PangeaMT – our own MT solution range!

PangeaMT is Pangeanic“s own translation technology division with a clear focus on customized, domain-specific MT.

As a forward-thinking and technology-open LSP, Pangeanic wins a post-editing contract in 2007 to work for the European Commission as MT output post-editors. It is at this time when we become acquainted with institutional user needs and (re-)evaluated several commercial MT products we had been using. Soon we decided to develop our own MT technology.

We are now keen followers of the stats-driven paradigm of MT and so PangeaMT, our MT solution range, is SMT and Moses-biased. However, we have been able to overcome some Moses shortcomings; our solutions go beyond text-based MT and are capable of taking input and producing output in industry-standards, such as TMX and XLIFF. Using open standards means that you will never have to buy expensive TM software again. Our solutions just avoid having you locked-in by expensive upgrades year after year.

Another PangeaMT breakthrough is our inline mark-up parser. Statistical machine translation systems usually produce plain text output because this is also the format they can process. However, we are keen to see PangeaMT solutions in use and adapted to the most demanding language industry requirements. We focused our effort on developing SMT engines capable of handling in-line coding typical of other content formats used in localization production environments. Thanks to this parser, PangeaMT can identify in-lines without attempting to translate them, and it places them back in the resulting text, too. An in-line placeholder acts first by copying and transferring all XML and code information to a separate module. The translation engine does its work and then places the in-line back into the translated segment. To our knowledge, our in-line parser constitutes an innovation well-above the current level of maturity of well-known SMT systems.

We keep learning and improving with every development commissioned by an existing or new client and language combination. We therefore remain open as to apply new (hybridization) techniques that we research and implement ourselves or could develop in conjunction with others. We are aware of the fact that for some language combinations it will be necessary to resort to some linguistic-informative techniques that will be part of the pre- or post-processing phases. Right word and phrase reordering in the MT output is not an easy goal to achieve, especially when the languages involved are not closely linked from a linguistic family standpoint, or when one of the two languages has a really flexible and so MT-challenging word order (WO). Some language-specific fixing procedures may come handy. In some other cases, it may be useful to use one language as pivot to train engines in languages that are pretty close. These and other techniques may be used or taken as a basis for expanding our PangeaMT solution palette.

Our PangeaMT division and solutions have been discussed and presented widely in lots of GILT (globalization – internationalization – localization – translation) industry events.

Please visit our sister website to learn more about PangeaMT. Thanks for your interest!

Next time you think languages, think Pangeanic