Current Affairs Prague hosts machine translation marathon

09-03-2010 16:38 | Ruth Fraňková

Prague’s Charles University recently hosted an unusual marathon which tested the capacity of various machine translating systems. The annual event is part of the Euromatrix project, which aims to establish machine translation systems for all European languages. The participants had a week to translate some 12,000 sentences from various newspapers and news sites. In the coming weeks their output will be confronted with translations done by professional „human” translators. Ruth Fraňková spoke to Ondřej Bojar from the Institute of Formal and Applied Linguistics, which is taking part in the Euromatrix project:

Listen RealAudio: 16kbps 32kbps
Download: MP3

Ondřej BojarOndřej Bojar “The project involves, if I am correct, seven universities and two companies, and we are one of those universities. Our main focus is translation from English to Czech and also from Czech to English, but we prefer Czech as the target language. We are working with the deep syntactic representations of the sentence which means that we aim at a translation where linguistics is applied. We have built collections of hundreds of thousands of sentences that are manually annotated with the syntactic representation of the sentence and we are now transferring the knowledge we have about Czech syntax into English.”

Is Czech a difficult language compared to other European languages?

“Czech has some specific properties that make it particularly difficult for translation, for example from English, and the difference between Czech and English is the rich morphology in Czech. While in English you have just a single form of a word, say green for the colour, in Czech you have seven cases, four genders and two numbers. Not all these combinations are different on the surface level but the number of possible Czech word forms is much higher and the system has to choose a correct one so this is a challenge.”

What about word order?

“The word order actually helps us when we are translating from English to Czech because Czech allows nearly any permutation of words in the sentence as a correct word order provided that the case markings and things like that are correct. When we are translating back from Czech into English and the Czech is produced by native speakers the situation is much more difficult. You have to identify where is the subject, where is the verb, where is the object, and these have to be in the canonical English order otherwise the sentence wouldn’t be comprehensible for a speaker of English.”

What is the future of machine translation systems? Do you think they can replace humans?

“I do believe that machine translation systems can replace humans in case of repetitive texts. For example weather reports were translated from English to French already in the 1970s. Now I think we are moving towards European legislation and I estimate that 60 percent of the texts or even more can be automatically translated with no human intervention.”

Social bookmarking

Featured

Also in this edition

Trade union leader ostracized for outrageous homophobic remarks

Daniela Lazarová

Jaromír Dušek Transport and services union leader Jaromír Dušek has evoked outrage on the Czech political scene by launching a vicious and homophobic...More

Environment Ministry seeks to stem disappearance of farmland under new development

Chris Johnstone

Agricultural land is fast disappearing under concrete in the Czech Republic and the pace is getting faster rather than slowing down....More

New Defence Ministry site provides information on Czech soldiers killed abroad during world wars

Ian Willoughby

The fates of Czech soldiers who died abroad during the first and second world wars are being mapped in a newly created War Graves...More

Related articles

More

Topics Archive: Culture | Science and technology | Czech language

More

Section Archive

More

Latest programme in English

More from Radio Prague