|
The Tenjinno Machine Translation Competition | |||||||||||||||||||||||||||||||||||||||||
|
Downloading the competition problemsThere are currently 4 different problems to be solved as described in the following table. You can download the training and testing data, simply by clicking on the hyperlinks in the 4th and 5th column in the table. The above individual files are compressed with gzip. Alternatively, you can get the files in one shot). Note that you can force Netscape to download to a file instead of displaying by using shift-left Click on the link.
These files represent 4 different problems, ranked in difficulty from 1 to 4 (with 1 being the easiest and 4 the hardest). The way in which the target problems, training and testing sets were created is described here. The winner of each individual problem will be the first contestant that submits a correctly label test set for that problem to the oracle. The winner of the competition will be the winner of the highest-ranking problem. We reserve the right to add additional problems to the competition if the initial set of problems proves to be too simple or too hard. File FormatsThe files train.?.gz contain a list of sentences in the source language along with their translation in the target language. You should use them to infer the transducers. You can test your answers using test.?.gz, which are strings you can translate. The translated sentences can be send to the the Tenjinno Oracle. The format of the file is a comma separated text file, with the first column being the input sentence and the second column representing the translation of that sentence. It is possible for a sentence in the input language to be translated to the empty string. |
|||||||||||||||||||||||||||||||||||||||||
| For any comments or questions about these pages please contact the Tenjinno organisers. |
|
Brad Starkie Menno van Zaanen Dominique Estival |