Tenjin The Tenjinno Machine Translation Competition
Home
Task
Background
Download data
Status
Oracle
Important Dates
Archive
 
 

Downloading the competition problems

There are currently 4 different problems to be solved as described in the following table. You can download the training and testing data, simply by clicking on the hyperlinks in the 4th and 5th column in the table. The above individual files are compressed with gzip. Alternatively, you can get the files in one shot). Note that you can force Netscape to download to a file instead of displaying by using shift-left Click on the link.

Problem Number

Formalism

Log2 (Complexity)

Training set

Testing sets

Status

1

Deterministic FST

7.27*1031

train.1.gz

test.1.gz

Solved

2

Non Deterministic Unambigous FST

7.27*1031

train.2.gz

test.2.gz

Unsolved

3

Deterministic SDTS

7.24*1034

train.3.gz

test.3.gz

Unsolved

4

Non Deterministic Unambigous SDTS

8.28*1034

train.4.gz

test.4.gz

Unsolved

These files represent 4 different problems, ranked in difficulty from 1 to 4 (with 1 being the easiest and 4 the hardest). The way in which the target problems, training and testing sets were created is described here.

The winner of each individual problem will be the first contestant that submits a correctly label test set for that problem to the oracle. The winner of the competition will be the winner of the highest-ranking problem. We reserve the right to add additional problems to the competition if the initial set of problems proves to be too simple or too hard.

File Formats

The files train.?.gz contain a list of sentences in the source language along with their translation in the target language. You should use them to infer the transducers. You can test your answers using test.?.gz, which are strings you can translate. The translated sentences can be send to the the Tenjinno Oracle.

The format of the file is a comma separated text file, with the first column being the input sentence and the second column representing the translation of that sentence. It is possible for a sentence in the input language to be translated to the empty string.

For any comments or questions about these pages please contact the Tenjinno organisers.
Brad Starkie
Menno van Zaanen
Dominique Estival

Copyright 2005 ICGI. Last updated: Tue Apr 11 09:15:01 EST 2006