[Ltg] SALS-SIG/LTG Seminar [Dan Flickinger, 2008-06-11, E6A 357, 11:30am]
Marc Tilbrook
marct at ics.mq.edu.au
Wed Jun 4 14:29:53 EST 2008
----
SALS-SIG/LTG Seminar
- see: http://www.clt.mq.edu.au/Events/SALS-SIG.html
Wednesday 11 June 2007, 11:30am - 12:30pm
Macquarie University, E6A, Room 357
----
Title: Allies in Babel's Aftermath: Combining deep grammars with statistics
in machine translation
Speaker: Dan Flickinger Stanford University http://lingo.stanford.edu/dan/
Abstract:
During the past two decades, machine translation has experienced renewed and
growing interest, driven in part by new applications and markets on the Web,
and in part by the invention of new approaches, in particular data-driven
methods like Statistical Machine Translation (SMT) and Example-Based MT.
While these methods have shown very promising initial results, it has
recently become clear even to proponents of SMT that further improvements in
quality of output will require something in addition to the current
statistical methods alone. There is an emerging consensus within
computational linguistics that hybrid approaches combining rich symbolic
resources and powerful statistical techniques will be necessary to produce
NLP applications with a satisfactory balance of robustness and precision.
In this talk, I will present and demonstrate one such hybrid approach in a
semantic-transfer based MT system, LOGON, developed in Norway, which makes
use of two pre-existing wide-coverage hand-built grammars of Norwegian and
English to parse and generate, combined with statistical methods to rank the
outputs of each of the components for analysis, transfer, and generation.The
statistical models were trained on treebanks produced by manually
disambiguating the grammar-based parsing results in each language for a
sentence-aligned Norwegian-English corpus consisting of tourism booklets on
Norwegian back-country hiking. The transfer component consists of a large
set of rules, both manually and semi-automatically produced, which map
meaning representations in Minimal Recursion Semantics (MRS) produced by the
Norwegian grammar into roughly equivalent English-specific MRS
representations suitable for input to the English generator. Manual
harmonization of linguistic analyses across the two languages improved the
generality and robustness of these transfer rules, and reduced the quantity
of training data needed for the statistical models.
Biosketch:
Dan Flickinger is a Senior Researcher at the Center for the Study of
Languageand Information, Stanford University, and manager of the LinGO
(LinguisticGrammars Online) project there. His areas of research focus
include syntactic theory, the organization of the lexicon, grammar
engineering, deep linguistic processing and formal semantics. He has
experience both in industry (at Hewlett-Packard Labs and various Silicon
Valley start-ups) and in academia (including a Visiting Professor position
at Saarland University in Germany).
More information about the LTG
mailing list