[Ltg] LTG Seminar [Tanja Gaustad van Zaanen, Dec 20, 11 am, E6A357]
Rolf Schwitter
rolfs at ics.mq.edu.au
Wed Dec 15 10:01:21 EST 2004
----
LTG Seminar
- see: http://www.clt.mq.edu.au/Events/Seminars.html
Monday, December 20, 2004 at 11am
Macquarie Uni, E6A 357
----
Title: Linguistics Knowledge and Word Sense Disambiguation
Speaker: Tanja Gaustad van Zaanen
In this talk I will present the findings of my PhD research. The main
research question I try to answer in my thesis is which linguistic knowledge
sources are most useful for word sense disambiguation (WSD), more
specifically word sense disambiguation of Dutch. Therefore, the structure of
the thesis - and of this talk - is based on the various levels of linguistic
information tested for WSD, including morphology, information on the
syntactic class of a particular ambiguous word, and the syntactic structure
of the entire sentence containing an ambiguous word.
The goal of the project was to develop a tool which is able to automatically
determine the meaning of a particular ambiguous word in context, a so called
word sense disambiguation system. I will first introduce the experimental
setup of the WSD system, including a brief presentation of the corpus, the
classification
algorithm used, as well as the "features" (or sources of linguistic
knowledge) integrated in the model. In a second step, I will present a novel
approach to building a statistical WSD system which includes morphological
information in the form of lemmas as the keyelement. This will be followed
by a presentation of various results, highlighting the importance of
linguistic information in the WSD system presented.
In this statistical WSD system, especially the addition of deep linguistic
knowledge greatly improves disambiguation accuracy. In combination with an
approach taking advantage of morphological information, the best results for
WSD of Dutch (on the Senseval-2 data set) are obtained. My system achieves
significantly higher disambiguation accuracy than any results for Dutch that
have been reported in the literature up to now and is thus state-of-the-art
for Dutch WSD.
More information about the LTG
mailing list