[Ltg] LTG Seminar [Tanja Gaustad van Zaanen, Dec 20, 11 am, E6A357]

Rolf Schwitter rolfs at ics.mq.edu.au
Wed Dec 15 10:01:21 EST 2004


----
LTG Seminar
 - see: http://www.clt.mq.edu.au/Events/Seminars.html

Monday, December 20, 2004 at 11am
Macquarie Uni, E6A 357
----

Title: Linguistics Knowledge and Word Sense Disambiguation
Speaker: Tanja Gaustad van Zaanen

In this talk I will present the findings of my PhD research. The main 
research question I try to answer in my thesis is which linguistic knowledge 
sources are most useful for word sense disambiguation (WSD), more 
specifically word sense disambiguation of Dutch. Therefore, the structure of 
the thesis - and of this talk - is based on the various levels of linguistic 
information tested for WSD, including morphology, information on the 
syntactic class of a particular ambiguous word, and the syntactic structure 
of the entire sentence containing an ambiguous word.

The goal of the project was to develop a tool which is able to automatically 
determine the meaning of a particular ambiguous word in context, a so called 
word sense disambiguation system. I will first introduce the experimental 
setup of the WSD system, including a brief presentation of the corpus, the 
classification
algorithm used, as well as the "features" (or sources of linguistic 
knowledge) integrated in the model. In a second step, I will present a novel 
approach to building a statistical WSD system which includes morphological 
information in the form of lemmas as the keyelement. This will be followed 
by a presentation of various results, highlighting the importance of 
linguistic information in the WSD system presented.

In this statistical WSD system, especially the addition of deep linguistic 
knowledge greatly improves disambiguation accuracy. In combination with an 
approach taking advantage of morphological information, the best results for 
WSD of Dutch (on the Senseval-2 data set) are obtained. My system achieves 
significantly higher disambiguation accuracy than any results for Dutch that 
have been reported in the literature up to now and is thus state-of-the-art 
for Dutch WSD.




More information about the LTG mailing list