[Ltg] LTG Seminars [13 + 20 December, E6A, 11am ]
Rolf Schwitter
rolfs at ics.mq.edu.au
Wed Dec 1 20:40:10 EST 2004
----
LTG Seminar
- see: http://www.clt.mq.edu.au/Events/Seminars.html
Monday, December 13, 2004 at 11am
Macquarie Uni, E6A 357
----
Speaker: Menno van Zannen
Title: How to get from A to ABL
If you want to structure sequences (such as natural lanuage sentences), you
could use for example a grammar and parse the sequences. However, what do
you do if such a grammar is not available? This problem can occur, for
example, when analyzing sentences written in an ancient natural language, or
a natural language for which no grammar has been created yet. To make
things worse, it may also be a sentence from a ``language'' for which it is
unclear what the grammar should look like (for example in music).
The Alignment-Based Learning framework is a grammatical inference framework
(or more precisely a structure inference framework). By analyzing the
sentences, it tries to find regularities that may be used to assign
structure to unstructured sequences.
In this talk, I will explain how the system works, how it arose, how it
evolved and how it can be used in the future. This will give a compact
overview of the previous work I have done and some ideas on future work I
would like to perform.
Monday, December 20, 2004 at 11am
Macquarie Uni, E6A 357
----
Speaker: Tanja Gaustad van Zannen
Title: Linguistics Knowledge and Word Sense Disambiguation
In this talk I will present the findings of my PhD research. The main
research question I try to answer in my thesis is which linguistic knowledge
sources are most useful for word sense disambiguation (WSD), more
specifically word sense disambiguation of Dutch. Therefore, the structure of
the thesis - and of this talk - is based on the various levels of linguistic
information tested for WSD, including morphology, information on the
syntactic class of a particular ambiguous word, and the syntactic structure
of the entire sentence containing an ambiguous word.
The goal of the project was to develop a tool which is able to automatically
determine the meaning of a particular ambiguous word in context, a so called
word sense disambiguation system. I will first introduce the experimental
setup of the WSD system, including a brief presentation of the corpus, the
classification algorithm used, as well as the "features" (or sources of
linguistic knowledge) integrated in the model. In a second step, I will
present a
novel approach to building a statistical WSD system which includes
morphological information in the form of lemmas as the key element. This
will be followed by a presentation of various results, highlighting the
importance of linguistic information in the WSD system presented.
In this statistical WSD system, especially the addition of deep linguistic
knowledge greatly improves disambiguation accuracy. In combination with an
approach taking advantage of morphological information, the best results for
WSD of Dutch (on the Senseval-2 data set) are obtained. My system achieves
significantly higher disambiguation accuracy than any results for Dutch that
have been reported in the literature up to now and is thus state-of-the-art
for Dutch WSD.
More information about the LTG
mailing list