[Ltg] LTG Seminar [PACLING Practice Presentations 2007-09-10, E6A 357, 11am]

Mary Gardiner gardiner at ics.mq.edu.au
Mon Sep 10 08:22:47 EST 2007


On Mon, Sep 10, 2007, Marc Tilbrook wrote:
> ----
>    LTG Seminar
>     - see: http://www.clt.mq.edu.au/Events/Seminars.html
> 
>     Monday, 3rd September , 2007, 10am
>     Macquarie University, E6A, Room 357
> ----
> 
>    ----
>     * Please note we are back to our 11am start.
>    ----
> 
> We will be having two PACLING practice presentations, by Stephen Wan and 
> Mary Gardiner.

Title: Corpus Statistics Approaches to Discriminating Among
Near-Synonyms
Speaker: Mary Gardiner

Abstract:

Near-synonyms are words that mean approximately the same thing, and
which tend to be assigned to the same leaf in ontologies such as
WordNet. However, they can differ from each other subtly in both
meaning and usage---consider the pair of near-synonyms "frugal"
and "stingy"---and therefore choosing the appropriate near-synonym
for a given context is not a trivial problem.

Early work on near-synonyms was that of Edmonds (1997). Edmonds reported 
an experiment attempting to predict which of a set of near-synonyms 
would be used in a given context using lexical co-occurrence networks.  
The conclusion of this work was that corpus statistics approaches did 
not appear to work well for this type of problem and led instead to the 
development of machine learning approaches over lexical resources such 
as "Choose the Right Word" (Hayakawa, 1994).

Our hypothesis is that some kind of corpus statistics approach may still 
be effective in some situations: particularly if the near-synonyms 
differ in sentiment from each other. Intuition based on work in 
sentiment analysis suggests that if the distribution of words embodying 
some characteristic of sentiment can predict the overall sentiment or 
attitude of a document, perhaps these same words can predict the choice 
of an individual `attitudinal' near-synonym given its context, while 
this is not necessarily true for other types of near-synonym.  This 
would again open up problems involving this type of near-synonym to 
corpus statistics methods. As a first step, then, we investigate whether 
attitudinal near-synonyms are more likely to be successfully predicted 
by a corpus statistics method than other types. In this paper we present 
a larger-scale experiment based on Edmonds (1997), and show that 
attitudinal near-synonyms can in fact be predicted more accurately using 
corpus statistics methods.


More information about the LTG mailing list