[Ltg] LTG Seminar [Brett Powley 07-04-23,E6A 357, 11am]
Marc Tilbrook
marct at ics.mq.edu.au
Fri Apr 20 11:46:17 EST 2007
----
LTG Seminar
- see: http://www.clt.mq.edu.au/Events/Seminars.html
Monday, 23rd April, 2007, 11am
Macquarie University, E6A, Room 357
----
Title: Extracting citations from academic papers: finding them is only half
the job
Speaker: Brett Powley
My research is focused on extraction and analysis of citations from
collections of academic papers. In this talk, I'd like to report on two
aspects of my work. For the first part, I'll give an overview of the issues
in performing high-accuracy citation extraction. I'll present some results
on applying an evidence-based algorithm that I've developed to the ACL
Anthology and also to a second heterogeneous corpus comprising papers from a
broad range of disciplines. This work is the subject of two conference
papers to be presented later this year.
For the second part, I'll present some work-in-progress on one specific
problem related to interpreting citing sentences: the resolution of
ambiguity in parenthetical citations. I'll show that the syntactic
attachment of a parenthetical citation to its containing sentence is often
ambiguous, and that syntactic and lexical cues within the sentence are most
often not sufficient to resolve the ambiguity. I'd then like to outline some
ideas I have for solving this problem by finding evidence from other citing
sentences and measuring alignment between those sentences and the sentence
in question. This presents a number of as yet unresolved challenges
including measuring alignment in spite of paraphrasing; determining how to
weight evidence; and determining how to represent multiple sources of
evidence as a feature vector suitable for machine learning experiments. As
this is early work, I'd like to solicit feedback and ideas on approaches to
this problem.
More information about the LTG
mailing list