Within the field of
Automatic Text Summarization, I am interested in developing an
approach to account for coherence statistically and for ensuring the
verisimilitude of generated text.
My aim then is to build a system that can generate a paragraph
length abstract summary.
In my first year, I examined techniques (eg.
Witbrock and Mittal, 1999) to generate a single sentence summary of
a news article (similar in content to the headline). The
benefit of such a summarizer is that it generates sentences from
scratch which may be a concise statistical paraphrase of information
found across several sentences. The resulting
system can be found at the Headliner
Project Webpage.
The next few years of work will involve taking
this as a building block in generating summaries longer than one
sentence. Accounting for coherence and verisimilitude will be the focus of the thesis.
The thesis will cover research topics such as:
- Statistical Text Generation
- Statistical Content Planning
- Statistical Surface Realization
- Paraphrase tree alignment
- Summarization evaluation
- Sentence Ordering and Statistical Cohesion
I am currently looking at UN Humanitarian Aid
Proposals as the data for my PhD.
My other interests include applying NLP to a variety of applications such as computer games,
entertainment, distance
education software and electronic government. I am especially
interested in Computational Creativity and Computational Music