S  T  E  P  H  E  N           W   A  N


P h D   R E S E A R C H

Supervisors: Dr. Robert Dale, Dr. Mark Dras, Dr. Cécile Paris (external)
 

 
Contents:

Research Topic:

Within the field of Automatic Text Summarization, I am interested in developing an approach to account for coherence statistically and for ensuring the verisimilitude of generated text.    My aim then is to build a system that can generate a paragraph length abstract summary.

In my first year, I examined techniques (eg. Witbrock and Mittal, 1999) to generate a single sentence summary of a news article (similar in content to the headline).  The benefit of such a summarizer is that it generates sentences from scratch which may be a concise statistical paraphrase of information found across several sentences.    The resulting system can be found at the Headliner Project Webpage.

The next few years of work will involve taking this as a building block in generating summaries longer than one sentence.  Accounting for coherence and verisimilitude will be the focus of the thesis.

The thesis will cover research topics such as:

  • Statistical Text Generation
    • Statistical Content Planning
    • Statistical Surface Realization
  •  Paraphrase tree alignment
  • Summarization evaluation
  • Sentence Ordering and Statistical Cohesion

I am currently looking at UN Humanitarian Aid Proposals as the data for my PhD.

My other interests include applying NLP to a variety of applications such as computer games, entertainment, distance education software and electronic government.  I am especially interested in Computational Creativity and Computational Music

Publications:

  • For a full list see: http://www.ict.csiro.au/staff/stephen.wan/publications.htm
     
  • Stephen Wan, Kathleen McKeown. (2004) Generating Overview Summaries of Ongoing Email Thread Discussions. In the Proceedings of the 20th International Conference on Computational Linguistics. August, 2004, Geneva, Switzerland. [ pdf, 68k]
     
  • Stephen Wan, Robert Dale, Mark Dras, Cécile Paris. (2003) Straight to the Point: Discovering Themes for Summary Generation.  In the Proceedings of the Australian Workshop on Natural Language Processing 2003, Melbourne Australia. [postscript, 746K]
     
  • Stephen Wan, Mark Dras, Cécile Paris, Robert Dale. (2003) Using Thematic Information in Statistical Headline Generation. In "The Proceedings of the Workshop on Multilingual Summarization and Question Answering" at ACL 2003, July 11, Sapporo, Japan. [PDF, 2.4M]

Funding:

  • Research Award For Areas and Centres of Excellence (RAACE) at Macquarie University
  • CSIRO Top-Up Scholarship
 

 

 

Back my Homepage