[Ltg] LTG Seminar [Steve Cassidy 2007-10-29, E6A 357, 11am]

Marc Tilbrook marct at ics.mq.edu.au
Thu Oct 25 15:37:51 EST 2007


----
  LTG Seminar
   - see: http://www.clt.mq.edu.au/Events/Seminars.html

   Monday, 29th October, 2007, 11am
   Macquarie University, E6A, Room 357
  ----

Title: Putting your Corpora on the (Semantic) Web
Speaker: Steve Cassidy

Language corpora are distributed in all kinds of formats supported by all
kinds of tools. Sharing corpora is done by distributing media or putting
things up on the web.  Corrections or updates are often managed by issuing
new versions or distributing patch files or not managed at all.  Having many
people work collaboratively on annotating a data set is not something we've
managed to do without considerable administrative overhead.

What I'll propose in this talk is that language corpora should live on the
web in a way that satisfies many of these needs.  That all parts of a corpus
be available over HTTP for direct download; that updates and additions to
annotation can be made via HTTP; that  versions of corpora be managed
directly and be available on the web.

To achieve this I'll propose a new annotation model that builds on RDF, the
core data model of the Semantic Web effort.  I'll show how this model
subsumes a number of other annotation models that are in use and argue that
there are new advantages that come out of deploying RDF in this space. Since
persuading everyone to convert their tools over to RDF is not feasible, I'll
also describe how annotations in various formats can be delivered over HTTP
to satisfy legacy clients.





More information about the LTG mailing list