Archive for the 'Annotation' Category


Sparql Endpoint for Python WSGI

As part of DADA (and yes, that page is a bit out of date) I wanted to provide a Sparql endpoint to allow experimentation with querying the raw RDF annotation data. So far, we’ve built everything using Redland in Python but it seems there is no exsiting Sparql endpoint implementation for this combination. The Sparql protocol document is long but as far as I can tell the core of the protocol is a simple GET request with an encoded Sparql query, results are returned as raw XML in the special Sparql result format or as RDF/XML if the return type is a graph. This proves to be very easy to implement on top of Redland since it’s query operator returns exactly those result types.

So, I present SparqlEndpoint-0.1, a python module that provides a WSGI conformant implementation of a Sparql Endpoint for Redland. It almost certainly doesn’t implement all of the protocol standard and it can be improved no end, for example by making it independant of the RDF backend it queries (eg. using RDFlib).

I’m not putting up a demo endpoint just yet as I’m having severe performance issues with my development server in combination with Redland. The triple store is growing rapidly to the millions of triples and the result is a huge latency (tens of minutes) to perform some queries. Given some recent discussion on the Redland list I’m wondering whether a jump to one of the RDF specific stores is the thing to do. This would probably mean rewriting my code in Java but based on the Berlin Sparql Benchmark numbers, Sesame and Jena have the kind of performance I need (sub second query response times on 100M triples).

Well, enough of that. If you are interested in SparqlEndpoint please download and take a look. If there is interest I’m happy to share it and host development somewhere accessible.

Posted on 21st August 2008
Under: Annotation, RDF | No Comments »

Version Control for RDF Triple Stores

RDF, the core data format for the Semantic Web, is increasingly being deployed both from automated sources and via human authoring either directly or through tools that generate RDF output. As individuals build up large amounts of RDF data and as groups begin to collaborate on authoring knowledge stores in RDF, the need for some kind of version management becomes apparent. While there are many version control systems available for program source code and even for XML data, the use of version control for RDF data is not a widely explored area. This paper examines an existing version control system for program source code, Darcs, which is grounded in a semi-formal theory of patches, and proposes an adaptation to directly manage versions of an RDF triple store.

Paper presented at ICSOFT 2007, Barcelona, Spain, July 2007. Download PDF

Posted on 3rd September 2007
Under: Annotation, Conferences, RDF, publication | No Comments »

PhD Scholarship in Semantic Web Technologies for Annotation

I have a PhD scholarship available for a project in applying Semantic Web technologies (RDF, Sparql, Annotea) to the Linguistic Annotation problem. Here’s an outline:

Shared collaborative distributed annotation using semantic web technologies.

The Semantic Web augments the current Web with machine-processable information enabling humans and machines to work in cooperation; in our context, we are using it as the basis of a linguistic annotation system that is used by language researchers to annotate language resources. This project will look at the issues raised when we allow many people to collaborate on authoring these annotations and making shared annotations available to a community of researchers. This crosses a number of existing areas of research including the semantic web and social computing, and extends the range of interactions available to researchers over the web.

Of course, as usual there is scope for variation on this theme, if you’re interested in this problem space and want to pursue a PhD in Australia, please get in touch. The scholarship is open to Australians and International students.

Posted on 30th March 2007
Under: Annotation, ProjectIdea | No Comments »

Annotation - Spoken Word Services

Annotation - Spoken Word Services is another project that is providing web based annotation of audio recordings, this time in a learning environment.

Posted on 29th May 2006
Under: Annotation, Speech | No Comments »

BBC Annotatable Audio

Tom Coates describes a currently internal BBC intitative to have everyone annotate audio content flikr style. This is a very cool application and is like a real world (as in non-academic) application of the kind of shared annotation we want to enable in our eResearch project. It’s probably what the people at Dart would like to enable, or at least that was my understanding of one of their goals.

Read the rest of this entry »

Posted on 9th November 2005
Under: Annotation, Speech | No Comments »