Steve Cassidy

The Days Run Away…

Archive for the ‘Uncategorized’ Category

NIST Speaker Recognition Evaluations

without comments

Looking at the 2003 NIST SR evaluations, while we’re too late to enter this year there is some useful data available, for example the Automatically Generated word transcripts for some of their training data might be interesting to look at.

In their Rich Transcription track there is an interesting task which might be nice to attempt: meta data extraction. In the last evaluation it was just “Who Spoke When” annotation but they intend to add more target metadata in future rounds. This is quite close to some of our goals for the Meeting Room Project.

Quote from the NIST TREC-9 SDR page:

The results of the TREC-9 2000 SDR evaluation presented at TREC on November 14, 2000 showed that retrieval performance for sites on their own recognizer transcripts was virtually the same as their performance on the human reference transcripts. Therefore, retrieval of excerpts from broadcast news using automatic speech recognition for transcription was deemed to be a solved problem - even with word error rates of 30%.

Gosh!

And there’s more…Transcripts of meeting room data are available which seem to be manual. Meeting content doesn’t seem too exciting :-) but gives some idea of dialogue structure etc. in this kind of data.

Written by Steve Cassidy

May 8th, 2003 at 2:00 pm

Posted in Uncategorized

OSCOM

without comments

An article on Advogato describes the efforts of OSCOM to unify Open Source content management systems. Mentions Twingle -

Twingle, an OSCOM project to build a common CMS authoring tool … it helped incent servers to move WebDAV support up on the priority list. Since Twingle uses Mozilla’s RDF engine, it brought CMS attention to the Semantic Web.

which looks like it might be worth finding out about…

Written by Steve Cassidy

May 6th, 2003 at 2:00 pm

Posted in Uncategorized

Tim Bray on Good Web Citizenship

without comments

Tim Bray talks eloquently about what Apple could do to make their IMS service a good web citizen. Including:

  • Don’t invent new URI schemes (Apple uses itms:) since the plumbing of the web doesn’t understand them
  • Don’t use text/xml as the mime type for XML. This is something I wasn’t aware of. It seems that text/* is a license for a proxy to transcode the content, say from UTF-8 to US-ASCII. So the answer is to use application/xml instead avoiding the whole mess.

Postscript Apple also invented a new URI scheme (webcal:) for it’s web calendar service, eg you can get the Australian holidays in ical format at webcal://ical.mac.com/ical/Australian32Holidays.ics — here again, plain http also works. So Apple is using URI schemes as mime-type indicators…

Written by Steve Cassidy

May 1st, 2003 at 2:00 pm

Posted in Uncategorized

JXPath

without comments

JXPath - JXPath is a java api for traversing object graphs using an XPath like syntax. The collapse the notion of axis down to only ‘child’ which really becomes ‘anything I can traverse to from here’.

Our generalisations of Xpath go much further though and could achieve what JXPath does relatively easily with the appropriate axis definitions.

It makes an interesting point though that a path like language is a very useful tool for accessing parts of data structures in programs/scripts. This is how they’ve been used in relational databases in the past (ie. making pointer chasing less verbose), so why not use them to locate data in more general data sources.

Written by Steve Cassidy

March 12th, 2003 at 1:00 pm

Posted in Uncategorized

RDF model vs. Syntax

without comments

Don Box’s Spoutlet:
My love affair with RDF began in 1999 when I had to prepare a a tutorial on XML metadata formats for XTech. My RDF love affair was with the Model of RDF, mind you.

This touches on something I’ve been thinking lately about the anti-RDF arguments by Dave Winer and others in the RSS world. To me RDF is simply the triple based metadata model, however serialised. The arguments with the RDF-XML syntax aren’t really arguments with RDF itself. On the other hand I’ve not yet done anything large scale with the RDF model yet, so we’ll see…

Written by Steve Cassidy

November 28th, 2002 at 1:00 pm

Posted in Uncategorized

Children’s Books

without comments

The International Children’s Digital Library Has 200 children’s books scanned for public access. I can’t see it because of the Java requirement though :-(

Written by Steve Cassidy

November 28th, 2002 at 1:00 pm

Posted in Uncategorized

REST

without comments

XML-RPC case study

There is a deep symmetry in being able to GET the same stuff you POST that should be exploited when possible. [Paul Prescod]

Written by Steve Cassidy

November 27th, 2002 at 1:00 pm

Posted in Uncategorized

PIMs

without comments

Open Source Applications Foundation - Vista prototype is another outlook killer, perhaps interesting this time as it’s based on an RDF database underneath and is written in Python/Tkinter. Some nice ideas.

I’m getting annoyed with evolution afer only a few weeks of using it in anger, it’s too unstable and has quirks I can’t get used to. I mostly like the calendar and palm integration and the mail handler seems good but isn’t as polished as i’d like. Perhaps the next release will settle things a little.

Written by Steve Cassidy

November 7th, 2002 at 1:00 pm

Posted in Uncategorized

Zero Install

without comments

Don Park’s Blog say’s the net needs zero install extensible client platforms which .Net and java webstart aren’t. perhaps CANTCL can be something like that, especially along with starpacks we can have a zero install client application which can be extended and updated via
web downloads each of which is relatively small. Why does the Java machinery have to be so damn big?

Written by Steve Cassidy

November 7th, 2002 at 1:00 pm

Posted in Uncategorized

Overlapping trees in XML

without comments

xmlhack: One tree isn’t enough talks about a couple of proposals for encoding overlapping trees.

LMNL is a non XML markup language which allows for overlapping ranges.JITT (Just in time trees) is an XML based system which can derive different trees by parsing the file differntly. They have a page on Overlapping Hierarchies/Concurrent Markup.

This looks like it deserves some more attention.

Written by Steve Cassidy

October 24th, 2002 at 2:00 pm

Posted in Uncategorized