Steve Cassidy

The Days Run Away…

Archive for the ‘Speech’ Category

SCOPE

with one comment

So today I make my TV debut! A few weeks ago a film crew from Channel 10 came to shoot a segment for the CSIRO/Channel 10 kids science show SCOPE. The episode, on sound, airs today at 4pm.

I had great fun making the segment, I’ve never done anything like this before and it’s amazing how much work goes in to producing such a short piece. I can’t wait to see how it turns out.

If you watched the show and are interested in having a look at speech you might want to download one of the programs I used in the show. The WaveSurfer tool (you want to get the Binary release for windows from this page). Wavesurfer will let you record your voice and see the spectrogram patterns like the ones I was looking at on the show (you’ll need a microphone for your computer, a cheap headset will do). To get a good looking display, select “New” from the File menu and choose “Demonstration” when asked what configuration to use. Then press the red record button and speak into the microphone.

Here’s an experiment to try: record yourself saying “hid”, “hod”, “head”, “had”. Look at the spectrogram of each word and see if you can tell the difference. Look particularly for the brigher bands in the display — these are called formants and they’re different for every vowel sound.

Another experiment: record two children and an adult saying the same word, for example “SCOPE”. Can you tell the difference between them? Which looks more similar, the children’s voices or one of the children and the adult?

Please leave a comment if you’ve seen the show!

Written by Steve Cassidy

August 28th, 2006 at 7:23 am

Posted in Speech

Transcribed Podcasts and Audio Books

without comments

John Udell is taggins some of his del.icio.us links to podcasts with transcriptavailable, transcripts have been generated manually. This could be a nice source of data for experiments with information retrieval from podcasts.

Sort of relatedly, I just discovered LibriVox which hosts volunteer recordings of out of copyright literary works (eg. Project Gutenberg books). I sampled War of the Worlds and the quality seems great. Worth a browse.

Written by Steve Cassidy

August 23rd, 2006 at 8:28 pm

Posted in Blogging, Speech

Annotation - Spoken Word Services

without comments

Annotation - Spoken Word Services is another project that is providing web based annotation of audio recordings, this time in a learning environment.

Written by Steve Cassidy

May 29th, 2006 at 1:28 pm

Posted in Annotation, Speech

BBC Annotatable Audio

without comments

Tom Coates describes a currently internal BBC intitative to have everyone annotate audio content flikr style. This is a very cool application and is like a real world (as in non-academic) application of the kind of shared annotation we want to enable in our eResearch project. It’s probably what the people at Dart would like to enable, or at least that was my understanding of one of their goals.

Read the rest of this entry »

Written by Steve Cassidy

November 9th, 2005 at 1:00 pm

Posted in Annotation, Speech

Augmenting Conversations Using Dual Purpose Speech

without comments

Augmenting Conversations Using Dual Purpose Speech Kent Lyons, Christopher Skeels, Thad Starner Cornelis M. Snoeck, Benjamin A. Wong, Daniel Ashbrook College of Computing and GVU Center Georgia Institute of Technology.

In this paper, we explore the concept of dual purpose speech: speech that is socially appropriate in the context of a human to human conversation which also provides meaningful input to a computer. We motivate the use of dual purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user s calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.

Written by Steve Cassidy

October 3rd, 2004 at 2:00 pm

Posted in Speech

SpeechBot

without comments

SpeechBot is a is a search engine for audio & video content that is hosted and played from other websites. Recordings are indexed via speech recognition on the audio. A very interesting experiment which seems to work for some queries; the transcripts are obviously noisy but it’s obvious that if a particular term is repeated often in a piece then it will get indexed well. My search for ancient history troy returned a few relevant results but many false hits due to troy being an easily inserted token.

Written by Steve Cassidy

May 24th, 2004 at 2:00 pm

Posted in Speech