Archive for the ‘Speech’ Category
SCOPE
So today I make my TV debut! A few weeks ago a film crew from Channel 10 came to shoot a segment for the CSIRO/Channel 10 kids science show SCOPE. The episode, on sound, airs today at 4pm.
I had great fun making the segment, I’ve never done anything like this before and it’s amazing how much work goes in to producing such a short piece. I can’t wait to see how it turns out.
If you watched the show and are interested in having a look at speech you might want to download one of the programs I used in the show. The WaveSurfer tool (you want to get the Binary release for windows from this page). Wavesurfer will let you record your voice and see the spectrogram patterns like the ones I was looking at on the show (you’ll need a microphone for your computer, a cheap headset will do). To get a good looking display, select “New” from the File menu and choose “Demonstration” when asked what configuration to use. Then press the red record button and speak into the microphone.
Here’s an experiment to try: record yourself saying “hid”, “hod”, “head”, “had”. Look at the spectrogram of each word and see if you can tell the difference. Look particularly for the brigher bands in the display — these are called formants and they’re different for every vowel sound.
Another experiment: record two children and an adult saying the same word, for example “SCOPE”. Can you tell the difference between them? Which looks more similar, the children’s voices or one of the children and the adult?
Please leave a comment if you’ve seen the show!
Transcribed Podcasts and Audio Books
John Udell is taggins some of his del.icio.us links to podcasts with transcriptavailable, transcripts have been generated manually. This could be a nice source of data for experiments with information retrieval from podcasts.
Sort of relatedly, I just discovered LibriVox which hosts volunteer recordings of out of copyright literary works (eg. Project Gutenberg books). I sampled War of the Worlds and the quality seems great. Worth a browse.
Annotation - Spoken Word Services
Annotation - Spoken Word Services is another project that is providing web based annotation of audio recordings, this time in a learning environment.
BBC Annotatable Audio
Tom Coates describes a currently internal BBC intitative to have everyone annotate audio content flikr style. This is a very cool application and is like a real world (as in non-academic) application of the kind of shared annotation we want to enable in our eResearch project. It’s probably what the people at Dart would like to enable, or at least that was my understanding of one of their goals.
Augmenting Conversations Using Dual Purpose Speech
Augmenting Conversations Using Dual Purpose Speech Kent Lyons, Christopher Skeels, Thad Starner Cornelis M. Snoeck, Benjamin A. Wong, Daniel Ashbrook College of Computing and GVU Center Georgia Institute of Technology.
In this paper, we explore the concept of dual purpose speech: speech that is socially appropriate in the context of a human to human conversation which also provides meaningful input to a computer. We motivate the use of dual purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user s calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.
SpeechBot
SpeechBot is a is a search engine for audio & video content that is hosted and played from other websites. Recordings are indexed via speech recognition on the audio. A very interesting experiment which seems to work for some queries; the transcripts are obviously noisy but it’s obvious that if a particular term is repeated often in a piece then it will get indexed well. My search for ancient history troy returned a few relevant results but many false hits due to troy being an easily inserted token.
