THE VOICE IN THE MACHINE

August 8, 2012

Demystifying Speech Recognition … but not too much

I really appreciate when people try to give a simplified view of technology with the goal to let the general public understand what’s behind the hood, and how complex is, oftentimes, to make things works properly. That is the goal I had in mind when I embarked on the project of writing The Voice in the Machine. However, I believe,…
Read more →
July 29, 2012

The hard job of getting meanings

If I had to chose one of the areas of human-machine natural communication where we haven’t ben able to make any significant stride during the past decades, I would choose “general” language understanding. Don’t get me wrong. Language understanding per se has made huge steps ahead. IBM Watson‘s victory over Jeopardy! human champions is a testimony of…
Read more →
July 17, 2012

Apples and Oranges

There is a lot of talking about the performance of Apple’s Siri. An article appeared on the New York Times a few days ago brutally destroying Siri from the point of view of its performance, and others compare it with Google Voice Search. As a professional in the field, having followed Google Voice Search closely, knowing well the people who work…
Read more →
July 14, 2012

Singing computers

Building a computer that speaks with the same naturalness and intelligibility of humans is not a much easier task than building a computer that understand speech. In fact it took decades to reach the quality of modern speech synthesizer, and yet the superiority of real human voice is still unbeatable. Still today, whenever possible, automated spoken dialog systems on the phone…
Read more →
July 5, 2012

Put that there!

One of the first multimodal interaction systems, dubbed Put-that-there, was built at the MIT Architecture Machine lab in the late 1970s by Chris Schmandt, who is now the director of the Speech and Mobility Group at MIT Media Labs. Here is a demo from 1979, where you see the integration of speech and gesture recognition to…
Read more →
July 3, 2012

What’s wrong with speech recognition?

Although speech recognition is getting better and better, it keeps making mistakes that often annoy us. Much more than humans would do in similar situations. And we have been trying to make it better for decades. What’s wrong with it? Scientists are constantly testing and trying to improve speech recognition in adverse conditions, such as…
Read more →
June 26, 2012

An ante-litteram vision of Siri

Some of you may remember the Apple’s knowledge navigator video posted here. It was released in 1987 as a vision of a future tight integration of advanced technologies on a flat tablet computer. The computer in this entertaining vision clip sports a fancy Web-like interface, a realistic avatar, touch screen with gesture recognition, teleconferencing, speech…
Read more →
June 18, 2012

The mythical 10 years

Speech recognition is one of those technologies who have been around for a while, but have never become mature enough to be considered established and part of everyday’s life like, instead, digital cameras, retina displays, and bluetooth. However, for a few years now speech recognition technology has be “sort-of” working so as some of us…
Read more →
June 18, 2012

Can machines really think?

Can machines really think? This question has been haunting machine intelligence experts and philosopher way before today’s Siri’s speech understanding, and IBM Watson’s Jeopardy! question answering challenge. Even though today we are not thinking anymore about whether machines can think or not, the question was quite popular in the heydays of Artificial Intelligence, the cold war, and the pioneers of computer…
Read more →
June 16, 2012

The Voice in the Machine is Alive

Hello there. A couple of months after the publication of my book, The Voice in the Machine: Building Computers that Understand Speech, I decided to start a new blog on the same theme. Having the fortune of working at ICSI, the International Computer Science Institute, one of the few independent advanced research places where computer speech, language, and AI…
Read more →

Instagram / TikTok / X

Designed with WordPress