About

About this blog

A couple of months after the publication of my book, The Voice in the Machine: Building Computers that Understand Speech, I decided to start a new blog on the same theme. Having the fortune of working at ICSI, the International Computer Science Institute, one of the few independent advanced research places where computer speech, language, and AI research is still vivid and unfettered, and being computers that understand speech one of my lifetime loves, I will use this blog to collect my notes and my points of view on the evolution of the science and technology of what i like to call “talking machines”. Having the additional fortune of having worked on both sides of the research-industry chasm for more three decades, I will do my best to put down into words my presumably unbiased point of view on the related science, technology, and business.

About me, Roberto Pieraccini

Since January 2012 I am the director of the International Computer Science Institute (ICSI) in Berkeley, CA, an independent research institution affiliated with the University of California at Berkeley. ICSI includes, among its research staff, world-class scientists in the most disparate computer science disciplines, such as internet networking and security, computer speech and vision, advanced computer architectures, neurosciences and bio-informatics.

I have been in the speech technology research and business for more than 30 years. Prior to joining ICSI, I was the Chief Technology Officer at SpeechCycle (acquired by Synchronoss in May 2012), a company specialized in advanced spoken human-machine interaction systems for enterprise customer care (yes, those annoying “please tell me the reason you are calling about” computers that prevent you to talk to human operators when you need them). With the desire to make those annoying computers better, I led an effort to develop new technology, at SpeechCycle, that helped make those computers learn from their own mistakes and, hopefully, improve. All of that was based on statistics, data, and machine learning.

Before SpeechCycle, around 2003-2005, I managed a speech research team at IBM T.J. Watson Research, in Yorktown Heights, NY and prior to that, between 1999 and 2003, I was at SpeechWorks International, which is now known as Nuance, today’s largest worldwide computer speech company.

The turning point in my computer speech research carrer was in 1988 when I joined Bell Labs (part of it morphed then into AT&T Labs, after AT&T tri-vestiture in 1996), where I worked with some of the most influential scientists in computer speech, such as Larry Rabiner. I arrived at Bell Laboratories from Italy, where in the 1980s I was a researcher at CSELT, the laboratories of the national Italian telephone company.

During all this time, I wrote, as an author or co-author, about 150 scientific papers and articles in the fields of speech recognition, spoken language understanding and dialog, multimodal interaction, and machine learning. I am best known for my original contributions to statistical methods for spoken language understanding and machine learning for spoken dialog systems.

My book “The Voice in the Machine” on the history of computer speech understanding technology, published by MIT Press, tells the story of 60 years of computer speech technology in a way that is accessible to general scientific readers.