Singing computers

Building a computer that speaks with the same naturalness and intelligibility of humans is not a much easier task than building a computer that understand speech. In fact it took decades to reach the quality of modern speech synthesizer, and yet the superiority of real human voice is still unbeatable. Still today, whenever possible, automated spoken dialog systems on the phone deploy prompts recorded by voice talents carefully coached by voice user interface designers, rather than speech synthesized by a computer. Having said that, it is true that  the speech synthesis and text-to-speech technology  has gone a long way from it first  attempts, and it is used in many commercial applications, including the navigator in your car.

One thing that may not be obvious is that making a computer sing is in a way easier than making it speak (contrary to that, speaking is easier for humans than singing). The reason is that speech synthesizers, in order to produce naturally sounding speech,  have to give it the right intonation and rhythm, which depend on many factors, such as the structure of the sentences,their general meaning, the specific message to convey to the listener, the context of the whole text, and so forth. Generating the right intonation and rhythm automatically and  from written text alone is not an easy feat. We have developed algorithms to do that, but our algorithms, though much better than what we had decades ago, are not perfect yet. Rather, in singing voice, the intonation and rhythm are prescribed exactly by the a score; it’s just enough to follow the music.  So a syntetic singing voice may end up sounding more natural than a corresponding, non singing, speaking voice.

In one of the most dramatic scenes of the classic 1968 Sci-Fi movie “2011, a Space Odyssey,” HAL 9000, the villain computer, sings a tune while is being deactivated by astronaut David Bowman. The tune is Days Bell:

Apparently  Stanley Kubrik and Arthur C. Clarke  visited the most prestigious technology research centers of the time in order to get ideas for the movie before they started production. Certainly they visited Bell Labs and heard the first singing computer, created in 1961 by  computer music pioneer Max Mathews.  And certainly they heard that computer perform one of its preferred songs: Daisy Bell.

Advertisement

~ by Roberto Pieraccini on July 14, 2012.

One Response to “Singing computers”

  1. Thanks to Federico Stano, i got this historic recording of MUSA, the MUltichannel Speaking Automaton built at CSELT (the research center of the Italian telephony company in the 1970s and 1980s). It appeared as a plastic record in a magazine in 1978. I got it then, and listened to it in awe. It is also thanks to this that I got into a speech technology career.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: