Microsoft Claims Interpretation Breakthrough

Rashid Demonstrates Interpretation

Microsoft’s chief research officer, Rick Rashid, has announced that the software giant hopes to have “systems that can completely break down language barriers” within the next few years. In a video demonstration, Rashid spoke in English and was then echoed, in his own voice, by a Mandarin Chinese translation.Microsoft has been working on the core speech-recognition technology, which it calls Deep Neural Net (DNN) translation, for the last couple of years, and it already offers it as a commercial service called inCus. However, as Rashid explained in a blog post, the company has now taken the system a step further.Rashid wrote the post, he said, due to interest in a speech he gave at Microsoft Research Asia’s 21st Century Computing event. In that speech, Rashid’s words were simultaneously interpreted  into Mandarin, with the translation relayed in a simulation of his own voice.

“The first [step] takes my words and finds the Chinese equivalents, and while non-trivial, this is the easy part,” Rashid wrote. “The second reorders the words to be appropriate for Chinese, an important step for correct translation between languages. Of course, there are still likely to be errors in both the English text and the translation into Chinese, and the results can sometimes be humorous. Still, the technology has developed to be quite useful.”

For the final, text-to-speech leg of the process, Microsoft had to record a few hours of a native Chinese speaker’s speech, and around an hour of Rashid’s own voice.

All of the common speech recognition and automatic translation systems are based on a statistical technique known as Hidden Markov Modeling, which has an error rate of between 20-25%. According to Rashid, the new DNN technique reduces that rate to about 14-18%.

“This means that rather than having one word in four or five incorrect, now the error rate is one word in seven or eight,” he wrote. “While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modelling in 1979, and as we add more data to the training we believe that we will get even better results.”

“The results are still not perfect, and there is still much work to be done, but the technology is very promising, and we hope that in a few years we will have systems that can completely break down language barriers,” Rashid added.


LEAVE A REPLY

Please enter your comment!
Please enter your name here