Last week we saw the release of Google Assistant, an electronic assistant that follows commands on mobile phones. While these tech super-giants are in a heated competition to close in the voice-assist market, Siri has the advantage of being able to speak 21 languages localized for 36 countries. The latest, Shanghainese, is a variety of Wu Chinese spoken in central districts of the City of Shanghai and its surrounding areas. This Sino-Tibetan language, like many other Wu variants, is able to be understood outside of the Wu region by speakers of other languages such as Mandarin.
It is a complicated matter for a voice-assist program to communicate effectively with speakers of other languages. One major hurdle that has to be overcome is that many words do not translate directly from language to language, so voice-assist programs must be able to pick up on words and phrases that are specific to certain languages in particular.
Apple tries to combat this by bringing in humans to read passages in a range of accents and dialects, which are then transcribed by hand so that the computer has an exact representation of the spoken text to learn from. Apple speech team leader, Alex Acero told Reuters that Apple also captures a small percentage of the audio recordings from these sessions and makes them anonymous. The recordings are far from perfect and are complete with background noise and human speech errors like mumbling. Other participants then transcribe these recordings. This process helps cut the speech recognition error rate in half. After enough data and recordings have been gathered, and a speech actor has been enacted in the new voice for Siri, Apple then estimates what the most common questions are, and once released, Siri can gather user data to see what real-world users are frequently asking.
All of this adds up to a still far-from-perfect, yet quite functional voice-assist program that can reach thousands of new users thanks to the inclusion of a new language.