China’s giant tech company Baidu, has surpassed both Microsoft and Google when it comes to AI and language learning. The company, which is sometimes referred to as China’s Google, achieved the highest ever score in the General Language Understanding Evaluation (Glue), which is widely considered to be the benchmark for AI language understanding. It consists of nine different tests for things like picking out the names of people in a sentence, and figuring out what a pronoun like “it” refers to when there are multiple potential options. The average person scores about 87 points out of a hundred on the Glue scale—Baidu is the first to score over 90. The company used it’s own AI language model, called ERNIE (which stands for “Enhanced Representation through kNowledge IntEgration”).
According to Karen Hao of MIT Technology Review, what’s notable about Baidu’s achievement is that it illustrates how AI research benefits from a diversity of contributors. Baidu’s researchers developed a technique with Ernie that was specifically for Chinese. This proved, however, to make it better at understanding English as well.
Baidu’s ERNIE was modeled after Google’s BERT (Bidirectional Encoder Representations from Transformers), which was created in 2018. Before BERT, natural language models had much lower capabilities, and could only predict words for applications like Autocomplete, but couldn’t sustain a train of thought. When BERT came along, it considered the context before and after a word at once, making it bidirectional and able to but each word in it’s complete context.
The Baidu researchers took this idea further, and trained ERNIE to predict sets of missing words, which is essential for understanding Chinese, in which individual words rarely work alone. While BERT specialized in predicting words, ERNIE was able to predict phrases. This ability had great crossover into English, making it able to predict entire sets of words. Just as Chinese, English has words that have different meanings depending on their contexts.
“When we first started this work, we were thinking specifically about certain characteristics of the Chinese language,” says Hao Tian, the chief architect of Baidu Research. “But we quickly discovered that it was applicable beyond that.”