Imagine you’re training a computer with a solid vocabulary and a basic knowledge about parts of speech. How would it understand this sentence: “The chef who ran to the store was out of food.”
Did the chef run out of food? Did the store? Did the chef run the store that ran out of food?
Most human English speakers will instantly come up with the right answer, but even advanced artificial intelligence systems can get confused. After all, part of the sentence literally says that “the store was out of food.”
Advanced new machine learning models have made enormous progress on these problems, mainly by training on huge datasets or “treebanks” of sentences that humans have hand-labeled to teach grammar, syntax and other linguistic principles.
The problem is that treebanks are expensive and labor intensive, and computers still struggle with many ambiguities. The same collection of words can have widely different meanings, depending on the sentence structure and context.
But a pair of new studies by artificial intelligence researchers at Stanford find that advanced AI systems can figure out linguistic principles on their own, without first practicing on sentences that humans have labeled for them. It’s much closer to how human children learn languages long before adults teach them grammar or syntax.
Even more surprising, however, the researchers found that the AI model appears to infer “universal” grammatical relationships that apply to many different languages.
That has big implications for natural language processing, which is increasingly central to AI systems that answer questions, translate languages, help customers and even review resumes. It could also facilitate systems that learn languages spoken by very small numbers of people.
The key to success? It appears that machines learn a lot about language just by playing billions of fill-in-the-blank games that are reminiscent of “Mad Libs.” In order to get better at predicting the missing words, the systems gradually create their own models about how words relate to each other.
“As these models get bigger and more flexible, it turns out that they actually self-organize to discover and learn the structure of human language,” says Christopher Manning, the Thomas M. Siebel Professor in Machine Learning and professor of linguistics and of computer science at Stanford, and an associate director of Stanford’s Institute for Human-Centered Artificial Intelligence (HAI). “It’s similar to what a human child does.”
Read more: Tech Xplore