The Center for Data Innovation spoke with Brenden Lake, Moore-Sloan Data Science Fellow at New York University’s Center for Data Science. Lake discussed his recent research to develop a machine learning algorithm that learns in a similar manner to the human brain, as well as how principles from his research could help pave the way for new machine learning techniques that might eventually achieve the same capacity to learn as humans.
Joshua New: You published a research paper recently that made headlines called “Human-level concept learning through probabilistic program induction.” In layman’s terms, could you explain this research?
Brenden Lake: Machine learning has made remarkable advances in recent years, yet for most types of cognitively natural concepts, people are still better learners than machines. A child may need just a few examples to learn new words like “hairbrush” or “pineapple,” or an adult may need just one example to learn a new word like “Segway,” while the best computer algorithms require tens or hundreds of examples. People also use their concepts in far richer ways than conventional machine learning systems, allowing them to go beyond just recognizing new examples. For instance, people can sketch a novel instance of a Segway or imagine a novel vehicle inspired by the Segway.
In our recent paper, my coauthors and I studied how people learn simple visual concepts—that is, handwritten characters collected from alphabets around the world. This domain provides a large number of cognitively natural concepts for comparing human and machine learning, yet these concepts are simple enough that it is possible to reverse engineer how people learn. Guided by psychological data, we developed a new algorithm that learns to represent these visual concepts as simple programs, or structured procedures for generating new examples of that concept. While most machine learning algorithms treat concepts as patterns and learning as a process of finding and recognizing patterns, our algorithm treats concepts as simple models of the world, and learning as a process of building models.
We found that the algorithm can perform a range of tasks, such as classification, generating new examples of a concept, generating new concepts, and so on in ways that are difficult to distinguish from human behavior.
New: What are the real world applications of computer systems adept at recognizing handwritten characters?
Lake: There are many real-world applications of character recognition. The post office uses machine learning to read addresses and zip codes on envelopes and banks use machine learning to read the value of a check. While these systems are very accurate and impressive pattern recognizers, they typically see hundreds or thousands of examples of each digit or each letter before achieving high accuracy. In contrast, people can learn a new letter from a foreign alphabet from just a single example, suggesting that they are doing something different. In the paper, we developed a computational model with similar learning capabilities, with the aim of finding insights that generalize to other domains.
New: A lot of the press about your research implied that the system you developed could give computers the same capacity to learn as humans. Is this an oversimplification, or does your research really pave the way for software that can learn as quickly and as well as humans?
Lake: I think we made some interesting progress. Our paper shows how to capture, in computational terms, a range of human abilities in this domain of simple visual concepts. An important limitation is that the current algorithm only works for handwritten characters, and even with characters people can see additional structure that the model misses. Nonetheless, the model embodies three important principles that were important for its performance and can be applied more broadly: compositionality, causality, and learning-to-learn. Compositionality is the old idea that representations are built up from simpler primitives (characters are composed of pen strokes, or cars are composed of parts). Causality means that learning is about discovering aspects of the real causal process that generates examples of a concept. Last, learning-to-learn means that previous learning from related concepts is used to accelerate the learning of new concepts.
These principles may help explain how people quickly learn and use concepts in other domains. Our approach shows how these three principles can work together, although they could also be incorporated into other machine learning paradigms. A key point is that we need to learn the right form of representation—not just learning from bigger data sets—if we want to build more human-like learning algorithms.
New: Would it be easy to repackage your algorithm for different applications driven by machine learning, such as image classification or speech recognition? Or are the processes too different?
Lake: Other domains with symbolic concepts, such as speech or gesture recognition, are closely analogous to characters and the principles transfer quite seamlessly. People may only need one or a few examples of an unfamiliar name, like “Ban Ki-moon,” or novel hand gesture, like “hang loose,” to basically “get” the concept, allowing them to recognize new examples or produce a semblance of the concept themselves, even before the symbolic meaning is clear. In the case of speech, a new word can be represented causally by abstractly modeling the articulatory process, building the representation compositionally from phonemes—that is, primitives that are a product of learning-to-learn. More broadly, the three key principles could also help explain rapid learning of other types of objects like vehicles, tools, furniture, and so on, but this is a greater challenge for future work.
New: What is next for you? What would you like your future research to focus on or accomplish?
Lake: In addition to exploring other domains, my collaborators and I are pursuing questions related to cognitive development and neural representation. We want to study how children learn novel simple visual concepts, especially how this ability changes as children learn to write their native alphabet. We are also using brain imaging to study the role of action representations, especially those in the premotor and primary motor cortex, in the perception and learning of new characters.
More broadly, I see myself continuing to study the many computational problems that people are better at solving than machines. We have a lot to learn from these cases with interesting applications in many fields, including cognitive science, data science, machine learning, and artificial intelligence. By reverse engineering the human solutions to these computational challenges, there is the potential for both to better understand people and to develop a more human-like learning capacity in machines.