Published on May 16th, 2016 | by Joshua New0
5 Q’s for Dr. Jacob Vogelstein, Program Manager at IARPA
The Center for Data Innovation spoke with Dr. R. Jacob Vogelstein, program manager at the Intelligence Advanced Research Projects Activity (IARPA) within the Office of the Director of National Intelligence. Vogelstein explained a new IARPA project to reverse engineer the brain and make machines better at learning like humans.
This interview has been lightly edited.
Joshua New: At IARPA, you oversee the Machine Intelligence from Cortical Networks (MICrONS) program, which aims to “reverse engineer the algorithms of the brain” to improve machine learning. Could you explain this mission in layman’s terms?
Jacob Vogelstein: The MICrONS program aims to create a new generation of algorithms that can make better decisions with smaller amounts of data. This is one of the hallmarks of human intelligence—as humans, we are exposed to a large number of events, stimuli, pictures, sounds, and so on and can make sense of this information really quickly, even if it’s the first time we’re exposed to it. In contrast, today’s state-of-the-art machine learning algorithms are able to find similarities in data only when they have a lot of examples to learn from—thousands or millions of data points. But these aren’t very good at making abstractions or generalizing from a small number of examples. Our goal is to close the gap between machine intelligence and human intelligence, specifically in this area of sparse data.
New: How do you actually go about understanding how the brain solves problems and recreating this process?
Vogelstein: For decades, it has been proposed that the brain is organized in a modular fashion. That is, there are common computational units that are physically manifested in the brain and that repeat a number of times within a given area. For example, in the visual cortex—the part of our brain that processes visual information—it has been proposed that there is a circuit, which in this context means a set of computations carried out by a group of neurons, that is copied and replicated in a canonical form to perform the many different kinds of information processing necessary for us to make sense of an image. The idea is that this common module is replicated to tile across space to process all of the data coming through the eyes, and then also applied repeatedly as information is processed from an early, simple stage to an advanced, more complicated stage.
Beyond vision, the theory is that there are canonical circuits throughout the brain and that the brain essentially has a finite library of these elements that it combines to perform different computations. This has historically only been theoretical because until recently it has been very difficult to collect hard data about these modules. The kind of tools we’ve developed in neuroscience have been excellent at interrogating the brain at multiple scales of resolution, but are missing the key resolution needed to observe this phenomenon.
Most of what we know about the brain right now, we know from observing either one neuron at a time or millions of neurons at a time. When you get an MRI, for example, the machine measures units of activity aggregated from millions of neurons. Conversely, if you conduct a traditional neuroscience experiment using electrodes, those electrodes measure one or just a small handful of neurons at a time. In contrast, these modules that people have hypothesized as the base units of computation in the brain are on the order of tens of thousands or hundreds of thousands of neurons. Essentially, the actual implementation of these computations occurs at the local network level, rather than at the individual neuron level or the broad network level across the whole brain.
So we’re trying to reverse engineer the contents of a small region of the brain that we believe will have circuits that are representative of the modules repeated in multiple areas across the brain. If we can get some insight into the contents of these modules and how these computations are structured, then the hope is that we can employ those effectively in new algorithms with silicon.
New: Is the focus of the project just to advance machine learning research generally, or are there specific areas you’re focusing on that stand to benefit the most from this approach?
Vogelstein: Our findings would apply broadly across machine learning and simple forms of artificial intelligence. On this program in particular however, we’re focusing on sensory information processing, because that’s the part of the brain that’s best understood—how we go from hearing sounds in the ear to understanding words. We’re focused on these areas in particular because that’s where we think we can make the most sense of what we find. However, we expect the algorithms we develop will definitely be broadly applicable.
I think right now, if you were to examine the state-of-the-art in machine learning, algorithms for sensory information processing are similarly the best developed. Computer vision algorithms and text-to-speech algorithms, for example, are among the most successful machine learning applications that we have.These are called artificial neural networks for a good reason—they’re derived from a high level understanding of sensory information processing architectures in the brain. This is the domain we expect we’ll have the obvious and direct impact with our research. But eventually, you could abstract this approach to apply to non-sensory information processing applications, such as cybersecurity, financial intelligence, and a number of other data domains.
New: What does success look like for the MICrONS program? What’s the timeline?
Vogelstein: The program started in January 2016, and it’s a five year program, so we have a long way to go. But we’re already seeing some really exciting data coming out of the lab. This program is structured around multiple phases of data collection, followed by analysis, and then incorporation of these findings into new algorithms. In this first phase, we’re doing the initial data collection in the lab with teams collecting a very large sample of activity from the visual cortex of a few different animal models. The idea is to collect a large amount of data from a lot of different neurons running in parallel at the same time so we can get a better understanding of the dynamics of the brain. Then, once we’ve recorded this data, there will be a second phase of collection on the same regions of the brain, focusing on identifying the anatomical structure and morphology of all the neurons involved. With this, we can paint a pretty comprehensive picture of how neurons are manifested in the brain, how they are wired together, and what activity they generate in the brain while an animal is processing visual information.
In the first 18 months of the program, our goal is to carry out this experiment for a small, million-cubic-micron region of the brain. This contains a few hundred or thousand cells and will act as a pilot for the second phase of the program, which will run for 24 months after phase one ends. This second phase will repeat the same process but for a cubic millimeter of the brain, which is significantly larger and the hypothesized size of these modules we’re trying to study. One cubic millimeter would contain tens of thousands or one hundred thousand neurons. So we’re focusing on scaling up to reach that point, and the data sets we’re generating today are already unprecedented in terms of their size, scope, and resolution for this kind of research.
New: Before IARPA, you worked for several years at the financial firm Global Domain Partners,where you developed algorithmic trading strategies. How impactful has machine learning been for the investment industry? Is there potential for what you’re working on now to benefit that industry?
Vogelstein: I started about a decade ago to develop algorithms for trading futures contracts. The conventional wisdom at the time was that if you were trying to identify patterns in the data at a relatively slow time scale—from day to day—there were not really signals in there that you could exploit because the market was relatively efficient and would eliminate any transient noise. We went in with the hypothesis that the markets are so complicated that there are thousands and thousands of different factors that contribute to an underlying price. Some are financial, some are geopolitical, some are even influenced by the weather. There’s so much data contributing to this underlying holding that it would be impossible for any one human to look at all those signals and make a logical decision about the trajectory of a price, but a computer algorithm could make more sense of this data and have a competitive advantage.
That was a fairly successful strategy and we had positive returns for the seven years we ran it. Now, the prevailing wisdom is that there are absolutely signals to exploit in the data that computers can make much better use of that humans ever could hope to. This is true for small timescales, such as with high frequency trading, as well as large timescales that are beyond the capacity of humans to understand confidently. The biggest difference is that we’ve realized that there’s so much data to exploit and that we can actually build algorithms now that can capitalize on this.
The goal of MICrONS isn’t to figure out financial markets, but I do think that as machine learning becomes more mature and we get better at generalizing from sparse data, these insights will definitely apply. We’re trying to develop algorithms that learn from sparse data like the brain can, and in finance, predicting stock market anomalies would be a great application of this. One of the problem with market crashes, whether they’re nefarious or just naturally occuring, is that there just aren’t that many you can go back and look at and learn from. Trying to figure out a pattern with three examples is impossible for today’s algorithms that need millions of data points to learn patterns. The insights we gain from MICrONS into the human ability to generalize from small numbers will definitely be applicable here where there just simply isn’t enough data to draw on.