Before joining DeepMind, I did my PhD with Anne Collins and Silvia Bunge at UC Berkeley.
In my first PhD project, I used pupillometry to study rule inference.
After that, I did a series of projects on cognitive modeling. I led a big effort to collect a large developmental dataset with over 250 participants (8-30 years) who performed 3 different cognitive tasks, and provided physiological measures in the lab.
I first used one of the tasks to understand how learning develops with age. The standard approach here would be to pick one kind of model (e.g., Reinforcement Learning or Bayesian inference), and tweak it until it fits. Instead of picking just one, I decided to fit both. This allowed me to do many things that other studies can’t do: I now had two explanations instead of just one. I could compare them to each other. And I could distill them into a single, simple story.
I next combined all three tasks to understand what happens when we fit a variety of cognitive models on a variety of cognitive tasks. What I found was unexpected: Whatever cognitive processes a model parameter captures oftentimes varies tremendously between tasks. This has important implications for the interpretation of computational models.
I also have a longstanding interest in hierarchy and abstraction – our ability to reason at different timescales, and at different levels of granularity. I believe that the human propensity to think abstractly makes us particularly efficient and flexible.
More recently, I am interested in how AI can help us understand cognition. I now create cognitive models that combine classic cognitive models (like RL) with neural networks: The classic part makes these hybrid models interpretable and normative. The neural network part lets us go beyond hard-cut theories and get data-driven insights; and it also means we can fit data really well (predictive power).
For a comprehensive list of publications, check out Google Scholar.