All Things Attention: Bridging Different Perspectives on Attention

Four colorful panels of cartoons of robots and people looking at images of eyes

Generated by DALL-E

In conjunction with NeurIPS 22: December 2, 2022. Hybrid in-person and virtual.

Attention is a widely popular topic studied in many fields such as neuroscience, psychology, and machine learning. A better understanding and conceptualization of attention in both humans and machines has led to significant progress across fields. At the same time, attention is far from a clear or unified concept, with many definitions within and across multiple fields.

Cognitive scientists study how the brain flexibly controls its limited computational resources to accomplish its objectives. Inspired by cognitive attention, machine learning researchers introduce attention as an inductive bias in their models to improve performance or interpretability. Human-computer interaction designers monitor people’s attention during interactions to implicitly detect aspects of their mental states.

While the aforementioned research areas all consider attention, each formalizes and operationalizes it in different ways. Bridging this gap will facilitate:

(Cogsci for AI) More principled forms of attention in AI agents towards more human-like abilities such as robust generalization, quicker learning and faster planning.
(AI for cogsci) Developing better computational models for modeling human behaviors that involve attention.
(HCI) Modeling attention during interactions from implicit signals for fluent and efficient coordination
(HCI/ML) Artificial models of algorithmic attention to enable intuitive interpretations of deep models?

Topics of Interest

The All Things Attention workshop aims to foster connections across disparate academic communities that conceptualize “attention” such as Neuroscience, Psychology, Machine Learning, and Human Computer Interaction. Workshop topics of interest include:

Relationships between biological and artificial attention

What are the connections between forms of attention in the human brain and deep neural network architectures?
Can the anatomy of human attention models inspire designs of architectures for artificial systems?
Given the same task and learning objective, how do learned attention mechanisms in machines differ from those in humans?

Attention for reinforcement learning and decision making

How have reinforcement learning agents leveraged attention in decision making?
Do decision-making agents today have implicit or explicit formalisms of attention?
How can AI agents build notions of attention without explicitly baked in notions of attention?
Can attention significantly enable AI agents to scale e.g. through gains in sample efficiency, and generalization?
How should learning systems reason about computational attention (which parts of sensed inputs to focus computation on)?

Attention mechanisms for continual / lifelong learning

How can continual learning agents use attention to maintain already-learned knowledge?
How can attention control the amount of interference between different inputs?
How does the executive control of attention evolve with learning in humans?
How does understanding the development of human attentional systems in infancy and childhood explain how attention can be learned in artificial systems?

Attention for interpretation and explanation

How can attention models aid visualization?
How is attention used for interpretability in AI?
What are the major bottlenecks and common pitfalls in applying attention methods for explaining the decisions of AI agents?

Attention in human-computer interaction

How do we detect aspects of human attention during interactions, from sensing to processing to representations?
What systems benefit from human attention modeling, and how do they use these models?
How can systems influence a user’s attention, and what systems benefit from this capability?
How can a system communicate or simulate its own attention (humanlike or algorithmic) in an interaction, and to what benefit?

Attention mechanisms in Deep Neural Network (DNN) architectures

How does attention in DNNs such as transformers relate to existing formalisms of attention in cogsci/psychology?
How does self-attention in transformers contribute to its vast success in recent models such as GPT2, GPT3, DALLE?
How can an understanding of attention from other fields inspire future DNN research?

Ways to Participate

Recording of the event is available here.

Questions

Ask questions on slido: Use this link or the embedded page here:

RocketChat and NeurIPS workshop page

You can participate in live discussions during the workshop here. Invited speakers will also be encouraged to answer questions offline using the same link. Please note you need to be registered for the workshop to access RocketChat.

Virtual poster session and hangouts

The virtual poster session will be held on the Discord server — see the NeurIPS page for a link. It’ll be open for the whole day, so feel free to jump in just to chat!

Schedule

Time in CST	Event
9:00 AM - 11:00 AM 9:00 AM - 9:05 AM 9:05 AM - 9:25 AM 9:25 AM - 9:45 AM 9:45 AM - 10:05 AM 10:05 AM - 10:25 AM 10:25 AM - 11:00 AM	Talks Session I Workshop Intro Ida Momennejad James Whittington Henny Admoni Tobias Gerstenberg Spotlight Talks: Foundations of Attention Mechanisms in Deep Neural Network Architectures Is Attention Interpretation? A Quantitative Assessment On Sets
11:00 AM - 12:00 PM	In-Person Panel Discussion Panelists: Megan deBettencourt, Tobias Gerstenberg, Erin Grant, Ida Momennejad, Ramakrishna Vedantam, James Whittington, Cyril Zhang
12:00 PM - 1:00 PM	Lunch and Virtual Social Event
1:00 PM - 2:00 PM	Coffee Break / Poster Session
2:00 PM - 4:00 PM 2:00 PM - 2:20 PM 2:20 PM - 2:40 PM 2:40 PM - 3:00 PM 3:00 PM - 3:20 PM 3:20 PM - 4:00 PM	Talks Session II Shalini De Mello Pieter Roelfsema Erin Grant Vidhya Navalpakkam Spotlight Talks: Wide Attention Is The Way Forward For Transformers Fine-tuning hierarchical circuits through learned stochastic co-modulation Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
4:00 PM - 5:00 PM	Coffee Break / Poster Session
5:00 PM - 6:00 PM	Virtual Panel Discussion Panelists: Henny Admoni, David Ha, Brian Kingsbury, John Langford, Shalini De Mello, Vidhya Navalpakkam, Ashish Vaswani

Invited Speakers

Ida Momennejad

Principal Researcher, Microsoft Research

Attention in Task-sets, Planning, and the Prefrontal Cortex

What we pay attention to depends on the context and the task at hand. On the one hand, the prefrontal cortex can modulate how to direct attention outward to the external world. On the other hand, attention to internal states enables metacognition and configuration of internal states using repertoires of memories and skills. I will first discuss ongoing work in which, inspired by the role of attention in affordances and task-sets, we analyze large scale game play data in the XboX 3D game Bleeding Edge in an interpretable way. I will briefly mention ongoing directions including decoding of plans during chess based on eye-tracking. I will conclude with how future models of multi-scale predictive representations could include prefrontal cortical modulation during planning and task performance.

James Whittington

Postdoc, University of Oxford

Relating Transformers to Models and Neural Representations of the Hippocampal Formation

Many deep neural network architectures loosely based on brain networks have recently been shown to replicate neural firing patterns observed in the brain. One of the most exciting and promising novel architectures, the Transformer neural network, was developed without the brain in mind. In this work, we show that transformers, when equipped with recurrent position encodings, replicate the precisely tuned spatial representations of the hippocampal formation; most notably place and grid cells. Furthermore, we show that this result is no surprise since it is closely related to current hippocampal models from neuroscience. We additionally show the transformer version offers dramatic performance gains over the neuroscience version. This work continues to bind computations of artificial and brain networks, offers a novel understanding of the hippocampal-cortical interaction, and suggests how wider cortical areas may perform complex tasks beyond current neuroscience models such as language comprehension.

Henny Admoni

A. Nico Habermann Assistant Professor, Carnegie Mellon University

Eye Gaze in Human-Robot Collaboration

In robotics, human-robot collaboration works best when robots are responsive to their human partners’ mental states. Human eye gaze has been used as a proxy for one such mental state: attention. While eye gaze can be a useful signal, for example enabling intent prediction, it is also a noisy one. Gaze serves several functions beyond attention, and thus recognizing what people are attending to from their eye gaze is a complex task. In this talk, I will discuss our research on modeling eye gaze to understand human attention in collaborative tasks such as shared manipulation and assisted driving.

Tobias Gerstenberg

Assistant Professor of Cognitive Psychology, Stanford University

Attending to What's Not There

When people make sense of the world, they don’t only pay attention to what’s actually happening. Their mind also takes them to counterfactual worlds of what could have happened. In this talk, I will illustrate how we can use eye-tracking to uncover the human mind’s forays into the imaginary. I will show that when people make causal judgments about physical interactions, they don’t just look at what actually happens. They mentally simulate what would have happened in relevant counterfactual situations to assess whether the cause made a difference. And when people try to figure out what happened in the past, they mentally simulate the different scenarios that could have led to the outcome. Together these studies illustrate how attention is not only driven by what’s out there in the world, but also by what’s hidden inside the mind.

Shalini De Mello

Principal Research Scientist, NVIDIA

Exploiting Human Interactions to Learn Human Attention

Unconstrained eye gaze estimation using ordinary webcams in smart phones and tablets is immensely useful for many applications. However, current eye gaze estimators are limited in their ability to generalize to a wide range of unconstrained conditions, including, head poses, eye gaze angles and lighting conditions, etc. This is mainly due to the lack of availability of gaze training data in in-the-wild conditions. Notably, eye gaze is a natural form of human communication while humans interact with each other. Visual data (videos or images) containing human interaction are also abundantly available on the internet and are constantly growing as people upload more. Could we leverage visual data containing human interaction to learn unconstrained gaze estimators? In this talk we will describe our foray into addressing this challenging problem. Our findings point to the great potential of human interaction data as a low cost and ubiquitously available source of training data for unconstrained gaze estimators. By lessening the burden of specialized data collection and annotation, we hope to foster greater real-word adoption and proliferation of gaze estimation technology in end-user devices.

Pieter Roelfsema

Department Head, Netherlands Institute for Neuroscience

BrainProp: How Attentional Processes in the Brain Solve the Credit Assignment Problem

Humans and many other animals have an enormous capacity to learn about sensory stimuli and to master new skills. Many of the mechanisms that enable us to learn remain to be understood. One of the greatest challenges of systems neuroscience is to explain how synaptic connections change to support maximally adaptive behaviour. We will provide an overview of factors that determine the change in the strength of synapses. Specifically, we will discuss the influence of attention, neuromodulators and feedback connections in synaptic plasticity and suggest a specific framework, called BrainProp, in which these factors interact to improve the functioning of the entire network.

Much recent work focuses on learning in the brain using presumed biologically plausible variants of supervised learning algorithms. However, the biological plausibility of these approaches is limited, because there is no teacher in the motor cortex that instructs the motor neurons. Instead, learning in the brain usually depends on reward and punishment. BrainProp is a biologically plausible reinforcement learning scheme for deep networks with an any number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in lower layers that are responsible for this action. After the choice, the network receives reinforcement so that there is no need for a teacher. We showed how BrainProp is mathematically equivalent to error backpropagation, for one output unit at a time (Pozzi et al., 2020). We illustrate learning of classical and hard image-classification benchmarks (MNIST, CIFAR10, CIFAR100 and Tiny ImageNet) by deep networks. BrainProp achieves an accuracy that is equivalent to that of standard error-backpropagation, and better than other state-of-the-art biologically inspired learning schemes. Additionally, the trial-and-error nature of learning is associated with limited additional training time so that BrainProp is a factor of 1-3.5 times slower. These results provide new insights into how deep learning may be implemented in the brain.

Erin Grant

Senior Research Fellow, University College London

Attention as Interpretable Information Processing in Machine Learning Systems

Attention in psychology and neuroscience conceptualizes how the human mind prioritizes information as a result of limited resources. Machine learning systems do not necessarily share the same limits, but implementations of attention have nevertheless proven useful in machine learning across a broad set of domains. Why is this so? I will focus on one aspect: interpretability, which is an ongoing challenge for machine learning systems. I will discuss two different implementations of attention in machine learning that tie closely to conceptualizations of attention in two domains of psychological research. Using these case studies as a starting point, I will discuss the broader strengths and drawbacks of using attention to constrain and interpret how machine learning systems process information. I will end with a problem statement highlighting the need to move away from localized notions to a global view of how attention-like mechanisms modulate information processing in artificial systems.

Vidhya Navalpakkam

Principal Scientist, Google Research

Accelerating Human Attention Research via ML Applied to Smartphones

Attention and eye movements are thought to be a window to the human mind, and have been extensively studied across Neuroscience, Psychology and HCI. However, progress in this area has been severely limited as the underlying methodology relies on specialized hardware that is expensive (upto $30,000) and hard to scale. In this talk, I will present our recent work from Google, which shows that ML applied to smartphone selfie cameras can enable accurate gaze estimation, comparable to state-of-the-art hardware based devices, at 1/100th the cost and without any additional hardware. Via extensive experiments, we show that our smartphone gaze tech can successfully replicate key findings from prior hardware-based eye movement research in Neuroscience and Psychology, across a variety of tasks including traditional oculomotor tasks, saliency analyses on natural images and reading comprehension. We also show that smartphone gaze could enable applications in improved health/wellness, for example, as a potential digital biomarker for detecting mental fatigue. These results show that smartphone-based attention has the potential to unlock advances by scaling eye movement research, and enabling new applications for improved health, wellness and accessibility, such as gaze-based interaction for patients with ALS/stroke that cannot otherwise interact with devices.

Panelists

Ida Momennejad

Principal Researcher, Microsoft Research

James Whittington

Postdoc, University of Oxford

Henny Admoni

A. Nico Habermann Assistant Professor, Carnegie Mellon University

Tobias Gerstenberg

Assistant Professor of Cognitive Psychology, Stanford University

Shalini De Mello

Principal Research Scientist, NVIDIA

Erin Grant

Senior Research Fellow, University College London

Vidhya Navalpakkam

Principal Scientist, Google Research

Megan deBettencourt

Postdoc, University of Chicago

David Ha

Head of Strategy, Stability AI

Ramakrishna Vedantam

Research Scientist, Facebook AI Research (FAIR)

Cyril Zhang

Senior Researcher, Microsoft Research

Ashish Vaswani

Chief Scientist and Co-Founder, Adept AI Labs

Brian Kingsbury

Distinguished Research Scientist and Manager, IBM Research

John Langford

Partner Research Manager, Microsoft Research

Accepted Papers

Oral Presentations

Fine-tuning hierarchical circuits through learned stochastic co-modulation Caroline Haimerl, Eero P Simoncelli, Cristina Savin

All Things Attention: Bridging Different Perspectives on Attention

Topics of Interest

Relationships between biological and artificial attention

Attention for reinforcement learning and decision making

Attention mechanisms for continual / lifelong learning

Attention for interpretation and explanation

Attention in human-computer interaction

Attention mechanisms in Deep Neural Network (DNN) architectures

Ways to Participate

Questions

RocketChat and NeurIPS workshop page

Virtual poster session and hangouts

Schedule

Invited Speakers

Principal Researcher, Microsoft Research

Attention in Task-sets, Planning, and the Prefrontal Cortex

Postdoc, University of Oxford

Relating Transformers to Models and Neural Representations of the Hippocampal Formation

A. Nico Habermann Assistant Professor, Carnegie Mellon University

Eye Gaze in Human-Robot Collaboration

Assistant Professor of Cognitive Psychology, Stanford University

Attending to What's Not There

Principal Research Scientist, NVIDIA

Exploiting Human Interactions to Learn Human Attention

Department Head, Netherlands Institute for Neuroscience

BrainProp: How Attentional Processes in the Brain Solve the Credit Assignment Problem

Senior Research Fellow, University College London

Attention as Interpretable Information Processing in Machine Learning Systems

Principal Scientist, Google Research

Accelerating Human Attention Research via ML Applied to Smartphones

Panelists

Principal Researcher, Microsoft Research

Postdoc, University of Oxford

A. Nico Habermann Assistant Professor, Carnegie Mellon University

Assistant Professor of Cognitive Psychology, Stanford University

Principal Research Scientist, NVIDIA

Senior Research Fellow, University College London

Principal Scientist, Google Research

Postdoc, University of Chicago

Head of Strategy, Stability AI

Research Scientist, Facebook AI Research (FAIR)

Senior Researcher, Microsoft Research

Chief Scientist and Co-Founder, Adept AI Labs

Distinguished Research Scientist and Manager, IBM Research

Partner Research Manager, Microsoft Research

Accepted Papers

Oral Presentations

Poster Presentations

Organizers

Microsoft Research

Carnegie Mellon University

McGill University / MILA, Montreal

Tufts University

Stanford University

University College London / New York University

University of Texas, Austin / University of Massachusetts

Relevant Publications

CogSci and Neuroscience

Machine Learning, Deep Learning and Reinforcement Learning

HCI, HRI and Robotics