Technical Program

Paper Detail

Paper: PS-1B.44
Session: Poster Session 1B
Location: Symphony/Overture
Session Time: Thursday, September 6, 18:45 - 20:45
Presentation Time:Thursday, September 6, 18:45 - 20:45
Presentation: Poster
Paper Title: Global-and-local attention networks for visual recognition
Manuscript:  Click here to view manuscript
Authors: Drew Linsley, Brown University, United States; Dan Shiebler, Twitter, United States; Sven Eberhardt, Amazon, United States; Thomas Serre, Brown University, United States
Abstract: Most recent gains in machine vision have originated from the development of network architectures which incorporate some form of attention. While biology is sometimes mentioned as a source of inspiration, the attentional mechanisms that have been considered by the computer vision community remain limited in comparison to the richness and diversity of the processes used by our visual system. Here, we describe a biologically-motivated "global-and-local attention" (GALA) module which is shown to yield state-of-the-art object recognition accuracy when embedded in a modern deep neural network. We further describe ClickMe.ai, a large-scale online experiment designed for human participants to identify diagnostic image regions for visual recognition in order to co-train a GALA network. Adding humans-in-the-loop is shown to significantly improve network accuracy, while also yielding visual representations that are more interpretable and more similar to those used by human observers.