Technical Program

Paper Detail

Paper: PS-1A.45
Session: Poster Session 1A
Location: Symphony/Overture
Session Time: Thursday, September 6, 16:30 - 18:30
Presentation Time:Thursday, September 6, 16:30 - 18:30
Presentation: Poster
Paper Title: Auditory scene analysis as Bayesian inference in sound source models
Manuscript:  Click here to view manuscript
Authors: Maddie Cusimano, Luke Hewitt, Joshua B. Tenenbaum, Josh H. McDermott, Massachusetts Institute of Technology, United States
Abstract: Inferring individual sound sources from the mixture of soundwaves that enters our ear is a central problem in auditory perception, termed auditory scene analysis (ASA). The study of ASA has uncovered a diverse set of illusions that suggest general principles underlying perceptual organization. However, most explanations for these illusions remain intuitive or are narrowly focused, without formal models that predict perceived sound sources from the acoustic waveform. Whether ASA phenomena can be explained by a small set of principles is unclear. We present a Bayesian model based on representations of simple acoustic sources, for which a neural network is used to guide Markov chain Monte Carlo inference. Given a sound waveform, our system infers the number of sources present, parameters defining each source, and the sound produced by each source. This model qualitatively accounts for perceptual judgments on a variety of classic ASA illusions, and can in some cases infer perceptually valid sources from simple audio recordings.