Technical Program

Paper Detail

Paper: PS-2B.3
Session: Poster Session 2B
Location: Symphony/Overture
Session Time: Friday, September 7, 19:30 - 21:30
Presentation Time:Friday, September 7, 19:30 - 21:30
Presentation: Poster
Paper Title: Combining heuristics with counterfactual play in reinforcement learning.
Manuscript:  Click here to view manuscript
Authors: Erik Peterson, Necati Müyesser, Kyle Dunovan, Tim Verstynen, Carnegie Mellon University, United States
Abstract: Deep reinforcement learning can sometimes match and exceed human performance, but if even minor changes are introduced artificial networks can't adapt what they've learned to new situations. Two reasons why people are so eminently adaptable is their use of heuristics, and their ability to imagine new environments and learn from them, a kind of counterfactual reasoning. We've developed a model of hierarchical reinforcement learning which includes both these elements. Using a board game with a known optimal strategy--Wythoff’s game--we show that this ``stumbler-strategist'' network promotes generalizability and robustness to new environments and rule changes, while also improving post-training interpretability of learning outcomes.