10.1184/R1/6720692.v1
Brian D. Ziebart
Brian D.
Ziebart
Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy
Carnegie Mellon University
2010
Machine learning
decision making
probabilistic modeling
maximum entropy
inverse optimal control
influence diagrams
informational revelation
feedback
causality
goal inference
2010-12-01 00:00:00
Thesis
https://kilthub.cmu.edu/articles/thesis/Modeling_Purposeful_Adaptive_Behavior_with_the_Principle_of_Maximum_Causal_Entropy/6720692
Predicting human behavior from a small amount of training examples is a challenging machine
learning problem. In this thesis, we introduce the principle of maximum causal entropy, a general
technique for applying information theory to decision-theoretic, game-theoretic, and control
settings where relevant information is sequentially revealed over time. This approach guarantees
decision-theoretic performance by matching purposeful measures of behavior (Abbeel & Ng,
2004), and/or enforces game-theoretic rationality constraints (Aumann, 1974), while otherwise being
as uncertain as possible, which minimizes worst-case predictive log-loss (Gr¨unwald & Dawid,
2003).
We derive probabilistic models for decision, control, and multi-player game settings using this
approach. We then develop corresponding algorithms for efficient inference that include relaxations
of the Bellman equation (Bellman, 1957), and simple learning algorithms based on convex
optimization. We apply the models and algorithms to a number of behavior prediction tasks.
Specifically, we present empirical evaluations of the approach in the domains of vehicle route
preference modeling using over 100,000 miles of collected taxi driving data, pedestrian motion
modeling from weeks of indoor movement data, and robust prediction of game play in stochastic
multi-player games.