Houjun Liu

GAMMA

Past Work

self play: this is a \(\text{coNP}\) vs \(\text{NP}\) problem: whereas competitive self-play attempts to defend against all strategies, collaborative self-play only needs to find one useful strategy; this doesn’t generalize well because humans are not a partner
behavior cloning:
Population Based Training: computational super e

Novelty

instead, learn a generative model from both simulated agents or human data
then, sample from this generative model

Notable Methods

Key Figs

New Concepts

Notes