Paul Christiano
1 min readJun 29, 2017

--

Arthur only needs to predict approval[T](a), but it needs to use observations to learn. It learns from observations by having a joint model and conditioning the model on its observations.

--

--

No responses yet