Paul Christiano
1 min readApr 1, 2018

--

By default the agent will also need to explore “don’t shutdown when asked” in order to verify that it gets a low return, though hopefully you’d also deal with that using simulated data.

--

--

Responses (1)