1 min readApr 24, 2018
I mean: we don’t care if our model has some pathological behavior that is only activated on a particular input, we only care about the actual computations that the model is doing on the training set. Intuitively, this might make it a lot easier to notice if something is wrong.