Implicit bias in machine learning
- Ohad Shamir (Weizmann Institute of Science)
Most practical algorithms for supervised machine learning boil down to optimizing the average performance over a training dataset. However, it is increasingly recognized that although the optimization objective is the same, the manner in which it is optimized plays a decisive role in the properties of the resulting predictor. For example, when training large neural networks, there are generally many weight combinations that will perfectly fit the training data. However, gradient-based training methods somehow tend to reach those which, on the one hand, do not overfit, and on the other hand, are very brittle to adversarially crafted examples. Why the dynamics of these methods lead to such "implicit biases" is still far from being fully understood. In this talk, I'll describe several recent theoretical results related to this question, in the context of benign overfitting and adversarial examples.
Based on joint work with Gal Vardi and Gilad Yehudai.