Influence Functions in Machine Learning Tasks

This was our final course project for COMPSCI 689: Machine Learning.

Machine learning models are generally complex and it is difficult to interpret the learned parameters. It is thus hard to follow their training and behavior. It would be great if we could understand how the model is learning from the data and making predictions. To this effect, Koh and Liang [2017] showed how influence functions can be used to calculate the influence of a given training point on test prediction of a binary classification model, and reported some applications. In this work, we extend the influence functions framework to cover more Machine Learning tasks, so that they can be used more widely in this field to understand and improve training and performance. We investigate how to calculate the influence of populations of training samples, and extend the framework to regression and multi-class problems. We also look at how we can use influence functions to train a given model better.