Stochastic Second Order Optimization Methods: Theory and Applications
4:10pm Webster 11
Farbod Roosta-Khorasani "Fred Roosta"
Abstract: Many machine learning, scientific computing and data analysis applications require the solution of optimization problems involving a sum of large number of functions. We first motivate the problem by describing an application from scientific computing, namely PDE-constrained optimization. We then consider a more general problem setting for minimizing a sum of n functions over a convex constraint set. For such problems, algorithms that carefully sub-sample to reduce n can improve the computational efficiency, while maintaining the original convergence properties. For second order methods, we give quantitative convergence results for variants of Newtons methods where the Hessian or the gradient is uniformly sub-sampled. We then show that, given certain assumptions, we can extend our analysis and apply non-uniform sampling which results in modified algorithms exhibiting more robustness and better dependence on problem specific quantities, such as the condition number. We finally present a simple machine learning application demonstrating the properties of our methods.