Title: Implementation of Backtracking line search in Deep Neural Networks: Theory and Practice
Speaker: Tuyen Truong - University of Oslo
Date/Time: Friday, Aug 13 2021 - 11:00 am (GMT + 7)
Video recording: https://youtu.be/EimWBpAUhTk
About the Speaker:
Tuyen Truong had an undergraduate degree in mathematics from University of Science (Ho Chi Minh City), and a PhD degree in mathematics from Indiana University. After being a postdoc at Syracuse University, Korea Institute for Advanced Study and The University of Adelaide, he came to University of Oslo as an Associate Professor in mathematics. He works in both pure and applied mathematics, with a view towards applications in Deep Learning (which, as an amateur player of the game of Go, he was drawn to through the big news about Alpha Go). He is open to different viewpoints, new discussions and collaborations
Backtracking line search (or Armijo’s algorithm) has been around for more than 60 years. However, until around 2015-2016, there were still threads on Reddit and Overflow with claims that it cannot be implemented successfully in Deep Neural Networks (DNN) because it is expensive. About 1 year later, there was an implementation of Wolfe’s method (more complicated than Armijo’s) in DNN, but that work seemed not be replicated by independent teams. Since August 2018, a joint work of the speaker (with Hang Tuan Nguyen at Axon AI Research) has successfully implemented Armijo’s algorithm (and its combination with Momentum and NAG) on large enough datasets (Cifar10 and Cifar100), as well as provided new theoretical results on Armijo’s method (for example, showing that its theoretical guarantees are better than known results for Wolfe’s algorithm). Since then, there have been similar implementations from other research teams, including one from Canadian institutions. This talk will overview the theoretical and practical considerations around implementation of Armijo’s algorithm, and will include many illustrating experiments (from both small scale optimization, stochastic optimization and DNN). Also, future directions are discussed at the end.