Seminar: On optimal transport in machine learning and statistics: Computational, modeling, and theoretical perspectives

Seminar: On optimal transport in machine learning and statistics: Computational, modeling, and theoretical perspectives

COGO Working Space, 4th Floor, Sun Grand City Ancora Residence, No.3 Luong Yen Street, Hai Ba Trung District, Hanoi, Vietnam

2019-10-24

10:00 am - 11:30 am

Nhat Ho

 

Abstract

From its origins in work by Monge and Kantorovich in the eighteenth and twentieth centuries, respectively, and through to the present day, the optimal transport (OT) problem has played a determinative role in machine learning, statistics, and data science. In the current era, the strong and increasing linkage between optimization and machine learning has brought new applications of OT to the fore. In this talk, we study the OT problem from three different perspectives.

From the computational viewpoint, we consider the entropic regularized version of OT problem, which we refer to as entropic regularized OT, in a discrete setting, where we assume that the target and source probability distributions each have at most n supports. The Sinkhorn algorithm, a block coordinate descent algorithm, has been used widely in practice for solving the entropic regularized OT and achieves a complexity upper bound of order n2=”2 to approximate the OT problem where ” denotes the tolerance. We demonstrate that the Sinkhorn algorithm can be accelerated directly by considering the monotone constant step scheme of accelerated randomized coordinate descent. The new algorithm, named Randkhorn algorithm, has a complexity bound of order n7=3=”, which is better than that of Sinkhorn algorithm in terms of “, and achieves the state-of-the-art practical performance.

From the theoretical side, we consider an application of optimal transport to study the statistical efficiency of parameter estimation from Gaussian mixtures of experts, a class of piece-wise regression models that have found applications in natural language processing, speech recognition, and deep learning. We introduce a novel notion of algebraic independence among expert functions and establish a connection between the algebraic independence and a certain class of partial differential equations (PDEs). Exploiting this connection allows us to derive convergence rates and minimax lower bounds for parameter estimation.

From the methodological standpoint, we propose novel and flexible joint-optimization approaches based on several notion of Wasserstein distances to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our methods can perform efficiently with largescale real world datasets consisting of hundred millions of data points via a parallel implementation with Apache Spark.

About the speaker

Nhat Ho is currently a postdoctoral fellow in the Electrical Engineering and Computer Sciences
(EECS) Department at University of California, Berkeley where he is supervised by Professor Michael I. Jordan and Professor Martin J. Wainwright. Before going to Berkeley, he finished his Ph.D degree in 2017 at the Department of Statistics, University of Michigan, Ann Arbor where he was advised by Professor Long Nguyen and Professor Ya’acov Ritov. Going further back in time, he received his Bachelor degree in Mathematics and Computer Science from Ho Chi Minh University of Science in 2011. His current research focuses on four principles of statistics, machine learning, and data science: heterogeneity, interpretability, stability, and scalability.

Upcoming Events