Find a Project

Back to Project Search

Gradient compression for distributed training of machine learning models

Apply to this project

Project Details

Program

Computer Science

Field of Study

computer science, mathematics, machine learning

Division

Computer, Electrical and Mathematical Sciences and Engineering

Faculty Lab Link

https://richtarik.org/

Project Description

Modern supervised machine learning models are trained using enormous amounts of data, and for this distributed computing systems are used. The training data is distributed across the memory of the nodes of the system, and in each step of the training process one needs to aggregate updates computed by all nodes using local data. This aggregation step requires communication of a large tensor, which is the bottleneck limiting the efficiency of the training method.

To mitigate this issue, various compression (e.g., sparsification/quantization/dithering) schemes were propose in the literature recently. However, many theoretical, system-level and practical questions remain to be open. In this project the intern will aim to advance the state of the art in some aspect of this field. As this is a fast moving field, details of the project will only be finalized together with the successful applicant. Background reading based on research on this topic done in my group:

https://arxiv.org/abs/1905.11261

https://arxiv.org/abs/1905.10988

https://arxiv.org/abs/1903.06701

https://arxiv.org/abs/1901.09437

https://arxiv.org/abs/1901.09269

https://www.frontiersin.org/articles/10.3389/fams.2018.00062/abstract

https://arxiv.org/abs/1610.05492

https://arxiv.org/abs/1610.02527

About the Researcher

Peter Richtarik

Professor, Computer Science

Computer, Electrical and Mathematical Science and Engineering Division

Affiliations

Computer Science
Statistics
Applied Mathematics and Computational Science

Education Profile

PhD, Operations Research, Cornell University, 2007
MS, Operations Research, Cornell University, 2006
Mgr, Mathematics, Comenius University, 2001
Bc, Management Comenius University, 2001
Bc, Mathematics, Comenius University, 2000

Research Interests

Prof. Richtarik's research interests lie at the intersection of mathematics, computer science, machine learning, optimization, numerical linear algebra, high performance computing and applied probability. He is interested in developing zero, first, and second-order algorithms for convex and nonconvex optimization problems described by big data, with a particular focus on randomized, parallel and distributed methods. He is the co-inventor of federated learning, a Google platform for machine learning on mobile devices preserving privacy of users' data.

Selected Publications

R. M. Gower, D. Goldfarb and P. Richtarik. Stochastic block BFGS: squeezing more curvature out of data, Proceedings of The 33rd International Conference on Machine Learning, pp. 1869-1878, 2016
J. Konecny, J. Liu, P. Richtarik and M. Takac. Mini-batch semi-stochastic gradient descent in the proximal setting, IEEE Journal of Selected Topics in Signal Processing 10(2), 242a-255, 2016
P. Richtarik and M. Takac. Parallel coordinate descent methods for big data optimization Mathematical Programming 156(1):433a-484, 2016
R. M. Gower and P. Richtarik. Randomized iterative methods for linear systems, SIAM Journal on Matrix Analysis and Applications 36(4):1660-1690, 2015
O. Fercoq and P. Richtarik. Accelerated, parallel and proximal coordinate descent. SIAM Journal on Optimization 25(4):1997a-2023, 2015

Desired Project Deliverables

Ideally author or coauthor a research paper, and submit it to a premier conference in the field (e.g., ICML, AISTATS, NeurIPS, ICLR).

On this page