The increasing popularity of large-scale and fully decentralized computational architectures, fueled for instance by the advent of the “Internet of Things” and machine learning in networks of agents (sensors, mobile phones, etc.), motivates the development of efficient optimization algorithms adapted to such constraints. Beyond standard parallelized optimization, machine learning algorithms must be adapted to settings with limited control over the network dynamics. This workshop aims at presenting state-of-the-art techniques dealing with distributed and decentralized settings on trending topics such as adapting models and fully asynchronous algorithms.
Analysis of asynchronous optimization algorithms: simpler proofs and sharper bounds
Mikael Johansson - KTH
Many machine learning tasks are naturally posed as optimization problems. Examples include linear or logistic regression, training of support vector machines and (deep) neural network training. Increasingly often, data sets are so large that they cannot be conveniently handled in a single computer but need to be processed in a parallel and distributed manner. To fully reap the benefits of parllelization, it is essential that the optimization algorithms can run asynchronously and avoid the significant idle waiting associated with global synchronization of worker nodes.
By now, several optimization algorithms exist with guaranteed convergence under asynchrony. However, most proofs are very long and difficult to penetrate, convergence conditions on step-sizes and other algorithm parameters are often hard to verify, and the performance guarantees tend to scale poorly with increasing level of asynchrony.
In this talk, I will describe our efforts to provide powerful, yet simple to apply, results for analyzing asynchronous algorithms. I will first discuss how time-delay models allow to capture phenomena such as inconsistent read from a shared memory and time-varying return times of workers in a master-server architecture. I will then describe two new convergence results for delayed sequences and show how they allow for simple convergence proofs with sharper guarantees than the current state of the art. Experimental results will be used to validate our theoretical findings.
Decentralized Collaborative Learning of Personalized Models over Networks
Aurélien Bellet - Inria Lille
We consider a set of learning agents in a collaborative peer-to-peer network, where each agent learns a personalized model according to its own learning objective. The question addressed in this paper is: how can agents improve upon their locally trained model by communicating with other agents that have similar objectives? We introduce and analyze two asynchronous gossip algorithms running in a fully decentralized manner. Our first approach, inspired from label propagation, aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model. In our second approach, agents jointly learn and propagate their model by making iterative updates based on both their local dataset and the behavior of their neighbors. Our algorithm to optimize this challenging objective in a decentralized way is based on ADMM.
Stochastic reformulations of linear systems and efficient randomized algorithms
Peter Richtarik - University of Edinburgh
We propose a new paradigm for solving linear systems. In our paradigm, the system is reformulated into a stochastic problem, and then solved with a randomized algorithm. Our reformulation can be equivalently seen as a stochastic optimization problem, stochastically preconditioned linear system, stochastic fixed point problem and as a probabilistic intersection problem. We propose and analyze basic and accelerated stochastic algorithms for solving the reformulated problem, with linear convergence rates.