# Distributed Gradient Methods with Variable Number of Working Nodes

@article{Jakoveti2016DistributedGM, title={Distributed Gradient Methods with Variable Number of Working Nodes}, author={Du{\vs}an Jakoveti{\'c} and Dragana Bajovi{\'c} and Natasa Krejic and Nata{\vs}a Krklec Jerinki{\'c}}, journal={IEEE Transactions on Signal Processing}, year={2016}, volume={64}, pages={4080-4095} }

We consider distributed optimization where N nodes in a connected network minimize the sum of their local costs subject to a common constraint set. We propose a distributed projected gradient method where each node, at each iteration k, performs an update (is active) with probability pk, and stays idle (is inactive) with probability 1-pk. Whenever active, each node performs an update by weight-averaging its solution estimate with the estimates of its active neighbors, taking a negative gradient… Expand

#### 24 Citations

Distributed first and second order methods with increasing number of working nodes

- Mathematics, Computer Science
- 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP)
- 2016

The presented mechanism relaxes the requirement that all nodes are active at all iterations k, which significantly improves the computational and communication efficiency of some recently proposed first and second order distributed methods for solving distributed optimization problems. Expand

Distributed second order methods with variable number of working nodes

- Computer Science, Mathematics
- ArXiv
- 2017

It is demonstrated that the idling mechanism can be successfully incorporated in distributed second order methods also, thus achieving the same order of convergence rate (R-linear) as the standard DQN, but with significantly cheaper updates. Expand

Newton-like Method with Diagonal Correction for Distributed Optimization

- Mathematics, Computer Science
- SIAM J. Optim.
- 2017

This work proposes a class of distributed Newton-like methods, which it refers to as Distributed Quasi Newton (DQN), which approximates the Hessian inverse by: 1) splitting the Hessians into its diagonal and off-diagonal part, 2) inverting the diagonal part, and 3) approximating the inverse of the off- diagonal part through a weighted linear function. Expand

Resource-Aware Exact Decentralized Optimization Using Event-Triggered Broadcasting

- Computer Science, Mathematics
- IEEE Transactions on Automatic Control
- 2021

The linearized augmented Lagrangian method is used to design an event-triggered decentralized algorithm that only requires light local computation at generic time instants and peer-to-peer communication at sporadic triggering time instant results show its performance and superiority in exploiting communication resources. Expand

Optimality Trade-offs For Distributed Estimation Anit

- 2018

This paper proposes Communication efficient REcursive Distributed estimatiOn algorithm, CREDO, for networked multi-worker setups without a central master node. CREDO is designed for scenarios in… Expand

Accelerated Distributed Dual Averaging Over Evolving Networks of Growing Connectivity

- Computer Science, Mathematics
- IEEE Transactions on Signal Processing
- 2018

This paper extends the distributed dual averaging (DDA) subgradient algorithm to evolving networks of growing connectivity and provides quantitative evaluation of DDA acceleration for distributed optimization that is absent in the existing analysis. Expand

Stochastic sub-gradient algorithm for distributed optimization with random sleep scheme

- Mathematics
- 2015

In this paper, we consider a distributed convex optimization problem of a multi-agent system with the global objective function as the sum of agents’ individual objective functions. To solve such an… Expand

Inference and Optimization over Networks: Communication Efficiency and Optimality

- Computer Science
- 2018

This thesis proposes distributed recursive algorithms where the networked entities simultaneously incorporate locally sensed information and information obtained from the neighborhood in distributed setups which exhibit data parallelism and also potentially analytically intractable loss functions. Expand

Communication-Efficient Distributed Strongly Convex Stochastic Optimization: Non-Asymptotic Rates.

- Mathematics
- 2018

We examine fundamental tradeoffs in iterative distributed zeroth and first order stochastic optimization in multi-agent networks in terms of \emph{communication cost} (number of per-node… Expand

Distributed Event-Triggered Gradient Method for Constrained Convex Minimization

- Mathematics, Computer Science
- IEEE Transactions on Automatic Control
- 2020

This paper investigates the distributed gradient method for large-scale convex constrained problems with event-triggered consensus protocols and shows that the convergence can be ensured provided that the event-triggering threshold bound is square summable, and the stepsize satisfies specific conditions that are characterized by the Lipschitz constant of the gradient. Expand

#### References

SHOWING 1-10 OF 66 REFERENCES

Fast Distributed Gradient Methods

- Mathematics, Computer Science
- IEEE Transactions on Automatic Control
- 2014

This work proposes two fast distributed gradient algorithms based on the centralized Nesterov gradient algorithm and establishes their convergence rates in terms of the per-node communications K and theper-node gradient evaluations k. Expand

Fast cooperative distributed learning

- Mathematics, Computer Science
- 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR)
- 2012

This work proposes a distributed gradient-like algorithm, that is built from the (centralized) Nesterov gradient method, that converges at rate O(log k/k) and demonstrates the gains obtained on two simulation examples: acoustic source localization and learning a linear classifier based on l2-regularized logistic loss. Expand

Network Newton-Part I: Algorithm and Convergence

- Mathematics
- 2015

We study the problem of minimizing a sum of convex objective functions where the components of the objective are available at different nodes of a network and nodes are allowed to only communicate… Expand

A fast distributed proximal-gradient method

- Computer Science, Mathematics
- 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
- 2012

This paper shows that this distributed proximal-gradient method for optimizing the average of convex functions, each of which is the private local objective of an agent in a network with time-varying topology converges at the rate 1/k, which is faster than the convergence rate of the existing distributed methods for solving this problem. Expand

Network Newton

- Mathematics, Computer Science
- 2014 48th Asilomar Conference on Signals, Systems and Computers
- 2014

The Network Newton method is proposed as a distributed algorithm that incorporates second order information via distributed evaluation of approximations to Newton steps and adaptive (A)NN is introduced in order to establish exact convergence. Expand

Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization

- Computer Science, Mathematics
- J. Optim. Theory Appl.
- 2010

This paper considers a distributed multi-agent network system where the goal is to minimize a sum of convex objective functions of the agents subject to a common convex constraint set, and investigates the effects of stochastic subgradient errors on the convergence of the algorithm. Expand

Convergence analysis of distributed subgradient methods over random networks

- Mathematics, Computer Science
- 2008 46th Annual Allerton Conference on Communication, Control, and Computing
- 2008

This work considers the problem of cooperatively minimizing the sum of convex functions, where the functions represent local objective functions of the agents and proposes a distributed subgradient method that uses averaging algorithms for locally sharing information among the agents. Expand

Asynchronous gossip algorithms for stochastic optimization

- Computer Science
- Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference
- 2009

An asynchronous algorithm that is motivated by random gossip schemes where each agent has a local Poisson clock and it is proved that the gradients converge to zero with probability 1 and the iterates converge to an optimal solution almost surely. Expand

Distributed Nesterov-like gradient algorithms

- Mathematics, Computer Science
- 2012 IEEE 51st IEEE Conference on Decision and Control (CDC)
- 2012

This paper presents a distributed, constant step size, Nesterov-like gradient algorithm that converges much faster than existing distributed (sub)gradient methods, with zero additional communications and very small additional computations per iteration, on a class of convex functions with Lipschitz continuous first derivative. Expand

Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

- Mathematics, Computer Science
- IEEE Transactions on Automatic Control
- 2012

This work develops and analyze distributed algorithms based on dual subgradient averaging and provides sharp bounds on their convergence rates as a function of the network size and topology, and shows that the number of iterations required by the algorithm scales inversely in the spectral gap of thenetwork. Expand