Research Papers research paper arxiv machine-learning deep-learning

Optimization Trade-offs in Asynchronous Federated Learning: A Stochastic Networks Approach

arXivMarch 30, 202610 min read0 views

arXiv:2603.26231v1 Announce Type: new Abstract: Synchronous federated learning scales poorly due to the straggler effect. Asynchronous algorithms increase the update throughput by processing updates upon arrival, but they introduce two fundamental challenges: gradient staleness, which degrades convergence, and bias toward faster clients under heterogeneous data distributions. Although algorithms such as AsyncSGD and Generalized AsyncSGD mitigate this bias via client-side task queues, most existing analyses neglect the underlying queueing dynamics and lack closed-form characterizations of the u — Abdelkrim Alahyane (LAAS-SARA), C\'eline Comte (CNRS, LAAS-SARA), Matthieu Jonckheere (CNRS, LAAS-SARA)

View PDF

Abstract:Synchronous federated learning scales poorly due to the straggler effect. Asynchronous algorithms increase the update throughput by processing updates upon arrival, but they introduce two fundamental challenges: gradient staleness, which degrades convergence, and bias toward faster clients under heterogeneous data distributions. Although algorithms such as AsyncSGD and Generalized AsyncSGD mitigate this bias via client-side task queues, most existing analyses neglect the underlying queueing dynamics and lack closed-form characterizations of the update throughput and gradient staleness. To close this gap, we develop a stochastic queueing-network framework for Generalized AsyncSGD that jointly models random computation times at the clients and the central server, as well as random uplink and downlink communication delays. Leveraging product-form network theory, we derive a closed-form expression for the update throughput, alongside closed-form upper bounds for both the communication round complexity and the expected wall-clock time required to reach an $\epsilon$-stationary point. These results formally characterize the trade-off between gradient staleness and wall-clock convergence speed. We further extend the framework to quantify energy consumption under stochastic timing, revealing an additional trade-off between convergence speed and energy efficiency. Building on these analytical results, we propose gradient-based optimization strategies to jointly optimize routing and concurrency. Experiments on EMNIST demonstrate reductions of 29%--46% in convergence time and 36%--49% in energy consumption compared to AsyncSGD.

Subjects:

Machine Learning (cs.LG); Performance (cs.PF); Optimization and Control (math.OC); Probability (math.PR)

Cite as: arXiv:2603.26231 [cs.LG]

(or arXiv:2603.26231v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.26231

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Celine Comte [view email] [via CCSD proxy] [v1] Fri, 27 Mar 2026 09:53:53 UTC (3,781 KB)

Original source

arXiv

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Products

5 Q’s with Oded Falik, CTO of Strand AI

The Center for Data Innovation recently spoke with Oded Falik, CTO of Strand AI, a San Francisco-based company developing machine-learning systems that analyze relationships between biological measurements to help researchers…

Center for Data Innovation

1m19 days ago

ModelsRecent

Defining causal mechanism in dual process theory and two types of feedback control

arXiv:2602.11478v3 Announce Type: replace Abstract: Mental events are considered to supervene on physical events. A supervenient event does not change without a corresponding change in the underlying subvenient physical events. Since wholes and their parts exhibit the same supervenience-subvenience relations, inter-level causation has been expected to serve as a model for mental causation. We proposed an inter-level causation mechanism to construct a model of consciousness and an agent's self-determination. However, a significant gap exists between this mechanism and cognitive functions. Here, we demonstrate how to integrate the inter-level causation mechanism with the widely known dual-process theories. We assume that the supervenience level is composed of multiple supervenient functions

arXiv q-bio.NC

2mabout 14 hours ago

Products

Mapping New Zealand’s Transport Demands

Researchers at the University of Canterbury in New Zealand have created a visual showing how regional transport demand, measured as passenger and freight activity, could evolve through 2050. The maps…

Center for Data Innovation

1m11 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 81 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersRecent

Energy Landscapes of Emotion: Quantifying Brain Network Stability During Happy and Sad Face Processing Using EEG-Based Hopfield Energy

arXiv:2603.27644v1 Announce Type: new Abstract: Understanding how the human brain instantiates distinct emotional states is a key challenge in affective neuroscience. While network-based approaches have advanced emotion processing research,they remain largely descriptive,leaving the dynamical stability of emotional brain states unquantified.This study introduces a novel framework to quantify this stability by applying Hopfield network energy to empirically derived functional connectivity. High density EEG was recorded from 20 healthy adults during a happy versus sad facial expression discrimination task. Functional connectivity was estimated using the weighted Phase Lag Index to obtain artifact-robust,frequency-specific matrices, which served as coupling weights in a continuous Hopfield en

arXiv q-bio.NC

2mabout 14 hours ago

Research Papers

Collaboration and Credit Principles

A lot of the best research in machine learning comes from collaborations. In fact, many of the most significant papers in the last few years (TensorFlow, AlphaGo, etc) come from collaborations of 20+ people. These collaborations are made possible by goodwill and trust between researchers.

Chris Olah Blog

1malmost 7 years ago

Research PapersRecent

BitSov: A Composable Bitcoin-Native Architecture for Sovereign Internet Infrastructure

arXiv:2603.28727v1 Announce Type: new Abstract: Today's internet concentrates identity, payments, communication, and content hosting under a small number of corporate intermediaries, creating single points of failure, enabling censorship, and extracting economic rent from participants. We present BitSov, an architectural framework for sovereign internet infrastructure that composes existing decentralized technologies (Bitcoin, Lightning Network, decentralized storage, federated messaging, and mesh connectivity) into a unified, eight-layer protocol stack anchored to Bitcoin's base layer. The framework introduces three architectural patterns: (1) payment-gated messaging, where every transmitted message requires cryptographic proof of a Bitcoin payment, deterring spam through economic incenti

arXiv cs.CR

1mabout 14 hours ago

Research PapersRecent

Information in a recurrent Retina-V1 network with realistic noise, feedback and nonlinearities

arXiv:2603.27347v1 Announce Type: new Abstract: Quantitative estimation of information flow in early vision with psychophysically realistic networks is still an open issue. This is because, up to date, the necessary elements (general and plausible network, accurate noise, and reliable information measures) have not been put together. As a result, previous works made different approximations that limit the generality of their results. This work combines the following elements for the first time: (1) General and plausible recurrent net: a cascade of linear+nonlinear psychophysically tuned layers [IEEE TIP.06, J.Neurophysiol.19, J.Math.Neurosci.20, Neurocomp.24], augmented to consider top-down feedback following [Nat.Neurosci.21,Neurips.22]. (2) Accurate noise in every layer, which is tuned t

arXiv q-bio.NC

2mabout 14 hours ago