Workshop on Distributed Computing, Optimization & Learning WDCL 2025

Schedule

Wednesday, September 3

Time	Speaker / Activity	Topic
09:00 – 09:15	Registration and arriving
09:15 – 09:30	Welcome by the Organizers
09:30 – 10:30	Rüdiger Urbanke	The Universality Lens: Information-Theoretic Insights into Over-Parameterized Models and Generalization
10:30 – 11:30	Coffee break
11:30 – 12:15	Salim El Rouayheb	Random Walk Learning and the Pac-Man Attack
12:15 – 14:00	Lunch
14:00 – 14:45	Razane Tajeddine	Privacy-preserving and Fair Data Sharing from Vertically Partitioned Data
14:45 – 15:30	Themistoklis Charalambous	Communication-aware Distributed Coordination and Optimization - Part II
15:30 – 17:30	Poster session + Snacks and Coffee
19:00	Informal gathering

Thursday, September 4

Time	Speaker / Activity	Topic
09:30 – 10:30	Ananda Theertha Suresh	A Tale at Two Scales: Efficient Distributed Learning
10:30 – 11:30	Coffee break
11:30 – 12:15	Alexandre Graell i Amat	Subgraph Federated Learning via Spectral Methods
12:15 – 14:00	Lunch
14:00 – 14:45	Zheng Chen	Enhancing Differentially Private Decentralized Learning via Correlated Noise
14:45 – 15:30	Abdellatif Zaidi	Distributed Statistical Learning: Architectures, Algorithms and Information and Communication Views
15:30 – 17:30	Poster session + Snacks and Coffee
19:00	Social Dinner	Max-Emanuel Brauerei

Friday, September 5

Time	Speaker / Activity	Topic
09:30 – 10:30	Mohammad Ali Maddah-Ali	Learning Theory Meets Coding Theory: A Foundation for General Coded Computing
10:30 – 11:30	Coffee break
11:30 – 12:15	Alex Sprintson	Algorithms for Privacy-Preserving and Secure Computation in Untrusted Environments
12:15 – 14:00	Lunch
14:00 – 14:45	Ali Khalesi	Tessellated Distributed Computing and Beyond: From Linear Foundations to Non-Linear Frontiers
14:45 – 15:30	Zoran Utkovski	AI for RAN or RAN for AI?
15:30 – 17:30	Concluding Remarks

Talks

Rüdiger Urbanke (École Polytechnique Fédérale de Lausanne)

The Universality Lens: Information-Theoretic Insights into Over-Parameterized Models and Generalization

Wednesday, September 3, 09:30-10:30

Abstract:

We investigate generalization through the lens of a Bayesian mixture learner using log-loss and an (almost) uniform prior over a vast hypothesis class. A central result reveals that the learner’s regret is not governed by the size of the hypothesis class, but rather by the cumulative probability mass of models close—measured via Kullback-Leibler (KL) divergence—to the true data-generating process. We term this quantity the weight of the hypothesis.

This gives rise to a natural definition of model simplicity: simple models are those with high weight and require fewer samples to generalize, while complex models have low weight and demand more data. This framework provides a rigorous and intuitive explanation for why over-parameterized models often generalize well—posterior concentration favors simple hypotheses when supported by data.

We also bridge theory with practice by noting that stochastic gradient descent with Langevin dynamics approximates Bayesian posterior sampling. This connection enables the theoretical learner to be realized using standard machine learning techniques augmented with ensemble learning.

Our analysis produces non-uniform regret bounds and connects with key practical concepts such as flat minima and model distillation. The results apply broadly across online, batch, and supervised learning settings, offering a unified, principled view of generalization in modern AI systems.

This is joint work with Meir Feder and Yaniv Fogel.

Salim El Rouayheb (Rutgers University)

Random Walk Learning and the Pac-Man Attack

Wednesday, September 3, 11:30-12:15

Abstract:

Random walks (RWs) offer a communication-efficient and scalable primitive for decentralized learning, but they face serious robustness challenges under adversarial conditions. We introduce the Pac-Man attack, where a malicious node probabilistically terminates visiting walks, progressively undermining the learning process. Even if the system begins with multiple redundant walks, all RWs will eventually be killed. To counter this, we view RWs as a population that can multiply through self-duplication. We study the Average Crossing algorithm, where walks duplicate based solely on timing information between successive visits at each node. Our analysis shows that, under suitable parameter regimes, the RW population remains bounded yet persistent, with the probability of extinction bounded away from one. Moreover, we uncover a sharp phase transition in survival probability as a function of the duplication threshold.

Joint work with Xingran Chen and Parimal Parag.

Razane Tajeddine (American University of Beirut)

Privacy-preserving and Fair Data Sharing from Vertically Partitioned Data

Wednesday, September 3, 14:00-14:45

Abstract:

Predictive machine learning is increasingly reliant on large volumes of data to forecast future events. While large amounts of data leads to more accurate and powerful models, it also raises critical concerns about user privacy and fairness, particularly when the data contains sensitive or personally identifiable information. These challenges are further amplified when the data is distributed or partitioned across multiple parties. Combining data from different parties for training typically allows for better model performance, as it enables the model to learn from a more comprehensive and diverse dataset, possibly resulting in improved generalization and predictive power. However, in most cases, each party holds a portion of the data that cannot be directly shared due to legal, ethical, or regulatory constraints, including data protection laws like GDPR, institutional policies, or user consent agreements. Furthermore, data heterogeneity, the variation in data distributions across different clients, can introduce or exacerbate significant group fairness issues. This often results in models that perform well for the majority or well-represented groups but underperform for the minority, underrepresented or marginalized populations, leading to inequitable outcomes.

Themistoklis Charalambous (University of Cyprus)

Communication-aware Distributed Coordination and Optimization - Part II

Wednesday, September 3, 14:45-15:30

Abstract:

In today’s increasingly connected world, there is growing interest in a new generation of intelligent, often mobile, cooperative autonomous systems. These systems rely on advanced sensing and decision-making capabilities to coordinate with one another in distributed environments. Such coordination, which is essential in areas like intelligent transportation and factory automation, is typically carried out over wireless networks. In this context, communication is not just a supporting function but a core enabler of efficient, scalable, and robust cooperation. This presentation will begin by recapping the key ideas and findings introduced in last year’s presentation, which focused on the fundamentals of communication-aware distributed coordination and optimization. Building on those results, we will present the latest developments in how wireless communication can be more tightly integrated into coordination and optimization strategies.

Ananda Theertha Suresh (Google Research)

A Tale at Two Scales: Efficient Distributed Learning

Thursday, September 4, 09:30-10:30

Abstract:

This talk discusses the recent rise in popularity of distributed learning at two very different scales. We start with cross-device federated learning, a method used to train relatively small models on user data from millions of devices. We then shift to the other end of the spectrum and explore the recent growth of distributed training for large language models. We examine the common bottlenecks that affect training at both scales: communication bandwidth and memory. Finally, we review a few of the recent solutions that address these bottlenecks and highlight some key open questions.

Alexandre Graell i Amat (Chalmers University of Technology)

Subgraph Federated Learning via Spectral Methods

Thursday, September 4, 11:30-12:15

Abstract:

We consider the problem of federated learning (FL) with graph-structured data distributed across multiple clients. In particular, we address the prevalent scenario of interconnected subgraphs, where interconnections between clients significantly influence the learning process. Existing approaches suffer from critical limitations, either requiring the exchange of sensitive node embeddings, thereby posing privacy risks, or relying on computationally-intensive steps, which hinders scalability.

To tackle these challenges, we propose FedLap, a novel framework that leverages global structure information via Laplacian smoothing in the spectral domain to effectively capture inter-node dependencies while ensuring privacy and scalability. We provide a formal analysis of the privacy of FedLap, demonstrating that it preserves privacy. Notably, FedLap is the first subgraph FL scheme with strong privacy guarantees. We also show that FedLap achieves competitive or superior utility compared to existing techniques.

Zheng Chen (Linköping University)

Enhancing Differentially Private Decentralized Learning via Correlated Noise

Thursday, September 4, 14:00-14:45

Abstract:

Decentralized learning facilitates collaborative training of a shared machine learning model among distributed agents without the need for a central server, by iteratively combining local computation and inter-agent mixing. Although each agent retains its dataset locally, sharing local models can still expose private information about the local training datasets to adversaries. To mitigate privacy attacks, a common strategy is to inject random artificial noise at each agent before exchanging local models between neighbors. However, this often leads to utility degradation due to the negative effects of cumulated privacy noise on the learning algorithm. In this talk, I will demonstrate how the privacy-utility tradeoff can be improved by leveraging the power of correlated noise. The key idea is to generate topology-aware correlated noise across distributed agents using a covariance-based approach, allowing the aggregated noise to largely cancel out during the mixing process. Our generalized correlated noise design unifies several state-of-the-art methods as special cases and achieves superior performance than existing independent and pairwise correlation schemes, improving model performance under formal privacy guarantees.

Abdellatif Zaidi (Université Gustave-Eiffel and Huawei)

Distributed Statistical Learning: Architectures, Algorithms and Information and Communication Views

Thursday, September 4, 14:45-15:30

Abstract:

This talk is aimed at providing a survey of the recent progress in the area of distributed statistical learning as well as its applications. First we show that, perhaps in sharp contrast what the intuition suggests, statistical learning can perform better when applied distributively, rather than in a centralized manner. In particular, it will be shown that distributed learning has a smaller generalization error, i.e., is more robust to variations in the input data; and the improvement over centralized processing increases with the number of clients. We also discuss multi-round interactive learning models (such as with the popular FedAvg) and show that, in this case, more interactions with the parameter server does not result in a smaller generalization error. This suggests that local SGD, in which learning is on small-sized batches of data and then models are aggregated suitably, is better than the popular SGD. Then, we show that data-heterogeneity across clients, a phenomenon that is largely believed to be detrimental to efficient learning, can actually help generalize better.

If time permits, in the second part of this talk, we also discuss architectures, training algorithms, as well as a number of relevant connections with seemingly unrelated communication and information-theoretic problems, such as communication over Cloud Radio Access Networks with oblivious relays and multiterminal hypothesis testing under communication constraints.

Mohammad Ali Maddah-Ali (University of Minnesota Twin Cities)

Learning Theory Meets Coding Theory: A Foundation for General Coded Computing

Friday, September 5, 09:30-10:30

Abstract:

Running distributed learning on server clusters faces challenges such as straggler delays and adversarial behavior. Coded computation, which injects redundancy into the process, offers a promising remedy. However, existing methods, rooted in classical error-correcting codes, lack the flexibility required for the diverse and complex computations of modern distributed machine learning.

In this talk, we introduce a new framework for code design that is grounded in learning theory. Within this framework, the estimation error of coded computing is linked to the generalization gaps arising in a sequence of two nested learning problems. This perspective unlocks the rich toolbox of non-parametric learning, enabling us to systematically construct numerically stable encoders and decoders for general computations, while providing provable approximation guarantees. Numerical experiments further demonstrate the effectiveness of this approach on challenging tasks, including inference in models with millions of parameters.

Alex Sprintson (Texas A&M University)

Algorithms for Privacy-Preserving and Secure Computation in Untrusted Environments

Friday, September 5, 11:30-12:15

Abstract:

In many practical settings, users must rely on an untrusted cloud server to perform computational tasks. The main challenge in such cases is that the information a user wants to keep private might be exposed to a curious adversary hosting the computation service. We introduce several algebraic and code-theoretic techniques that balance efficiency and privacy. Unlike generic homomorphic encryption methods, which require high computational effort, our approaches utilize the unique properties of the specific tasks to enable fast and secure computation. We demonstrate how our techniques can be applied to data analysis and machine learning workloads.

Ali Khalesi (Institut Polytechnique des Sciences Avancées, Paris)

Tessellated Distributed Computing and Beyond: From Linear Foundations to Non-Linear Frontiers

Friday, September 5, 14:00-14:45

Abstract:

This talk introduces Tessellated Distributed Computing (TDC), a new framework for analyzing the interplay between computation, communication, and accuracy in large-scale distributed systems. By reformulating multi-user distributed computation of linearly-decomposable functions as a sparse fixed-support matrix factorization problem, TDC leverages tessellation patterns and SVD-based decompositions to design schemes that achieve provably optimal tradeoffs in the zero-error regime and asymptotically tight bounds in the lossy regime. The analysis reveals sharp capacity characterizations, computation–communication tradeoffs, and concise error expressions derived from random matrix theory.

I will then discuss how these foundations pave the way for future extensions beyond linearity. In particular, I will highlight the challenges of non-linear decompositions, quantized computation, and structured approximations, and their implications for distributed optimization, large-scale learning, and federated inference. Time permitting, I will also touch on related open problems where tessellation-inspired approaches may reveal new limits and algorithmic insights for modern distributed systems.

Zoran Utkovski (Fraunhofer - Institut für Nachrichtentechnik, Heinrich-Hertz-Institut)

AI for RAN or RAN for AI?

Friday, September 5, 14:45-15:30

Abstract:

There has been a recent focus on integrating AI into cellular technology with the aim to enhance mobile network efficiency, reduce power consumption, and retrofit existing communication infrastructure. According to the AI RAN Alliance, this opens the way for research and innovation in the areas of “AI for RAN” - advancing RAN capabilities through AI; and “AI on RAN” – deploying AI services at the network edge through RAN to offer new services to mobile users. In this context, we will present solutions for edge inference where inference tasks are split between the device and the network side, with special attention to the emerging trade-offs between computation complexity and energy efficiency on one side, and communication overhead and inference accuracy on the other side. We will also provide relation to “semantic communication” and will discuss the underlying informatic-theoretic background. Selected aspects will be presented in the form of Proof-of-Concept Demonstrations.