Fundamental limits of Byzantine-resilient distributed learning
Description
In a distributed learning setting, multiple worker machines are hired to help a main server with the expensive training of a machine learning model. Each worker is assigned a subtask, from which the main server tries to reconstruct the total computation result.
In the presence of faulty or malicious workers, called Byzantine workers, a common strategy is to distribute subtasks redundantly [3]. Since the redundancy introduces large computation overheads for the workers, strategies to reduce this overhead are required. One approach is to use interactive consistency checks at the main server, which can reduce the redundancy by up to 50% [1].
The interactive consistency checks are not for free, but cause additional computation and communication cost. For specific parameter regimes, this cost is well-studied. However, it is unkown how large this cost is in general. Therefore, we ask the following research questions:
1) How much computation is needed to guarantee error-free reconstruction of the computation result?
2) How much communication is needed?
3) What is the best trade-off between communication and computation cost?
The focus of this project is to study these research questions fundamentally. That is, we aim at understanding what the least amount of communication and computation possible is. The student will analyze these questions through mathematical tools, such as graph theory or information theory. The findings shall be compared against existing schemes [1,2] to evaluate their (sub-)optimality.
[1] C. Hofmeister, L. Maßny, E. Yaakobi and R. Bitar, "Byzantine-Resilient Gradient Coding Through Local Gradient Computations," in IEEE Transactions on Information Theory, vol. 71, no. 4, pp. 3142-3156, April 2025, doi: 10.1109/TIT.2025.3542896.
[2] S. Jain, L. Maßny, C. Hofmeister, E. Yaakobi and R. Bitar, "Interactive Byzantine-Resilient Gradient Coding for General Data Assignments," 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 2024, pp. 3273-3278, doi: 10.1109/ISIT57864.2024.10619596.
[3] R. Tandon, Q. Lei, A. G. Dimakis, N. Karampatziakis, "Gradient Coding: Avoiding Stragglers in Distributed Learning", in Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3368-3376, 2017.
Prerequisites
Mandatory:
- strong mathematical background
- prior knowledge in information theory
- basic knowledge in graph theory
- interest in theoretical fundamental research
Recommended:
- proficiency in algebra and probability theory
- basic knowledge in coding theory
Contact
luis.massny@tum.de, christoph.hofmeister@tum.de
Supervisor:
Understanding Guarantees and Pitfalls of Differential Privacy
Description
Many popular applications, such as recommender systems, data mining, or machine learning require the collection and processing of large datasets. In view of the ubiquitous data collecting, users claim for privacy, but guaranteeing privacy while providing useful data is challenging. Differential Privacy (DP) [1] has become a popular notion for data privacy, which quantifies the ability of a curious data analyst to identify the value of sensitive data points. To use DP in practical systems, it is important to understand the fundamental guarantees of a system that claims to ensure DP.
While it is sometimes believed that DP guarantees hold unconditionally and even in the presence of arbitrary side information, it has been shown that it is not possible to provide privacy and utility without making assumptions about how the data are generated [2]. In particular, dependence (correlation) between different database entries can be exploited to break the alleged privacy guarantees [3].
In this seminar topic, the student will familiarize herself with the definition and formal guarantees of DP and study the issues and pitalls of DP, particularly with a focus on dependent data distributions. The student will summarize the findings in the form of a scientific presentation and a scientific article, based on her own reading of scientific papers. These can include but are not necessarily limited to the recommended references [1-3].
[1] C. Dwork and A. Roth, “The Algorithmic Foundations of Differential Privacy,” TCS, 2014.
[2] D. Kifer and A. Machanavajjhala, "No free lunch in data privacy," Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11).
[3] C. Liu, S. Chakraborty, and P. Mittal, “Dependence Makes You Vulnerable: Differential Privacy Under Dependent Tuples,” in Proceedings of the Network and Distributed System Security Symposium, 2016.
Contact
Luis Maßny (luis.massny@tum.de)