Fundamental Limits of Information-Theoretic Secure Aggregation
Description
Secure aggregation is a fundamental cryptographic technique used in distributed machine learning paradigms, such as federated learning, enabling a central server to compute the sum of users' inputs without learning anything about their individual data.
This seminar explores the fundamental information-theoretic limits of secure aggregation by analyzing two critical extensions to the standard model. The first paper [1] investigates a two-round communication protocol for handling unknown user dropouts, characterizing the optimal communication cost required. The second paper [2] extends the traditional client-server architecture to a hierarchical network (clients, relays, and server). It introduces a "relay security" constraint to keep intermediate nodes oblivious to user inputs and establishes the optimal trade-off between communication efficiency and key generation efficiency.
[1]. Zhao, Yizhou, and Hua Sun. "Information theoretic secure aggregation with user dropouts." IEEE Transactions on Information Theory 68.11 (2022): 7471-7484.
[2]. Zhang, Xiang, et al. "Optimal communication and key rate region for hierarchical secure aggregation with user collusion." IEEE Transactions on Information Theory 72.2 (2025): 1030-1050.
Prerequisites
Information Theory
Supervisor:
Multi-round Privacy in Federated Learning
Description
Federated learning allows to train a machine learning model in a distributed manner, i.e., the training data are collected and stored locally by users such as mobile devices or multiple institutes. The training is under the coordination of a central server and performed iteratively. In each iteration, the server sends the current global model to the users, who update their local model and send the local updates to the server for aggregation.
FL is proposed to protect user's sensitive data since these training data never leave the user devices. However, works have shown that the local updates still leaks information about the local datasets. To deal with this leakage, SecAgg[1] is proposed. Secure aggregation is to make sure that the server only obtains the aggregation of the local updates rather than each individual update.
However, recent work [2] has shown that, SecAgg only preserves privacy of the users in a single training round. Due to user selection in federated learning, by observing the aggregated models over multiple training rounds, the server is able to recoverindividual local models of the users.
The goal of this seminar is to study and understand SecAgg [1], the multi-round privacy leakage it suffers and how is this problem solved in [2].
[1]. Bonawitz, Keith, et al. "Practical secure aggregation for privacy-preserving machine learning." proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017.
[2]. So, Jinhyun, et al. "Securing secure aggregation: Mitigating multi-round privacy leakage in federated learning." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 8. 2023.