Fairness and Robustness in Federated Learning
Description
Due to the inherent data heterogeneity, partial client participation, and the direct impact of the client updates on the global model, fairness and robustness are two important concerns in federated learning.
Fairness can be evaluated by the performance (accuracy) of the global model on clients' local data, while robustness measures the system's resilience to misbehavior from malicious clients.
Statistical heterogeneity often underlines the tension between accuracy, fairness, and robustness. Robust aggregations mitigate the impact of outliers, which also filter out informative updates and violate fairness, and up-weighting the importance of rare clients, the model can easily overfit to corrupted devices.
In this seminar, you will learn the strategies to tackle fairness and robustness simultaneously in the federated learning setting. The following papers can be a good starting point.
- Li, Tian, et al. "Fair resource allocation in federated learning." arXiv preprint arXiv:1905.10497 (2019).
- Blanchard, Peva, et al. "Machine learning with adversaries: Byzantine tolerant gradient descent." Advances in neural information processing systems 30 (2017).
- Li, Tian, et al. "Ditto: Fair and robust federated learning through personalization." International conference on machine learning. PMLR, 2021.
Supervisor:
Multi-round Privacy in Federated Learning
Description
Federated learning allows to train a machine learning model in a distributed manner, i.e., the training data are collected and stored locally by users such as mobile devices or multiple institutes. The training is under the coordination of a central server and performed iteratively. In each iteration, the server sends the current global model to the users, who update their local model and send the local updates to the server for aggregation.
FL is proposed to protect user's sensitive data since these training data never leave the user devices. However, works have shown that the local updates still leaks information about the local datasets. To deal with this leakage, SecAgg[1] is proposed. Secure aggregation is to make sure that the server only obtains the aggregation of the local updates rather than each individual update.
However, recent work [2] has shown that, SecAgg only preserves privacy of the users in a single training round. Due to user selection in federated learning, by observing the aggregated models over multiple training rounds, the server is able to recoverindividual local models of the users.
The goal of this seminar is to study and understand SecAgg [1], the multi-round privacy leakage it suffers and how is this problem solved in [2].
[1]. Bonawitz, Keith, et al. "Practical secure aggregation for privacy-preserving machine learning." proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017.
[2]. So, Jinhyun, et al. "Securing secure aggregation: Mitigating multi-round privacy leakage in federated learning." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 8. 2023.