### Seminars

## Random Walks for Decentralized Learning

#### Description

Fully decentralized schemes do not require a central entity and have been studied in [1, 2]. These works aim to reach consensus on a desirable machine learning model among all clients. We can mainly distinguish between i) gossip algorithms [3] where clients share their result with all neighbors, naturally leading to high communication complexities, and ii) random walk approaches like [4, 5] where the model is communicated only to a specific neighbor until matching certain convergence criteria. Such random walk approaches are used in federated learning to reduce the communication load in the network and at the clients’ side.

The main task of the student is to study the work in [5], which additionally accounts for the heterogeneity of the clients’ data. Further, drawbacks and limitations of the proposed approach should be determined.

[1] J. B. Predd, S. B. Kulkarni, and H. V. Poor, “Distributed learning in wireless sensor networks,” IEEE Signal Process. Mag., vol. 23, no. 4, pp. 56–69, 2006.

[2] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1–122, 2011.

[3] S. S. Ram, A. Nedi ?c, and V. V. Veeravalli, “Asynchronous gossip algorithms for stochastic optimization,” in IEEE Conf. Decis. Control. IEEE, 2009, pp. 3581–3586.

[4] D. Needell, R. Ward, and N. Srebro, “Stochastic gradient descent, weighted sampling, and the randomized kaczmarz algorithm,” Adv. Neural Inf. Process. Syst., vol. 27, 2014.

[5] G. Ayache, V. Dassari, and S. E. Rouayheb, “Walk for learning: A random walk approach for federated learning from heterogeneous data,” arXiv preprint arXiv:2206.00737, 2022.

#### Prerequisites

- Machine Learning and Statistics

- Information Theory

#### Supervisor:

## Hard Problems in Coding Theory

#### Description

For many applications, including post-quantum cryptography, it is of interest to know which problems are difficult to solve.

The task of the student would be to familiarize themselves with the various difficult problems that appear in coding theory such as the syndrome decoding problem, code equivalence problem, etc. and their variations. They would then summarise what is known about them in terms of the difficulty of solving them e.g. best known complexity, whether they are known to be NP-hard, etc.

**References:**

- Barg, Alexander. "Complexity issues in coding theory." Algebraic Coding (1998).

- Berlekamp, Elwyn, Robert McEliece, and Henk Van Tilborg. "On the inherent intractability of certain coding problems (corresp.)." IEEE Transactions on Information Theory 24.3 (1978): 384-386.

#### Prerequisites

- Channel Coding
- Complexity Theory

#### Supervisor:

## Post-Quantum Cyptography based on Codes: Ternary Syndrome Decoding

**Keywords:**

code-based cryptography, ternary syndrome decoding with large weight, decoding attack

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [2].

Recently, different variants of the classical syndrome decoding problem (SDP) in the Hamming metric have been proposed [1,3].

The main reason for this is that it appears hard to build an efficient digital signature scheme around the classical SDP.

One such variant is the ternary syndrome decoding with large weight, in which the error has few or no zero-entries [1].

The goal of this topic is understanding the properties of the decoding problem and analyzing the cost of existing solvers such as Sterns algorithm [2].

**Main Paper:**

[1] Bricout, R., Chailloux, A., Debris-Alazard, T., & Lequesne, M. (2020). Ternary syndrome decoding with large weight. In *Selected Areas in Cryptography–SAC 2019: 26th International Conference, Waterloo, ON, Canada, August 12–16, 2019, Revised Selected Papers 26* (pp. 437-466). Springer International Publishing.

can be found here: https://arxiv.org/pdf/1903.07464.pdf

**References:**

[2] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. *arXiv preprint arXiv:2201.07119*.

[3] Baldi, M., Bitzer, S., Pavoni, A., Santini, P., Wachter-Zeh, A., & Weger, V. (2023). Generic Decoding of Restricted Errors. *arXiv preprint arXiv:2303.08882*.

#### Prerequisites

Security in Communications and Storage

#### Supervisor:

## Rank-Metric Codes and Their Applications

#### Description

Rank-matric codes are codes that live in a vector space that is endowed with a different metric than the Hamming metric: in the rank-metric the distance between two codewords, represented as matrices over a smaller field, is defined as the rank of their difference.

The theory of rank-metric codes has interesting connections to many topics in discrete mathematics and promissing applications to code-based cryptography and network coding.

In this seminar work, the student will study properties and constructions of rank-metric codes and one or more applications. The goal is to understand and summarize parts of the following papers:

[1] H. Bartz, L. Holzbaur, H. Liu, S. Puchinger, J. Renner, A. Wachter-Zeh (2022). “Rank-Metric Codes and Their Applications”. arXiv: 2203.12384 https://arxiv.org/pdf/2203.12384.pdf

[2] K. Marshall, (2016). “A study of cryptographic systems based on Rank metric codes”, Dissertation, University of Zurich https://www.zora.uzh.ch/id/eprint/127105/1/Diss%20Kyle.pdf

[3] T. Randrianarisoa, (2018). “Rank metric codes, codes using linear complexity and applications to public key cryptosystems”, Dissertation, University of Zurich https://www.zora.uzh.ch/id/eprint/153545/1/153545.pdf

[4] E. Gorla (2019). “Rank-metric codes”. arXiv: 1902.02650 https://arxiv.org/pdf/1902.02650.pdf

[5] E. Gorla and A. Ravagnani. (2018). “Codes endowed with the rank metric”. In: Network Coding and Subspace Designs. Springer. 3–23. https://link.springer.com/content/pdf/10.1007/978-3-319-70293-3.pdf

#### Supervisor:

## Decoding Schemes beyond Unique Decoding Radius

**Keywords:**

unique decoding radius, Reed-Solomon codes, Homogeneous interleaved codes, array codes

**Short Description:**

Error-correcting codes beyond the designed decoding radius with high probability are interesting in storage applications.

#### Description

In the literature on coding theory, there are two classes of codes: scaler codes and array codes. In the array codes, a codeword of a linear code over a finite field $\mathbb{F}_p$, for some prime number $p$ represents a row of the array codeword over $\mathbb{F}_{p^m}$, where $m > 1$ is referred to as the subpacketization. This class of codes is promising in storage applications. For example, a flash memory unit commonly stores any of the $2^m$ values as a binary column representation such that each bit located on a different page belongs to a particular $2^m$-ary memory cell and holds either $0$ or $1$. The remarkable property of array codes is that they are seen as codes over an extension field with the same parameters, then $m > 1$ is referred to as the extension degree. Homogeneous interleaved codes [1] [2], types of array codes, have a significant feature which is increasing the decoding radius beyond $\lfloor\frac{d-1}{2}\rfloor$, i.e., they decode up to $d-2$ errors with high probability.

The student's tasks are differentiating these classes of codes, understanding their applications, and focusing on array codes. Then the student summarizes the algorithms/schemes that correct errors beyond the unique decoding radius, i.e., beyond $\lfloor\frac{d-1}{2}\rfloor$ suggested by [1] and [2].

[1] V. Y. Krachkovsky and Yuan Xing Lee, "Decoding for iterative Reed-Solomon coding schemes," in IEEE Transactions on Magnetics, vol. 33, no. 5, pp. 2740-2742, Sept. 1997, doi: 10.1109/20.617715.

[2] J. J. Metzner and E. J. Kapturowski, "A general decoding technique applicable to replicated file disagreement location and concatenated code decoding," in IEEE Transactions on Information Theory, vol. 36, no. 4, pp. 911-917, July 1990, doi: 10.1109/18.53757.

#### Prerequisites

- Basics of Channel Coding
- Finite field theory
- Linear Algebra

#### Supervisor:

## String Reconstruction from Substring Compositions: Application to Polymer-Based Storage

**Keywords:**

Polymer-based data storage, string reconstruction, composition errors, insertions, deletions

#### Description

Polymer-based storage involves storing information strings as a chain of molecules, the masses of which indicate whether the encoded bit is a 0 or 1. The readout mechanism requires tandem mass spectrometers which output compositions of all possible substrings, i.e. the number of 0s and 1s in a particular substring. The authors of [1] demonstrated that unique string reconstruction is only possible for specific string lengths, and also presented an algorithm for the same. In [2-5], code constructions were presented to facilitate this for all string lengths and also to correct substitution errors in the measured compositions.

[1] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String

reconstruction from substring compositions,” SIAM Journal on Discrete

Mathematics, vol. 29, no. 3, pp. 1340--1371, 2015.

[2] S. Pattabiraman, R. Gabrys and O. Milenkovic, "Coding for Polymer-Based Data Storage", arXiv:2003.02121, 2020

[3] R.Gabrys, S.Pattabiraman, and O.Milenkovic, "Mass error-correction codes for polymer-based data storage,” IEEE International Symposium on Information Theory, ISIT 2020, Los Angeles, CA, USA, June 21-26, 2020, pp. 25--30, IEEE, 2020.

[4] S.~Pattabiraman, R.~Gabrys, and O.~Milenkovic, “Reconstruction and error-correction codes for polymer-based data storage,” in IEEE Information Theory Workshop, ITW 2019, Visby, Sweden, August 25-28,2019, pp. 1--5, IEEE, 2019.

[5] R. Gabrys, S. Pattabiraman and O. Milenkovic, "Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions," 2020 IEEE Information Theory Workshop (ITW), 2021, pp. 1-5.