Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

# Offered Theses

Please contact the doctoral researchers directly if you are interested in a Bachelor or Master thesis, a student job, an "Ingenieurspraxis" or a "Forschungspraxis". It is also usually possible to find a topic that matches your specific interests. Please include a curriculum vitae together with a list of attended courses when applying for a thesis. If your "Ingenieurspraxis" is selected to be supervised by one of our professors, please hand in the documents to Doris Dorn (Room N2401).

## Coding with Feedback

#### Description

In paper

"Correcting a Single Error in Feedback Channels" the problem of correcting a single error with feedback was investigated. It was mainly devoted to binary channels, namely, binary symmetric and asymmetric channels. A general theorem, which allows constructing strategies with one feedback, was proved. For the symmetric channel with one error, it was proved that with two feedbacks one can transmit as many messages as with complete feedback. For the asymmetric channel, some strategies for small lengths have been proposed. Later some results for the binary symmetric channel have been generalized for the q-ary symmetric channel. (These results are not published yet, I can send the draft).

There are multiple ways, how this research can be continued:

1. One can try to generalize the results from mentioned paper for the case of 2 errors or a constant number of errors.
2. The methods developed in the paper "Correcting a Single Error in Feedback Channels" allow us to construct optimal codes for the asymmetric channel with complete feedback and compute their sizes. However, it requires a lot of time. It would be interesting to prove a general formula for the size of such optimal codes.
3. Another direction is to consider non-symmetric q-ary channels.

## Coding theory in different metrics

#### Description

In this thesis, the student will study the mathematics of codes in diffent metrics such as the Hamming metric, (sum-)rank metric, column/row-cover metric, etc.

The focus can lie on similar mathematical ideas shared across different metrics, such as bounds on codes, good code constructions, decoding algorithms, applications.

## Code-based Cryptography

#### Description

In this thesis, the student will study the mathematics of linear codes and how they can be used to design cryptosystems.

## [identification] Idnetification and Secrecy with Physically-Unclonable-Functions (PUFs)

Keywords:
PUF secrecy identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

From previous work we have a fairly efficient implementation based Reed-Muller code which can be found at

Secrecy in this identification codes has also been implemented in unpublished work. Furthermore, the theoretical work on Identification with PUF's has been done in

The goal of the project will be to bridge the three topics and create practical and efficient secret identification codes in the PUF setting.

The working language will be in English.

Environment: this is a project in collaboration with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Statistical Decoding

Keywords:
code-based cryptography, decoding attack

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

In order to solve the generic decoding problem, algorithms from the  information set decoding (ISD) family can be used.
During the last 60 years, small improvements to this approach were made. During this time, other algorithms, such as statistical decoding [2], were proposed, but failed to achieve the performance of ISD [3].

Recently, a variant of statistical decoding was proposed that claims to perfom better than the best ISD variants for low code rates [4].

If you are interested, please write an email, then we'll discuss the details.

Main Paper:

Carrier, K., Debris-Alazard, T., Meyer-Hilfiger, C., & Tillich, J. P. (2022). Statistical Decoding 2.0: Reducing Decoding to LPN. arXiv preprint arXiv:2208.02201.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Jabri, A. A. (2001, December). A statistical decoding algorithm for general linear block codes. In IMA International Conference on Cryptography and Coding (pp. 1-8). Springer, Berlin, Heidelberg.

[3] Debris-Alazard, T., & Tillich, J. P. (2017, June). Statistical decoding. In 2017 IEEE International Symposium on Information Theory (ISIT).

[4] Carrier, K., Debris-Alazard, T., Meyer-Hilfiger, C., & Tillich, J. P. (2022). Statistical Decoding 2.0: Reducing Decoding to LPN. arXiv preprint arXiv:2208.02201.

#### Prerequisites

Channel coding

Security in Communications and Storage

Probability theory and statistics

## Private and Secure Federated Learning

#### Description

In federated learning, a machine learning model shall be trained on private user data with the help of a central server, the so-called federator. This setting differs from other machine learning settings in that the user data shall not be shared with the federator for privacy reasons and/or to decrease the communication load of the system.

Even though only intermediate results are shared, extra care is necessary to guarantee data privacy. An additional challenge arises if the system includes malicious users that breach protocol and send corrupt computation results.

The goal of this work is to design, implement and analyze coding- and information-theoretic solutions for privacy and security in federated learning.

#### Prerequisites

• Coding Theory (e.g., Channel Coding)
• Information Theory
• Machine Learning Basics

## [identification] Pseudo-Random Identification

Keywords:
random pseudo identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

One advantage can be gained by using pseudo randomness to generate both the input and the code itself.
Your task will be implementing the identification codes described in the attached pdf (an english translation of a paper published in russian in a russian journal) aiming at the fastest implementation and smallest collisions, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI from TUM and CeTI from TU Dresden, the latter having already some preliminary implementation of pseudo-random identification using various pseudo-random generators. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Error Correcting Codes for Memories with (Partially) Defects

Keywords:
Linear Codes, Algebraic Codes, Error Correction , Masking Defects, Flash Memories, Phase-Change Memories

#### Description

For different applications, the demand for reliable memory solutions in particular for non-volatile memories such as phase-change memories (PCMs)  is rapidly increasing. PCM cells may become defective (also called stuck) either fully or partially if they fail in switching their states, and therefore these cells can only hold a single phase. In response to these defects, combined masking and error-correcting code constructions have been proposed, where masking is for hiding the defects while error-correcting is to compromise potential added-channel errors.  We want to investigate further code constructions such that less overall redundancy is required to handle these two types of errors. As an alternate, work for combined erasure errors and masking code constructions could be investigated.

#### Prerequisites

- Basic principle of Linear Algebra

- Channel Coding/Coding Theory

- Basic knowledge in Information Theory

#### Contact

M.Eng. Haider Al Kim

Doctoral Researcher

Technical University of Munich

Department of Electrical and Computer Engineering /

Coding and Cryptography (COD) Group

Email: haider.alkim@tum.de

## Deterministic K-Identification For The DMC With Power Constraint

Keywords:
Identification via channel, K-identification, deterministic codes
Short Description:
K-identification capacity of a DMC is derived.

#### Description

The student attempt to study the deterministic identification capacity

of a DMC subject to power constraint and generalize it for the K-identification.

#### Prerequisites

Basics of Information Theory and Channel Coding.

Familiarity with the fundamentals of Identification Theory

## [identification] Implementation of identification with universal hash functions

Keywords:
universal hash identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

aiming at the fastest implementation, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Efficient Block Propagation in Cryptocurrency Networks

#### Description

Cryptocurrencies like Bitcoin and Ethereum use a decentralized ledger called Blockchain to track transactions. Whenever a new block is added to the Blockchain, the change is spread through the network using a gossip-like protocol. This process is known as block propagation.

To increase scalability, the efficiency of block propagation is crucial. This thesis aims to explore the information theoretic limits of block propagation, derive realistic models based on real data, and investigate innovative and efficient techniques for block propagation.

The thesis will be conducted at the Institute of Communications and Navigation at DLR (German Aerospace Center) in Oberpfaffenhofen.

#### Prerequisites

Required qualifications are

• basic knowledge of information theory
• programming experience in Matlab, C, or python.
• Interest in cryptocurrencies.

#### Contact

Interested applicants may contact Dr. Francisco Lázaro via email at francisco.lazaroblasco@dlr.de.

#### Supervisor:

Juan Diego Lentner Ibanez - Dr. Francisco Lázaro (DLR (German Aerospace Center))

## Channel Coding: Efficient Decoding for GC Codes and General Codes

Keywords:
channel coding, efficient decoding, generalized concatenated codes
Short Description:
We develop efficient decoders for short block codes.

#### Description

Arising applications, such as machine-to-maschine communication require error-correction codes with short information length.The design of such codes and efficient decoders is an open problem [1].

Recently, Reed-Muller codes have gained a lot of interest, because of thier good error-correction capability and their structure, which allows for low-complexity decoders, see, e.g., [2].

It has been known for quite some time that Reed-Muller codes belong to a more general class of codes: the Generalized Concatenated (GC) Codes [3].
This class allows for more flexible code design, e.g., with respect to the information rate of the code.
Hence, by transfering and refining results for Reed-Muller codes to GC codes, one could improve over existing solutions.

Another approach to obtain good results in the short-length regime is using the best-known codes [4].
Since these codes do usually not have a structure that enables efficient decoding, one has to perform decoding of a general linear code. The most efficient approaches are variants of Ordered Statistics Decoding (OSD) [5]. The idea for improving over state-of-the-art varaints is to encorporate recent improvements from another field of research: Information Set Decoding.

If you are interested in either of the directions (or have some other direction in mind), please write an email, then we'll discuss the details.

References:

[1] Co?kun, M. C., Durisi, G., Jerkovits, T., Liva, G., Ryan, W., Stein, B., & Steiner, F. (2019). Efficient error-correcting codes in the short blocklength regime. Physical Communication, 34, 66-79.

[2] Geiselhart, M., Elkelesh, A., Ebada, M., Cammerer, S., & ten Brink, S. (2021). Automorphism ensemble decoding of Reed–Muller codes. IEEE Transactions on Communications, 69(10), 6424-6438.

[3] Schnabl, G., & Bossert, M. (1995). Soft-decision decoding of Reed-Muller codes as generalized multiple concatenated codes. IEEE Transactions on Information Theory, 41(1), 304-308.

[4] Markus Grassl. "Bounds on the minimum distance of linear codes and quantum codes." Online available at http://www.codetables.de.

[5] Fossorier, M. P., & Lin, S. (1995). Soft-decision decoding of linear block codes based on ordered statistics. IEEE Transactions on Information Theory, 41(5), 1379-1396.

Channel coding

## Coding with Feedback

#### Description

In paper

"Correcting a Single Error in Feedback Channels" the problem of correcting a single error with feedback was investigated. It was mainly devoted to binary channels, namely, binary symmetric and asymmetric channels. A general theorem, which allows constructing strategies with one feedback, was proved. For the symmetric channel with one error, it was proved that with two feedbacks one can transmit as many messages as with complete feedback. For the asymmetric channel, some strategies for small lengths have been proposed. Later some results for the binary symmetric channel have been generalized for the q-ary symmetric channel. (These results are not published yet, I can send the draft).

There are multiple ways, how this research can be continued:

1. One can try to generalize the results from mentioned paper for the case of 2 errors or a constant number of errors.
2. The methods developed in the paper "Correcting a Single Error in Feedback Channels" allow us to construct optimal codes for the asymmetric channel with complete feedback and compute their sizes. However, it requires a lot of time. It would be interesting to prove a general formula for the size of such optimal codes.
3. Another direction is to consider non-symmetric q-ary channels.

## Code-based Cryptography: digital signatures

Keywords:
code-based cryptography, digital signatures

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

The McEliece cryptosystem is a promising candidate for asymmetric encryption.
However, many attempts at constructing a code-based signature scheme have resulted in impractical parameters or security problems.

NIST's announcement of a competetion dedicated to standardizing post-quantum signatures has lead to the publication of several new code-based schemes

In this work we pick one of the proposals and analyse its (in)security [2,3,4].
If you are interested, please write an email, then we'll discuss the details.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Cho, J., No, J. S., Lee, Y., Koo, Z., & Kim, Y. S. (2022). Enhanced pqsigRM: Code-Based Digital Signature Scheme with Short Signature and Fast Verification for Post-Quantum Cryptography. Cryptology ePrint Archive.

[3] Baldi, M., Chiaraluce, F., & Santini, P. (2022). SPANSE: combining sparsity with density for efficient one-time code-based digital signatures. arXiv preprint arXiv:2205.12887.

[4] Barenghi, A., Biasse, J. F., Persichetti, E., & Santini, P. (2022). On the computational hardness of the code equivalence problem in cryptography. Cryptology ePrint Archive.

#### Prerequisites

Channel coding

Security in Communications and Storage

## Code-based Cryptography: Information Set Decoding

Keywords:
code-based cryptography, information set decoding

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

In order to solve the generic decoding problem, algorithms from the  information set decoding (ISD) family can be used.
During the last 60 years, small improvements to this approach were made.

Recently, new variants of the classical decoding problem were proposed [2,3,4].
This work adapts strategies for the classical problem to one of the new settings.
The goal is to develop decoding algorithms, analyse their complexity and do a (proof of concept) implementation.
There is also a webpage which provides instances that we can attempt to solve.

If you are interested, please write an email, then we'll discuss the details.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Bricout, R., Chailloux, A., Debris-Alazard, T., & Lequesne, M. (2020). Ternary syndrome decoding with large weight. In Selected Areas in Cryptography–SAC 2019: 26th International Conference, Waterloo, ON, Canada, August 12–16, 2019, Revised Selected Papers 26 (pp. 437-466). Springer International Publishing.

[3] Weger, V., Khathuria, K., Horlemann, A. L., Battaglioni, M., Santini, P., & Persichetti, E. (2020). On the hardness of the Lee syndrome decoding problem. arXiv preprint arXiv:2002.12785.

[4] Baldi, M., Battaglioni, M., Chiaraluce, F., Horlemann-Trautmann, A. L., Persichetti, E., Santini, P., & Weger, V. (2020). A new path to code-based signatures via identification schemes with restricted errors. arXiv preprint arXiv:2008.06403.

#### Prerequisites

Channel coding

Security in Communications and Storage

## [identification] Idnetification and Secrecy with Physically-Unclonable-Functions (PUFs)

Keywords:
PUF secrecy identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

From previous work we have a fairly efficient implementation based Reed-Muller code which can be found at

Secrecy in this identification codes has also been implemented in unpublished work. Furthermore, the theoretical work on Identification with PUF's has been done in

The goal of the project will be to bridge the three topics and create practical and efficient secret identification codes in the PUF setting.

The working language will be in English.

Environment: this is a project in collaboration with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Capacity upper Bounds for ISI Channels with Direct Detection

#### Description

We are interested in computing upper bounds (on capacity) for frequency-selective channels with a memoryless nonlineary at the transmitter/receiver.

One application for these bounds are short-reach fiber-optic communication systems with a single photodiode at the receiver. The photodiode is a memoryless nonlinearity, as it produces an output that is proportional to the squared magnitude of the input signal.

A simple upper bound for the above model is given in [Sec. III D, 2].

D. Plabst et al., "Achievable Rates for Short-Reach Fiber-Optic Channels With Direct Detection," in Journal of Lightwave Technology, vol. 40, no. 12, pp. 3602-3613, 15 June15, 2022, doi: 10.1109/JLT.2022.3149574.

#### Prerequisites

Information Theory

Linear System Theory

## Private and Secure Federated Learning

#### Description

In federated learning, a machine learning model shall be trained on private user data with the help of a central server, the so-called federator. This setting differs from other machine learning settings in that the user data shall not be shared with the federator for privacy reasons and/or to decrease the communication load of the system.

Even though only intermediate results are shared, extra care is necessary to guarantee data privacy. An additional challenge arises if the system includes malicious users that breach protocol and send corrupt computation results.

The goal of this work is to design, implement and analyze coding- and information-theoretic solutions for privacy and security in federated learning.

#### Prerequisites

• Coding Theory (e.g., Channel Coding)
• Information Theory
• Machine Learning Basics

## [identification] Pseudo-Random Identification

Keywords:
random pseudo identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

One advantage can be gained by using pseudo randomness to generate both the input and the code itself.
Your task will be implementing the identification codes described in the attached pdf (an english translation of a paper published in russian in a russian journal) aiming at the fastest implementation and smallest collisions, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI from TUM and CeTI from TU Dresden, the latter having already some preliminary implementation of pseudo-random identification using various pseudo-random generators. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## [quantum] Quantum Machine Learning for Communication

Keywords:
physical, layer, quantum, machine learning, non-linear

#### Description

As part of an ongoing project with Huawei we are looking into quantum machine learning algorithms applied to decoding at the end of an optical fiber in the non-linear regime.

So far we have tried only the quantum version of k-mean clustering, however the goal is to test further quantum algorithms, in particular quantum support vector machines next, and their classical quantum-inspired counterpart.

The projects will involve reading the literature on quantum machine learning algorithms and quantum-inspired algorithms, find or come up with an implementation (this will involve the use of quantum libraries, in particular so far we have use qiskit), and benchmark the performance.

#### Prerequisites

Knowledge of quantum mechanics or quantum information is highly recommended.

## Error Correcting Codes for Memories with (Partially) Defects

Keywords:
Linear Codes, Algebraic Codes, Error Correction , Masking Defects, Flash Memories, Phase-Change Memories

#### Description

For different applications, the demand for reliable memory solutions in particular for non-volatile memories such as phase-change memories (PCMs)  is rapidly increasing. PCM cells may become defective (also called stuck) either fully or partially if they fail in switching their states, and therefore these cells can only hold a single phase. In response to these defects, combined masking and error-correcting code constructions have been proposed, where masking is for hiding the defects while error-correcting is to compromise potential added-channel errors.  We want to investigate further code constructions such that less overall redundancy is required to handle these two types of errors. As an alternate, work for combined erasure errors and masking code constructions could be investigated.

#### Prerequisites

- Basic principle of Linear Algebra

- Channel Coding/Coding Theory

- Basic knowledge in Information Theory

#### Contact

M.Eng. Haider Al Kim

Doctoral Researcher

Technical University of Munich

Department of Electrical and Computer Engineering /

Coding and Cryptography (COD) Group

Email: haider.alkim@tum.de

## [identification] Implementation of identification with algebraic-geometry (Goppa) codes

Keywords:
goppa algebraic geometry codes identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

Your task will be implementing identification with Goppa codes, aiming at the fastest implementation, and testing their performance in comparison to other current implementations. The reference articles for this implementation are:

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## [identification] Implementation of identification with universal hash functions

Keywords:
universal hash identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

aiming at the fastest implementation, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## [identification] Applications of Identification Codes in V2X Communications

Keywords:
sumo, ns3, ns-3, vehicular, communication, identification, c++, Reed-Muller

#### Description

As part of the NewCom Project, new communication paradigms are investigated from an experimental perspective in order to construct proof-of-concept implementations that demonstrate the theoretical results obtained for Post-Shannon Communication schemes. In particular, this MSc thesis focuses on Identification Codes and their integration into a simulation environment where vehicular networks are modelled.

For this, the master student will first conduct a review of the state-of-the-art use cases for identification in the scientific literature and in form of patents, with an emphasis on V2X communications. By using an open-source V2X implementation based on LDR’s Simulation of Urban Mobility (SUMO) framework integrated with ns-3’s implementation of the ITS-G5 and LTE standards and conducting simulation in specific scenarios, the student will gain a first impression of the performance of the system using traditional transmission schemes. The integration of existing implementation of identification codes culminates this thesis, where KPIs will be defined in order to compare the advantages of using identification instead of transmission in the context of V2X communications.

The software used for the simulation of the vehicular network communication is ezCar2x

which build on and integrates the NS-3 (network simulation) and SUMO (traffic simulation) libraries

For the identification part and identification code based on Reed-Muller codes needs to be reimplemented (work in progress) from Python into C++ using the Givaro library

#### Prerequisites

• Knowledge of communications engineering, mobile communications, wireless channel models, signal processing, and channel coding techniques (experience in LTE/5G cellular networks is a plus)

• Interest in novel communication concepts as well in their practical implementation

• Software experience: MATLAB, C++ and Python (experience with ns-3 or SUMO is a plus)

• Comfortable working with Linux operative systems and distributed version control tools (e.g., gitlab)

• Goal-oriented and structured work style

#### Contact

To apply, Please send your application by e-mail to Roberto Ferrara (roberto.ferrara@tum.de) and Luis Torres-Figueroa (luis.torres.figueroa@tum.de) with the following documents:

• Curriculum vitae

• Short motivation (0.5 – 1 page)

## [security] Practical implementation of physical-layer semantic security

Keywords:
semantic, security, secrecy, programming, implementation

#### Description

The goal of this project is to implement in Python/Sagemath the security functions (at least one of four) described in https://arxiv.org/abs/2102.00983
Sagemath contains libraries for mosaics, BIBDs, etc, that can be used for the project.

Motivation:
There are various types of security definitions.
The mutual information based types, in increasing order of security requirement are

1. Weak secresy asks that the average mutual information of the eavesdropper I(M:E)/n goes to 0 for a uniform message M (average here means averaged over the blocklength n, an additional average over M is implicit in the mutual information)
2. Strong secrecy asks that the total mutual information I(M:E) goes to 0,
3. Semantic security asks that the total mutual informaiton I(M:E) goes to 0 for any distribution of the message M (and thus in particular for all distributions that pick any of two chosen messages with 1/2 probabilty)

Then there are the almost-equivalent respective indistiguishablity types  of security requirements (below |P-Q|_1 is the statistical distance and Exp_M is expectation value over M)

1. average indistinguishability 1/n Exp_M | P_{E|M} - P_E |_1 for a uniform message M goes to 0 (again average refers over the blocklegth n, clearly there is also the average over M)
2. total indistiguishability Exp_M | P_{E|M} - P_E |_1 for a uniform message M goes to 0
3. indistinguishability |P_{E|m} - P_{E|m'}|_1 for any two messages m and m' goes to 0.

Each of the indistiguishabilities can also be written using KL digvergence instead of statistical distance, in which case the conditions are exactly equivalent to their mutual information versions.

Strong secrecy is the standard security requirement considered in information-theoretic security, while semantic security is the minimum requirement considered in computational security.
Information-theoretic (physical-layer) security differs from computational security in that the secrecy is guaranteed irrespective of the power of the adversary, while in computational security E is computationally bounded. Computational security also assumes that the message is at least of a certain length for the schemes to work, and thus if the message to be secured is too small it needs to be padded to a larger message.

In practice, information theoretic security is expensive, because the messages that can be secured can be only as long as the keys that can be generated. However, in identification only a very small part of the message needs to be secured, which in computational security triggers padding and thus waste, but on the other side makes information-theoretic security accessible and not so expensive.

At the same time, the security of identification implicitly requires semantic security. It has been known for a while that hash functions provide information-theoretic strong secrecy. However, because the standard for information-theoretic security has been strong secrecy, before https://arxiv.org/abs/2102.00983 no efficient functions where known to provide information-theoretic semantic security.
We need an implementation of these type of functions so that we can integrate information-theoretic security into our identification project.

## [quantum] Realignment criterion and upper bounds in device-independent QKD

#### Description

This paper uses the partial transpose as a tool to derive upper bounds on device-independent QKD
https://arxiv.org/abs/2005.13511
In this project the goal is to try to generalize the above to the other tools like the reallignment criterion:
https://arxiv.org/abs/quant-ph/0205017
https://arxiv.org/abs/0802.2019

#### Prerequisites

basics of quantum information/quantum formalism

## [quantum] Semantic security of infinite-dimensional classical-quantum channels

#### Description

Generalize semantic security of classical-quantum channels to infinite dimensional channel (not necessarily gaussian)

#### Prerequisites

quantum information theory

## [quantum] Asymptotic continuity of restricted quantum relative entropies under general channels

Keywords:
quantum, relative entropy, Pinsker, reverse, inequality, information thoery, asymptotic, continuity

#### Description

Asypmtotic continuity is a property in the form of inequalities (classically known also as inequalities of the reverse-Pinker type) that is necessary to prove upper bounds on operational capacities.

The (quantum) relative entropy (also known as quantum divergence and classically also known as Kullbackt-Leibler divergence), can be used to define various entanglment measures many of which have a proven asymptotic continuity.

Of particular interest are the restricted quantum relative entropies defined by Marco Piani (https://arxiv.org/abs/0904.2705), many of which satisfy asymptotic continuity (A.S.)

In the above there are maybe 2-3 different proof styles.
We can group the results in the above as follows:

• A.S. for entropy, conditional entropies, mutual information, conditional mutual information
• A.S. for relative entropies with infimum over states on the second argument
• A.S. relative entropies with infimum over state *and maximization over measurement channels*

The goal of the project is to generalize the last case to asymptotic continuity for relative entropies with infimum over state and maximization over *general* channels.

Possible new proof directions are

#### Prerequisites

Knowledge of quantum information is highly recommended/required.
Knowledge of matrix analysis will be a strong advantage.

#### Contact

roberto.ferrara@tum.de

## [quantum] Practical protocols for quantum synchronization in classical network

Keywords:
quantum, network, synchronization

#### Prerequisites

Knowledge of quantum theory as provided by the course Algorithms in Quantum Theory or similar

## [quantum] Entanglement-measures upper bounds on device-independent distillable key

Keywords:
quantum, qkd, entanglement

#### Description

The goal of this work is to try to upper bound the device-independent distillable key in terms of locally restricted relative entropy of entanglement (an entanglement measure).

The following are relevant works/articles

#### Prerequisites

Strong background in quantum theory is required, preferably in quantum information theory, which is not covered by the course Algorithms in Quantum Theory

## Channel Coding: Efficient Decoding for GC Codes and General Codes

Keywords:
channel coding, efficient decoding, generalized concatenated codes
Short Description:
We develop efficient decoders for short block codes.

#### Description

Arising applications, such as machine-to-maschine communication require error-correction codes with short information length.The design of such codes and efficient decoders is an open problem [1].

Recently, Reed-Muller codes have gained a lot of interest, because of thier good error-correction capability and their structure, which allows for low-complexity decoders, see, e.g., [2].

It has been known for quite some time that Reed-Muller codes belong to a more general class of codes: the Generalized Concatenated (GC) Codes [3].
This class allows for more flexible code design, e.g., with respect to the information rate of the code.
Hence, by transfering and refining results for Reed-Muller codes to GC codes, one could improve over existing solutions.

Another approach to obtain good results in the short-length regime is using the best-known codes [4].
Since these codes do usually not have a structure that enables efficient decoding, one has to perform decoding of a general linear code. The most efficient approaches are variants of Ordered Statistics Decoding (OSD) [5]. The idea for improving over state-of-the-art varaints is to encorporate recent improvements from another field of research: Information Set Decoding.

If you are interested in either of the directions (or have some other direction in mind), please write an email, then we'll discuss the details.

References:

[1] Co?kun, M. C., Durisi, G., Jerkovits, T., Liva, G., Ryan, W., Stein, B., & Steiner, F. (2019). Efficient error-correcting codes in the short blocklength regime. Physical Communication, 34, 66-79.

[2] Geiselhart, M., Elkelesh, A., Ebada, M., Cammerer, S., & ten Brink, S. (2021). Automorphism ensemble decoding of Reed–Muller codes. IEEE Transactions on Communications, 69(10), 6424-6438.

[3] Schnabl, G., & Bossert, M. (1995). Soft-decision decoding of Reed-Muller codes as generalized multiple concatenated codes. IEEE Transactions on Information Theory, 41(1), 304-308.

[4] Markus Grassl. "Bounds on the minimum distance of linear codes and quantum codes." Online available at http://www.codetables.de.

[5] Fossorier, M. P., & Lin, S. (1995). Soft-decision decoding of linear block codes based on ordered statistics. IEEE Transactions on Information Theory, 41(5), 1379-1396.

Channel coding

## Coding with Feedback

#### Description

In paper

"Correcting a Single Error in Feedback Channels" the problem of correcting a single error with feedback was investigated. It was mainly devoted to binary channels, namely, binary symmetric and asymmetric channels. A general theorem, which allows constructing strategies with one feedback, was proved. For the symmetric channel with one error, it was proved that with two feedbacks one can transmit as many messages as with complete feedback. For the asymmetric channel, some strategies for small lengths have been proposed. Later some results for the binary symmetric channel have been generalized for the q-ary symmetric channel. (These results are not published yet, I can send the draft).

There are multiple ways, how this research can be continued:

1. One can try to generalize the results from mentioned paper for the case of 2 errors or a constant number of errors.
2. The methods developed in the paper "Correcting a Single Error in Feedback Channels" allow us to construct optimal codes for the asymmetric channel with complete feedback and compute their sizes. However, it requires a lot of time. It would be interesting to prove a general formula for the size of such optimal codes.
3. Another direction is to consider non-symmetric q-ary channels.

## Code-based Cryptography: digital signatures

Keywords:
code-based cryptography, digital signatures

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

The McEliece cryptosystem is a promising candidate for asymmetric encryption.
However, many attempts at constructing a code-based signature scheme have resulted in impractical parameters or security problems.

NIST's announcement of a competetion dedicated to standardizing post-quantum signatures has lead to the publication of several new code-based schemes

In this work we pick one of the proposals and analyse its (in)security [2,3,4].
If you are interested, please write an email, then we'll discuss the details.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Cho, J., No, J. S., Lee, Y., Koo, Z., & Kim, Y. S. (2022). Enhanced pqsigRM: Code-Based Digital Signature Scheme with Short Signature and Fast Verification for Post-Quantum Cryptography. Cryptology ePrint Archive.

[3] Baldi, M., Chiaraluce, F., & Santini, P. (2022). SPANSE: combining sparsity with density for efficient one-time code-based digital signatures. arXiv preprint arXiv:2205.12887.

[4] Barenghi, A., Biasse, J. F., Persichetti, E., & Santini, P. (2022). On the computational hardness of the code equivalence problem in cryptography. Cryptology ePrint Archive.

#### Prerequisites

Channel coding

Security in Communications and Storage

## Code-based Cryptography: Information Set Decoding

Keywords:
code-based cryptography, information set decoding

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

In order to solve the generic decoding problem, algorithms from the  information set decoding (ISD) family can be used.
During the last 60 years, small improvements to this approach were made.

Recently, new variants of the classical decoding problem were proposed [2,3,4].
This work adapts strategies for the classical problem to one of the new settings.
The goal is to develop decoding algorithms, analyse their complexity and do a (proof of concept) implementation.
There is also a webpage which provides instances that we can attempt to solve.

If you are interested, please write an email, then we'll discuss the details.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Bricout, R., Chailloux, A., Debris-Alazard, T., & Lequesne, M. (2020). Ternary syndrome decoding with large weight. In Selected Areas in Cryptography–SAC 2019: 26th International Conference, Waterloo, ON, Canada, August 12–16, 2019, Revised Selected Papers 26 (pp. 437-466). Springer International Publishing.

[3] Weger, V., Khathuria, K., Horlemann, A. L., Battaglioni, M., Santini, P., & Persichetti, E. (2020). On the hardness of the Lee syndrome decoding problem. arXiv preprint arXiv:2002.12785.

[4] Baldi, M., Battaglioni, M., Chiaraluce, F., Horlemann-Trautmann, A. L., Persichetti, E., Santini, P., & Weger, V. (2020). A new path to code-based signatures via identification schemes with restricted errors. arXiv preprint arXiv:2008.06403.

#### Prerequisites

Channel coding

Security in Communications and Storage

## Distributed Noise Generation for Secure Over-the-Air Computation with Applications in Federated Learning

Short Description:
Over-the-Air (OtA) computation is a promising approach with the potential to drastically reduce the communication overhead of wireless distributed data-processing systems (e.g. Federated Learning). Since this method, however, is prone to eavesdropping, artificial noise can be employed to secure the communication. An open problem however, is the distributed design of artifical noise among different users.

#### Description

Novel use cases for mobile communication networks include the aggregation of large amounts of data, which is stored in a distributed manner across network users. For instance, Federated Learning requires the aggregation of machine learning model updates from contributing users.

Over-the-Air (OtA) computation is an approach with the potential to drastically reduce the communication overhead of wireless distributed data-processing systems (e.g. Federated Learning). It exploits the multiple-access property and linearity of the wireless channel to compute sums of pre-processed data by the channel. This important property at the same time opens great opportunities for eavesdroppers to learn about the transmitted signal. If the legitimate receiver shall have exclusive access to the computation result, it is crucial to employ additional security measures.

Artificial noise can be employed to secure the communication. This noise is either generated by dedicated users jamming the communication [3], or by jointly designing the noise contribution of each user, [1][2]. The latter approach makes it possible to minimize the distortion at the legitimate receiver, but requires a centrally coordinated noise design. Therefore, an open problem is how to allow for the distributed design of artifical noise.

[1] Maßny, Luis, and Antonia Wachter-Zeh. "Secure Over-the-Air Computation using Zero-Forced Artificial Noise." arXiv preprint arXiv:2212.04288 (2022).
[2] Liao, Jialing, Zheng Chen, and Erik G. Larsson. "Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations." arXiv preprint arXiv:2210.02235 (2022).
[3] Yan, Na, et al. "Toward Secure and Private Over-the-Air Federated Learning." arXiv preprint arXiv:2210.07669 (2022).

#### Prerequisites

- basic knowledge in statistics and estimation theory
- basic knowledge about linear wireless channels

## [identification] Idnetification and Secrecy with Physically-Unclonable-Functions (PUFs)

Keywords:
PUF secrecy identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

From previous work we have a fairly efficient implementation based Reed-Muller code which can be found at

Secrecy in this identification codes has also been implemented in unpublished work. Furthermore, the theoretical work on Identification with PUF's has been done in

The goal of the project will be to bridge the three topics and create practical and efficient secret identification codes in the PUF setting.

The working language will be in English.

Environment: this is a project in collaboration with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Capacity upper Bounds for ISI Channels with Direct Detection

#### Description

We are interested in computing upper bounds (on capacity) for frequency-selective channels with a memoryless nonlineary at the transmitter/receiver.

One application for these bounds are short-reach fiber-optic communication systems with a single photodiode at the receiver. The photodiode is a memoryless nonlinearity, as it produces an output that is proportional to the squared magnitude of the input signal.

A simple upper bound for the above model is given in [Sec. III D, 2].

D. Plabst et al., "Achievable Rates for Short-Reach Fiber-Optic Channels With Direct Detection," in Journal of Lightwave Technology, vol. 40, no. 12, pp. 3602-3613, 15 June15, 2022, doi: 10.1109/JLT.2022.3149574.

#### Prerequisites

Information Theory

Linear System Theory

## Statistical Decoding

Keywords:
code-based cryptography, decoding attack

#### Description

Due to the recent advances in quantum computers, the search for cryptosystems that survive quantum attacks is of great interest. Code-based cryptography is a promising candidate, since it is build on the NP-hard problem of decoding a random code [1].

In order to solve the generic decoding problem, algorithms from the  information set decoding (ISD) family can be used.
During the last 60 years, small improvements to this approach were made. During this time, other algorithms, such as statistical decoding [2], were proposed, but failed to achieve the performance of ISD [3].

Recently, a variant of statistical decoding was proposed that claims to perfom better than the best ISD variants for low code rates [4].

If you are interested, please write an email, then we'll discuss the details.

Main Paper:

Carrier, K., Debris-Alazard, T., Meyer-Hilfiger, C., & Tillich, J. P. (2022). Statistical Decoding 2.0: Reducing Decoding to LPN. arXiv preprint arXiv:2208.02201.

References:

[1] Weger, V., Gassner, N., & Rosenthal, J. (2022). A Survey on Code-Based Cryptography. arXiv preprint arXiv:2201.07119.

[2] Jabri, A. A. (2001, December). A statistical decoding algorithm for general linear block codes. In IMA International Conference on Cryptography and Coding (pp. 1-8). Springer, Berlin, Heidelberg.

[3] Debris-Alazard, T., & Tillich, J. P. (2017, June). Statistical decoding. In 2017 IEEE International Symposium on Information Theory (ISIT).

[4] Carrier, K., Debris-Alazard, T., Meyer-Hilfiger, C., & Tillich, J. P. (2022). Statistical Decoding 2.0: Reducing Decoding to LPN. arXiv preprint arXiv:2208.02201.

#### Prerequisites

Channel coding

Security in Communications and Storage

Probability theory and statistics

## MAB-Based Efficient Distributed ML on the Cloud

Keywords:
Distributed Machine Learning (ML), Multi-Armed Bandits (MABs), Cloud Simulations (AWS, GCP, ...)

#### Description

We consider the problem of running a distributed machine learning algorithm on the cloud. This imposes several challenges. In particular, cloud instances may have different performances/speeds. To fully leverage the performance of the instances, we want to characterize their speed and potentially use the fastest ones. To explore the speed of the instances while exploiting them (assigning computational tasks), we use the theory of multi-armed bandits (MABs).

The goal of the research intership is to start by implementing existing theoretical algorithms [1] and possibly adapting them based on the experimental observations.

[1] M. Egger, R. Bitar, A. Wachter-Zeh and D. Gündüz, Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits, submitted to IEEE Journal on Selected Areas in Communications (JSAC), 2022.

#### Prerequisites

• Information Theory
• Machine Learning Basics
• Python (Intermediate Level)

## Private and Secure Federated Learning

#### Description

In federated learning, a machine learning model shall be trained on private user data with the help of a central server, the so-called federator. This setting differs from other machine learning settings in that the user data shall not be shared with the federator for privacy reasons and/or to decrease the communication load of the system.

Even though only intermediate results are shared, extra care is necessary to guarantee data privacy. An additional challenge arises if the system includes malicious users that breach protocol and send corrupt computation results.

The goal of this work is to design, implement and analyze coding- and information-theoretic solutions for privacy and security in federated learning.

#### Prerequisites

• Coding Theory (e.g., Channel Coding)
• Information Theory
• Machine Learning Basics

## [identification] Pseudo-Random Identification

Keywords:
random pseudo identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

One advantage can be gained by using pseudo randomness to generate both the input and the code itself.
Your task will be implementing the identification codes described in the attached pdf (an english translation of a paper published in russian in a russian journal) aiming at the fastest implementation and smallest collisions, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI from TUM and CeTI from TU Dresden, the latter having already some preliminary implementation of pseudo-random identification using various pseudo-random generators. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## Dimension of s-Lifted Reed-Solomon Codes

#### Description

In the work [1], we developed a method to analyze the dimension of quadratic-curve-lifted Reed-Solomon codes and its asymptotic behavior. The method gives tighter bound on the dimension compared to the estimation given in [2], which defines a more general class of lifted Reed-Solomon codes with regard to curves of higher degree.

In this work, the student is expected to extended the method in [1] to the general class of codes in [2] to analyse the dimension and its asymptotic behavior.

## [quantum] Quantum Machine Learning for Communication

Keywords:
physical, layer, quantum, machine learning, non-linear

#### Description

As part of an ongoing project with Huawei we are looking into quantum machine learning algorithms applied to decoding at the end of an optical fiber in the non-linear regime.

So far we have tried only the quantum version of k-mean clustering, however the goal is to test further quantum algorithms, in particular quantum support vector machines next, and their classical quantum-inspired counterpart.

The projects will involve reading the literature on quantum machine learning algorithms and quantum-inspired algorithms, find or come up with an implementation (this will involve the use of quantum libraries, in particular so far we have use qiskit), and benchmark the performance.

#### Prerequisites

Knowledge of quantum mechanics or quantum information is highly recommended.

## [identification] Implementation of identification with algebraic-geometry (Goppa) codes

Keywords:
goppa algebraic geometry codes identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

Your task will be implementing identification with Goppa codes, aiming at the fastest implementation, and testing their performance in comparison to other current implementations. The reference articles for this implementation are:

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## [identification] Implementation of identification with universal hash functions

Keywords:
universal hash identification

#### Description

Identification is a communication scheme that allows rate doubly exponential in the blocklemght, with the tradeoff that identities cannot be decoded (as messages do) but can only be verified.

The double exponential growth presents various challenges in the finite regime: there are heavy computational costs introduced at the encoder and decoder and heavy trade-offs between the error and the codes sizes.

The ultimate goal is to find a fast, reliable implementation while still achieving large code sizes.

Identification codes can be achieved by first removing the errors from the channel with regular transmission channel coding, and then sending a challenge though the corrected channel. For every identity i, The channenge is generated by picking a random input m and computing the corresponding output T_i(m) using a function T_i that depends on the identity. The challenge is then the pair m,T_i(m) and the receiver wanting to verify an identity j will verify whether j=i by testing the challenge. This is done by recomputing the output with T_j and verifying whether T_j(m)= T_i(m). The errors are reduced by ensuring that the various functions collide on a small fraction of the possible inputs.

It turns out that choosing good sets of funtions {T_i} is the same as choosing error-correction codes {c_i} with large distance, where now each codeword c_i defines a function by mapping positions m (sometimes called code locators) to symbols c_im of the codeword.
We can thus construct identification codes by choosing error-correction codes where we are only interested in the performance of the error correction encoders (we are not interested in the error-correction decoder or error-correction codes).

aiming at the fastest implementation, and testing their performance in comparison to other current implementations.

For reference, our previous work on identification based on Reed-Solomon and Reed-Muller code can be found at

The coding will be in Python/Sagemath.
The working language will be in English.

Environment: we collaborate with LTI. At LNT and LTI there is currently a lot of funding for research in identification. Therefore you will find a large group of people that might be available for discussion and collaboration.

## [quantum] Realignment criterion and upper bounds in device-independent QKD

#### Description

This paper uses the partial transpose as a tool to derive upper bounds on device-independent QKD
https://arxiv.org/abs/2005.13511
In this project the goal is to try to generalize the above to the other tools like the reallignment criterion:
https://arxiv.org/abs/quant-ph/0205017
https://arxiv.org/abs/0802.2019

#### Prerequisites

basics of quantum information/quantum formalism

## On the Equivalence of Identification and Authentication

Keywords:
Identification via channel, identification codes, authentication, authentication codes
Short Description:
A Certain equivalence of identification and authentication would be shown.

#### Description

It would be shown that under suitable formulations (preserving all salient features) the two problem of Identification (Ahlswede and Dueck, 1989) and Authentication (Simmons, G. J. 1984) are in essence very close to each other. This equivalency was conjectured first by M. S. Pinsker. In this research internship the student is expected to address this conjecture. Both problems must be studied separately and then the similar essence of them should be drawn out. In particular the identification codes and authentication codes along with theire relation will be investigated.

#### Prerequisites

1. Background in Information Theory and Channel Coding
2. Familiarity with fundamentals of Identification Theory

References:

1. Simmons, G. J. 1984, “Message authentication: a game on hypergraphs,” Congressus Numer. 45:161-192.
2. Simmons, G. J. 1982, “A game theory model of digital message authentication,”  Congressus Numer., 34, 413-424
3. Simmons, G. J. 1985, “Authentication theory/coding theory,” in: Advances in Cryptology: Proceedings of CRYPTO 84, Lecture Notes in Computer Science, vol. 196, Springer-Verlag, Berlin, pp. 411-432.
4. E. Gilbert, F. J. MacWilliams and N.J. A. Sloane, 1974, “Codes which detect deception,” Bell System Tech. J., 53, 405-424.
5. R. Ahlswede and G. Dueck, “Identification via channels,” in IEEE Trans. on Inf. Theory, vol. 35, no. 1, pp. 15-29, Jan. 1989, doi: 10.1109/18.42172.
6. L. A. Bassalygo, M. V. Burnashev, “Authentication, Identification, and Pairwise Separated Measures”, Problems Inform. Transmission, 32:1 (1996), 33–39

## Coding with Feedback

#### Description

In paper

"Correcting a Single Error in Feedback Channels" the problem of correcting a single error with feedback was investigated. It was mainly devoted to binary channels, namely, binary symmetric and asymmetric channels. A general theorem, which allows constructing strategies with one feedback, was proved. For the symmetric channel with one error, it was proved that with two feedbacks one can transmit as many messages as with complete feedback. For the asymmetric channel, some strategies for small lengths have been proposed. Later some results for the binary symmetric channel have been generalized for the q-ary symmetric channel. (These results are not published yet, I can send the draft).

There are multiple ways, how this research can be continued:

1. One can try to generalize the results from mentioned paper for the case of 2 errors or a constant number of errors.
2. The methods developed in the paper "Correcting a Single Error in Feedback Channels" allow us to construct optimal codes for the asymmetric channel with complete feedback and compute their sizes. However, it requires a lot of time. It would be interesting to prove a general formula for the size of such optimal codes.
3. Another direction is to consider non-symmetric q-ary channels.

## Seminar Topics

The three Seminars "Seminar on Coding and Cryptography", "Seminar on Digital Communications" and "Seminar on Optical Communications" are organized jointly.