Bachelorarbeiten
Sliding Window Decoding of Spatially Coupled Product-Like Codes
Beschreibung
This work covers the implementation and theoretical analysis of sliding window decoding of spatially coupled product-like codes. The student will explore the performance-complexity tradeoff for eBCH component codes and precoded polar component codes under suboptimal component decoders like the Chase decoder or the SCL decoder. We consider adaptive coding for channels with state known at the transmitter and compute iterative decoding thresholds of corresponding code ensembles.
You may also contact me with your own project idea related to iterative decoding.
Voraussetzungen
"Channel Coding" lecture
In-depth programming skills in Matlab, Python or C.
Recommended: Lecture "Channel Codes for Iterative Decoding"
Betreuer:
Dispersion Compensation in IM/DD Systems Using the Gerchberg-Saxton Algorithm
Beschreibung
The Gerchberg-Saxton algorithm is an iterative phase retrieval method originally developed in optics, and it has been adapted for electronic dispersion compensation in optical communication systems. The student will implement the algorithm and apply it to intensity-modulation/direct-detection (IM/DD) systems to explore its potential for waveform optimization. The focus lies on simulating the algorithm's behavior in the presence of fiber dispersion and analyzing convergence properties and performance.
Betreuer:
Thesis in Optical Communications and Receiver Design
Beschreibung
Please reach out if you are interested in a thesis in any of my research fields. Possible areas include optical communications, particularly physical modeling and nonlinearity mitigation for single-mode fiber, and aspects of receiver design, such as receivers for channels with memory.
A good background in optical communication systems and applied information theory is preferable, but the requirements generally depend on your interests.
Please include a description of your interests and corresponding academic background in your application. If you have a thesis idea, I am happy to discuss your suggestions. Also, I am available to supervise external theses as long as they are in my field of expertise.
Betreuer:
Beyond Shannon: Exploring Rényi Entropy and Its Applications
Beschreibung
A foundational concept in information theory is Shannon entropy. However, Shannon entropy does not always provide sufficient flexibility when dealing with various optimization problems, robustness considerations, or scenarios where fine control over uncertainty quantification is required. In 1961, Rényi provided a parametric generalization of Shannon entropy [1], allowing for a more nuanced analysis of information measures.
Rényi entropy has found applications in diverse fields such as hypothesis testing, machine learning, privacy and security (e.g., differential privacy), and statistical physics. The target of this project is to understand the difference between Shannon and Rényi entropy, conditional entropy, and divergence, as well as its applications in both theoretical and applied research.
[1] A. Rényi, “On measures of entropy and information,” in Proc. 4th Berkeley Symp. Mathematics, Statistics, and Probability, vol. 1, Berkeley, CA, USA, 1961, pp. 547–561.
Betreuer:
Topics in Intelligent Reflecting Surfaces (IRS), Integrated Sensing and Communication and Wireless Communication
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Voraussetzungen
Prerequirements:
- Wireless Communication
- Mobile Communication
- Information Theory
- Multi-User Information Theory
Betreuer:
Thesis in Polar Coding, Probabilistic Shaping, and Applied Information Theory
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Betreuer:
Masterarbeiten
Coding for Multi-User Wireless Random Access Protocols
Beschreibung
Unsourced multi-access protocols ensure that multiple users can transmit on the same physical resources without pre-allocation of resources to the different users. To avoid information loss caused by collision of messages transmitted simultaneously, we investigate how to use (adaptive) coding schemes that allow the concurrent transmission of coded messages stemming from different users with lossless reconstruction of the payloads.
The student should be proficient in communications engineering and coding theory, i.e., the following prerequisites (or similar) are minimal requirements:
- Channel Coding
- Nachrichtentechnik
Betreuer:
Multi-round Privacy in Federated Learning
Beschreibung
Federated learning allows to train a machine learning model in a distributed manner, i.e., the training data are collected and stored locally by users such as mobile devices or multiple institutes. The training is under the coordination of a central server and performed iteratively. In each iteration, the server sends the current global model to the users, who update their local model and send the local updates to the server for aggregation.
FL is proposed to protect user's sensitive data since these training data never leave the user devices. However, works have shown that the local updates still leaks information about the local datasets. To deal with this leakage, SecAgg[1] is proposed. Secure aggregation is to make sure that the server only obtains the aggregation of the local updates rather than each individual update.
However, recent work [2] has shown that, SecAgg only preserves privacy of the users in a single training round. Due to user selection in federated learning, by observing the aggregated models over multiple training rounds, the server is able to recoverindividual local models of the users.
The goal of this seminar is to study and understand SecAgg [1], the multi-round privacy leakage it suffers and how is this problem solved in [2].
[1]. Bonawitz, Keith, et al. "Practical secure aggregation for privacy-preserving machine learning." proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017.
[2]. So, Jinhyun, et al. "Securing secure aggregation: Mitigating multi-round privacy leakage in federated learning." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 8. 2023.
Betreuer:
Jointly Supervised Master's Thesis in Equalization-Enhanced Phase Noise
This thesis aims to investigate the modeling and/or mitigation of equalization-enhanced phase noise. The student works for Nokia and is jointly supervised by Nokia and the Institute for Communications Engineering, TUM.
Beschreibung
As an optical signal propagates through a fiber, it is distorted by chromatic dispersion. On the receiver side, the signal further experiences phase noise from the local oscillator. Compensating chromatic dispersion without consideration of the phase noise enhances the latter, which is referred to as equalization-enhanced phase noise (EEPN).
Depending on the student's interests, the thesis can focus on different aspects of EEPN, e.g., proper modeling or mitigation.
For this thesis, the student works for Nokia, but is jointly supervised and can expect close supervision from both sides. We aim to start the thesis on 7 January 2026.
Voraussetzungen
Suitable students have a basic background in optical communications and information theory. Depending on the focus of the thesis, a background in machine learning can be a plus.
Betreuer:
Improved Decoding of Interleaved RS Codes
Reed-Solomon codes, Interleaved decoding, List decoding, Coded computing
Beschreibung
Reed-Solomon (RS) codes are a very popular class of codes with applications in distributed storage, computation and communication. Interleaved RS (IRS) codes can be viewed as multiple parallel codewords of possibly different RS codes. Whenever errors occur within the same subset of positions for all the codewords, with high probability, these codes allow to correct many more random errors than guaranteed by the unique error correction capability of the individual codes [1]. Very recently, the semi-adversarial setting where some errors occur not uniformly random but are potentially adversarial was investigated [3], but many questions remain open. Alternatively, decoding beyond the unique decoding radius is possible by allowing the decoder to output a list of potential codewords. In some cases, if extra information about the transmitted codeword is known, the list can be filtered down to at most one codeword [2]. List-decoding of interleaved RS codes is an underexplored research direction.
Lagrange coded computing (LCC) [4] is a distributed computation scheme which utilizes RS decoding to achieve resilience against adversarial and straggling workers, as well as privacy assuming limited collusion. The objective of this project is to design and analyze decoding techniques for IRS codes with the goal of improving upon LCC.
[1] https://www.cs.umd.edu/~gasarch/TOPICS/pir/breakpir2.pdf
[2] https://ieeexplore.ieee.org/document/1214429
[3] https://arxiv.org/pdf/2504.10399
[4] https://proceedings.mlr.press/v89/yu19b/yu19b.pdf
Voraussetzungen
Channel Coding
Betreuer:
Homomorphic Evaluation of Kalman Filtering
homomorphic encryption, kalman filtering
Beschreibung
Kalman filtering is one of the most prominent and widely used examples of estimation methods. The Kalman filter is a recursive algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, to produce estimates of unknown variables.
It basically estimates the state of a dynamic system using a series of noisy measurements and operates in two steps: prediction, where it forecasts the next state using the system's model, and update, where it refines the estimate with new measurements.
This technique is widely applied in navigation, control systems, and signal processing for real-time data fusion and noise reduction.
One of the main challenges in Kalman filtering, especially for privacy-preserving methods, is the inversion of matrices. To preserve privacy, computations can be done via homomorphic encryption. Homomorphic encryption is a form of encryption that allows computations to be performed directly on encrypted data without needing to decrypt it first.
For instance, the homomorphic evaluation of Kalman filters requires efficient matrix inversions, which can benefit from tailored homomorphic encryption methods. Effective privacy-preserving Kalman filtering can thus play a critical role in applications ranging from autonomous navigation and prediction of vehicle trajectories to financial modeling.
The goal of this project is to explore and implement efficient privacy-preserving techniques for statistical data analysis, with a focus on homomorphic evaluation of algorithms such as Kalman filters and clustering.
References:
[1] D. Simon, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, Hoboken, NJ, USA: Wiley, 2006.
[2] M. S. Grewal and A. P. Andrews, Kalman Filtering: Theory and Practice Using MATLAB, 4th ed., Hoboken, NJ, USA: Wiley-IEEE Press, 2015.
[3] Z. Brakerski, ‘Fundamentals of fully homomorphic encryption’, in Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali, New York, NY, USA: Association for Computing Machinery, 2019, pp. 543–563. Accessed: Oct. 22, 2024. [Online]. Available: https://doi.org/10.1145/3335741.3335762
[4] V. Vaikuntanathan, ‘Computing Blindfolded: New Developments in Fully Homomorphic Encryption’, in 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Oct. 2011, pp. 5–16. doi: 10.1109/FOCS.2011.98.
[5] S. Halevi, ‘Homomorphic Encryption’, Y. Lindell, Ed., in Information Security and Cryptography. Cham: Springer International Publishing, 2017, pp. 219–276. doi: 10.1007/978-3-319-57048-8_5.
Voraussetzungen
Linear Algebra
Probability and Statistics
Cryptography
Betreuer:
Upper Bounds on Integer Partitions
combinatorics, number theory, sum-rank metric
Beschreibung
How many ways are there to write down n nonnegative integers, all of which being strictly smaller than q, such that their sum is k? In other words, what is the coefficient of x^k in (1 + x + ... + x^(q-1))^n?
In this thesis, we are going to dive into the integer partitioning problem with a certain number of partitions and an upper bound on partition size.
The goal of this thesis is to take an already existing bound for the value mentioned above, which holds for a specific k value - and extend it to general k.
The upper bound can then be used for proving better upper bounds in coding theory for certain metrics other than the Hamming metric.
[1] H. B.-S. Couvée, T. Jerkovits, and J. Bariffi, ‘Bounds on Sphere Sizes in the Sum-Rank Metric and Coordinate-Additive Metrics’, Des. Codes Cryptogr., Mar. 2025, doi: 10.1007/s10623-025-01604-0.
Voraussetzungen
information theory, channel coding, strong interest in combinatorics and number theory
Betreuer:
Coding for Composite DNA
Beschreibung
DNA Data Storage
Data storage on DNA molecules is a promising approach for archiving massive data.
In classical DNA storage systems, binary information is encoded into sequences consisting of the four DNA bases {A, C, G, T}. The encoded sequences are used to generate DNA molecules called strands using the biochemical process of DNA synthesis. The synthesized strands are stored together in a tube. To retrieve the binary information, the strand must be read via DNA sequencing and decoded back into the binary representation.
The synthesis and the sequencing procedures are error-prone, and with the natural degradation of DNA they introduce errors to the DNA strands. To ensure data reliability, the errors have to be corrected by algorithms and error-correcting codes (ECCs).
A 5min video with an overview of DNA storage: https://youtu.be/r8qWc9X4f6k?si=Yzm5sOW-a6VDnBu3
Composite DNA
Recently, to allow higher potential information capacity, [1,2] introduced the DNA composite synthesis method. In this method, the multiple copies created by the standard DNA synthesis method are utilized to create composite DNA symbols, defined by a mixture of DNA bases and their ratios in a specific position of the strands. By defining different mixtures and ratios, the alphabet can be extended to have more than 4 symbols. More formally, a composite DNA symbol in a specific position can be abstracted as a quartet of probabilities {p_A, p_C, p_G, p_T}, in which p_X, 0 ≤ p_X ≤ 1, is the fraction of the base X in {A, C, G, T} in the mixture and p_A+p_C+ p_G+ p_T =1. Thus, to identify composite symbols it is required to sequence multiple reads and then to estimate p_A, p_C, p_G, p_T in each position.
Problem description
ECCs for DNA data storage differ in many aspects from classical error correction codes. In this model, new error type gain relevance, like deletions and insertions which affect the synchronization of the sequences. Especially for composite DNA data storage, these error types received only little attention.
The most related work to this problem was recently studied by Zhang et al. in [6]. The authors initiated the study of error-correcting codes for DNA composite. They considered an error model for composite symbols, which assumes that errors occur in at most t symbols, and their magnitude is limited by l. They presented several code constructions as well as bounds for this model. In this thesis, we want to analyse a different way to model the composite synthesis method and studies additional error models. We already have some results for substitution and single deletion errors. This thesis should focus on evaluating more error models in the channel model.
This should only roughly introduce the problem. No need to review all references. If you are interested, please reach out to me, and we can discuss a suitable direction for you.
References
[1] L. Anavy, I. Vaknin, O. Atar, R. Amit, and Z. Yakhini, “Data storage in DNA with fewer synthesis cycles using composite DNA letters,” Nat Biotechnol, vol. 37, no. 10, pp. 1229–1236, Oct. 2019, doi: 10.1038/s41587-019-0240-x.
[2] Y. Choi et al., “High information capacity DNA-based data storage with augmented encoding characters using degenerate bases,” Sci Rep, vol. 9, no. 1, Art. no. 1, Apr. 2019, doi: 10.1038/s41598-019-43105-w.
[3] V. Guruswami and J. Håstad, “Explicit Two-Deletion Codes With Redundancy Matching the Existential Bound,” IEEE Transactions on Information Theory, vol. 67, no. 10, pp. 6384–6394, Oct. 2021, doi: 10.1109/TIT.2021.3069446.
[4] J. Sima, N. Raviv, and J. Bruck, “Two Deletion Correcting Codes From Indicator Vectors,” IEEE Trans. Inform. Theory, vol. 66, no. 4, pp. 2375–2391, Apr. 2020, doi: 10.1109/TIT.2019.2950290.
[5] I. Smagloy, L. Welter, A. Wachter-Zeh, and E. Yaakobi, “Single-Deletion Single-Substitution Correcting Codes,” IEEE Transactions on Information Theory, pp. 1–1, 2023, doi: 10.1109/TIT.2023.3319088.
[6] W. Zhang, Z. Chen, and Z. Wang, “Limited-Magnitude Error Correction for Probability Vectors in DNA Storage,” in ICC 2022 - IEEE International Conference on Communications, Seoul, Korea, Republic of: IEEE, May 2022, pp. 3460–3465. doi: 10.1109/ICC45855.2022.9838471.
Voraussetzungen
Betreuer:
Error correction for DNA storage
Beschreibung
DNA-based data storage is a novel approach for long term digital data archiving.
Due to the unique nature of writing and reading DNA, the channel associated with these processes is still relatively poorly understood and varies over different synthesis (writing) and sequencing (reading) technologies. The task of the student is to evaluate various decoding strategies for certain error-correcting schemes tailored for the DNA storage channel.
Voraussetzungen
- Basic principles of stochastic and algebra
- Channel Coding
- Information Theory
A few references to suggest possible research directions:
[1] J. Sima, N. Raviv, M. Schwartz, and J. Bruck, “Error Correction for DNA Storage.” arXiv, Oct. 02, 2023. Available: http://arxiv.org/abs/2310.01729.
[2] S. Chandak et al., “Overcoming High Nanopore Basecaller Error Rates for DNA Storage via Basecaller-Decoder Integration and Convolutional Codes,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2020, pp. 8822–8826. doi: 10.1109/ICASSP40776.2020.9053441
[3] L. Welter, R. Sokolovskii, T. Heinis, A. Wachter-Zeh, E. Rosnes, and A. G. i Amat, “An End-to-End Coding Scheme for DNA-Based Data Storage With Nanopore-Sequenced Reads.” arXiv, Jun. 18, 2024. Available: http://arxiv.org/abs/2406.12955.
[4] A. Vidal, V. B. Wijekoon, and E. Viterbo, “Concatenated Nanopore DNA Codes,” IEEE Transactions on NanoBioscience, vol. 23, no. 2, pp. 310–318, Apr. 2024, doi: 10.1109/TNB.2024.3350001
Betreuer:
Decoding Codes in Hamming Metric using Rank-Metric Decoders
decoding complexity
Beschreibung
Given a parity-check matrix H and a syndrome s, the Hamming Syndrome Decoding (SD) problem asks to find a vector e, the "error vector", of Hamming weight at most t. This difficulty of solving this problem has important applications to cryptography and the best solvers are the so-called Information Set Decoding (ISD) algorithms [1], [2, Section 5.3]. When the code is defined over an extension field F_{q^m}, the error vector has the additional property that its rank over the base field F_q is also at most t. Hence, one can see this as an instance of the so-called Rank Syndrome Decoding (RSD) problem, which is also an important problem with many existing solvers [3, 4].
The task of the student will be to:
- understand the existing solvers for RSD
- apply these solvers to the Hamming Syndrome Decoding problem over F_{q^m}
- compare the obtained complexity to the existing solvers in the Hamming-metric, i.e., the ISD algorithms
[1] Peters, Christiane. 2010. “Information-Set Decoding for Linear Codes over Fq.” In Post-Quantum Cryptography, edited by Nicolas Sendrier, 81–94. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-12929-2_7.
[2] Weger, Violetta, Niklas Gassner, and Joachim Rosenthal. 2022. “A Survey on Code-Based Cryptography.” arXiv. https://doi.org/10.48550/arXiv.2201.07119.
[3] Gaborit, Philippe, Olivier Ruatta, and Julien Schrek. 2016. “On the Complexity of the Rank Syndrome Decoding Problem.” IEEE Transactions on Information Theory 62 (2): 1006–19. https://doi.org/10.1109/TIT.2015.2511786.
[4] Aragon, Nicolas, Philippe Gaborit, Adrien Hauteville, and Jean-Pierre Tillich. 2018. “A New Algorithm for Solving the Rank Syndrome Decoding Problem.” In 2018 IEEE International Symposium on Information Theory (ISIT), 2421–25. https://doi.org/10.1109/ISIT.2018.8437464.
Voraussetzungen
- Channel Coding (in particular, finite fields and basic coding theory)
- Linear Algebra
- Prior acquaintance with the rank-metric or motivation to learn more about it is beneficial
Betreuer:
Fundamental limits of Byzantine-resilient distributed learning
Beschreibung
In a distributed learning setting, multiple worker machines are hired to help a main server with the expensive training of a machine learning model. Each worker is assigned a subtask, from which the main server tries to reconstruct the total computation result.
In the presence of faulty or malicious workers, called Byzantine workers, a common strategy is to distribute subtasks redundantly [3]. Since the redundancy introduces large computation overheads for the workers, strategies to reduce this overhead are required. One approach is to use interactive consistency checks at the main server, which can reduce the redundancy by up to 50% [1].
The interactive consistency checks are not for free, but cause additional computation and communication cost. For specific parameter regimes, this cost is well-studied. However, it is unkown how large this cost is in general. Therefore, we ask the following research questions:
1) How much computation is needed to guarantee error-free reconstruction of the computation result?
2) How much communication is needed?
3) What is the best trade-off between communication and computation cost?
The focus of this project is to study these research questions fundamentally. That is, we aim at understanding what the least amount of communication and computation possible is. The student will analyze these questions through mathematical tools, such as graph theory or information theory. The findings shall be compared against existing schemes [1,2] to evaluate their (sub-)optimality.
[1] C. Hofmeister, L. Maßny, E. Yaakobi and R. Bitar, "Byzantine-Resilient Gradient Coding Through Local Gradient Computations," in IEEE Transactions on Information Theory, vol. 71, no. 4, pp. 3142-3156, April 2025, doi: 10.1109/TIT.2025.3542896.
[2] S. Jain, L. Maßny, C. Hofmeister, E. Yaakobi and R. Bitar, "Interactive Byzantine-Resilient Gradient Coding for General Data Assignments," 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 2024, pp. 3273-3278, doi: 10.1109/ISIT57864.2024.10619596.
[3] R. Tandon, Q. Lei, A. G. Dimakis, N. Karampatziakis, "Gradient Coding: Avoiding Stragglers in Distributed Learning", in Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3368-3376, 2017.
Voraussetzungen
Mandatory:
- strong mathematical background
- prior knowledge in information theory
- basic knowledge in graph theory
- interest in theoretical fundamental research
Recommended:
- proficiency in algebra and probability theory
- basic knowledge in coding theory
Kontakt
luis.massny@tum.de, christoph.hofmeister@tum.de
Betreuer:
Thesis in Optical Communications and Receiver Design
Beschreibung
Please reach out if you are interested in a thesis in any of my research fields. Possible areas include optical communications, particularly physical modeling and nonlinearity mitigation for single-mode fiber, and aspects of receiver design, such as receivers for channels with memory.
A good background in optical communication systems and applied information theory is preferable, but the requirements generally depend on your interests.
Please include a description of your interests and corresponding academic background in your application. If you have a thesis idea, I am happy to discuss your suggestions. Also, I am available to supervise external theses as long as they are in my field of expertise.
Betreuer:
Topics in Intelligent Reflecting Surfaces (IRS), Integrated Sensing and Communication and Wireless Communication
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Voraussetzungen
Prerequirements:
- Wireless Communication
- Mobile Communication
- Information Theory
- Multi-User Information Theory
Betreuer:
Measuring Uncertainty in Structured Stochastic Processes
Beschreibung
The entropy rate of a stochastic process quantifies the average uncertainty per time step. In the context of Hidden Markov Models (HMMs)—where we observe a sequence of outputs generated by an underlying Markov process—this notion becomes particularly intriguing. While standard Markov chains have a well-defined structure for computing entropy rates, the introduction of a hidden layer complicates matters significantly. The entropy rate of an HMM reflects the long-term unpredictability of its observable outputs, capturing both the randomness of the state transitions and the ambiguity introduced by the observation process.
Understanding the entropy rate of HMMs has broad implications, from speech recognition and bioinformatics to machine learning and signal processing. Despite their practical importance, entropy rates of HMMs are notoriously hard to compute exactly, often requiring approximate methods or asymptotic analysis. This challenge opens up interesting theoretical questions and motivates efficient computational approaches. In this project, we investigate how structure, memory, and noise in the model impact entropy rate, and what this reveals about the information limits of systems modeled by HMMs.
Voraussetzungen
Information Theory
Betreuer:
Thesis in Polar Coding, Probabilistic Shaping, and Applied Information Theory
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Betreuer:
Communication with Coarse Quantization
Beschreibung
Motivated by the cell-free and massive MIMO (multiple input multiple outputs) communication scenarios, the number of power amplifiers (PA), digital to analog converters (DAC), etc., is increased. Thus, using coarse quantized transmission reduces the channel's hardware cost and nonlinear effects. More details can be found here.
In this project, we investigate algorithms for mapping modulated data to coarsely quantized signals and compare their information rates.
The student needs an understanding of information theory and communication systems.
Betreuer:
Code Constructions for Burst Metrics
Coding theory, Lattices, Discrete mathematics
Beschreibung
This thesis is concerned with developing code constructions for a new weight function (and associated metric) on (Z/qZ)^n called the unit-burst weight, suitable for measuring same-symbol burst errors. A unit burst is defined as a vector that has some consecutive positions of ones and is zero otherwise.
Any vector v in (Z/qZ)^n can be written as a (not necessarily unique) linear combination of these bursts. The burst weight is then the minimum number of bursts that need to be added or subtracted to produce v.
This metric has a connection to the A_n root lattice, a special lattice in Z^n+1 of vectors whose entries sum to zero. More precisely, the unit bursts relate to the shortest vectors of A_n called roots, and the burst weight corresponds to the smallest decomposition of a lattice point into roots.
We have already derived some basic properties and algorithms for this new metric and now would like to find some bounds and code constructions achieving those bounds.
Kontakt
anna.baumeister@tum.de, hugo.sauerbier-couvee@tum.de
Betreuer:
Forschungspraxis (Research Internships)
Improved Decoding of Interleaved RS Codes
Reed-Solomon codes, Interleaved decoding, List decoding, Coded computing
Beschreibung
Reed-Solomon (RS) codes are a very popular class of codes with applications in distributed storage, computation and communication. Interleaved RS (IRS) codes can be viewed as multiple parallel codewords of possibly different RS codes. Whenever errors occur within the same subset of positions for all the codewords, with high probability, these codes allow to correct many more random errors than guaranteed by the unique error correction capability of the individual codes [1]. Very recently, the semi-adversarial setting where some errors occur not uniformly random but are potentially adversarial was investigated [3], but many questions remain open. Alternatively, decoding beyond the unique decoding radius is possible by allowing the decoder to output a list of potential codewords. In some cases, if extra information about the transmitted codeword is known, the list can be filtered down to at most one codeword [2]. List-decoding of interleaved RS codes is an underexplored research direction.
Lagrange coded computing (LCC) [4] is a distributed computation scheme which utilizes RS decoding to achieve resilience against adversarial and straggling workers, as well as privacy assuming limited collusion. The objective of this project is to design and analyze decoding techniques for IRS codes with the goal of improving upon LCC.
[1] https://www.cs.umd.edu/~gasarch/TOPICS/pir/breakpir2.pdf
[2] https://ieeexplore.ieee.org/document/1214429
[3] https://arxiv.org/pdf/2504.10399
[4] https://proceedings.mlr.press/v89/yu19b/yu19b.pdf
Voraussetzungen
Channel Coding
Betreuer:
Homomorphic Evaluation of Kalman Filtering
homomorphic encryption, kalman filtering
Beschreibung
Kalman filtering is one of the most prominent and widely used examples of estimation methods. The Kalman filter is a recursive algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, to produce estimates of unknown variables.
It basically estimates the state of a dynamic system using a series of noisy measurements and operates in two steps: prediction, where it forecasts the next state using the system's model, and update, where it refines the estimate with new measurements.
This technique is widely applied in navigation, control systems, and signal processing for real-time data fusion and noise reduction.
One of the main challenges in Kalman filtering, especially for privacy-preserving methods, is the inversion of matrices. To preserve privacy, computations can be done via homomorphic encryption. Homomorphic encryption is a form of encryption that allows computations to be performed directly on encrypted data without needing to decrypt it first.
For instance, the homomorphic evaluation of Kalman filters requires efficient matrix inversions, which can benefit from tailored homomorphic encryption methods. Effective privacy-preserving Kalman filtering can thus play a critical role in applications ranging from autonomous navigation and prediction of vehicle trajectories to financial modeling.
The goal of this project is to explore and implement efficient privacy-preserving techniques for statistical data analysis, with a focus on homomorphic evaluation of algorithms such as Kalman filters and clustering.
References:
[1] D. Simon, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, Hoboken, NJ, USA: Wiley, 2006.
[2] M. S. Grewal and A. P. Andrews, Kalman Filtering: Theory and Practice Using MATLAB, 4th ed., Hoboken, NJ, USA: Wiley-IEEE Press, 2015.
[3] Z. Brakerski, ‘Fundamentals of fully homomorphic encryption’, in Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali, New York, NY, USA: Association for Computing Machinery, 2019, pp. 543–563. Accessed: Oct. 22, 2024. [Online]. Available: https://doi.org/10.1145/3335741.3335762
[4] V. Vaikuntanathan, ‘Computing Blindfolded: New Developments in Fully Homomorphic Encryption’, in 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Oct. 2011, pp. 5–16. doi: 10.1109/FOCS.2011.98.
[5] S. Halevi, ‘Homomorphic Encryption’, Y. Lindell, Ed., in Information Security and Cryptography. Cham: Springer International Publishing, 2017, pp. 219–276. doi: 10.1007/978-3-319-57048-8_5.
Voraussetzungen
Linear Algebra
Probability and Statistics
Cryptography
Betreuer:
LDPC Codes for Deletion Correction
Beschreibung
Overview:
Traditional error correction codes like LDPC (Low-Density Parity-Check) are highly effective for substitution noise but struggle when applied to deletion channels, where bits are removed and the sequence alignment is shifted. This project explores an experimental, graph-based decoding approach for handling such synchronization errors, inspired by methods in the references below.
You will design and simulate a novel decoder that tracks and corrects drift—the cumulative effect of deletions—by using a factor graph with locally connected variable nodes and drift constraints. A central idea is to model the decoding process as inference on this structured graph, leveraging probabilistic message passing.
What You'll Learn:
-
How deletion errors affect classical coding theory
-
Principles of LDPC codes and belief propagation
-
Techniques for modeling time-varying constraints (drift) in a graphical setting
-
Implementation of custom decoders and simulators in Python
References:
-
R. Shibata, G. Hosoya, and H. Yashima, “Design of Irregular LDPC Codes without Markers for Insertion/Deletion Channels,” IEEE GLOBECOM, 2019. doi:
-
F. Wang, D. Fertonani, and T. M. Duman, “Symbol-Level Synchronization and LDPC Code Design for Insertion/Deletion Channels,” IEEE Trans. Commun., vol. 59, no. 5, 2011. doi:
Voraussetzungen
- Channel Coding
- Codes on Graphs or Channel Codes for Iterative Decoding (Understanding of LDPC codes and factor graphs)
- Good understanding of probability and linear algebra
- Familiarity with Python programming
Betreuer:
Coding for Composite DNA
Beschreibung
DNA Data Storage
Data storage on DNA molecules is a promising approach for archiving massive data.
In classical DNA storage systems, binary information is encoded into sequences consisting of the four DNA bases {A, C, G, T}. The encoded sequences are used to generate DNA molecules called strands using the biochemical process of DNA synthesis. The synthesized strands are stored together in a tube. To retrieve the binary information, the strand must be read via DNA sequencing and decoded back into the binary representation.
The synthesis and the sequencing procedures are error-prone, and with the natural degradation of DNA they introduce errors to the DNA strands. To ensure data reliability, the errors have to be corrected by algorithms and error-correcting codes (ECCs).
A 5min video with an overview of DNA storage: https://youtu.be/r8qWc9X4f6k?si=Yzm5sOW-a6VDnBu3
Composite DNA
Recently, to allow higher potential information capacity, [1,2] introduced the DNA composite synthesis method. In this method, the multiple copies created by the standard DNA synthesis method are utilized to create composite DNA symbols, defined by a mixture of DNA bases and their ratios in a specific position of the strands. By defining different mixtures and ratios, the alphabet can be extended to have more than 4 symbols. More formally, a composite DNA symbol in a specific position can be abstracted as a quartet of probabilities {p_A, p_C, p_G, p_T}, in which p_X, 0 ≤ p_X ≤ 1, is the fraction of the base X in {A, C, G, T} in the mixture and p_A+p_C+ p_G+ p_T =1. Thus, to identify composite symbols it is required to sequence multiple reads and then to estimate p_A, p_C, p_G, p_T in each position.
Problem description
ECCs for DNA data storage differ in many aspects from classical error correction codes. In this model, new error type gain relevance, like deletions and insertions which affect the synchronization of the sequences. Especially for composite DNA data storage, these error types received only little attention.
The most related work to this problem was recently studied by Zhang et al. in [6]. The authors initiated the study of error-correcting codes for DNA composite. They considered an error model for composite symbols, which assumes that errors occur in at most t symbols, and their magnitude is limited by l. They presented several code constructions as well as bounds for this model. In this thesis, we want to analyse a different way to model the composite synthesis method and studies additional error models. We already have some results for substitution and single deletion errors. This thesis should focus on evaluating more error models in the channel model.
This should only roughly introduce the problem. No need to review all references. If you are interested, please reach out to me, and we can discuss a suitable direction for you.
References
[1] L. Anavy, I. Vaknin, O. Atar, R. Amit, and Z. Yakhini, “Data storage in DNA with fewer synthesis cycles using composite DNA letters,” Nat Biotechnol, vol. 37, no. 10, pp. 1229–1236, Oct. 2019, doi: 10.1038/s41587-019-0240-x.
[2] Y. Choi et al., “High information capacity DNA-based data storage with augmented encoding characters using degenerate bases,” Sci Rep, vol. 9, no. 1, Art. no. 1, Apr. 2019, doi: 10.1038/s41598-019-43105-w.
[3] V. Guruswami and J. Håstad, “Explicit Two-Deletion Codes With Redundancy Matching the Existential Bound,” IEEE Transactions on Information Theory, vol. 67, no. 10, pp. 6384–6394, Oct. 2021, doi: 10.1109/TIT.2021.3069446.
[4] J. Sima, N. Raviv, and J. Bruck, “Two Deletion Correcting Codes From Indicator Vectors,” IEEE Trans. Inform. Theory, vol. 66, no. 4, pp. 2375–2391, Apr. 2020, doi: 10.1109/TIT.2019.2950290.
[5] I. Smagloy, L. Welter, A. Wachter-Zeh, and E. Yaakobi, “Single-Deletion Single-Substitution Correcting Codes,” IEEE Transactions on Information Theory, pp. 1–1, 2023, doi: 10.1109/TIT.2023.3319088.
[6] W. Zhang, Z. Chen, and Z. Wang, “Limited-Magnitude Error Correction for Probability Vectors in DNA Storage,” in ICC 2022 - IEEE International Conference on Communications, Seoul, Korea, Republic of: IEEE, May 2022, pp. 3460–3465. doi: 10.1109/ICC45855.2022.9838471.
Voraussetzungen
Betreuer:
Error correction for DNA storage
Beschreibung
DNA-based data storage is a novel approach for long term digital data archiving.
Due to the unique nature of writing and reading DNA, the channel associated with these processes is still relatively poorly understood and varies over different synthesis (writing) and sequencing (reading) technologies. The task of the student is to evaluate various decoding strategies for certain error-correcting schemes tailored for the DNA storage channel.
Voraussetzungen
- Basic principles of stochastic and algebra
- Channel Coding
- Information Theory
A few references to suggest possible research directions:
[1] J. Sima, N. Raviv, M. Schwartz, and J. Bruck, “Error Correction for DNA Storage.” arXiv, Oct. 02, 2023. Available: http://arxiv.org/abs/2310.01729.
[2] S. Chandak et al., “Overcoming High Nanopore Basecaller Error Rates for DNA Storage via Basecaller-Decoder Integration and Convolutional Codes,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2020, pp. 8822–8826. doi: 10.1109/ICASSP40776.2020.9053441
[3] L. Welter, R. Sokolovskii, T. Heinis, A. Wachter-Zeh, E. Rosnes, and A. G. i Amat, “An End-to-End Coding Scheme for DNA-Based Data Storage With Nanopore-Sequenced Reads.” arXiv, Jun. 18, 2024. Available: http://arxiv.org/abs/2406.12955.
[4] A. Vidal, V. B. Wijekoon, and E. Viterbo, “Concatenated Nanopore DNA Codes,” IEEE Transactions on NanoBioscience, vol. 23, no. 2, pp. 310–318, Apr. 2024, doi: 10.1109/TNB.2024.3350001
Betreuer:
Decoding Codes in Hamming Metric using Rank-Metric Decoders
decoding complexity
Beschreibung
Given a parity-check matrix H and a syndrome s, the Hamming Syndrome Decoding (SD) problem asks to find a vector e, the "error vector", of Hamming weight at most t. This difficulty of solving this problem has important applications to cryptography and the best solvers are the so-called Information Set Decoding (ISD) algorithms [1], [2, Section 5.3]. When the code is defined over an extension field F_{q^m}, the error vector has the additional property that its rank over the base field F_q is also at most t. Hence, one can see this as an instance of the so-called Rank Syndrome Decoding (RSD) problem, which is also an important problem with many existing solvers [3, 4].
The task of the student will be to:
- understand the existing solvers for RSD
- apply these solvers to the Hamming Syndrome Decoding problem over F_{q^m}
- compare the obtained complexity to the existing solvers in the Hamming-metric, i.e., the ISD algorithms
[1] Peters, Christiane. 2010. “Information-Set Decoding for Linear Codes over Fq.” In Post-Quantum Cryptography, edited by Nicolas Sendrier, 81–94. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-12929-2_7.
[2] Weger, Violetta, Niklas Gassner, and Joachim Rosenthal. 2022. “A Survey on Code-Based Cryptography.” arXiv. https://doi.org/10.48550/arXiv.2201.07119.
[3] Gaborit, Philippe, Olivier Ruatta, and Julien Schrek. 2016. “On the Complexity of the Rank Syndrome Decoding Problem.” IEEE Transactions on Information Theory 62 (2): 1006–19. https://doi.org/10.1109/TIT.2015.2511786.
[4] Aragon, Nicolas, Philippe Gaborit, Adrien Hauteville, and Jean-Pierre Tillich. 2018. “A New Algorithm for Solving the Rank Syndrome Decoding Problem.” In 2018 IEEE International Symposium on Information Theory (ISIT), 2421–25. https://doi.org/10.1109/ISIT.2018.8437464.
Voraussetzungen
- Channel Coding (in particular, finite fields and basic coding theory)
- Linear Algebra
- Prior acquaintance with the rank-metric or motivation to learn more about it is beneficial
Betreuer:
Fundamental limits of Byzantine-resilient distributed learning
Beschreibung
In a distributed learning setting, multiple worker machines are hired to help a main server with the expensive training of a machine learning model. Each worker is assigned a subtask, from which the main server tries to reconstruct the total computation result.
In the presence of faulty or malicious workers, called Byzantine workers, a common strategy is to distribute subtasks redundantly [3]. Since the redundancy introduces large computation overheads for the workers, strategies to reduce this overhead are required. One approach is to use interactive consistency checks at the main server, which can reduce the redundancy by up to 50% [1].
The interactive consistency checks are not for free, but cause additional computation and communication cost. For specific parameter regimes, this cost is well-studied. However, it is unkown how large this cost is in general. Therefore, we ask the following research questions:
1) How much computation is needed to guarantee error-free reconstruction of the computation result?
2) How much communication is needed?
3) What is the best trade-off between communication and computation cost?
The focus of this project is to study these research questions fundamentally. That is, we aim at understanding what the least amount of communication and computation possible is. The student will analyze these questions through mathematical tools, such as graph theory or information theory. The findings shall be compared against existing schemes [1,2] to evaluate their (sub-)optimality.
[1] C. Hofmeister, L. Maßny, E. Yaakobi and R. Bitar, "Byzantine-Resilient Gradient Coding Through Local Gradient Computations," in IEEE Transactions on Information Theory, vol. 71, no. 4, pp. 3142-3156, April 2025, doi: 10.1109/TIT.2025.3542896.
[2] S. Jain, L. Maßny, C. Hofmeister, E. Yaakobi and R. Bitar, "Interactive Byzantine-Resilient Gradient Coding for General Data Assignments," 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 2024, pp. 3273-3278, doi: 10.1109/ISIT57864.2024.10619596.
[3] R. Tandon, Q. Lei, A. G. Dimakis, N. Karampatziakis, "Gradient Coding: Avoiding Stragglers in Distributed Learning", in Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3368-3376, 2017.
Voraussetzungen
Mandatory:
- strong mathematical background
- prior knowledge in information theory
- basic knowledge in graph theory
- interest in theoretical fundamental research
Recommended:
- proficiency in algebra and probability theory
- basic knowledge in coding theory
Kontakt
luis.massny@tum.de, christoph.hofmeister@tum.de
Betreuer:
Sliding Window Decoding of Spatially Coupled Product-Like Codes
Beschreibung
This work covers the implementation and theoretical analysis of sliding window decoding of spatially coupled product-like codes. The student will explore the performance-complexity tradeoff for eBCH component codes and precoded polar component codes under suboptimal component decoders like the Chase decoder or the SCL decoder. We consider adaptive coding for channels with state known at the transmitter and compute iterative decoding thresholds of corresponding code ensembles.
You may also contact me with your own project idea related to iterative decoding.
Voraussetzungen
"Channel Coding" lecture
In-depth programming skills in Matlab, Python or C.
Recommended: Lecture "Channel Codes for Iterative Decoding"
Betreuer:
Dispersion Compensation in IM/DD Systems Using the Gerchberg-Saxton Algorithm
Beschreibung
The Gerchberg-Saxton algorithm is an iterative phase retrieval method originally developed in optics, and it has been adapted for electronic dispersion compensation in optical communication systems. The student will implement the algorithm and apply it to intensity-modulation/direct-detection (IM/DD) systems to explore its potential for waveform optimization. The focus lies on simulating the algorithm's behavior in the presence of fiber dispersion and analyzing convergence properties and performance.
Betreuer:
Thesis in Optical Communications and Receiver Design
Beschreibung
Please reach out if you are interested in a thesis in any of my research fields. Possible areas include optical communications, particularly physical modeling and nonlinearity mitigation for single-mode fiber, and aspects of receiver design, such as receivers for channels with memory.
A good background in optical communication systems and applied information theory is preferable, but the requirements generally depend on your interests.
Please include a description of your interests and corresponding academic background in your application. If you have a thesis idea, I am happy to discuss your suggestions. Also, I am available to supervise external theses as long as they are in my field of expertise.
Betreuer:
Beyond Shannon: Exploring Rényi Entropy and Its Applications
Beschreibung
A foundational concept in information theory is Shannon entropy. However, Shannon entropy does not always provide sufficient flexibility when dealing with various optimization problems, robustness considerations, or scenarios where fine control over uncertainty quantification is required. In 1961, Rényi provided a parametric generalization of Shannon entropy [1], allowing for a more nuanced analysis of information measures.
Rényi entropy has found applications in diverse fields such as hypothesis testing, machine learning, privacy and security (e.g., differential privacy), and statistical physics. The target of this project is to understand the difference between Shannon and Rényi entropy, conditional entropy, and divergence, as well as its applications in both theoretical and applied research.
[1] A. Rényi, “On measures of entropy and information,” in Proc. 4th Berkeley Symp. Mathematics, Statistics, and Probability, vol. 1, Berkeley, CA, USA, 1961, pp. 547–561.
Betreuer:
Topics in Intelligent Reflecting Surfaces (IRS), Integrated Sensing and Communication and Wireless Communication
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Voraussetzungen
Prerequirements:
- Wireless Communication
- Mobile Communication
- Information Theory
- Multi-User Information Theory
Betreuer:
Thesis in Polar Coding, Probabilistic Shaping, and Applied Information Theory
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.
Betreuer:
Code Constructions for Burst Metrics
Coding theory, Lattices, Discrete mathematics
Beschreibung
This thesis is concerned with developing code constructions for a new weight function (and associated metric) on (Z/qZ)^n called the unit-burst weight, suitable for measuring same-symbol burst errors. A unit burst is defined as a vector that has some consecutive positions of ones and is zero otherwise.
Any vector v in (Z/qZ)^n can be written as a (not necessarily unique) linear combination of these bursts. The burst weight is then the minimum number of bursts that need to be added or subtracted to produce v.
This metric has a connection to the A_n root lattice, a special lattice in Z^n+1 of vectors whose entries sum to zero. More precisely, the unit bursts relate to the shortest vectors of A_n called roots, and the burst weight corresponds to the smallest decomposition of a lattice point into roots.
We have already derived some basic properties and algorithms for this new metric and now would like to find some bounds and code constructions achieving those bounds.
Kontakt
anna.baumeister@tum.de, hugo.sauerbier-couvee@tum.de
Betreuer:
Attaining Fully Homomorphic Encryption through Private Information Retrieval
homomorphic encryption, learning with errors, private information retrieval
Beschreibung
A fully homomorphic encryption (FHE) scheme is an encryption scheme which enables performing arbitrary operations on plaintexts in their encrypted form.
Private Information Retrieval (PIR) is the problem of retrieving a desired data from a database while preventing the database server from finding out the retrieved data.
The goal of this thesis/project is to understand and analyze how to obtain FHE from any PIR scheme [1], [3].
References:
[1] S. Halevi, ‘Homomorphic Encryption’, Y. Lindell, Ed., in Information Security and Cryptography. Cham: Springer International Publishing, 2017, pp. 219–276. doi: 10.1007/978-3-319-57048-8_5.
[2] Z. Brakerski and V. Vaikuntanathan, ‘Efficient Fully Homomorphic Encryption from (Standard) LWE’, in 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Palm Springs, CA, USA: IEEE, Oct. 2011, pp. 97–106. doi: 10.1109/FOCS.2011.12.
[3] V. Vaikuntanathan, ‘Computing Blindfolded: New Developments in Fully Homomorphic Encryption’, in 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Oct. 2011, pp. 5–16. doi: 10.1109/FOCS.2011.98.
[4] ‘Fully Homomorphic Encryption from Ring-LWE and Security for Key Dependent Messages’, in Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 505–524. doi: 10.1007/978-3-642-22792-9_29.
[5] Y. Ishai and A. Paskin, “Evaluating branching programs on encrypted data,” in TCC, ser. Lecture Notes in Computer Science, S. P. Vadhan, Ed., vol. 4392. Springer, 2007, pp. 575–594.
Voraussetzungen
Security in Communications and Storage
Betreuer:
Ingenieurpraxis
Sliding Window Decoding of Spatially Coupled Product-Like Codes
Beschreibung
This work covers the implementation and theoretical analysis of sliding window decoding of spatially coupled product-like codes. The student will explore the performance-complexity tradeoff for eBCH component codes and precoded polar component codes under suboptimal component decoders like the Chase decoder or the SCL decoder. We consider adaptive coding for channels with state known at the transmitter and compute iterative decoding thresholds of corresponding code ensembles.
You may also contact me with your own project idea related to iterative decoding.
Voraussetzungen
"Channel Coding" lecture
In-depth programming skills in Matlab, Python or C.
Recommended: Lecture "Channel Codes for Iterative Decoding"
Betreuer:
Beyond Shannon: Exploring Rényi Entropy and Its Applications
Beschreibung
A foundational concept in information theory is Shannon entropy. However, Shannon entropy does not always provide sufficient flexibility when dealing with various optimization problems, robustness considerations, or scenarios where fine control over uncertainty quantification is required. In 1961, Rényi provided a parametric generalization of Shannon entropy [1], allowing for a more nuanced analysis of information measures.
Rényi entropy has found applications in diverse fields such as hypothesis testing, machine learning, privacy and security (e.g., differential privacy), and statistical physics. The target of this project is to understand the difference between Shannon and Rényi entropy, conditional entropy, and divergence, as well as its applications in both theoretical and applied research.
[1] A. Rényi, “On measures of entropy and information,” in Proc. 4th Berkeley Symp. Mathematics, Statistics, and Probability, vol. 1, Berkeley, CA, USA, 1961, pp. 547–561.
Betreuer:
Thesis in Polar Coding, Probabilistic Shaping, and Applied Information Theory
Beschreibung
I may not always have prepared thesis topics available. Please feel free to reach out if you are interested in working on a thesis within any of my research areas.