Wireless Resource Management in Multi-Path 5G/6G Networks
Beschreibung
Traditional mobile networks are typically based on a single access technology or single radio link connections, which makes them prone to link failures, congestion, and interference. These limitations result in service interruptions, high latency, and reduced throughput, which can compromise safety and efficiency in mission-critical applications.
This work involves developing and implementing wireless resource management algorithms by
- formulating mathematical optimization problems
- solving the problems with heuristics or machine learning-based approaches
- evaluating the algorithms in simulators or hardware testbeds
If you are interested in this work, please send me an email with a short introduction of yourself and your interests along with your CV and grade transcript.
Note: There are multiple thesis/internship positions available within this umbrella topic. The specific topic could be adapted to suit your interests.
Voraussetzungen
- Experience with programming in C/C++ and Python is required
- Strong foundation in wireless networking concepts is required
- Motivation to learn 5G concepts
- Experience with reinforcement learning frameworks would be an advantage
Betreuer:
Working student for 6G Multi-technology testbed
Beschreibung
Connected robots need fast and reliable communication, but relying on a single wireless link can lead to failures during critical tasks. This student job focuses on combining technologies such as 6G-cellular, WiFi, and LiFi into a unified 6G network, improving reliability, reducing latency, and increasing data rates for next-generation robotic systems.
The goal of this work is to help implement and maintain a 6G multi-technology testbed and then use the testbed to investigate the performance of multipath networks. In this work, you have the opportunity to work with open source software mobile networks software as well as learn about a new technology – LiFi. This working student position can also be followed up with a research internship or thesis.
If you are interested in this work, please send me an email with a short introduction of yourself along with your CV and grade transcript.
Voraussetzungen
- Experience with programming in C/C++
- Strong foundation in wireless networking concepts
- Motivation to learn 5G concepts
- Availability to work in-presence
Betreuer:
Verbesserung der Effektivität von Over-The-Air WLAN Fuzzing gegen Android-Targets
Beschreibung
Übersicht
Fuzzing ist eine Software-Test-Methode, die genutzt wird, um Implementierungsprobleme und Bugs zu erkennen. Indem die Eingabeschnittstellen des Testobjekts (z.B. Library-Funktionen) mit inkonsistenten oder ungültigen Eingaben bespielt werden, sollen Edge-Cases erreicht und ungültige Zustände des Testobjekts herbeigeführt werden (z.B. Crash, unendliche Schleifen etc.) Fuzzing wird in der Regel auf Software angewendet, kann aber auch verwendet werden, um die Implementierung von Firmware zu testen, die auf eingebetteten Chips (z. B. WLAN-Chips) läuft. Dies geschieht, indem ungültige oder unvollständige WiFi-Frames über Funk an das Gerät gesendet werden (Over-the-Air-Fuzzing).
Das Ziel dieser Arbeit ist es, auf vorherigen Arbeiten zum Over-the-Air-Fuzzing aufzubauen und den Fuzzer zu verbessern, indem die Erzeugung von Fuzz-Testcases und/oder die Introspektion des Fuzzers in das Zielgerät verbessert werden. Dies kann durch die Modifikation des Fuzzers oder die Installation zusätzlicher Tools auf dem Zielgerät erfolgen.
Ziele
- Literaturrecherche: Detaillierte Untersuchung des aktuellen Forschungsstands zum Over-the-Air-Fuzzing und möglicher Verbesserungen eines bestehenden Fuzzers
- Konzeption: Konzeption möglicher Verbesserung(en) am Fuzzer oder Hilfssoftware auf dem Zielgerät
- Prototyping: Implementierung der entworfenen Verbesserung(en)
- Wissentschaftliche Evaluation: Durchführung von Tests und Messungen, die die Performance des verbesserten Fuzzers mit dem Original vergleichen
- Dokumentation: Gründliche Dokumentation des Forschungsprozesses, der Ergebnisse und getroffener Entscheidungen
Voraussetzungen
- Interesse für drahtlose Netzwerke und Sicherheitsanalyse im Allgemeinen und für Fuzzing im Speziellen
- Programmierkenntnisse in C und Python (ggf. auch in C++ und Java/Kotlin)
- Erste Erfahrungen mit Reverse Engineering, Assembly Code und Emulation von Vorteil, aber nicht notwendig
Wichtige Hinweise
Diese Arbeit erfolgt in Kooperation mit der Zentralen Stelle für Informationstechnik im Sicherheitsbereich.
Deshalb müssen alle Bewerbungen über das INTERAMT Portal des Partners erfolgen:
- Masterarbeit: https://interamt.de/koop/app/trefferliste?30&partner=339&suchText=%22abschlussarbeiten+%28master%29%22
Um Probleme bei der Bewerbung zu vermeiden, bitte die Hinweise auf der Seite gründlich lesen.
Die Bewerbung sollte mindestens folgende Informationen enthalten:
- ein kurzer Lebenslauf
- eine aktuelle Notenübersicht
- das Schlagwort ’T3-TFSC-OTAFUZZ’ als Kommentar
- ein kurzes Motivationsschreiben (Optional)
Für alle Fragen und weitere Informationen zu Thema und Bewerbungsprozess:
- Julian Sturm (TUM), Email: julian.sturm@tum.de
- Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Kontakt
Julian Sturm (julian.sturm@tum.de)
Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Betreuer:
WLAN Access Point Fingerprinting gegen Evil-Twin-Angriffe
Beschreibung
Übersicht
WLAN-Clients scannen kontinuierlich nach Netzwerken in der Umgebung, um sich mit bekannten Netzwerken zu verbinden. Besonders bei offenen (unverschlüsselten) Netzwerken ist es trivial einfach, ein Netzwerk zu „klonen“, also ein während der Netzwerksuchphase zu einem bekannten Netzwerk identisch aussehendes Netzwerk auszustrahlen. Clients, die dieses Netzwerk kennen, verbinden sich dann mit dem geklonten Netzwerk. Dies wird als „Evil Twin“-Angriff bezeichnet.
Ziel dieser Arbeit ist es, solche Evil-Twin-Angriffe mit Hilfe von Fingerprinting-Methoden zu erkennen. Hierzu werden verschiedene Merkmale des APs (z.B. Header-Informationen, unterstützte Funktionen, Zeitinformationen) und der Umgebung des Netzwerks (z.B. benachbarte Netzwerke, Geolokalisierung) geschickt kombiniert, um einen Fingerprint zu erhalten. Clients könnnen mit Hilfe von Fingerprinting bereits ohne die Nutzung der (randomisierten) MAC-Adresse eindeutig zugeordnet werden [1]. Diese Methodik soll jetzt genutzt werden, um einen AP wiederzuerkennen und einzuschätzen ob ein Netzwerk „echt“ oder „fake“ ist. Bei der Auswahl eines Netzwerks zum Verbinden können diese Informationen genutzt werden, um die Wahrscheinlichkeit eines erfolgreichen Evil-Twin-Angriffs deutlich zu verringern.
Ziele
- Literaturübersicht: Eingehende Untersuchung des aktuellen Forschungsstands zu WLAN Evil Twin-Angriffen, WLAN-Fingerprinting (Client- und AP-seitig) und anderen Abwehrmaßnahmen
- Entwurf einer Abwehrmaßnahme: Auswahl geeigneter Merkmale und Entwicklung einer Fingerprinting-Methode zur Unterscheidung echter und gefälschter APs
- Softwareentwicklung: Entwicklung einer Android-App oder Modifikation des zugrunde liegenden Android-Systems, um die Fingerabdruckergebnisse bei der Auswahl des Netzwerks zur Verbindung zu berücksichtigen
- Bewertung: Untersuchung der Effektivität des Fingerprintings
- Dokumentation: Gründliche Dokumentation des Forschungsprozesses, der Ergebnisse und getroffener Entscheidungen
Voraussetzungen
- Interesse und Kenntnisse im Bereich WLAN (alternativ Mobilfunk, Bluetooth o.ä.)
- Erste Erfahrungen mit Android-Entwicklung (App-Entwicklung und/oder Betriebssystem-Modifikationen, Custom ROMs etc.) von Vorteil
- Alternativ: Programmierkenntnisse in C/C++ und/oder Java/Kotlin
Wichtige Hinweise
Diese Arbeit erfolgt in Kooperation mit der Zentralen Stelle für Informationstechnik im Sicherheitsbereich.
Deshalb müssen alle Bewerbungen über das INTERAMT Portal des Partners erfolgen:
- Masterarbeit: https://interamt.de/koop/app/trefferliste?30&partner=339&suchText=%22abschlussarbeiten+%28master%29%22
Um Probleme bei der Bewerbung zu vermeiden, bitte die Hinweise auf der Seite gründlich lesen.
Die Bewerbung sollte mindestens folgende Informationen enthalten:
- ein kurzer Lebenslauf
- eine aktuelle Notenübersicht
- das Schlagwort ’T3-TFSC-WLAN-AP-FP’ als Kommentar
- ein kurzes Motivationsschreiben (Optional)
Für alle Fragen und weitere Informationen zu Thema und Bewerbungsprozess:
- Julian Sturm (TUM), Email: julian.sturm@tum.de
- Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Kontakt
Julian Sturm (julian.sturm@tum.de)
Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Betreuer:
Working Student For Medical Robotics in 6G
Beschreibung
The provision of medical expertise is limited in many regions of the world due to insufficient infrastructure and a lack of specialized medical personnel. 6G network technology provides the prerequisites for medical telepresence and high-precision telediagnostics. In the research project 6G-life2, we are developing a medical robotic system that enables patients to be examined remotely using 6G. To achieve this, we need support with the implementation and optimization of the robotic platform as well as with the management of the clinical testbed.
The tasks will include:
- Development, improvement, and evaluation of functions for a telemedicine robotic system
- Management and optimization of the ROS2 environment in the clinical testbed
Note: This position is offered by our project partners from the MITI Group!
Starting date is from now on. Please contact Sven Kolb (sven.kolb@tum.de) directly for application or further information.
Voraussetzungen
- Knowledge in robotics software development (especially ROS2, Docker, Git)
- Experience with Franka Panda robotic arms desirable
- AI/ML knowledge is also welcome
Kontakt
sven.kolb@tum.de
Betreuer:
Vergleich und Optimierung von Map-Matching-Algorithmen für Zellspuren
Beschreibung
Map-Matching bezeichnet verschiedene Verfahren, bei denen einzelne Messpunkte mit ungenauen Positionsangaben anhand von Kartendaten präzisiert werden. Die meisten etablierten Algorithmen wurden ursprünglich für hochfrequente und relative präzise Daten – beispielsweise GPS-Messungen mit einer Abtastrate im Sekunden-Bereich und Positionsfehlern von wenigen Metern – konzipiert und evaluiert.
Im Mobilfunkbereich liegen hingegen nur grobe Standortinformationen der aktuell genutzten Zelle eines Endgeräts vor. Diese werden deutlich seltener erfasst (typischerweise im Abstand mehrerer Minuten) und weisen eine wesentlich höhere Unsicherheit auf (oft bis zu 15 km).
Erste Untersuchungen mit dem AntMapper-Algorithmus zeigen jedoch, dass einige der bekannten Map-Matching-Algorithmen in der Lage sind, aus solchen Low-Frequency-Zelldaten reale Trajektorien mit akzeptabler Genauigkeit zu rekonstruieren.
Ziele
- Aufbau einer Datenbank, die GPS-Trajektorien und die zugehörigen Zellinformationen verknüpft.
- Implementierung einer Rekonstruktionsmethode für GPS-Trajektorien aus Zellspuren, basierend auf Time-Expanded-Graphs [1] oder vergleichbaren Verfahren.
- Evaluation der Rekonstruktionsgenauigkeit sowie Identifikation potenzieller Fehlerquellen..
Voraussetzungen
- Fundierte Kenntnisse im Bereich algorithmischer Verfahren.
- Solide Programmiererfahrung (z.B. Python, Rust).
- Bereitschaft zur Erfassung und Aufbereitung von Zellspurdaten mittels App.
Wichtige Hinweise
Diese Arbeit erfolgt in Kooperation mit der Zentralen Stelle für Informationstechnik im Sicherheitsbereich.
Deshalb müssen alle Bewerbungen über das INTERAMT Portal des Partners erfolgen:
- Masterarbeit: https://interamt.de/koop/app/trefferliste?30&partner=339&suchText=%22abschlussarbeiten+%28master%29%22
Um Probleme bei der Bewerbung zu vermeiden, bitte die Hinweise auf der Seite gründlich lesen.
Die Bewerbung sollte mindestens folgende Informationen enthalten:
- ein kurzer Lebenslauf
- eine aktuelle Notenübersicht
- das Schlagwort ’T3-TFMK-CTTEG’ als Kommentar
- ein kurzes Motivationsschreiben (Optional)
Für alle Fragen und weitere Informationen zu Thema und Bewerbungsprozess:
- Julian Sturm (TUM), Email: julian.sturm@tum.de
- Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Kontakt
Julian Sturm (julian.sturm@tum.de)
Forschungsreferat Telekommunikation (ZITiS), Email: t3@zitis.bund.de
Betreuer:
Evaluating the Necessity of an Orchestration Tool in Kubernetes-Based CNF Deployments: A Design Science Approach
Kubernetes, Cloud Orchestration, 5G Core Network, Cloud-Native Network Functions
Beschreibung
In the ongoing digital transformation, telecommunications companies are shifting from Virtual Network Functions (VNFs) to Cloud-Native Network Functions (CNFs) to meet the demand for agile, scalable, and resilient services. Deutsche Telekom is at the forefront of this transition, moving its network services onto a self-hosted bare-metal cloud infrastructure using Kubernetes as the core platform for container orchestration.
Kubernetes, widely recognized for its robust orchestration capabilities, is the foundation of Deutsche Telekom's cloud-native strategy. However, as network services are usually complex software solutions, deploying and provisioning CNFs pose several orchestration challenges that may require additional tooling. Various tools on the market are designed to manage these orchestration complexities, but the necessity and efficiency of such tools in a Kubernetes-based environment remain an open question.
This thesis seeks to answer the following question: "Is an additional orchestration tool necessary for managing CNF deployments in Kubernetes, or can a custom Kubernetes operator effectively address these orchestration needs?". The purpose of this master's thesis is to evaluate whether a dedicated orchestration tool is needed when deploying and managing CNFs in a Kubernetes setup, where Kubernetes already acts as an orchestrator. This thesis will also explore the design and development of a Kubernetes operator as a potential alternative to using an external orchestration tool.
For more details, please check the PDF with the thesis description
Voraussetzungen
We’re looking for motivated and technically skilled individuals to undertake a challenging and rewarding thesis project. To ensure success, the following prerequisites are essential:
- Strong Technical Acumen: A solid understanding of technical concepts and the ability to quickly adapt to and adopt new technologies.
- Programming Expertise: Proficiency in programming, ideally with experience in Go.
- Containerization Knowledge: Familiarity with container technologies for software deployment (e.g., Docker).
- (Kubernetes Experience): Prior exposure to Kubernetes is a plus but not mandatory.
Kontakt
- Dr. Patrick Derckx (patrick.derckx@telekom.de)
- Razvan-Mihai Ursu (razvan.ursu@tum.de)
Betreuer:
Most energy efficient Core on a private Telco Cloud: Energy optimized redundancy model for telco applications
Kubernetes, Energy Efficiency, 5G Core Network
Beschreibung
Motivation:
Deutsche Telekom is operating and constantly developing and improving it’s own cloud to operate internet and telephony services (Telco applications). The Kubernetes-based cloud and the Telco applications are combined to form a TaaP - Telco as a Platform. The TaaP are thousands of servers and hundreds of applications. The energy efficiency of the TaaP is a key success criterion for optimizing costs, energy consumption, and carbon emissions. Hence, a concept of Full Stack Energy Management is established. The focus is to jointly optimize hardware, software, and services towards energy efficiency without affecting service availability and robustness.
Problem & Challenge:
In the Telco industry, so far, HW redundancy has been the baseline for service robustness and resilience. The introduction of virtualization and containerization concepts resulted in an additional redundancy level above the hardware. Classical redundancy models no longer apply to this multi-layer Software redundancy. Moreover, so far, there is no mathematical model that captures the service availability for these new architectures. For example, a Telco-Service can be provided via 3 data centers with 50 servers each, forming 1 cluster hosting 500 Kubernetes pods. There is a mix of data center redundancy, hardware redundancy in the data centers, and Kubernetes worker node and pod redundancy.
Specific Problem Formulation:
On a TaaP there are multiple layers of redundancy in Hardware and Software. On the one hand, there are multiple site deployments, where each site has multiple hundreds of servers. On the other hand, on each site, each server has multiple redundant hardware parts, like the power supply. Moreover, a Kubernetes Cluster, which is homed on one site, hosts multiple microservices, each with a different redundancy concept like active/passive, n+1, n+m, etc. This setup of mixed HW and SW redundancy causes inefficiency and is not easy to calculate or simulate in terms of overall service availability, network, site, redundancy, and energy consumption.
Solution Approach:
There are multiple different parameters in HW and SW that impact the service availability and energy consumption. Firstly, a comprehensive list of these parameters is required, including modeling of dependencies. Secondly, a mathematical model needs to be set up that considers all of these parameters in "one equation". Thirdly, a graphical simulation should be set up to demonstrate the dependencies and results.
Expected Outcome:
A simulation and mathematical model should be developed that considers software and hardware redundancy across multiple sites and SW layers to calculate the network-wide service availability. This is key in order to further optimize the HW and SW footprint and improve sustainability. Moreover, the model should allow the optimization of the following parameters: least required HW based on predefined service availability, least energy consumption, and best redundancy.
Working on the live network or setting up a lab deployment is out of scope of this thesis. The focus of this thesis are the modeling and simulation of the deployment.
Voraussetzungen
- Proficiency in English. The project language is English, and the team spans across four EU countries.
- Advanced Kubernetes Knowhow.
- Basic knowledge about 5G Telco networks.
- Familiarity with tools such as GitLab and Wiki platforms.
- High level of self-engagement and motivation.
Kontakt
- Manuel Keipert (manuel.keipert@telekom.de)
- Valentin Haider (valentin.haider@tum.de)
- Razvan-Mihai Ursu (razvan.ursu@tum.de)
Betreuer:
Decentralized Federated Learning on Constrained IoT Devices
Beschreibung
The Internet of Things (IoT) is an increasingly prominent aspect of our daily lives, with connected devices offering unprecedented convenience and efficiency. As we move towards a more interconnected world, ensuring the privacy and security of data generated by these devices is paramount. That is where decentralized federated learning comes in.
Federated Learning (FL) is a machine-learning paradigm that enables multiple parties to collaboratively train a model without sharing their data directly. This thesis focuses on taking FL one step further by removing the need for a central server, allowing IoT devices to directly collaborate in a peer-to-peer manner.
In this project, you will explore and develop decentralized federated learning frameworks specifically tailored for constrained IoT devices with limited computational power, memory, and energy resources. The aim is to design and implement efficient algorithms that can harness the collective power of these devices while ensuring data privacy and device autonomy. This involves tackling challenges related to resource-constrained environments, heterogeneous device capabilities, and maintaining security and privacy guarantees.
The project offers a unique opportunity to contribute to cutting-edge research with real-world impact. Successful outcomes will enable secure and private machine learning on IoT devices, fostering new applications in areas such as smart homes, industrial automation, and wearable health monitoring.
Responsibilities:
- Literature review on decentralized federated learning, especially in relation to IoT and decentralized systems.
- Design and development of decentralized FL frameworks suitable for constrained IoT devices.
- Implementation and evaluation of the proposed framework using real-world datasets and testbeds.
- Analysis of security and privacy aspects, along with resource utilization.
- Documentation and presentation of findings in a thesis report, possibly leading to publications in top venues.
Requirements:
- Enrollment in a Master's program in Computer Engineering, Computer Science, Electrical Engineering or related fields
- Solid understanding of machine learning algorithms and frameworks (e.g., TensorFlow, PyTorch)
- Proficiency in C and Python programming language
- Experience with IoT devices and embedded systems development
- Excellent analytical skills and a systematic problem-solving approach
Nice to Have:
- Knowledge of cybersecurity and privacy principles
- Familiarity with blockchain or other decentralized technologies
- Interest in distributed computing and edge computing paradigms
Kontakt
Email: navid.asadi@tum.de
Betreuer:
Attacks on Cloud Autoscaling Mechanisms
Cloud Computing, Kubernetes, autoscaling, low and slow attacks, Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), cloud security, container orches
Beschreibung
In the era of cloud-native computing, Kubernetes has emerged as a leading container orchestration platform, enabling seamless scalability and reliability for modern applications.
However, with its widespread adoption comes a new frontier in cybersecurity challenges, particularly low and slow attacks that exploit autoscaling features to disrupt services subtly yet effectively.
This project aims to delve into the intricacies of these attacks, examining their impact on Kubernetes' Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), and proposing mitigation strategies for more resilient systems.
Responsibilities:
- Conduct a thorough literature review to identify existing knowledge gaps and research on similar attacks.
- Develop methodologies to simulate low and slow attack scenarios on Kubernetes clusters with varying configurations of autoscaling mechanisms.
- Analyze the impact of these attacks on resource utilization, service availability, and overall system performance.
- Evaluate current defense mechanisms and propose novel strategies to enhance the resilience of Kubernetes' autoscaling features.
- Implement and test selected mitigation approaches in a controlled environment.
- Document findings, present a comparative analysis of effectiveness, and discuss implications for future development in cloud security practices.
Requirements:
- A strong background in computer engineering, computer science or a related field.
- Familiarity with Kubernetes architecture and container orchestration concepts.
- Experience in deploying and managing applications on Kubernetes clusters.
- Proficiency in at least one scripting/programming language (e.g., Python, Go).
- Understanding of cloud computing and cybersecurity fundamentals.
Nice to Have:
- Prior research or hands-on experience in cloud security, particularly in the context of Kubernetes.
- Knowledge of network protocols and low-level system interactions.
- Experience with DevOps tools and practices.
Kontakt
Email: navid.asasdi@tum.de
Betreuer:
Working Student/Research Internship - On-Device Training on Microcontrollers
Beschreibung
We are seeking a highly motivated and skilled student to replicate a research paper that explores the application of pruning techniques for on-device training on microcontrollers. The original paper demonstrated the feasibility of deploying deep neural networks on resource-constrained devices, and achieved significant reductions in model size and computational requirements while maintaining acceptable accuracy.
Responsibilities:
- Extend our existing framework by implementing the pruning techniques on a microcontroller-based platform (e.g., Arduino, ESP32)
- Replicate the experiments described in the original paper to validate the results
- Evaluate the performance of the pruned models on various benchmark datasets
- Compare the results with the original paper and identify areas for improvement
- Document the replication process, results, and findings in a clear and concise manner
Requirements:
- Strong programming skills in C and Python
- Experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and microcontroller-based platforms
- Familiarity with pruning techniques for neural networks is a plus
- Excellent analytical and problem-solving skills
- Ability to work independently and manage time effectively
- Strong communication and documentation skills
Kontakt
Email: navid.asadi@tum.de
Betreuer:
Working Student - Machine Learning Serving on Kubernetes
Machine Learning, Kubernetes, Containerization, Docker, Orchestration, Cloud Computing, MLOps, Machine Learning Operations, DevOps, Microservices Architecture,
Beschreibung
We are seeking an ambitious and forward-thinking working student to join our dynamic team working at the intersection of Machine Learning (ML) and Kubernetes. In this exciting role, you will be immersed in a cutting-edge environment where advanced ML models meet the power of container orchestration through Kubernetes. Your contributions will directly impact the development and optimization of scalable and robust ML serving systems leveraging the benefits of Kubernetes.
If you are a student passionate about both Machine Learning and Kubernetes, we invite you to join us on this exciting journey! We offer the chance to pioneer cutting-edge solutions that leverage the power of these two transformative technologies.
Responsibilities:
- Collaborate with a cross-functional team to design and implement ML workflows on Kubernetes.
- Assist in packaging and deploying ML models as microservices using containers (Docker) and managing them effectively through Kubernetes.
- Optimize resource allocation, scheduling, and scaling strategies for efficient model serving at varying workloads.
- Implement monitoring solutions specific to ML inference tasks within the Kubernetes cluster.
- Troubleshoot and debug issues related to containerized ML applications
- Document best practices, tutorials, and guides on leveraging Kubernetes for ML serving
Requirements:
- Currently enrolled in a Bachelor's or Master's program in School of CIT
- Strong programming skills in Python with experience in software development lifecycle methodologies.
- Familiarity with machine learning frameworks such as TensorFlow and PyTorch.
- Proficiency in container technologies. Docker and Kubernetes certification would be a plus but not mandatory.
- Experience with cloud computing platforms; e.g., AWS, GCP or Azure.
- Demonstrated ability to work independently with effective time management and strong problem-solving analytical skills.
- Excellent communication and teamwork capabilities.
Nice to Have:
- Kubernetes Certification: Having a valid Kubernetes certification (CKA, CKAD, or CKE) demonstrates your expertise in container orchestration and can be a significant advantage.
- Experience with DevOps and/or MLOps Tools: Familiarity with MLOps tools such as MLflow, Kubeflow, or TensorFlow Extended (TFX) can help you streamline the machine learning workflow and improve collaboration. Experience with OpenTelemetry, Jaeger, Istio, and monitoring tools is a plus.
- Knowledge of Distributed Systems: Understanding distributed systems architecture and design patterns can help you optimize the performance and scalability of your machine learning models.
- Contributions to Open-Source Projects: Having contributed to open-source projects related to Kubernetes, machine learning, or MLOps demonstrates your ability to collaborate with others and adapt to new technologies.
- Familiarity with Agile Methodologies: Knowledge of agile development methodologies such as Scrum or Kanban can help you work efficiently in a fast-paced environment and deliver results quickly.
- Cloud-Native Application Development: Experience with cloud-native application development using frameworks like Cloud Foundry or AWS Cloud Development Kit (CDK) can be beneficial in designing scalable and efficient machine learning workflows.
Kontakt
Email: navid.asadi@tum.de
Betreuer:
Working Student for the Edge AI Testbed
IoT, Edge Computing, Machine Learning, Measurement, Power Characterization
Beschreibung
We are seeking a highly motivated and enthusiastic Working Student to join our team as part of the Edge AI Testbed project. As a Working Student, a key member of our research team, you will contribute to the development and testing of cutting-edge Artificial Intelligence (AI) systems at the edge of the network. You will work closely with our researchers and engineers to design, implement, and evaluate innovative AI solutions that can operate efficiently on resource-constrained edge devices.
Responsibilities:
- Assist in designing and implementing AI models for edge computing
- Develop and test software components for the Edge AI Testbed
- Collaborate with team members to integrate AI models with edge hardware platforms
- Participate in performance optimization and evaluation of AI systems on edge devices
- Contribute to the development of tools and scripts for automated testing and deployment
- Document and report on project progress, results, and findings
If you are a motivated and talented student looking to gain hands-on experience in Edge AI, we encourage you to apply for this exciting opportunity!
Requirements:
- Currently enrolled in a Bachelor's or Master's program in School of CIT
- Strong programming skills in languages such as Python and C++
- Experience with AI frameworks such as TensorFlow, PyTorch, or Keras
- Familiarity with edge computing platforms and devices (e.g., Raspberry Pi, NVIDIA Jetson)
- Basic knowledge of Linux operating systems and shell scripting
- Excellent problem-solving skills and ability to work independently
- Strong communication and teamwork skills
Nice to Have:
- Experience with containerization using Docker
- Familiarity with cloud computing platforms (e.g., Kubernetes)
- Experience with Apache Ray
- Knowledge of computer vision or natural language processing
- Participation in open-source projects or personal projects related to AI and edge computing
Kontakt
Email: navid.asadi@tum.de
Betreuer:
An AI Benchmarking Suite for Microservices-Based Applications
Kubernetes, Deep Learning, Video Analytics, Microservices
In the realm of AI applications, the deployment strategy significantly impacts performance metrics. This research internship aims to investigate and benchmark AI applications in two predominant deployment configurations: monolithic and microservices-based, specifically within Kubernetes environments. The central question revolves around understanding how these deployment strategies affect various performance metrics and determining the more efficient configuration. This inquiry is crucial as the deployment strategy plays a pivotal role in the operational efficiency of AI applications. Currently, the field lacks a comprehensive benchmarking suite that evaluates AI applications from an end-to-end deployment perspective. Our approach includes the development of a benchmarking suite tailored for microservice-based AI applications. This suite will capture metrics such as CPU/GPU/Memory utilization, interservice communication, end-to-end and per-service latency, and cache misses.
Beschreibung
In the realm of AI applications, the deployment strategy significantly impacts performance metrics.
This research internship aims to investigate and benchmark AI applications in two predominant deployment configurations: monolithic and microservices-based, specifically within Kubernetes environments.
The central question revolves around understanding how these deployment strategies affect various performance metrics and determining the more efficient configuration. This inquiry is crucial as the deployment strategy plays a pivotal role in the operational efficiency of AI applications.
Currently, the field lacks a comprehensive benchmarking suite that evaluates AI applications from an end-to-end deployment perspective. Our approach includes the development of a benchmarking suite tailored for microservice-based AI applications.
This suite will capture metrics such as CPU/GPU/Memory utilization, interservice communication, end-to-end and per-service latency, and cache misses.
Voraussetzungen
- Familiarity with Kubernetes
- Familiarity with Deep Learning frameworks (e.g., PyTorch or TensorFlow)
- Basics of computer networking
Kontakt
Email: navid.asadi@tum.de
Betreuer:
Performance Evaluation of Serverless Frameworks
Serverless, Function as a Service, Machine Learning, Distributed ML
Beschreibung
Serverless computing is a cloud computing paradigm that separates infrastructure management from software development and deployment. It offers advantages such as low development overhead, fine-grained unmanaged autoscaling, and reduced customer billing. From the cloud provider's perspective, serverless reduces operational costs through multi-tenant resource multiplexing and infrastructure heterogeneity.
However, the serverless paradigm also comes with its challenges. First, a systematic methodology is needed to assess the performance of heterogeneous open-source serverless solutions. To our knowledge, existing surveys need a thorough comparison between these frameworks. Second, there are inherent challenges associated with the serverless architecture, specifically due to its short-lived and stateless nature.
Requirements:
- Familiarity with Kubernetes
- Basics of computer networking
Kontakt
Email: navid.asadi@tum.de