Open Theses, Internship Opportunities, and Working Student Positions
Ongoing Theses
Scaling and Model Disaggregating Distributed ML Systems: Monolithic vs. Microservices Performance Analysis
Kubernetes, Deep Learning, Video Analytics, Microservices
Description
The increasing complexity of video analysis tasks has led to the development of sophisticated machine learning applications that rely on multiple interconnected deep-learning models. However, deploying these applications on edge servers with limited resources poses significant challenges in terms of balancing response time, accuracy, and resource utilization.
This research aims to investigate the trade-offs between monolithic and microservice-based architectures for multi-model video analytics applications. We will leverage Apache Ray and Kubernetes to develop a comprehensive benchmarking and monitoring pipeline that systematically analyzes scaling configurations and resource utilization. Our goal is to identify strategies for reducing latency and communication overhead while handling the complexity of multi-model architectures.
Through case studies, we will implement two applications consisting of multiple ML models and explore various deployment configurations, from monolithic to fully microservice-based. We will analyze their performance under different scaling strategies and investigate the impact of model disaggregation on performance metrics during scaling.
The research will focus on addressing key challenges, including:
- Implementing effective monitoring and profiling across distributed services
- Balancing performance and resource utilization across multiple models
- Researching scalability solutions to meet strict latency requirements in edge computing scenarios
Our expected outcomes include a deeper understanding of the trade-offs between monolithic and microservice-based architectures and the development of strategies for optimizing the deployment of multi-model ML applications.
Supervisor:
Finding Anti-Patterns on Microservices-Based AI Applications
Machine Learning, Kubernetes, Microservices
Description
This project aims to identify, analyze, and mitigate antipatterns in AI applications that use microservices-based architectures in distributed systems. The focus will be on Apache Ray within a Kubernetes environment. The goal is to develop strategies to optimize performance and resource efficiency by addressing common antipatterns that can degrade system performance, such as inefficient communication between services, over-provisioning of resources, and poor task distribution.
To achieve this, the project will benchmark AI applications, including video surveillance and action detection systems, to understand how antipatterns emerge and affect key performance metrics like latency, throughput, and resource utilization.
Supervisor:
Performance Evaluation of Serverless Frameworks
Serverless, Function as a Service, Machine Learning, Distributed ML
Description
Serverless computing is a cloud computing paradigm that separates infrastructure management from software development and deployment. It offers advantages such as low development overhead, fine-grained unmanaged autoscaling, and reduced customer billing. From the cloud provider's perspective, serverless reduces operational costs through multi-tenant resource multiplexing and infrastructure heterogeneity.
However, the serverless paradigm also comes with its challenges. First, a systematic methodology is needed to assess the performance of heterogeneous open-source serverless solutions. To our knowledge, existing surveys need a thorough comparison between these frameworks. Second, there are inherent challenges associated with the serverless architecture, specifically due to its short-lived and stateless nature.