Available Theses

Title	Student(s)	Supervisor	Description
Scalable and Adaptive Stream Processing Architectures	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Fault Tolerance and Resilience in Stream Processing	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Reinforcement/Machine/Federated Learning Integration in Streaming Data	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Hybrid Batch and Stream Processing Systems	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Fault-Tolerant Stream Processing with Sparse and Quantized Transformer Partitioning in Distributed Systems	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Layer-Wise Sparse Attention and Quantization for Scalable Transformer Partitioning	1 or 2	Thomas Fahringer, Abolfazl Younesi	details
Automatic LLM-based generation of novel language specifications	1	Juan Aznar Poveda and Marlon Etheredge	details
Fault Tolerance for a Novel Cloud-Edge-IoT Programming Model	1	Thomas Fahringer	details
Extension of a Novel Cloud-Edge-IoT Programming Model	1	Thomas Fahringer	details
Generation of Realistic Use Cases, Including Evaluation and Testing	1	Thomas Fahringer	details
Novel Programming Model and Runtime System for Edge/Cloud Infrastructures	1	Thomas Fahringer	details
Decentralized federated-Learning policy update for smart buildings	1	Juan Aznar	details
Python Frontend for Serverless Workflows	1	Juan Aznar	details
Instrumentation, Monitoring, and Visualization of Edge-Cloud Applications	1	Thomas Fahringer	details
Automatic Data Dependence Analysis for Simple C Programs	1	Thomas Fahringer	details
Additional List of Master Theses offered by Peter Thoman		Peter Thoman	details

Title	Scalable and Adaptive Stream Processing Architectures
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Designing and evaluating novel algorithms and system architectures for real-time data processing in large-scale, heterogeneous environments (e.g., cloud-edge networks). Emphasis on resource allocation, parallelism, and load balancing to achieve high throughput and low latency.
Description	This project involves exploring cutting-edge techniques in distributed stream processing, including partitioning strategies, dynamic scheduling, and performance modeling. Students will develop or extend existing frameworks to handle massive and high-velocity data streams.
Tasks	• Dynamic Resource Allocation: Create adaptive load balancing and scheduling strategies. • Scalability Mechanisms: Explore partitioning, paral- lelism, and distributed execution techniques. • Performance Evaluation: Develop benchmarks and simulation frameworks for empirical validation.
Theoretical skills	Distributed Systems Concepts, Scalability and Performance Modeling, Queueing Theory/Scheduling Algorithms
Practical skills	Programming in Java/Scala/Python, Familiarity with Apache Spark/Flink/Storm, Cloud Deployment (AWS/Azure)

Title	Fault Tolerance and Resilience in Stream Processing
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Investigating fault detection and recovery mechanisms, checkpointing, and consistency models in distributed stream processing frameworks to ensure reliability without sacrificing performance.
Description	Students will explore techniques to improve system robustness under node failures, network disruptions, and software bugs. This includes designing lightweight checkpointing and self-healing strategies that minimize overhead while maintaining data consistency.
Tasks	• Lightweight Checkpointing: Implement and test new recovery methods. • Self-Healing Systems: Develop adaptive failure detection and recovery strategies. • Consistency Models: Research trade-offs between strong consistency and high performance in real-time pipelines.
Theoretical skills	Distributed Systems Reliability, Consistency Models and Algorithms, Fault Tolerance Theories
Practical skills	Programming in Java/Scala, Experience with Stream Processing Frameworks, Familiarity with Docker/Kubernetes

Title	Reinforcement/Machine/Federated Learning Integration in Streaming Data
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Designing systems that integrate streaming data analytics with online and incremental machine learning methods, enabling continuous model training, inference, and adaptation in real time.
Description	The project aims to build pipelines that can handle rapidly evolving data streams, applying machine learning models that update on the fly. Students will investigate online learning, incremental deep learning, and anomaly detection while ensuring low latency and scalability.
Tasks	• Online Learning Models: Adapt machine learning al- gorithms to learn from new data continuously. • Incremental Deep Learning: Implement and evaluate deep neural networks that update in real time. • Explainability: Investigate interpretability methods for real-time predictions.
Theoretical skills	Machine Learning/Deep Learning Fundamentals, Online Learning Theory, Statistics and Data Analysis
Practical skills	Python (TensorFlow, PyTorch) or Java-based ML libraries, Streaming Frameworks (Apache Flink, Kafka Streams), Model Deployment and Monitoring
Additional information	Subprojects could include designing new online learning algorithms, benchmarking incremental ML frameworks, or developing explainable real-time AI.

Title	Hybrid Batch and Stream Processing Systems
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Building and evaluating unified systems that combine historical (batch) data analytics with real-time (stream) processing, enabling comprehensive insights and efficient data lifecycle management.
Description	This project explores architectures that handle both high-throughput batch workloads and low-latency streaming workloads. Students will investigate query optimization, data synchronization, and transactional consistency across batch and stream pipelines.
Tasks	• Unified Data Models: Propose or refine data models that integrate batch and stream data. • Transactional Consistency: Examine consistency and synchronization challenges in hybrid systems. • System Integration: Prototype solutions bridging real- time analytics with big data warehousing.
Theoretical skills	Big Data Architectures, Database and Transaction Theory, Distributed Computing Concepts
Practical skills	Hadoop/Spark for batch processing, Kafka/Flume/Flink for streaming, SQL and NoSQL Databases

Title	Fault-Tolerant Stream Processing with Sparse and Quantized Transformer Partitioning in Distributed Systems
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Design a fault-tolerant partitioning framework for Transformer models that integrates sparse computation and quantization techniques to handle stream processing in distributed systems.
Description	The project aims to partition and deploy Transformer-based DNNs for streaming data while ensuring continuous operation and resilience against node failures. The framework will maintain efficient processing even under resource constraints by utilizing sparsity and quantization.
Tasks	• Develop partitioning strategies for Transformer models us- ing sparse computation. • Integrate quantization techniques to reduce computa- tional overhead. • Implement fault-tolerance mechanisms for distributed stream processing. • Evaluate the framework using benchmarks and fault in- jection experiments.
Theoretical skills	Distributed Systems, Fault Tolerance, Sparse Computation, Neural Network Quantization
Practical skills	Programming in Python/C++, Deep Learning Frameworks, Experience with Distributed Computing

Title	Layer-Wise Sparse Attention and Quantization for Scalable Transformer Partitioning
Number of students	1 – 2 (preferred)
Language	English
Supervisors	Thomas Fahringer, Abolfazl Younesi
Focus	Develop a layer-wise strategy that leverages sparse attention and quantization to enable efficient partitioning of Transformer models across multiple edge cloud platforms, optimizing resource usage and scalability.
Description	This project focuses on enhancing the scalability of Transformer models in multi-cloud environments. The approach aims to optimize resource allocation and reduce inter-cloud communication overhead while ensuring high model performance by applying sparse attention mechanisms and quantization layer-by-layer.
Tasks	• Analyze Transformer architecture for partitioning oppor- tunities. • Develop layer-wise sparse attention mechanisms. • Integrate quantization techniques to minimize resource us- age. • Benchmark the partitioning strategy in multi-cloud se- tups.
Theoretical skills	Deep Learning Architectures, Sparse Computation, Quantization Methods, Multi-Cloud Computing Concepts
Practical skills	Experience with Deep Learning Frameworks, Cloud Deployment (AWS, Azure, GCP), Programming in Python/C++

Title	Automatic LLM-based generation of novel language specifications
Number of students	1
Language	English
Supervisors	Juan Aznar Poveda and Marlon Etheredge
Description	We research and develop a novel programming model based on state machines for high-performance, fault-tolerant, and dynamic Edge-Cloud applications. The automation of the language specification would largely increase developers’ productivity and favor massive deployments across the continuum. For this purpose, this topic is focused on (i) investigating suitable Large Language Models able to automatically convert requirements into applications written in the novel language developed by the DPS, (ii) implement the tool and integrate it into our system, and (iii) rigorously assess it for various use cases.
Tasks	Study LLMs Implement the LLM-based language generator Complete evaluation for different use cases
Theoretical skills	AI/ML foundations Cloud computing Programming models Formal specifications Familiarity with computer science topics such as the following will become important throughout this project: artificial intelligence, cloud computing, distributed systems, state machines.
Practical skills	Excellent programming abilities in Python as well as Java In-depth knowledge of AI/ML
Additional information	IMPORTANT: As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title	Fault Tolerance for a Novel Cloud-Edge-IoT Programming Model
Number of students	1
Language	English
Supervisors	Thomas Fahringer
Description	We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications. Our programming model lends itself to cloud applications that require fault tolerance. For this reason, we would like to extend our programming model with fault-tolerant constructs. For this project, it is required to investigate possible extensions to our programming model, implementation of these extensions into our system, and a complete evaluation, including use case development.
Tasks	Study fault-tolerant extensions to our programming model Implement the extensions in the system that implement our programming model Complete evaluation of the added extensions (theoretical and practical) Development of use cases that prove the extensions
Theoretical skills	Cloud computing Programming models Formal specifications Familiarity with computer science topics such as the following will become important throughout this project: consensus algorithms, state machines, distributed systems, and cloud computing.
Practical skills	Excellent programming abilities in Python as well as Java In-depth knowledge of distributed systems and cloud computing
Additional information	As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title	Extension of a Novel Cloud-Edge-IoT Programming Model
Number of students	1
Language	English
Supervisors	Thomas Fahringer
Description	We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications. Our programming model could be extended to increase the level of abstraction and reusability of developed components. Extensions that we are currently considering as being useful additions to our programming models are analogous to concepts such as: Abstraction Encapsulation Inheritance Polymorphism Reusability However, new ideas and their application to our programming model will be integral to this project. For this project, it is required to investigate possible extensions to our programming model, implementation of these extensions into our system, and a complete evaluation, including use case development.
Tasks	Studying new extensions to our programming model Implementation of extensions
Theoretical skills	Cloud computing Programming language design Parsers Formal specification
Practical skills	Excellent programming abilities in Python as well as Java In-depth knowledge of distributed systems and cloud computing
Additional information	As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title	Generation of Realistic Use Cases, Including Evaluation and Testing
Number of students	1
Language	English
Supervisors	Thomas Fahringer
Description	We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications. As an extension to the system, we are currently interested in accepting a bachelor student working on a project to extend the project with tooling that simplifies the development of cloud applications. An essential aspect of our programming model is the study of realistic (real-world) use cases. Such use cases may include applications such as smart cities, smart buildings, surveillance, and AI analysis. The generation of realistic use cases will entail the following: The studying (identification) of such realistic use cases Modeling of applications uses our programming model to develop applications for these use cases Complete evaluation of the applications in a real-world environment Comparison to existing programming models The product of this project is required to be the development of several realistic use cases. For this project, it is required to study possible real-world use cases, develop applications for these use cases, implement required services, and perform a complete evaluation of any implemented use case.
Tasks	Studying novel and realistic real-world use cases Modeling of applications for use cases using our programming model Evaluating use cases in a real-world environment Evaluation, comparing to existing programming models
Theoretical skills	Cloud computing Modeling Familiarity with computer science topics such as the following will become important throughout this project: modeling and AI analysis (for instance, image analysis).
Practical skills	Excellent programming abilities in Python as well as Java In-depth knowledge of distributed systems and cloud computing
Additional information	As our system is intended to solve real-world problems, any applications developed must be of high quality. For this reason, we are looking for students with excellent programming abilities.

Title	Novel Programming Model and Runtime System for Edge/Cloud Infrastructures
Number of students	1
Language	English
Supervisors	Thomas Fahringer
Description	Develop a novel programming model (language) which is based on a set of state machines and transitions among these state machines based on events. Develop a highly distributed and scaling runtime system that executes programs with this novel programming model. Test it with 2 – 3 applications that should be developed as part of this thesis. Testing will be done on a local edge laboratory with a public cloud.
Tasks	Specify a novel programming language based on state machines with YAML or JSON Convert programs written in the above language into a Java intermediate representation Develop a runtime system that can executed programs written in the above language targeting edge/cloud infrastructures Reuse existing event based environments and monitoring framework as part of the runtime system Study related work and compare against this new approach Test the overall framework with 2 – 3 applications that should be developed as part of this thesis Incorporate rigorous testing from the very beginning in the development process
Theoretical skills	Cloud computing, Serverless, Machine/Federated Learning
Practical skills	Java (expert programmer), git and GitHub, JSON or YAML, serverless computing, Container and Virtual Machine technologies
Additional information	You should have passed the lecture and PS on Verteilte Systeme in the computer science bachelor program or otherwise demonstrate that are familiar with the practical skills mentioned above. You should be an expert programmer in Java.The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed environments will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title	Decentralized federated-Learning policy update for smart buildings
Number of students	1
Language	English
Supervisors	Juan Aznar
Description	Smart buildings (SB) should provide smart and automatic responses to undesired events such as fire, earthquakes, water leakages, etc. To this end, edge devices are deployed to recognize risky scenarios on smart buildings (SB). In this thesis, a federated learning [1] scheme should be applied to optimally select the actions taken by the smart building and improve the different Machine Learning models trained over time in a distributed fashion. Finally, the Apollo system [2] will be used, not only to exploit parallelism and scalability, but also to intelligently select resources across the cloud-edge continuum according to the nature of the tasks scheduled.[1] https://www.tensorflow.org/federated [2] https://apollowf.github.io/learn.html
Tasks	Deploy a federated learning approach to improve smart buildings capabilities over time using decentralized data stemming from edge devices. Create one or more simple workflows whose tasks will be orchestrated by Apollo in a distributed fashion. Study the performance and scalability of the proposed solution and compare it with a solution without shared learning across edge devices and without orchestration.
Theoretical skills	Cloud computing, Serverless, Machine/Federated Learning
Practical skills	Python, TensorFlow (Lite & Federated), git, GitHub
Additional information

Title	Python Frontend for Serverless Workflows
Number of students	1
Language	English
Supervisors	Juan Aznar
Description	Apollo (https://apollowf.github.io/) is the DPS research orchestration and runtime system for Edge-Cloud infrastructures. We are using AFCL (https://apollowf.github.io/learn.html) to describe serverless workflows for distributed applications. As part of this thesis, you will have to create a Python version for AFCL thus application developers can create Python programs to build workflows instead of using AFCL directly. Furthermore, you have to create a transformation system that automatically converts the Python programs into AFCL which is input to APOLLO.
Tasks	Create a Python specification that fully represents the AFCL language constructs thus all AFCL programs can also be represented by this Python specification. Every AFCL program thus should have a Python representation. There are multiple solution paths to this problem, for instance, building a parser or a transformation system that converts the Python representation into AFCL. Other solutions may be possible as well. Your solution should be modular and easy to extend in case of any changes to AFCL. Convert at least 3 AFCL use cases into the Python representation.
Theoretical skills
Practical skills	Advanced Python programmer, git and GitHub, JSON or YAML
Additional information	It is not mandatory but of great help if you passed the lecture and PS on Verteilte Systeme in the computer science bachelor program.This Master Thesis will be supervised by Juan Aznar (IFI/DPS).The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed Python frontend will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title	Instrumentation, Monitoring, and Visualization of Edge-Cloud Applications
Number of students	1
Language	Englisch
Supervisors	Thomas Fahringer
Description	Port two existing applications or develop new applications to our own Edge-Cloud infrastructure at DPS. For this reason, you will have to read some papers which describe such applications or find them on the internet. The more realistic these applications the better. Next, you will have to implement an instrumentation, monitoring and analysis service under our Apollo (https://apollowf.github.io/) orchestration and runtime system for Edge-Cloud infrastructures. You will have to instrument the application and the runtime system for various parameters such as runtime, memory, transfer time, energy consumption, economic costs, etc. Then you have to port an existing or develop your own monitoring system that collects the performance data in a highly decentralized fashion. The monitoring data should be analyzed in real-time within a Dashboard to be developed for this purpose. The Askalon Visualization Diagrams (http://www.dps.uibk.ac.at/projects/askalon/visualization) or any other proper service can be used for this purpose. For this work, we should try to reuse as much software as possible. However, the result should be stable and sustainable as part of the Apollo system.
Tasks	Port two existing applications to the Edge-Cloud infrastructure with APOLLO (AFCL) Develop or port an existing instrumentation service for applications and runtime system to the APOLLO system Develop a scalable and highly decentralized monitoring service to collected instrumented data of the above applications Develop Dashboard to visualize performance data for real-time visualization Visualization could be done based on the Askalon Visualization Diagrams.
Theoretical skills
Practical skills	Advanced Java programmer, Distributed Systems, Cloud systems, Docker, git and GitHub
Additional information	It is not mandatory but of great help if you passed the lecture and PS on Verteilte Systeme in the computer science bachelor program. The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed instrumentation and monitoring system will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title	Automatic Data Dependence Analysis for Simple C Programs
Number of students	1
Language	German or English
Supervisor	Thomas Fahringer
Description	We have developed a simple dependence testing tool for simple C programs. The goal of this project is to detect errors and fix them and to extend the data dependence ability. Among other the compiler of this tool should be extended for countable dependence testing which inserts code to instrument and monitor array subscript expressions which are written into a trace file. At the end of the execution of such programs, the trace file is analyzed and dependencies are determined based on the content of the trace file. Next would be to include a new dependence tester, such as the polyhedral library and replace the existing dependence tester in the above mentioned tool with the objective to improve the accuracy of dependence testing.
Tasks	Understand the internals of the existing tool, test and debug where necessary. Update the tool for countable dependence testing based on compiler technology. Add the polyhedral dependence tester as a new dependence test to improve the accuracy of the dependence testing. Visualization of results. Development of a test suite and extensive testing.
Theoretical Skills	data dependence analysis compiler technology such as flex and bison
Practical Skills	scripting language, C oder C++
Additonal Information