Available Theses

Title Student(s) Supervisor Description
Scalable and Adaptive Stream Processing Architectures 1 or 2 Thomas Fahringer, Abolfazl Younesi details
Fault Tolerance and Resilience in Stream Processing 1 or 2 Thomas Fahringer, Abolfazl Younesi details
Reinforcement/Machine/Federated Learning Integration in Streaming Data
1 or 2 Thomas Fahringer, Abolfazl Younesi details
Hybrid Batch and Stream Processing Systems 1 or 2 Thomas Fahringer, Abolfazl Younesi details
Fault-Tolerant Stream Processing with Sparse and Quantized Transformer Partitioning in Distributed Systems
1 or 2 Thomas Fahringer, Abolfazl Younesi details
Layer-Wise Sparse Attention and Quantization for Scalable Transformer Partitioning
1 or 2 Thomas Fahringer, Abolfazl Younesi details
Automatic LLM-based generation of novel language specifications 1 Juan Aznar Poveda and Marlon Etheredge details
Fault Tolerance for a Novel Cloud-Edge-IoT Programming Model 1 Thomas Fahringer details
Extension of a Novel Cloud-Edge-IoT Programming Model 1 Thomas Fahringer details
Generation of Realistic Use Cases, Including Evaluation and Testing 1 Thomas Fahringer details
Novel Programming Model and Runtime System for Edge/Cloud Infrastructures 1 Thomas Fahringer details
Decentralized federated-Learning policy update for smart buildings 1 Juan Aznar details
Python Frontend for Serverless Workflows 1 Juan Aznar details
Instrumentation, Monitoring, and Visualization of Edge-Cloud Applications 1 Thomas Fahringer details
Automatic Data Dependence Analysis for Simple C Programs 1 Thomas Fahringer details
Additional List of Master Theses offered by Peter Thoman Peter Thoman details

Title Scalable and Adaptive Stream Processing Architectures
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Designing and evaluating novel algorithms and system architectures for real-time data processing in large-scale, heterogeneous environments (e.g., cloud-edge networks). Emphasis on resource allocation, parallelism, and load balancing to achieve high throughput and low latency.
Description This project involves exploring cutting-edge techniques in distributed stream processing, including partitioning strategies, dynamic scheduling, and performance modeling. Students will develop or extend existing frameworks to handle massive and high-velocity data streams.
Tasks Dynamic Resource Allocation: Create adaptive load balancing and scheduling strategies.
Scalability Mechanisms: Explore partitioning, paral- lelism, and distributed execution techniques.
Performance Evaluation: Develop benchmarks and simulation frameworks for empirical validation.
Theoretical skills Distributed Systems Concepts, Scalability and Performance Modeling, Queueing Theory/Scheduling Algorithms
Practical skills Programming in Java/Scala/Python, Familiarity with Apache Spark/Flink/Storm, Cloud Deployment (AWS/Azure)

Title Fault Tolerance and Resilience in Stream Processing
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Investigating fault detection and recovery mechanisms, checkpointing, and consistency models in distributed stream processing frameworks to ensure reliability without sacrificing performance.
Description Students will explore techniques to improve system robustness under node failures, network disruptions, and software bugs. This includes designing lightweight checkpointing and self-healing strategies that minimize overhead while maintaining data consistency.
Tasks Lightweight Checkpointing: Implement and test new recovery methods.
Self-Healing Systems: Develop adaptive failure detection and recovery strategies.
Consistency Models: Research trade-offs between strong consistency and high performance in real-time pipelines.
Theoretical skills Distributed Systems Reliability, Consistency Models and Algorithms, Fault Tolerance Theories
Practical skills Programming in Java/Scala, Experience with Stream Processing Frameworks, Familiarity with Docker/Kubernetes

Title Reinforcement/Machine/Federated Learning Integration in Streaming Data
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Designing systems that integrate streaming data analytics with online and incremental machine learning methods, enabling continuous model training, inference, and adaptation in real time.
Description The project aims to build pipelines that can handle rapidly evolving data streams, applying machine learning models that update on the fly. Students will investigate online learning, incremental deep learning, and anomaly detection while ensuring low latency and scalability.
Tasks Online Learning Models: Adapt machine learning al- gorithms to learn from new data continuously.
Incremental Deep Learning: Implement and evaluate deep neural networks that update in real time.
Explainability: Investigate interpretability methods for real-time predictions.
Theoretical skills Machine Learning/Deep Learning Fundamentals, Online Learning Theory, Statistics and Data Analysis
Practical skills Python (TensorFlow, PyTorch) or Java-based ML libraries, Streaming Frameworks (Apache Flink, Kafka Streams), Model Deployment and Monitoring
Additional information Subprojects could include designing new online learning algorithms, benchmarking incremental ML frameworks, or developing explainable real-time AI.

Title Hybrid Batch and Stream Processing Systems
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Building and evaluating unified systems that combine historical (batch) data analytics with real-time (stream) processing, enabling comprehensive insights and efficient data lifecycle management.
Description This project explores architectures that handle both high-throughput batch workloads and low-latency streaming workloads. Students will investigate query optimization, data synchronization, and transactional consistency across batch and stream pipelines.
Tasks Unified Data Models: Propose or refine data models that integrate batch and stream data.
Transactional Consistency: Examine consistency and synchronization challenges in hybrid systems.
System Integration: Prototype solutions bridging real- time analytics with big data warehousing.
Theoretical skills Big Data Architectures, Database and Transaction Theory, Distributed Computing Concepts
Practical skills Hadoop/Spark for batch processing, Kafka/Flume/Flink for streaming, SQL and NoSQL Databases

Title Fault-Tolerant Stream Processing with Sparse and Quantized Transformer Partitioning in Distributed Systems
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Design a fault-tolerant partitioning framework for Transformer models that integrates sparse computation and quantization techniques to handle stream processing in distributed systems.
Description The project aims to partition and deploy Transformer-based DNNs for streaming data while ensuring continuous operation and resilience against node failures. The framework will maintain efficient processing even under resource constraints by utilizing sparsity and quantization.
Tasks • Develop partitioning strategies for Transformer models us- ing sparse computation.
• Integrate quantization techniques to reduce computa- tional overhead.
• Implement fault-tolerance mechanisms for distributed stream processing.
• Evaluate the framework using benchmarks and fault in- jection experiments.
Theoretical skills Distributed Systems, Fault Tolerance, Sparse Computation, Neural Network Quantization
Practical skills Programming in Python/C++, Deep Learning Frameworks, Experience with Distributed Computing

Title Layer-Wise Sparse Attention and Quantization for Scalable Transformer Partitioning
Number of students 1 – 2 (preferred)
Language English
Supervisors Thomas Fahringer, Abolfazl Younesi
Focus Develop a layer-wise strategy that leverages sparse attention and quantization to enable efficient partitioning of Transformer models across multiple edge cloud platforms, optimizing resource usage and scalability.
Description This project focuses on enhancing the scalability of Transformer models in multi-cloud environments. The approach aims to optimize resource allocation and reduce inter-cloud communication overhead while ensuring high model performance by applying sparse attention mechanisms and quantization layer-by-layer.
Tasks • Analyze Transformer architecture for partitioning oppor- tunities.
• Develop layer-wise sparse attention mechanisms.
• Integrate quantization techniques to minimize resource us- age.
• Benchmark the partitioning strategy in multi-cloud se- tups.
Theoretical skills Deep Learning Architectures, Sparse Computation, Quantization Methods, Multi-Cloud Computing Concepts
Practical skills Experience with Deep Learning Frameworks, Cloud Deployment (AWS, Azure, GCP), Programming in Python/C++

Title Automatic LLM-based generation of novel language specifications
Number of students 1
Language English
Supervisors Juan Aznar Poveda and Marlon Etheredge
Description We research and develop a novel programming model based on state machines for high-performance, fault-tolerant, and dynamic Edge-Cloud applications. The automation of the language specification would largely increase developers’ productivity and favor massive deployments across the continuum. For this purpose, this topic is focused on (i) investigating suitable Large Language Models able to automatically convert requirements into applications written in the novel language developed by the DPS, (ii) implement the tool and integrate it into our system, and (iii) rigorously assess it for various use cases.
Tasks
  • Study LLMs
  • Implement the LLM-based language generator
  • Complete evaluation for different use cases
Theoretical skills
  • AI/ML foundations
  • Cloud computing
  • Programming models
  • Formal specifications

Familiarity with computer science topics such as the following will become important throughout this project: artificial intelligence, cloud computing, distributed systems, state machines.

Practical skills
  • Excellent programming abilities in Python as well as Java
  • In-depth knowledge of AI/ML
Additional information IMPORTANT: As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title Fault Tolerance for a Novel Cloud-Edge-IoT Programming Model
Number of students 1
Language English
Supervisors Thomas Fahringer
Description We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications.
Our programming model lends itself to cloud applications that require fault tolerance. For this reason, we would like to extend our programming model with fault-tolerant constructs.
For this project, it is required to investigate possible extensions to our programming model, implementation of these extensions into our system, and a complete evaluation, including use case development.
Tasks
  • Study fault-tolerant extensions to our programming model
  • Implement the extensions in the system that implement our programming model
  • Complete evaluation of the added extensions (theoretical and practical)
  • Development of use cases that prove the extensions
Theoretical skills
  • Cloud computing
  • Programming models
  • Formal specifications

Familiarity with computer science topics such as the following will become important throughout this project: consensus algorithms, state machines, distributed systems, and cloud computing.

Practical skills
  • Excellent programming abilities in Python as well as Java
  • In-depth knowledge of distributed systems and cloud computing
Additional information As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title Extension of a Novel Cloud-Edge-IoT Programming Model
Number of students 1
Language English
Supervisors Thomas Fahringer
Description We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications.
Our programming model could be extended to increase the level of abstraction and reusability of developed components.
Extensions that we are currently considering as being useful additions to our programming models are analogous to concepts such as:

  • Abstraction
  • Encapsulation
  • Inheritance
  • Polymorphism
  • Reusability

However, new ideas and their application to our programming model will be integral to this project.
For this project, it is required to investigate possible extensions to our programming model, implementation of these extensions into our system, and a complete evaluation, including use case development.

Tasks
  • Studying new extensions to our programming model
  • Implementation of extensions
Theoretical skills
  • Cloud computing
  • Programming language design
  • Parsers
  • Formal specification
Practical skills
  • Excellent programming abilities in Python as well as Java
  • In-depth knowledge of distributed systems and cloud computing
Additional information As the system that supports our programming model is intended to run large-scale cloud applications, any implementation must be of high quality. Therefore, software and code quality are essential aspects of this work. For this reason, we are looking for students with excellent programming abilities.

Title Generation of Realistic Use Cases, Including Evaluation and Testing
Number of students 1
Language English
Supervisors Thomas Fahringer
Description We research and develop a novel Cloud-Edge-IoT programming model for high-performance, fault-tolerant, and dynamic cloud applications. As an extension to the system, we are currently interested in accepting a bachelor student working on a project to extend the project with tooling that simplifies the development of cloud applications.
An essential aspect of our programming model is the study of realistic (real-world) use cases. Such use cases may include applications such as smart cities, smart buildings, surveillance, and AI analysis.
The generation of realistic use cases will entail the following:

  • The studying (identification) of such realistic use cases
  • Modeling of applications uses our programming model to develop applications for these use cases
  • Complete evaluation of the applications in a real-world environment
  • Comparison to existing programming models

The product of this project is required to be the development of several realistic use cases.
For this project, it is required to study possible real-world use cases, develop applications for these use cases, implement required services, and perform a complete evaluation of any implemented use case.

Tasks
  • Studying novel and realistic real-world use cases
  • Modeling of applications for use cases using our programming model
  • Evaluating use cases in a real-world environment
  • Evaluation, comparing to existing programming models
Theoretical skills
  • Cloud computing
  • Modeling

Familiarity with computer science topics such as the following will become important throughout this project: modeling and AI analysis (for instance, image analysis).

Practical skills
  • Excellent programming abilities in Python as well as Java
  • In-depth knowledge of distributed systems and cloud computing
Additional information As our system is intended to solve real-world problems, any applications developed must be of high quality. For this reason, we are looking for students with excellent programming abilities.

Title Novel Programming Model and Runtime System for Edge/Cloud Infrastructures
Number of students 1
Language English
Supervisors Thomas Fahringer
Description Develop a novel programming model (language) which is based on a set of state machines and transitions among these state machines based on events. Develop a highly distributed and scaling runtime system that executes programs with this novel programming model. Test it with 2 – 3 applications that should be developed as part of this thesis. Testing will be done on a local edge laboratory with a public cloud.
Tasks
  • Specify a novel programming language based on state machines with YAML or JSON
  • Convert programs written in the above language into a Java intermediate representation
  • Develop a runtime system that can executed programs written in the above language targeting edge/cloud infrastructures
  • Reuse existing event based environments and monitoring framework as part of the runtime system
  • Study related work and compare against this new approach
  • Test the overall framework with 2 – 3 applications that should be developed as part of this thesis
  • Incorporate rigorous testing from the very beginning in the development process
Theoretical skills Cloud computing, Serverless, Machine/Federated Learning
Practical skills Java (expert programmer), git and GitHub, JSON or YAML, serverless computing, Container and Virtual Machine technologies
Additional information You should have passed the lecture and PS on Verteilte Systeme in the computer science bachelor program or otherwise demonstrate that are familiar with the practical skills mentioned above. You should be an expert programmer in Java.The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed environments will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title Decentralized federated-Learning policy update for smart buildings
Number of students 1
Language English
Supervisors Juan Aznar
Description Smart buildings (SB) should provide smart and automatic responses to undesired events such as fire, earthquakes, water leakages, etc. To this end, edge devices are deployed to recognize risky scenarios on smart buildings (SB). In this thesis, a federated learning [1] scheme should be applied to optimally select the actions taken by the smart building and improve the different Machine Learning models trained over time in a distributed fashion. Finally, the Apollo system [2] will be used, not only to exploit parallelism and scalability, but also to intelligently select resources across the cloud-edge continuum according to the nature of the tasks scheduled.[1] https://www.tensorflow.org/federated
[2] https://apollowf.github.io/learn.html
Tasks
  • Deploy a federated learning approach to improve smart buildings capabilities over time using decentralized data stemming from edge devices.
  • Create one or more simple workflows whose tasks will be orchestrated by Apollo in a distributed fashion.
  • Study the performance and scalability of the proposed solution and compare it with a solution without shared learning across edge devices and without orchestration.
Theoretical skills Cloud computing, Serverless, Machine/Federated Learning
Practical skills Python, TensorFlow (Lite & Federated), git, GitHub
Additional information

Title Python Frontend for Serverless Workflows
Number of students 1
Language English
Supervisors Juan Aznar
Description Apollo (https://apollowf.github.io/) is the DPS research orchestration and runtime system for Edge-Cloud infrastructures. We are using AFCL (https://apollowf.github.io/learn.html) to describe serverless workflows for distributed applications. As part of this thesis, you will have to create a Python version for AFCL thus application developers can create Python programs to build workflows instead of using AFCL directly. Furthermore, you have to create a transformation system that automatically converts the Python programs into AFCL which is input to APOLLO.
Tasks
  • Create a Python specification that fully represents the AFCL language constructs thus all AFCL programs can also be represented by this Python specification. Every AFCL program thus should have a Python representation.
  • There are multiple solution paths to this problem, for instance, building a parser or a transformation system that converts the Python representation into AFCL. Other solutions may be possible as well.
  • Your solution should be modular and easy to extend in case of any changes to AFCL.
  • Convert at least 3 AFCL use cases into the Python representation.
Theoretical skills
Practical skills Advanced Python programmer, git and GitHub, JSON or YAML
Additional information It is not mandatory but of great help if you passed the lecture and PS on Verteilte Systeme in the computer science bachelor program.This Master Thesis will be supervised by Juan Aznar (IFI/DPS).The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed Python frontend will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title Instrumentation, Monitoring, and Visualization of Edge-Cloud Applications
Number of students 1
Language Englisch
Supervisors Thomas Fahringer
Description Port two existing applications or develop new applications to our own Edge-Cloud infrastructure at DPS. For this reason, you will have to read some papers which describe such applications or find them on the internet. The more realistic these applications the better. Next, you will have to implement an instrumentation, monitoring and analysis service under our Apollo (https://apollowf.github.io/) orchestration and runtime system for Edge-Cloud infrastructures. You will have to instrument the application and the runtime system for various parameters such as runtime, memory, transfer time, energy consumption, economic costs, etc. Then you have to port an existing or develop your own monitoring system that collects the performance data in a highly decentralized fashion. The monitoring data should be analyzed in real-time within a Dashboard to be developed for this purpose. The Askalon Visualization Diagrams (http://www.dps.uibk.ac.at/projects/askalon/visualization) or any other proper service can be used for this purpose. For this work, we should try to reuse as much software as possible. However, the result should be stable and sustainable as part of the Apollo system.
Tasks
  • Port two existing applications to the Edge-Cloud infrastructure with APOLLO (AFCL)
  • Develop or port an existing instrumentation service for applications and runtime system to the APOLLO system
  • Develop a scalable and highly decentralized monitoring service to collected instrumented data of the above applications
  • Develop Dashboard to visualize performance data for real-time visualization
  • Visualization could be done based on the Askalon Visualization Diagrams.
Theoretical skills
Practical skills Advanced Java programmer, Distributed Systems, Cloud systems, Docker, git and GitHub
Additional information It is not mandatory but of great help if you passed the lecture and PS on Verteilte Systeme in the computer science bachelor program. The student will have the opportunity to work with a state-of-the-art Apollo Edge-Cloud infrastructure. The developed instrumentation and monitoring system will be reused for international projects and published as open-source. Collaborative work in an international project is possible if the student is interested. In the best case this work can also be published and student can travel to conference and present his/her work.

Title Automatic Data Dependence Analysis for Simple C Programs
Number of students 1
Language German or English
Supervisor Thomas Fahringer
Description We have developed a simple dependence testing tool for simple C programs. The goal of this project is to detect errors and fix them and to extend the data dependence ability. Among other the compiler of this tool should be extended for countable dependence testing which inserts code to instrument and monitor array subscript expressions which are written into a trace file. At the end of the execution of such programs, the trace file is analyzed and dependencies are determined based on the content of the trace file.
Next would be to include a new dependence tester, such as the polyhedral library and replace the existing dependence tester in the above mentioned tool with the objective to improve the accuracy of dependence testing.
Tasks
  • Understand the internals of the existing tool, test and debug where necessary.
  • Update the tool for countable dependence testing based on compiler technology.
  • Add the polyhedral dependence tester as a new dependence test to improve the accuracy of the dependence testing.
  • Visualization of results.
  • Development of a test suite and extensive testing.
Theoretical Skills data dependence analysis
compiler technology such as flex and bison
Practical Skills scripting language, C oder C++
Additonal Information