Available Theses

Title Stud ent(s) Supervisor Description
Dynamic and fault tolerant scheduling of scientific workflow ensembles using spot instances 1 Sashko Ristov details
Cost- and performance-driven scheduling algorithm for tasks with various resource requirements 1 Sashko Ristov details
Pro-active memory management with page faults prediction in clouds 1 Sashko Ristov details
Disaster tolerant and cost effective replica placement in content delivery networks 1 Sashko Ristov details
Experimente und Datenanalyse für Clouds 1 Thomas Fahringer details

Title Dynamic and fault tolerant scheduling of scientific workflow ensembles using spot instances
Number of students 1
Language English
Supervisor Sashko Ristov
Description Public cloud providers provide various on-demand IaaS resources with various capacities, prices and reliability. For example, Amazon offers on-demand instances, which are more reliable than spot instances, but more expensive. Amazon can interrupt retake the spot instance if the biding price is lower than the current market price or due to the lack of available resources. However, spot instances are offered up to 90% of the on-demand price. Accordingly, users are allowed to decide about the trade-off between these parameters. The goal of this thesis is to develop a new algorithm that will schedule execution of many scientific workflows considering fault tolerance, price and performance.
Tasks
  • Review the state of the art solutions
  • Design a model for executions
  • Define heuristics to reduce the search space
  • Evaluate the model with simulation
Theoretical Skills Optimization
Practical Skills Java
Additonal Information

Title Cost- and performance-driven scheduling algorithm for tasks with various resource requirements
Number of students 1
Language English
Supervisor Sashko Ristov
Description Horizontal and vertical scaling in cloud usually speeds up the execution of jobs and tasks (faster execution) or increases the throughput (more tasks can be executed). However, both jobs and resources are heterogeneous, which makes the scaling inefficient. The goal of this thesis is to define and evaluate an algorithm that will optimize the overall job execution with the minimum cost by consolidating several tasks with different requirements within a single virtual machine in order to utilize the resources more efficiently. This consolidation will significantly reduce the cost with a small trade-off to performance.
Tasks
  • Review the state of the art solutions
  • Design a model for executions
  • Define heuristics to reduce the search space
  • Evaluate the model with simulation
Theoretical Skills Optimization
Practical Skills Java
Additonal Information

Title Pro-active memory management with page faults prediction in clouds
Number of students 1
Language English
Supervisor Sashko Ristov
Description Many memory pages in RAM memory of virtual machines are the same as their operating system, code segment and many parts of the data segment are the same. Cooperation among these virtual machines does not exist; neither between virtual machines hosted on different servers, nor among the servers themselves. If an instruction or data access generates a TLB miss at one node, the same instruction or data access will likely generate a TLB miss in other node of the cluster where the oblivious horizontally-scaled VM-siblings are hosted. The goal of this thesis is to develop an automatic and autonomous memory management over multiple servers (and hosted virtual machines. The profiler will analyze memory access patterns and page faults to design a predictor, which will improve the memory organization and enable better exploitation of memory patterns. The final goal is to reduce the page fault rate.
Tasks
  • Review the state of the art solutions
  • Reconfigure a hypervisor
  • Develop a profiler and predictor
  • Evaluate the model
Theoretical Skills Memory page faults, Operating systems, hypervisors
Practical Skills XEN, Java
Additonal Information

Title Disaster tolerant and cost effective replica placement in content delivery networks
Number of students 1
Language English
Supervisor Sashko Ristov
Description Fault tolerance is an important feature of a distributed system since it ensures a transparency. A user should not be aware of a failure of a single or several system components. Replicating the content is an approach to improve the fault tolerance. This thesis focuses on a content placement problem in a content delivery network of data centers. The goal of the thesis is to determine an optimal number of replicas and an optimized placement of all contents in a distributed system considering to optimize the two conflicting objectives: network utilization and fault tolerance.
Tasks
  • Review the state of the art solutions
  • Design a model for executions
  • Define heuristics to reduce the search space
  • Evaluate the model with simulation
Theoretical Skills Optimization
Practical Skills Java
Additonal Information

Title Experimente und Datenanalyse für Clouds
Number of students 1
Language German
Supervisors Thomas Fahringer
Description Das Ziel dieser Arbeit ist die Durchführung einer Serie von Experimenten, um die Eigenschaften und Fähigkeiten von Cloud Infrastrukturen (z.B. Amazon EC2) zu evaluieren. Es werden dabei zahlreiche Virtual Machine Instanzen (VMs) für kleinere Programme getestet. Dabei werden die Zeiten für die VMs und die Programme gemessen und anschließend ausgewertet. Zu den gemessenen Zeiten gehören: Zeit bis eine VM zugewiesen und gestartet wird, Zeit für die Ausführung der Programme (mit Messung von Speicher und CPU Verbrauch), Zeit um die VM wieder freizugeben, uva. Es werden dabei eine große Zahl von Experimenten gestartet (Script Programm). VMs und Programme müssen vorher instrumentiert werden. Die gemessenen Daten müssen in einer Datenbank abgelegt und dann statistisch ausgewertet und visualisiert werden. Eine Besonderheit ist dabei die Berücksichtigung von Spot Instances, die besonders billig aber vom Cloud Provider jederzeit entzogen werden können. Um solche Spot VMs zu bekommen, muss ein sogenanntes Bieterverfahren implementiert werden. Das Ziel dieser Arbeit ist ein besseres Verständnis von Cloud Ressourcen für verschiedene Programme. Dabei soll der Trade-off zwischen Performance und Kosten genauer untersucht werden.
Tasks
  • Script Programm zum Instrumentieren von VMs und Programmen
  • Script Programm zum Lesen und Speichern von Messdaten in einer Datenbank
  • Implementieren eines Bieterverfahrens für Cloud Spot Instances
  • Ausführen von Experimenten auf einer realen Cloud Infrastruktur
  • Auswertung und Visualisierung der gemessenen Daten
Theoretical skills einfache Kenntnisse im Bereich der Statistik
Practical skills Script Sprache, Datenbanken, Visualisierung von Daten
Additional information