Current Theses

Title

Student

Supervisor

Description

Parallelizing a Real-world Astrophysics Application With Legion: A Venture Into the Unknown Clemens Prosser Philipp Gschwandtner Details
Power and Energy Efficiency Analysis of HPC Workloads on Modern CPU Architectures Thomas Klotz Philipp Gschwandtner Details
Porting and Optimization of RF Pulse Scheduling for Trapped Ion Quantum Computing Mederika Zangerl Philipp Gschwandtner Details
Performance Effects of GPU Buffer Indexing Methods in Structured Grid Applications Julian Stecher Philipp Gschwandtner Details
Adaptive Mesh Refinement in Hydrodynamics: Redesigning Cronos Joshua Ocker Philipp Gschwandtner Details

Title Parallelizing a Real-world Astrophysics Application With Legion: A Venture Into the Unknown
Student Clemens Prosser
Language English
Supervisor Philipp Gschwandtner
Description Designing fast and highly efficient parallel applications for distributed memory systems entails overcoming several obstacles, with efficient data decomposition and distribution being one of the key requirements. This makes application development in HPC a tedious and error-prone process, as the most prominent distributed-memory programming model, MPI, leaves data decomposition, resource management and application tuning to hard-coded solutions implemented by the developer. Several endeavours aim at mitigating this issue, including Legion. Legion is a data-centric programming model for writing high-performance applications for distributed heterogeneous architectures by implicitly extracting parallelism and relinquishing control over data movement to the runtime system. The goal of this bachelor thesis is to port an existing structured grid application from astrophysics, Cronos, to Legion, while evaluating both the programmer effort entailed as well as the performance and scalability of the resulting application.
Tasks
  • Code analysis of the Cronos reimplementation and the original implementation
  • Study of the Legion programming model
  • Design and implementation of a Legion-based parallelization for Cronos
  • Performance evaluation on several hardware platforms
Theoretical Skills modern parallel programming models, high performance computing, scientific computing
Practical Skills C++, Legion, scientific computing, debugging
Additional Information

Title Power and Energy Efficiency Analysis of HPC Workloads on Modern CPU Architectures
Student Thomas Klotz
Language English
Supervisor Philipp Gschwandtner
Description Modern CPUs feature complex mechanisms in order to manage the trade-off between performance and energy, such as DVFS or power capping. Part of the data used to drive these mechanisms is available to the user, enabling detailed analyses of the power and energy efficiency of various workloads and comparison across architectures. The goal of this thesis is to investigate modern power control and measurement technologies available in contemporary processors, and use them to gain knowledge on the efficiency of HPC-relevant workloads.
Tasks
  • Survey and analysis of modern power control and measurement technologies
  • Implementation of suitable benchmarks that explore e.g. parallelism and vectorization
  • Benchmarking and analysis on multiple architectures
Theoretical Skills Parallel programming, vectorization and node-level optimization in general, basic knowledge of CPU hardware
Practical Skills C, C++, OpenMP
Additional Information

Title Porting and Optimization of RF Pulse Scheduling for Trapped Ion Quantum Computing
Student Mederika Zangerl
Language English
Supervisor Philipp Gschwandtner
Description In a quantum computer, information is stored in quantum bits (qubits). While classical bits can have exactly two different states, a qubit can be in a superposition of states. Qubits can for example be realized using trapped ions manipulated by laser pulses driven by radio-frequency signals. When performing computations, a provided quantum circuit needs to be translated into a sequence of RF pulses. These RF pulses in turn need to be arranged such that they can run on the targeted real-time RF generators. The hardware has limits to the number of simultaneous events and setup times and thus a transpiler is needed to take care of these constraints with high performance to maximize program throughput. The aim of this thesis is to improve the performance of a scheduler for RF pulse sequences for a trapped ion quantum device and the associated transpiling passes, as well as the maintainability of the underlying code. To accomplish this, the existing Python implementation is ported to the Rust programming language. Rust combines the speed of a compiled language with high level features such as an advanced static type system, enforced explicit error handling and intrinsic memory safety. Additionally, Rust can be easily integrated into existing Python code bases. In a second step, the Rust version will be optimized with regard to algorithmic complexity and data structures.
Tasks
  • port existing implementation to Rust
  • regression tests for the new implementation
  • performance evaluation of scheduler code
  • optimization of algorithms and data structures
Theoretical Skills algorithms and data structures, complexity analysis
Practical Skills Rust, Python, software engineering principles
Additional Information

Title Performance Effects of GPU Buffer Indexing Methods in Structured Grid Applications
Student Julian Stecher
Language German or English
Supervisor Philipp Gschwandtner
Description Accelerated clusters are ubiquitous, with 7 of the top 10 fastest supercomputers world wide supported by accelerators of some form (9 of the top 10 on the Green500 list). A key aspect of this technology is the fact that their computing and memory architecture differs from that of the host in which they are installed. This bachelor thesis focuses on implementing and benchmarking multiple variants of structured grid proxy apps for several dimensions in CUDA and investigating relevant performance effects.
Tasks
  • porting of 1D, 2D and 3D stencil codes to CUDA
  • investigating 4D data structures on top of 3D buffers
  • exploring usability and performance tradeoffs among various methods of allocating and indexing into CUDA buffers
  • performance evaluation accross multiple Nvidia GPU models
Theoretical Skills parallel programming, high performance computing, GPU computing, performance analysis
Practical Skills C++, CUDA, working with GPUs
Additional Information

Title Adaptive Mesh Refinement in Hydrodynamics: Redesigning Cronos
Student Joshua Ocker
Language German or English
Supervisor Philipp Gschwandtner
Description High performance computing is a branch of computer science that evolves very quickly, with increasingly complex architectures in both software and hardware.
However, HPC application software is often developed by domain scientists, who may struggle to keep up with these innovations. Cronos is such an example, a structured grid simulation capable of computing gamma ray emissions of binary star systems. With development started over a decade ago, its programming style reflects various language standards of C++ and lacks modern features such as accelerator support or adaptive mesh refinement (AMR). The idea of AMR is to drastically reduce the computational work of a grid based simulation, by dynamically adjusting local resolution.The goal of this bachelor thesis is to provide a preliminary prototype for a new Cronos implementation that supports AMR and can be easily extended to contemporary programming models.
Tasks
  • Code analysis of original Cronos implementation
  • Redesigning a minimal working prototype with a strong focus on sustainable software
  • Evaluation of basic AMR support
  • Performance evaluation
Theoretical Skills parallel programming, high performance computing, scientific computing
Practical Skills C++, working with legacy code bases, scientific computing
Additional Information

If a bachelor student wants to set his/her initial/final presentation he/she (or the supervisor) MUST contact Sashko Ristov to schedule the presentation!

Details for the theses