Resource Allocation for Scientific Workflows in Heterogeneous Infrastructures

Our overall goal is to enhance the portability of DAWs, allowing scientists to focus on the domain-specific challenges in their DAWs. We aim to achieve this through exploiting the characteristics of heterogeneous infrastructures to find an adaptive task-resource allocation. First, we want to develop novel methods to automatically describe and discover heterogeneous components and topologies This knowledge is then used to dynamically create infrastructure-aware task execution profiles at workflow runtime.

Motivation

Scientific Workflows consists of a huge amount of recurring tasks, where the execution can take several days. Resource managers handle the task-resource assignments, while they ensure that CPU and memory constraints are taken into account. However, they do not take other resource characteristics like CPU clock rates or memory latencies into account when managing heterogeneous clusters. Therefore, only simplistic black-box scheduling algorithms can be used, which do not take task characteristics and heterogeneous infrastructure into account.

Approach

In a first step we want to improve the knowledge a resource manager has about the existing infrastructure. Therefore, we want to develop novel methods to describe and profile heterogeneous infrastructures and networks. In the second step we want to use the historic runtime data to model task-resource profiles and to predict the runtime of tasks. Through the extensive knowledge the task-resource assignment process can be improved and the workflow runtime decreased. We implement our systems solutions into scientific workflow systems and evaluate them on real-world scientifc workflows.

Results

First attempts of ours show promising results in the domain of resource allocation for tasks in scientific workflows, which is why we seek to expand and intensify our efforts.

Publications

Contact

If you have any questions or are interested in collaborating with us on this topic, please get in touch with Jonathan!

Acknowledgments

This work was funded by the German Research Foundation (DFG), CRC 1404: “FONDA: Foundations of Workflows for Large-Scale Scientific Data Analysis”.