Workflow Execution
Workflow Execution is the process of dealing with the runtime jobs in the environments, and handling the fault tolerance and runtime optimization of the workflow execution. Execution Engine is a service responsible for distributing the jobs to different Grid resource.The main features of the Workflow Execution are:
- Service oriented architecture [4]
- Execute the basic and advanced control flow (e.g. sequence,
if, switch, while, for, dag, parallel, parallelfor,etc.)
- Execute and optimize up to 12 different kinds of basic and advanced data flow (e.g. stage-in, stage-out, collection, stream etc.)
- Automatic working environment organization and automatic Grid garbage clean
- Job submission control and file transfer number control which avoid the problem of Deny of Service(DoS)
- Runtime optimization of workflow execution [1]
- Support VSEE(Virtual Singel Execution Environment) [1]
- Support multi distributed log database (PostgreSQL, MySql, ...); Distributed engines can send/collect logs infomation to/from multi-distributed database
- Data mining based fault prediction and detection[2]
- Fault tolerance[4]: retry, replication, checkpointing/restart, migration, user-defined exception, rescue workflow...
- run-time collection of state of activities: execution time, queuing time, submitted sites, ...
- Manage the basic and advanced dataflow (e.g. collection, sream, activities arguments transfer, file, database, etc.) and transfer all kinds of data (e.g. string, integer, file, etc.)
- Handle the different run-time failures with different strategies (e.g. retry, check pointing, replication, etc. ) on the workflow level, as shown in following pictures:
- Visaulization of workflow execution
Figure 1: Architection of workflow engine.
Figure 1: Visualizatin of workflow execution.
Figure 1: Histogram of workflow execution.
Currently, Execution Engine has successfully run several real world applications( WIEN2k, Invmod), AstroGrid, Mateoro and got some good result.
Implementation and dependent software
-
Globus, Java Cog, Java programming language (J2SE 1.5)
People
Related Publication
- [1] Rubing Duan, Radu Prodan, Thomas Fahringer.
Run-time Optimization for Grid Workflow Applications
accepted by 7th IEEE/ACM International Conference on Grid Computing, September 28th-29th 2006, Barcelona, Spain. - [2] Rubing Duan, Radu Prodan, Thomas Fahringer.
Data Mining-based Fault Prediction and Detection on the Grid
accepted by 15th IEEE International Symposium on High Performance Distributed Computing (HPDC'06), June 2006, Paris, France. - [3] Thomas Fahringer, Radu Prodan, Rubing Duan, Francesco Nerieri, Stefan Podlipnig, Jun Qin, Mumtaz Siddiqui, Hong-Linh Truong, Alex Villazon and Marek Wieczorek.
ASKALON: A Grid Application Development and Computing Environment
6th IEEE/ACM International Workshop on Grid Computing (GRID 2005), Copyright (C) IEEE Computer Society Press, November 2005, Seattle, USA. [ps] [pdf] [bib] - [4] Rubing Duan, Radu Prodan, Thomas Fahringer.
DEE: A Distributed Fault Tolerant Workflow Enactment Engine for Grid Computing
The 2005 International Conference on High Performance Computing and Communications (HPCC-05), Copyright (C) Springer-Verlag, LNCS 3726, Proceedings.September 21 - 25, 2005, Sorrento, Italy. - [5] Rubing Duan, Thomas Fahringer, Radu Prodan, Jun Qin, Alex
Villazon and Marek Wieczorek.
Real World Workflow Applications in the Askalon Grid Environment
To appear in Proceedings of European Grid Conference 2005 (EGC 2005)[ps] [pdf] [bib]