LAIK

LAIK is an ongoing research library for data distribution on HPC facilities. It targets fault tolerance and load balancing while maintaining  efficiency of applications.

LAIK stands for "Leichtgewichtige AnwendungsIntegriete Datenhaltungskomponente". 

This library provides support for the distribution of global data for parallel applications. It calculates the communication requirements for changing a partitioning or switching between different partitionings. By associating partitions with work load using the "owner-computes" rule, LAIK can be used for automatic load balancing. Partitionings specify task-local accessability to be ensured. Access stay valid until a consistency point with different access rights is hit. To this end, a consistency point triggers data transfers depending on a previously enforced consistency point. Multiple partitionings with different access behavior may be declared for the same index space and the same consistency point. Hitting the point thus ensures that for each index, a value written by a previous writer gets broadcasted to all tasks which want to read the value. Reductions are supported via aggregation write behavior: multiple written values get aggregated and broadcasted to all readers when switching. Re-partitiong (and thus load balancing) is enabled by specifically marked consistency points which require global synchronisation. In such a point, a handler is called which may change partitionings before data transfers are triggered. LAIK-managed index spaces can be coupled to support data transfers of different data structures at the same time.

LAIK is completely open source and under active development. We frequently provide Student projects on this topic. 

Github: https://github.com/envelope-project/laik