StarPU is a runtime system that offers support for heterogeneous
multicore machines. While many efforts are devoted to design efficient
computation kernels for those architectures (e.g. to implement BLAS
kernels on GPUs or on Cell's SPUs), StarPU not only takes care of
offloading such kernels (and implementing data coherency across
the machine), but it also makes sure the kernels are executed as
efficiently as possible.