charm4py is a high-level parallel and distributed programming framework with a simple and powerful API, based on migratable Python objects and remote method invocation; built on top of an adaptive C/C++ runtime system providing speed, scalability and dynamic load balancing.

charm4py allows writing parallel and distributed applications in Python based on the Charm++ programming model. Charm++ has seen extensive use in the scientific and high performance computing (HPC) communities across a wide variety of computing disciplines, and has been used to produce several large parallel applications that run on the largest supercomputers, like NAMD.

With charm4py, all the application code can be written in Python. The core Charm++ runtime is implemented in a C/C++ shared library which the charm4py module interfaces with.

As with any Python program, there are several methods available to support high-performance functions where needed. These include, among others: NumPy; writing the desired functions in Python and JIT compiling to native machine instructions using Numba; or accessing C or Fortran code using f2py. Another option for increased speed is to run the program using a fast Python implementation (e.g. PyPy).

We have found that using charm4py + Numba, it is possible to build parallel applications entirely in Python that have the same or similar performance as the equivalent C++ application (whether based on Charm++ or MPI), and that scale to hundreds of thousands of cores.

Example applications are in the examples subdirectory of the source code repository.