figshare
Browse

Easily Parallelising Python Across Nodes Using MPI

Download (882.29 kB)
dataset
posted on 2025-04-08, 14:02 authored by Sachit KshatriyaSachit Kshatriya

One of the key benefits of large-scale cluster computing is the ability to run massively parallelised computing tasks for orders of magnitude gains in completion times. Yet, custom scripts often cannot suitably exploit multi-core/multi-node clusters. While interpreted languages, such as python, can readily support parallelisation across multiple CPU cores, parallelisation across cluster nodes is typically more complex and implemented in lower level languages such as C.


The ubiquity of Python in life science research presents the opportunity for large improvements in compute times, should parallelisation techniques be deployed by researchers. This is a brief introduction to the mpi4py library and a novel tool that exemplifies strategic use of mpi4py to straightforwardly parallelise custom python scripts not only across multiple cores but simultaneously across multiple nodes at OSC.

Funding

From viromes to virocells: dissecting viral roles in terrestrial microbiomes and nutrient_x000d_cycling

Office of Biological and Environmental Research

Find out more...

BII-Implementation: The EMERGE Institute: Identifying EMergent Ecosystem Responses through Genes-to-Ecosystems Integration

Directorate for Biological Sciences

Find out more...

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC