Benchmarking atlas-level data integration in single-cell genomics - integration task datasets

posted on 2022-01-24, 11:25 authored by Malte LueckenMalte Luecken, Maren Buttner, Anna Danese, Marta Interlandi, Michaela Müller, Daniel Strobl, Luke Zappia, Martin DugasMartin Dugas, Maria Colomé-Tatché, Fabian Theis, Kridsadakorn ChaichoompuKridsadakorn Chaichoompu
Pancreas, Lung atlas, human immune cell, and human and mouse immune cell integration RNA integration tasks, and all ATAC mouse brain integration tasks from the manuscript "Benchmarking atlas-level data integration in single-cell genomics".

These datasets were aggregated from public datasets, cell annotations were harmonized or reannotated, and the data was consistently preprocessed using scran pooling and log+1 transformation (for RNA tasks). In the immune cell datasets an erythrocyte development trajectory was also annotated. Details on dataset preprocessing can be found in the paper and in the accompanying Github at

Please cite the paper and the papers the individual datasets were aggregated from when using this data.


ExNet-0041-Phase2-3 („SyNergy-HMGU“)

Incubator grant # ZT-I-0007 sparse2big

Chan Zuckerberg initiative DAF 182835

European Union’s Horizon 2020 research and innovation programme under grant agreement No 874656

Chan Zuckerberg foundation #2019- 002438, Human Lung Cell Atlas 1.0