Various computational methods have been developed to predict perturbation effects, but despite claims of promising performance, concerns about the true efficacy of these models continue to be raised, particularly when the models are evaluated across diverse unseen cellular contexts and unseen perturbations. To address this, a comprehensive benchmark was conducted for 27 single-cell perturbation response prediction methods, including methods concerning genetic and chemical perturbations; 29 datasets were used, and various evaluation metrics were applied to assess the generalizability of the methods to unseen cellular contexts and perturbations. Insights regarding the method limitations, method generalization and method selection were obtained. Finally, an solution that leverages prior knowledge through cellular context embedding to improve the generalizability of models to new cellular contexts is presented.


Benchmark methods

All benchmark methods analyzed in our study are listed below. Details of the setting were available in our manuscript.

Method Generalization Article Time Title Version
biolord Cellular context/Perturbation Nature Biotechnology 2024 Disentanglement of single-cell data with biolord 0.0.3
CellOT Cellular context Nature Methods 2023 Learning single-cell perturbation responses using neural optimal transport 0.0.1
inVAE Cellular context Bioengineering 2023 Homogeneous Space Construction and Projection for Single-Cell Expression Prediction Based on Deep Learning 0.0.1
scDisInFact Cellular context Nature Communications 2024 scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data 0.1.0
scGen Cellular context Nature Methods 2019 scGen predicts single-cell perturbation responses 2.1.0
scPRAM Cellular context Bioinformatics 2024 scPRAM accurately predicts single-cell gene expression perturbation response based on attention mechanism 0.0.1
scPreGAN Cellular context Bioinformatics 2022 scPreGAN, a deep generative model for predicting the response of single-cell expression to perturbation 0.0.1
SCREEN Cellular context Frontiers of Computer Science 2024 SCREEN: predicting single-cell gene expression perturbation responses via optimal transport 0.0.1
scVIDR Cellular context Patterns 2023 Generative modeling of single-cell gene expression for dose-dependent chemical perturbations 0.0.3
trVAE Cellular context Bioinformatics 2020 Conditional out-of-distribution generation for unpaired data using transfer VAE 1.1.2
AttentionPert Perturbation Bioinformatics 2021 AttentionPert: Accurately Modeling Multiplexed Genetic Perturbations with Multi-scale Effects 0.0.1
CPA Perturbation Molecular Systems Biology 2023 Predicting cellular responses to complex perturbations in high-throughput screens 0.8.5
GEARS Perturbation Nature Biotechnology 2022 Predicting transcriptional outcomes of novel multigene perturbations with GEARS 0.1.0
GenePert Perturbation bioRxiv 2024 GenePert: Leveraging GenePT Embeddings for Gene Perturbation Prediction 0.0.1
linearModel Perturbation bioRxiv 2024 Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear methods 0.0.1
scGPT Perturbation Nature Methods 2024 scGPT: toward building a foundation model for single-cell multi-omics using generative AI 0.2.1
scFoundation Perturbation Nature Methods 2024 Large-scale foundation model on single-cell transcriptomics 0.0.1
chemCPA Perturbation arXiv 2022 Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution 2.0.0
scouter Perturbation bioRxiv 2024 Scouter: Predicting Transcriptional Responses to Genetic Perturbations with LLM embeddings 0.0.1
scELMo Perturbation bioRxiv 2024 scELMo: Embeddings from Language Models are Good Learners for Single-cell Data Analysis 0.0.1
GeneCompass Perturbation Cell Research 2024 GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model 0.0.1
cycleCDR Perturbation bioRxiv 2024 Predicting single-cell cellular responses to perturbations using cycle consistency learning 0.0.1
PRnet Perturbation Nature communication 2024 Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery 0.0.1

Benchmark datasets summary

All datasets analyzed in our study are listed in the Workflow. We have uploaded all benchmark datasets to Figshare and Zenodo, which can be obtained from Figshare-Cellular, Figshare-Perturbation, Zenodo-Cellular and Zenodo-perturbation.


scPerturb Logo

scPerturb Logo

PerturBase Logo

sc-pert Logo

pertpy Logo