Single-cell Multi-omics Gene Regulatory Network Database (SC-MO-GRN-DB) is a comprehensive molecular data repository for constructing and validating GRNs using data from single-cell multi-omics datasets. The goal of the platform is to provide a data and knowledge repository for investigators who aim to build, benchmark and explore gene regulatory networks with single-cell multi-omics data.
SC-MO-GRN-DB includes multiple single-cell data modalities including scRNA-Seq, scATAC-Seq, scChIP-Seq, scHi-C, and single-cell DNA methylation datasets as well as single-cell perturbation data with scCRISPR-Seq. In addition to these single-cell resources, we provide curated gene interaction data from ChIP-Seq, gene knock-down and knock-out experiments to test, benchmark and validate single-cell GRNs.
SC-MO-GRN-DB houses two primary types of data: reference networks and single-cell multiomics datasets. Reference networks capture transcription factor (TF) to target gene (TG) interactions, identified through TF localization via ChIP-seq with nearest-neighbor analysis, as well as TF perturbation studies such as knockouts. Single-cell multiomics datasets are often structured in a matrix format, where each column represents a single cell of a specific cell type.
SC-MO-GRN-DB is designed to support flexible mapping between single-cell sequencing datasets and reference gene regulatory networks derived from independent sources. For a given tissue or cell type, the database may contain multiple single-cell datasets and multiple reference networks, reflecting different experimental conditions, modalities, and evidence types. These datasets are not intended to be paired in a one-to-one manner; instead, users may combine any compatible single-cell dataset with one or more reference networks when constructing or evaluating GRNs. We recommend leveraging all available datasets rather than relying on a single pairing, as this enables more robust and reproducible analyses. For benchmarking and method evaluation, an all-to-all strategy, testing each single-cell dataset against each reference network, allows users to assess consistency and performance across data sources. For biological discovery, results may be integrated across datasets and reference networks to construct consensus GRNs that capture shared regulatory signals.
For users interested in multi-modal GRN inference, SC-MO-GRN-DB provides guidance based on whether sequencing modalities are truly paired or unpaired. Jointly generated (paired) datasets bundle multiple modalities within a single dataset, allowing direct application of joint analysis tools such as CellOracle or LINGER without additional preprocessing. For unpaired datasets, each modality should be downloaded separately and integrated using established methods such as LIGER or Seurat CCA. Differences in experimental design and sequencing platforms can introduce batch effects and alignment challenges, so we recommend using truly paired datasets whenever possible. For all paired datasets in the database, barcode matching has already been performed, reducing the technical burden on users and facilitating downstream multi-modal GRN inference.
This workflow outlines the process of constructing reference networks, which map transcription factor (TF) to target gene (TG) interactions. The methodology integrates multiple data sources, including ChIP-seq to determine TF localization and perturbation studies such as TF knockouts to infer regulatory relationships.
This workflow outlines the process for constructing single-cell multiomics datasets. The workflow involves data acquisition, processing, and integration of molecular outputs at the single-cell level. These datasets typically serve as control data in experiments and focus on a single cell type, allowing for detailed characterization of gene regulatory activity in a controlled setting.
SC-MO-GRN-DB supports multiple single-cell modalities, including scRNA-Seq, scATAC-Seq, scChIP-Seq, scHi-C, scDNA-Met, and scCRISPR datasets. These modalities provide a comprehensive view of gene expression, chromatin accessibility, and epigenetic modifications at the single-cell level.