Preface and Overview

Published

Jun 2026

Omics Systems

At Complex Data Insights (CDI), we define Omics Systems as a systems-oriented framework for reproducible omics data analysis, interpretation, and reporting.

Rather than focusing on individual tools or isolated workflows, Omics Systems emphasizes complete analytical systems that connect:

  • biological questions
  • data generation
  • data processing
  • statistical analysis
  • biological interpretation
  • reproducible reporting

The goal is to transform biological data into transparent, interpretable, and defensible scientific insight.


Why Omics Systems?

Modern omics analyses often focus heavily on software tools and isolated workflows.

However, generating outputs alone does not guarantee meaningful biological understanding.

Omics Systems promotes a systems-oriented approach:

Biological Question
        ↓
Experimental Design
        ↓
Data Generation
        ↓
Omics Data Processing
        ↓
Quality Control
        ↓
Feature Generation
        ↓
Domain-Specific Analysis
        ↓
Statistical Inference
        ↓
Biological Interpretation
        ↓
Reproducible Reporting

This perspective helps ensure that analytical results remain transparent, reproducible, and scientifically defensible.


A Common Omics Architecture

Although RNA-Seq, microbiome, proteomics, GWAS, single-cell, and multi-omics analyses address different biological questions, they share a common analytical foundation.

Shared Omics Infrastructure

All Omics System Builds rely on common principles and processing layers:

  • experimental design
  • data generation technologies
  • data management
  • omics data processing
  • quality control
  • reproducibility practices

Domain-Specific Extensions

The systems diverge after feature generation and enter specialized analytical workflows.

Build Primary Focus Status
RNA-Seq Gene expression Implemented
Microbiome Microbial communities Implemented
Proteomics Protein abundance and functional interpretation Current addition
GWAS Genetic variants Planned expansion
Single-cell Cellular heterogeneity Planned expansion
Multi-omics Cross-domain integration Planned expansion

Every Omics System Build extends this common architecture for a specific biological domain.


Omics Systems Architecture

Code
flowchart TD

    A[Biological Question]
    B[Experimental Design]
    C[Data Generation]
    D[Omics Data Processing]
    E[Quality Control]
    F[Feature Generation]

    G[RNA-Seq]
    H[Microbiome]
    P[Proteomics]
    I[GWAS]
    J[Single-cell]
    K[Multi-omics]

    L[Statistical Inference]
    M[Biological Interpretation]
    N[Reproducible Reporting]

    A --> B
    B --> C
    C --> D
    D --> E
    E --> F

    F --> G
    F --> H
    F --> P
    F --> I
    F --> J
    F --> K

    G --> L
    H --> L
    P --> L
    I --> L
    J --> L
    K --> L

    L --> M
    M --> N

flowchart TD

    A[Biological Question]
    B[Experimental Design]
    C[Data Generation]
    D[Omics Data Processing]
    E[Quality Control]
    F[Feature Generation]

    G[RNA-Seq]
    H[Microbiome]
    P[Proteomics]
    I[GWAS]
    J[Single-cell]
    K[Multi-omics]

    L[Statistical Inference]
    M[Biological Interpretation]
    N[Reproducible Reporting]

    A --> B
    B --> C
    C --> D
    D --> E
    E --> F

    F --> G
    F --> H
    F --> P
    F --> I
    F --> J
    F --> K

    G --> L
    H --> L
    P --> L
    I --> L
    J --> L
    K --> L

    L --> M
    M --> N


Current Omics System Builds

The current CDI Omics Systems guide summarizes the implementation of the Omics Systems Architecture across major biological data domains.

RNA-Seq and Microbiome represent the first implemented Omics System builds. Proteomics is the current addition, extending the architecture from transcript-level and community-level analysis into protein-level biological interpretation.

GWAS, single-cell RNA-Seq, and multi-omics integration are included as planned expansion systems that extend the same architecture into variant-level, cell-resolution, and cross-domain biological analysis.

Build 01 · RNA-Seq System

System focus: transcript-level biological analysis
Primary features: genes, transcripts, expression counts
Typical outputs: differential expression results, pathway summaries, reproducible reports
Status: implemented system

Build 02 · Microbiome System

System focus: microbial community analysis
Primary features: ASVs, OTUs, taxa, microbial pathways
Typical outputs: diversity summaries, taxonomic profiles, differential abundance results, biological interpretation reports
Status: implemented system

Build 03 · Proteomics System

System focus: protein-level biological analysis
Primary features: proteins, peptides, protein groups, functional annotations
Typical outputs: differential protein abundance results, GO/pathway enrichment, protein-network interpretation, reproducible reports
Status: current addition / active implementation

Build 04 · GWAS System

System focus: variant-level association analysis
Primary features: SNPs, genotypes, phenotypes, association statistics
Typical outputs: association results, Manhattan plots, candidate loci, variant interpretation summaries
Status: planned expansion

Build 05 · Single-cell RNA-Seq System

System focus: cell-resolution transcriptomics
Primary features: cells, genes, clusters, marker genes, cell types
Typical outputs: clustering results, marker-gene tables, cell-type annotations, trajectory or state-level interpretation
Status: planned expansion

Build 06 · Multi-omics Integration System

System focus: cross-domain biological integration
Primary features: harmonized samples, shared metadata, multi-layer molecular features
Typical outputs: integrated signatures, cross-omics associations, pathway-level synthesis, biological interpretation reports
Status: planned expansion


Technology Ecosystem

Depending on the build, workflows may integrate:

  • R
  • Python
  • Bioconductor
  • Quarto
  • QIIME 2
  • Scanpy
  • Plotly
  • GitHub Pages
  • reproducible computational environments

Technology choices are secondary to analytical reasoning. Tools may evolve, but the underlying system architecture remains consistent.


Workflow Philosophy

Across all builds, Omics Systems emphasizes:

  • systems over outputs
  • interpretation over automation
  • reproducibility over convenience
  • transparency over complexity
  • scientific reasoning over software execution

The goal is to help analysts move beyond running workflows toward building analytical systems that can be understood, reproduced, communicated, and trusted.