BiocPy: Facilitate Bioconductor Workflows in Python

Author

Published

January 14, 2026

Welcome to BiocPy

Looking for the older website?

A previous version of this book is published here and the Bioconductor workshop here (from 2024).

BiocPy is designed to build a bridge between the mature Bioconductor ecosystem and the Python landscape. Bioconductor is an open-source software project that provides tools for the analysis and comprehension of genomic data. One of the main advantages of Bioconductor is the availability of standard data representations and large number of analysis tools tailored for genomic experiments. These data structures allow researchers to seamlessly store, manipulate, and analyze data across multiple packages and workflows in R.

Inspired by Bioconductor, BiocPy aims to facilitate Bioconductor workflows in Python. To achieve this goal, we developed several core data structures that align closely to the Bioconductor implementations. By implementing these core Bioconductor data structures, BiocPy allows data to be easily interoperable between R and Python.

About this Book

This book is currently in active development and is organized into the following sections:

Foundations: The core data structures that underpin the ecosystem.
- GenomicRanges: Manipulation of genomic intervals.
- Data Containers: Rich semantic data containers like SummarizedExperiment.
BioC Hubs: Access to Bioconductor’s cloud resources.
- ExperimentHub: Discover and download datasets.
- AnnotationHub: Interact with gene models (TxDb) and organism databases (OrgDb).
Interoperability: Tools to bridge R and Python.
- R Interoperability: Read R data files (RDS) directly in Python.
- ArtifactDB: Language-agnostic storage for genomic data in R and Python.
Workflows: Real-world usage examples.
- Single-Cell Analysis: Multi-modal single-cell analysis with scranpy.
- Annotation: Automated cell type annotation with singler.

Welcome to BiocPy

About this Book

Further Reading