BiocPy Packages

Getting Started

The BiocPy ecosystem is modular. You can install the collection of core packages via PyPI.

Installation

To get the core data structures and utilities:

pip install biocutils genomicranges summarizedexperiment singlecellexperiment multiassayexperiment

To install interoperability tools:

pip install rds2py txdb orgdb experimenthub

Core Representations

These structures serve as essential and foundational data structures, acting as the building blocks for extensive and complex representations.

  • BiocFrame: A Bioconductor-like data frame class.
  • GenomicRanges: Aids in representing genomic regions and facilitating analysis.

Container classes represent single or multi-omic experimental data and metadata:

Moreover, BiocPy introduces a diverse range of data type classes designed to support the representation of atomic entities, including float, string, int lists, and named lists. These generics and utilities are provided through BiocUtils package, and the delayed and file-backed array operations in the DelayedArray and their derivatives (HDF5Array, TileDbArray).

Analysis Packages

BiocPy provides bindings to libscran and various other single-cell analysis methods incorporated into the scranpy package to support analysis of multi-modal single-cell datasets. It also features integration with the singler algorithm to annotate cell types by matching cells to known references based on their expression profiles.

R Interoperability

The rds2py package provides bindings to the rds2cpp library. This enables direct reading of RDS files in Python, eliminating the requirement for additional data conversion tools or intermediate formats. The package’s functionality streamlines the transition between Python and R, facilitating seamless analysis.

The following table serves as a directory of the core packages in the BiocPy ecosystem. All packages within the BiocPy ecosystem are published to Python’s Package Index (PyPI).

Package Description Latest Version Links
BiocFrame Flexible dataframe representation to support nested structures. 0.7.2 GitHub
DelayedArray Delayed array operations from Bioconductor 0.6.2 GitHub
GenomicRanges Container class to represent and operate over genomic regions and annotations. 0.8.3 GitHub
IRanges Python implementation of the IRanges Bioconductor package. 0.7.1 GitHub
MultiAssayExperiment Container class for representing and managing multi-omics genomic experiments 0.6.0 GitHub
SingleCellExperiment Container class for single-cell experiments 0.6.2 GitHub
SpatialExperiment Container class for storing data from spatial-omics experiments 0.1.0 GitHub
SpatialFeatureExperiment Container class for storing data from spatial feature experiments 0.0.5 GitHub
SummarizedExperiment Container to represent data from genomic experiments 0.6.5 GitHub
biocsetup “A CLI interface to quickly scaffold new BiocPy Python packages” 0.3.3 GitHub
biocutils Utilities to use across the biocpy packages. 0.3.3 GitHub
biostrings Efficient manipulation of genomic sequences 0.0.2 GitHub
celldex Index of Reference Cell Type Datasets 0.3.0 GitHub
compressed-lists Memory-efficient container for list-like objects 0.4.4 GitHub
experimenthub Access Bioconductors experimenthub resources 0.0.1 GitHub
hdf5array HDF5-backed objects for array and matrix like data 0.5.0 GitHub
mopsy Matrix operations for numpy and scipy 0.3.0 GitHub
orgdb Access OrgDB annotations 0.0.1 GitHub
pyBiocFileCache File based cache for resources and metadata 0.7.0 GitHub
rds2py Parse and construct Python representations for datasets stored in RDS files 0.8.0 GitHub
scranpy Analyze multi-modal single-cell data! 0.3.0 GitHub
scrnaseq Collection of Public Single-Cell RNA-Seq Datasets 0.3.1 GitHub
singler Python bindings to the singleR algorithm to annotate cell types from known references. 0.5.0 GitHub
tiledbarray TileDb backed objects for array and matrix like data 0.2.0 GitHub
txdb Python interface to access and manipulate genome annotations in TxDB SQLite format. 0.0.4 GitHub

Developer Guide

If you are interested in developing new Python packages, check out the developer guide on the phisolophy and tools we employ to ensure code quality and consistency within and across all the packages. A more detailed Python package management process is documented in the biocsetup package.