This package provides container class to represent single-cell experimental data as 2-dimensional matrices. In these matrices, the rows typically denote features or genomic regions of interest, while columns represent cells. In addition, a SingleCellExperiment (SCE) object may contain low-dimensionality embeddings, alternative experiments performed on same sample or set of cells.
Important
The design of SingleCellExperiment class and its derivates adheres to the R/Bioconductor specification, where rows correspond to features, and columns represent cells.
Note
These classes follow a functional paradigm for accessing or setting properties, with further details discussed in functional paradigm section.
The SingleCellExperiment extends RangeSummarizedExperiment and contains additional attributes:
reduced_dims: Slot for low-dimensionality embeddings for each cell.
alternative_experiments: Manages multi-modal experiments performed on the same sample or set of cells.
row_pairs or column_pairs: Stores relationships between features or cells.
Note
In contrast to R, matrices in Python are unnamed and do not contain row or column names. Hence, these matrices cannot be directly used as values in assays or alternative experiments. We strictly enforce type checks in these cases. To relax these restrictions for alternative experiments, set type_check_alternative_experiments to False.
Important
If you are using the alternative_experiments slot, the number of cells must match the parent experiment. Otherwise, the expectation is that the cells do not share the same sample or annotations and cannot be set in alternative experiments!
Before we construct a SingleCellExperiment object, lets generate information about rows, columns and a mock experimental data from single-cell rna-seq experiments:
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/genomicranges/SeqInfo.py:348: UserWarning: 'seqnames' is deprecated, use 'get_seqnames' instead
warn("'seqnames' is deprecated, use 'get_seqnames' instead", UserWarning)
Tip
You can also use delayed or file-backed arrays for representing experimental data, check out this section from summarized experiment.
Interop with anndata
We provide convenient methods for loading an AnnData or h5ad file into SingleCellExperiment objects.
For example, lets create an AnnData object,
import anndata as adfrom scipy import sparse as spcounts = sp.csr_matrix(np.random.poisson(1, size=(100, 2000)), dtype=np.float32)adata = ad.AnnData(counts)adata
AnnData object with n_obs × n_vars = 100 × 2000
Converting AnnData as SingleCellExperiment is straightforward:
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/genomicranges/SeqInfo.py:348: UserWarning: 'seqnames' is deprecated, use 'get_seqnames' instead
warn("'seqnames' is deprecated, use 'get_seqnames' instead", UserWarning)
and vice-verse. All assays from SCE are represented in the layers slot of the AnnData object:
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/genomicranges/SeqInfo.py:348: UserWarning: 'seqnames' is deprecated, use 'get_seqnames' instead
warn("'seqnames' is deprecated, use 'get_seqnames' instead", UserWarning)
You can subset experimental data by using the subset ([]) operator. This operation accepts different slice input types, such as a boolean vector, a slice object, a list of indices, or names (if available) to subset.
In our previous example, we didn’t include row or column names. Let’s create another SingleCellExperiment object that includes names.
SingleCellExperiment implements methods for the combine generic from BiocUtils.
These methods enable the merging or combining of multiple SingleCellExperiment objects, allowing users to aggregate data from different experiments or conditions. Note: row_pairs and column_pairs are not ignored as part of this operation.
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/genomicranges/SeqInfo.py:348: UserWarning: 'seqnames' is deprecated, use 'get_seqnames' instead
warn("'seqnames' is deprecated, use 'get_seqnames' instead", UserWarning)
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/genomicranges/SeqInfo.py:348: UserWarning: 'seqnames' is deprecated, use 'get_seqnames' instead
warn("'seqnames' is deprecated, use 'get_seqnames' instead", UserWarning)
The combine_rows or combine_columns operations, expect all experiments to contain the same assay names. To combine experiments by row:
from biocutils import relaxed_combine_columns, combine_columns, combine_rows, relaxed_combine_rowssce_combined = combine_rows(sce2, sce1)print(sce_combined)
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/singlecellexperiment/SingleCellExperiment.py:1089: UserWarning: 'row_pairs' and 'column_pairs' are currently ignored during this operation.
warn(
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/singlecellexperiment/SingleCellExperiment.py:1131: UserWarning: 'row_pairs' and 'column_pairs' are currently ignored during this operation.
warn(
Note
You can use relaxed_combine_columns or relaxed_combined_rows when there’s mismatch in the number of features or samples. Missing rows or columns in any object are filled in with appropriate placeholder values before combining, e.g. missing assay’s are replaced with a masked numpy array.
# sce_alts1 contains an additional assay not present in sce_alts2sce_relaxed_combine = relaxed_combine_columns(sce_alts1, sce_alts2)print(sce_relaxed_combine)
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/singlecellexperiment/SingleCellExperiment.py:1241: UserWarning: 'row_pairs' and 'column_pairs' are currently ignored during this operation.
warn("'row_pairs' and 'column_pairs' are currently ignored during this operation.")
Export as AnnData or MuData
The package also provides methods to convert a SingleCellExperiment object into a MuData representation:
mdata = sce.to_mudata()mdata
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/anndata/_core/aligned_df.py:67: ImplicitModificationWarning: Transforming to str index.
warnings.warn("Transforming to str index.", ImplicitModificationWarning)