scranpy.aggregation package#
Submodules#
scranpy.aggregation.aggregate_across_cells module#
- class scranpy.aggregation.aggregate_across_cells.AggregateAcrossCellsOptions(compute_sums=True, compute_detected=True, assay_type=0, num_threads=1)[source]#
Bases:
object
Options to pass to
aggregate_across_cells()
.- compute_sums#
Whether to compute the sum of each group.
- compute_detected#
Whether to compute the number of detected cells in each group.
- assay_type#
Assay to use from
input
if it is aSummarizedExperiment
.
- num_threads#
Number of threads.
- __annotations__ = {'assay_type': typing.Union[int, str], 'compute_detected': <class 'bool'>, 'compute_sums': <class 'bool'>, 'num_threads': <class 'int'>}#
- __dataclass_fields__ = {'assay_type': Field(name='assay_type',type=typing.Union[int, str],default=0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'compute_detected': Field(name='compute_detected',type=<class 'bool'>,default=True,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'compute_sums': Field(name='compute_sums',type=<class 'bool'>,default=True,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'num_threads': Field(name='num_threads',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}#
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)#
- __eq__(other)#
Return self==value.
- __hash__ = None#
- __repr__()#
Return repr(self).
- scranpy.aggregation.aggregate_across_cells.aggregate_across_cells(input, groups, options=AggregateAcrossCellsOptions(compute_sums=True, compute_detected=True, assay_type=0, num_threads=1))[source]#
Aggregate expression values for groups of cells.
- Parameters:
input (
Union
[TatamiNumericPointer
,SummarizedExperiment
]) –Matrix-like object where rows are features and columns are cells, typically containing expression values of some kind. This should be a matrix class that can be converted into a
TatamiNumericPointer
.Alternatively, a
SummarizedExperiment
containing such a matrix in its assays.Developers may also provide a
TatamiNumericPointer
directly.groups (
Union
[Sequence
,Tuple
[Sequence
],dict
,BiocFrame
]) – A sequence of length equal to the number of columns ofinput
, specifying the group to which each column is assigned. Alternatively, a tuple, dictionary, orBiocFrame
of one or more such sequences, in which case each unique combination of levels across all sequences is defined as a “group”.options (
AggregateAcrossCellsOptions
) – Further options.
- Return type:
- Returns:
A SummarizedExperiment where each row corresponds to a row in
input
and each column corresponds to a group. Assays contain the sum of expression values (ifoptions.compute_sums = True
) and the number of cells with detected expression (ifoptions.compute_detected = True
) for each group. Column data contains the identity of each group; forgroups
containing multiple sequences, the identity of each group is defined as a unique combination of levels from each sequence.
scranpy.aggregation.downsample_by_neighbors module#
- class scranpy.aggregation.downsample_by_neighbors.DownsampleByNeighborsOptions[source]#
Bases:
object
Options to pass to ~scranpy.aggregation.downsample_by_neighbors.downsample_by_neighbors.
- num_threads#
Number of threads to use.
- __annotations__ = {'num_threads': <class 'int'>}#
- scranpy.aggregation.downsample_by_neighbors.downsample_by_neighbors(input, k, options=<scranpy.aggregation.downsample_by_neighbors.DownsampleByNeighborsOptions object>)[source]#
Downsample a dataset by its neighbors. We do by considering a cell to be a “representative” of its nearest neighbors, allowing us to downsample by removing all of its neighbors; this is repeated until all cells are assigned to a representative, starting from the cells in the densest part of the dataset and working our way down. This approach aims to preserve the relative density of points for a faithful downsampling while guaranteeing the representation of rare subpopulations.
- Parameters:
input (
Union
[NeighborIndex
,NeighborResults
,ndarray
]) –Object containing per-cell nearest neighbor results or data that can be used to derive them.
This may be a a 2-dimensional
ndarray
containing per-cell coordinates, where rows are cells and columns are features/dimensions. This is most typically the result ofrun_pca()
.Alternatively,
input
may be a pre-built neighbor search index (NeighborIndex
) for the dataset, typically constructed from the PC coordinates for all cells.Alternatively,
input
may be pre-computed neighbor search results (NeighborResults
). for all cells in the dataset. The number of neighbors should be consistent with the perplexity provided inInitializeTsneOptions
(see alsotsne_perplexity_to_neighbors()
).k (
int
) – Number of neighbors to use for downsampling. Larger values result in more downsampling at the cost of speed. Only relevant ifinput
is not aNeighborResults
object.options (
DownsampleByNeighborsOptions
) – Further options.
- Return type:
- Returns:
The first value is of length less than the number of observations, and contains the indices of the observations that were retained after downsampling. The second value is of length equal to the number of observations, and contains the index of the representative observation for each observation in the dataset.