scranpy.aggregation package#

Submodules#

scranpy.aggregation.aggregate_across_cells module#

class scranpy.aggregation.aggregate_across_cells.AggregateAcrossCellsOptions(compute_sums=True, compute_detected=True, assay_type=0, num_threads=1)[source]#

Bases: object

Options to pass to aggregate_across_cells().

compute_sums#

Whether to compute the sum of each group.

compute_detected#

Whether to compute the number of detected cells in each group.

assay_type#

Assay to use from input if it is a SummarizedExperiment.

num_threads#

Number of threads.

__annotations__ = {'assay_type': typing.Union[int, str], 'compute_detected': <class 'bool'>, 'compute_sums': <class 'bool'>, 'num_threads': <class 'int'>}#
__dataclass_fields__ = {'assay_type': Field(name='assay_type',type=typing.Union[int, str],default=0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'compute_detected': Field(name='compute_detected',type=<class 'bool'>,default=True,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'compute_sums': Field(name='compute_sums',type=<class 'bool'>,default=True,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'num_threads': Field(name='num_threads',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}#
__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)#
__eq__(other)#

Return self==value.

__hash__ = None#
__repr__()#

Return repr(self).

assay_type: Union[int, str] = 0#
compute_detected: bool = True#
compute_sums: bool = True#
num_threads: int = 1#
scranpy.aggregation.aggregate_across_cells.aggregate_across_cells(input, groups, options=AggregateAcrossCellsOptions(compute_sums=True, compute_detected=True, assay_type=0, num_threads=1))[source]#

Aggregate expression values for groups of cells.

Parameters:
Return type:

SummarizedExperiment

Returns:

A SummarizedExperiment where each row corresponds to a row in input and each column corresponds to a group. Assays contain the sum of expression values (if options.compute_sums = True) and the number of cells with detected expression (if options.compute_detected = True) for each group. Column data contains the identity of each group; for groups containing multiple sequences, the identity of each group is defined as a unique combination of levels from each sequence.

scranpy.aggregation.downsample_by_neighbors module#

class scranpy.aggregation.downsample_by_neighbors.DownsampleByNeighborsOptions[source]#

Bases: object

Options to pass to ~scranpy.aggregation.downsample_by_neighbors.downsample_by_neighbors.

num_threads#

Number of threads to use.

__annotations__ = {'num_threads': <class 'int'>}#
num_threads: int = 1#
scranpy.aggregation.downsample_by_neighbors.downsample_by_neighbors(input, k, options=<scranpy.aggregation.downsample_by_neighbors.DownsampleByNeighborsOptions object>)[source]#

Downsample a dataset by its neighbors. We do by considering a cell to be a “representative” of its nearest neighbors, allowing us to downsample by removing all of its neighbors; this is repeated until all cells are assigned to a representative, starting from the cells in the densest part of the dataset and working our way down. This approach aims to preserve the relative density of points for a faithful downsampling while guaranteeing the representation of rare subpopulations.

Parameters:
  • input (Union[NeighborIndex, NeighborResults, ndarray]) –

    Object containing per-cell nearest neighbor results or data that can be used to derive them.

    This may be a a 2-dimensional ndarray containing per-cell coordinates, where rows are cells and columns are features/dimensions. This is most typically the result of run_pca().

    Alternatively, input may be a pre-built neighbor search index (NeighborIndex) for the dataset, typically constructed from the PC coordinates for all cells.

    Alternatively, input may be pre-computed neighbor search results (NeighborResults). for all cells in the dataset. The number of neighbors should be consistent with the perplexity provided in InitializeTsneOptions (see also tsne_perplexity_to_neighbors()).

  • k (int) – Number of neighbors to use for downsampling. Larger values result in more downsampling at the cost of speed. Only relevant if input is not a NeighborResults object.

  • options (DownsampleByNeighborsOptions) – Further options.

Return type:

Tuple[ndarray, ndarray]

Returns:

The first value is of length less than the number of observations, and contains the indices of the observations that were retained after downsampling. The second value is of length equal to the number of observations, and contains the index of the representative observation for each observation in the dataset.

Module contents#