compressed_lists package

Submodules

compressed_lists.base module

class compressed_lists.base.CompressedList(unlist_data, partitioning, element_type=None, element_metadata=None, metadata=None, validate=True)[source]

Bases: object

Base class for compressed list objects.

CompressedList stores list elements concatenated in a single vector-like object with partitioning information that defines where each list element starts and ends.

__copy__()[source]
Returns:

A shallow copy of the current Partitioning.

__deepcopy__(memo=None, _nil=[])[source]
Returns:

A deep copy of the current Partitioning.

__getitem__(key)[source]

Get an element or slice of elements from the list.

Parameters:

key (Union[int, str, slice]) – Integer index, string name, or slice.

Return type:

Any

Returns:

List element(s).

__init__(unlist_data, partitioning, element_type=None, element_metadata=None, metadata=None, validate=True)[source]

Initialize a CompressedList.

Parameters:
  • unlist_data (Any) – Vector-like object containing concatenated elements.

  • partitioning (Partitioning) – Partitioning object defining element boundaries (exclusive).

  • element_type (Any) – class for the type of elements.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • validate (bool) – Internal use only.

__iter__()[source]

Iterate over list elements.

Return type:

Iterator[Any]

__len__()[source]

Return the number of list elements.

Return type:

int

__repr__()[source]
Return type:

str

Returns:

A string representation.

copy()[source]

Alias for __copy__().

property element_metadata: dict

Alias for get_element_metadata.

property element_type: str

Alias for get_element_type, provided for back-compatibility.

extract_range(start, end)[source]

Extract a range from unlist_data.

This method must be implemented by subclasses to handle type-specific extraction from unlist_data.

Parameters:
  • start (int) – Start index (inclusive).

  • end (int) – End index (exclusive).

Return type:

Any

Returns:

Extracted element.

extract_subset(indices)[source]

Extract a subset of elements by indices.

Parameters:

indices (Sequence[int]) – Sequence of indices to extract.

Return type:

CompressedList

Returns:

A new CompressedList with only the selected elements.

classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedList from a regular list.

This method must be implemented by subclasses to handle type-specific conversion from list to unlist_data.

Parameters:
Return type:

CompressedList

Returns:

A new CompressedList.

get_element_lengths()[source]

Get the lengths of each list element.

Return type:

ndarray

get_element_metadata()[source]
Return type:

dict

Returns:

Dictionary of metadata for each element in this object.

get_element_type()[source]

Return the element_type.

Return type:

str

get_metadata()[source]
Return type:

dict

Returns:

Dictionary of metadata for this object.

get_names()[source]

Get the names of list elements.

Return type:

Optional[NamedList]

get_partitioning()[source]

Return the paritioning info.

Return type:

Partitioning

get_unlist_data()[source]

Get all elements.

Return type:

Any

lapply(func)[source]

Apply a function to each element.

Parameters:

func (Callable) – Function to apply to each element.

Return type:

CompressedList

Returns:

A new CompressedList with the results.

property metadata: dict

Alias for get_metadata.

property names: NamedList | None

Alias for get_names.

property paritioning: Partitioning

Alias for get_paritioning, provided for back-compatibility.

relist(unlist_data)[source]

Create a new CompressedList with the same partitioning but different data.

Parameters:

unlist_data (Any) – New unlisted data.

Return type:

CompressedList

Returns:

A new CompressedList.

set_element_metadata(element_metadata, in_place=False)[source]

Set new element metadata.

Parameters:
  • element_metadata (dict) – New element metadata for this object.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_metadata(metadata, in_place=False)[source]

Set additional metadata.

Parameters:
  • metadata (dict) – New metadata for this object.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_names(names, in_place=False)[source]

Set the names of list elements.

names:

New names, same as the number of rows.

May be None to remove names.

in_place:

Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_unlist_data(unlist_data, in_place=False)[source]

Set new list elements.

Parameters:
  • unlist_data (Any) – New vector-like object containing concatenated elements.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

to_list()[source]

Convert to a regular Python list.

Return type:

List[Any]

Returns:

A regular Python list with all elements.

unlist(use_names=True)[source]

Get the underlying unlisted data.

Parameters:

use_names (bool) –

Whether to include names in the result if applicable.

Currently not used.

Return type:

Any

Returns:

The unlisted data.

property unlist_data: Any

Alias for get_unlist_data.

compressed_lists.biocframe_list module

class compressed_lists.biocframe_list.CompressedBiocFrameList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList for BiocFrames.

__annotations__ = {}
__getitem__(key)[source]

Override to handle column extraction using splitAsCompressedList.

__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedBiocFrameList.

Parameters:
  • unlist_data (BiocFrame) – BiocFrame object.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

extract_range(start, end)[source]

Extract a range from unlist_data.

This method must be implemented by subclasses to handle type-specific extraction from unlist_data.

Parameters:
  • start (int) – Start index (inclusive).

  • end (int) – End index (exclusive).

Return type:

BiocFrame

Returns:

Extracted element.

classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedBiocFrameList from a regular list.

This concatenates the list of BiocFrame objects.

Parameters:
Return type:

CompressedBiocFrameList

Returns:

A new CompressedList.

compressed_lists.bool_list module

class compressed_lists.bool_list.CompressedBooleanList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of booleans.

__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedBooleanList.

Parameters:
  • unlist_data (BooleanList) – List of booleans.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

compressed_lists.float_list module

class compressed_lists.float_list.CompressedFloatList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of floats.

__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedFloatList.

Parameters:
  • unlist_data (FloatList) – List of floats.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

compressed_lists.integer_list module

class compressed_lists.integer_list.CompressedIntegerList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of integers.

__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedIntegerList.

Parameters:
  • unlist_data (IntegerList) – List of integers.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

compressed_lists.numpy_list module

class compressed_lists.numpy_list.CompressedNumpyList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of NumPy vectors.

__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedNumpyList.

Parameters:
  • unlist_data (ndarray) – List of NumPy vectors.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedNumpyList from a list of NumPy vectors.

Parameters:
Return type:

CompressedNumpyList

Returns:

A new CompressedNumpyList.

compressed_lists.partition module

class compressed_lists.partition.Partitioning(ends, names=None, validate=True)[source]

Bases: object

Represents partitioning information for a CompressedList.

This is similar to the PartitioningByEnd class in Bioconductor. It keeps track of where each element begins and ends in the unlisted data.

__copy__()[source]
Returns:

A shallow copy of the current Partitioning.

__deepcopy__(memo=None, _nil=[])[source]
Returns:

A deep copy of the current Partitioning.

__getitem__(key)[source]

Get partition range(s) by index or slice.

Parameters:

key (Union[int, slice]) – Integer index or slice.

Return type:

Union[tuple, List[tuple]]

Returns:

Tuple of (start, end) or list of such tuples.

__init__(ends, names=None, validate=True)[source]

Initialize a Partitioning object.

Parameters:
  • ends (Sequence[int]) – Sequence of ending positions for each partition (exclusive).

  • names (Optional[Sequence[str]]) – Optional names for each partition.

  • validate (bool) – Internal use only.

__len__()[source]

Return the number of partitions.

Return type:

int

__repr__()[source]
Return type:

str

Returns:

A string representation.

copy()[source]

Alias for __copy__().

element_lengths()[source]

Alias for get_element_lengths.

Return type:

int

property ends: ndarray

Alias for get_ends, provided for back-compatibility.

classmethod from_lengths(lengths, names=None)[source]

Create a Partitioning from a sequence of lengths.

Parameters:
Return type:

Partitioning

Returns:

A new Partitioning object.

classmethod from_list(lst, names=None)[source]

Create a Partitioning from a list by using the lengths of each element.

Parameters:
  • lst (List) – A list to create partitioning from.

  • names (Optional[Sequence[str]]) – Optional names for each partition.

Return type:

Partitioning

Returns:

A new Partitioning object.

get_element_lengths()[source]

Return the lengths of each partition.

Return type:

ndarray

get_ends()[source]

Return the names of each partition.

Return type:

ndarray

get_names()[source]

Return the names of each partition.

Return type:

Optional[Names]

get_nobj()[source]

Return the total number of objects across all partitions.

Return type:

int

get_partition_range(i)[source]

Get the start and end indices for partition i.

Return type:

tuple

get_starts()[source]

Return the starts of each partition.

Return type:

ndarray

property names: Names | None

Alias for get_names, provided for back-compatibility.

nobj()[source]

Alias for get_nobj.

Return type:

int

set_names(names, in_place=False)[source]

Set the names of list elements.

Parameters:
  • names (Optional[Sequence[str]]) –

    New names, same as the number of elements.

    May be None to remove row names.

  • in_place (bool) – Whether to modify the Partitioning in place.

Return type:

Partitioning

Returns:

A modified Partitioning object, either as a copy of the original or as a reference to the (in-place-modified) original.

property starts: ndarray

Alias for get_starts, provided for back-compatibility.

compressed_lists.split_generic module

compressed_lists.split_generic.groups_to_partition(data, groups, names=None)[source]

Convert group membership vector to partitioned data and Partitioning object.

Parameters:
  • data (Any) – The data to be split (flat vector-like object).

  • groups (list) – Group membership vector, same length as data.

  • names (Optional[Sequence[str]]) – Optional names for groups.

Return type:

Tuple[List[Any], Partitioning]

Returns:

Tuple of (partitioned_data_list, partitioning_object)

compressed_lists.split_generic.splitAsCompressedList(data, groups_or_partitions, names=None, metadata=None)[source]

Generic function to split data into an appropriate CompressedList subclass.

This function can work in two modes: 1. Group-based splitting where a flat vector is split according to group membership. 2. Partition-based splitting where a flat vector is split according to explicit partitions.

Parameters:
  • data (Any) – The data to split into a CompressedList.

  • groups_or_partitions (Union[list, Partitioning]) – Optional group membership vector (same length as data) or explicit partitioning object.

  • names (Optional[Sequence[str]]) – Optional names for the list elements.

  • metadata (Optional[dict]) – Optional metadata for the CompressedList.

Return type:

CompressedList

Returns:

An appropriate CompressedList subclass instance.

compressed_lists.string_list module

class compressed_lists.string_list.CompressedCharacterList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedStringList

__annotations__ = {}
class compressed_lists.string_list.CompressedStringList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of strings.

__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedStringList.

Parameters:
  • unlist_data (StringList) – List of strings.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (Optional[dict]) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • kwargs – Additional arguments.

Module contents