compressed_lists package

Submodules

compressed_lists.CompressedIntegerList module

class compressed_lists.CompressedIntegerList.CompressedIntegerList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of integers.

__abstractmethods__ = frozenset({})
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedIntegerList.

Parameters:
  • unlist_data (ndarray) – NumPy array of integers.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (dict) – Optional metadata for elements.

  • metadata (dict) – Optional general metadata.

  • kwargs – Additional arguments.

classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedIntegerList from a list of integer lists.

Parameters:
Return type:

CompressedIntegerList

Returns:

A new CompressedIntegerList.

compressed_lists.CompressedList module

class compressed_lists.CompressedList.CompressedList(unlist_data, partitioning, element_type=None, element_metadata=None, metadata=None, validate=True)[source]

Bases: ABC

Base class for compressed list objects.

CompressedList stores list elements concatenated in a single vector-like object with partitioning information that defines where each list element starts and ends.

__abstractmethods__ = frozenset({'_extract_range', 'from_list'})
__annotations__ = {}
__copy__()[source]
Returns:

A shallow copy of the current Partitioning.

__deepcopy__(memo=None, _nil=[])[source]
Returns:

A deep copy of the current Partitioning.

__getitem__(key)[source]

Get an element or slice of elements from the list.

Parameters:

key (Union[int, str, slice]) – Integer index, string name, or slice.

Return type:

Any

Returns:

List element(s).

__init__(unlist_data, partitioning, element_type=None, element_metadata=None, metadata=None, validate=True)[source]

Initialize a CompressedList.

Parameters:
  • unlist_data (Any) – Vector-like object containing concatenated elements.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_type (str) – String identifier for the type of elements.

  • element_metadata (dict) – Optional metadata for elements.

  • metadata (Optional[dict]) – Optional general metadata.

  • validate (bool) – Internal use only.

__iter__()[source]

Iterate over list elements.

Return type:

Iterator[Any]

__len__()[source]

Return the number of list elements.

Return type:

int

__repr__()[source]
Return type:

str

Returns:

A string representation.

copy()[source]

Alias for __copy__().

property element_metadata: dict

Alias for get_element_metadata.

property element_type: str

Alias for get_element_type, provided for back-compatibility.

extract_subset(indices)[source]

Extract a subset of elements by indices.

Parameters:

indices – Sequence of indices to extract.

Returns:

A new CompressedList with only the selected elements.

abstractmethod classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedList from a regular list.

This method must be implemented by subclasses to handle type-specific conversion from list to unlist_data.

Parameters:
  • lst – List to convert.

  • names – Optional names for list elements.

  • metadata – Optional metadata.

Returns:

A new CompressedList.

get_element_lengths()[source]

Get the lengths of each list element.

Return type:

ndarray

get_element_metadata()[source]
Return type:

dict

Returns:

Dictionary of metadata for each element in this object.

get_element_type()[source]

Return the element_type.

Return type:

str

get_metadata()[source]
Return type:

dict

Returns:

Dictionary of metadata for this object.

get_names()[source]

Get the names of list elements.

Return type:

Optional[NamedList]

get_partitioning()[source]

Return the paritioning info.

Return type:

Partitioning

get_unlist_data()[source]

Get all elements.

Return type:

Any

lapply(func)[source]

Apply a function to each element.

Parameters:

func – Function to apply to each element.

Returns:

A new CompressedList with the results.

property metadata: dict

Alias for get_metadata.

property names: NamedList | None

Alias for get_names.

property paritioning: Partitioning

Alias for get_paritioning, provided for back-compatibility.

relist(unlist_data)[source]

Create a new CompressedList with the same partitioning but different data.

Parameters:

unlist_data – New unlisted data.

Returns:

A new CompressedList.

set_element_metadata(element_metadata, in_place=False)[source]

Set new element metadata.

Parameters:
  • element_metadata (dict) – New element metadata for this object.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_metadata(metadata, in_place=False)[source]

Set additional metadata.

Parameters:
  • metadata (dict) – New metadata for this object.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_names(names, in_place=False)[source]

Set the names of list elements.

names:

New names, same as the number of rows.

May be None to remove names.

in_place:

Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

set_unlist_data(unlist_data, in_place=False)[source]

Set new list elements.

Parameters:
  • unlist_data (Any) – New vector-like object containing concatenated elements.

  • in_place (bool) – Whether to modify the CompressedList in place.

Return type:

CompressedList

Returns:

A modified CompressedList object, either as a copy of the original or as a reference to the (in-place-modified) original.

to_list()[source]

Convert to a regular Python list.

Return type:

List[Any]

Returns:

A regular Python list with all elements.

unlist(use_names=True)[source]

Get the underlying unlisted data.

Parameters:

use_names (bool) –

Whether to include names in the result if applicable.

Currently not used.

Return type:

Any

Returns:

The unlisted data.

property unlist_data: Any

Alias for get_unlist_data.

compressed_lists.CompressedStringList module

class compressed_lists.CompressedStringList.CompressedStringList(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Bases: CompressedList

CompressedList implementation for lists of strings.

__abstractmethods__ = frozenset({})
__annotations__ = {}
__init__(unlist_data, partitioning, element_metadata=None, metadata=None, **kwargs)[source]

Initialize a CompressedStringList.

Parameters:
  • unlist_data (List[str]) – List of strings.

  • partitioning (Partitioning) – Partitioning object defining element boundaries.

  • element_metadata (dict) – Optional metadata for elements.

  • metadata (dict) – Optional general metadata.

  • kwargs – Additional arguments.

classmethod from_list(lst, names=None, metadata=None)[source]

Create a CompressedStringList from a list of string lists.

Parameters:
Return type:

CompressedStringList

Returns:

A new CompressedStringList.

compressed_lists.partition module

class compressed_lists.partition.Partitioning(ends, names=None, validate=True)[source]

Bases: object

Represents partitioning information for a CompressedList.

This is similar to the PartitioningByEnd class in Bioconductor. It keeps track of where each element begins and ends in the unlisted data.

__copy__()[source]
Returns:

A shallow copy of the current Partitioning.

__deepcopy__(memo=None, _nil=[])[source]
Returns:

A deep copy of the current Partitioning.

__getitem__(key)[source]

Get partition range(s) by index or slice.

Parameters:

key (Union[int, slice]) – Integer index or slice.

Return type:

Union[tuple, List[tuple]]

Returns:

Tuple of (start, end) or list of such tuples.

__init__(ends, names=None, validate=True)[source]

Initialize a Partitioning object.

Parameters:
  • ends (Sequence[int]) – Sequence of ending positions for each partition.

  • names (Optional[Sequence[str]]) – Optional names for each partition.

  • validate (bool) – Internal use only.

__len__()[source]

Return the number of partitions.

Return type:

int

__repr__()[source]
Return type:

str

Returns:

A string representation.

copy()[source]

Alias for __copy__().

element_lengths()[source]

Alias for get_element_lengths.

Return type:

int

property ends: Names | None

Alias for get_ends, provided for back-compatibility.

classmethod from_lengths(lengths, names=None)[source]

Create a Partitioning from a sequence of lengths.

Parameters:
Return type:

Partitioning

Returns:

A new Partitioning object.

classmethod from_list(lst, names=None)[source]

Create a Partitioning from a list by using the lengths of each element.

Parameters:
  • lst (List) – A list to create partitioning from.

  • names (Optional[Sequence[str]]) – Optional names for each partition.

Return type:

Partitioning

Returns:

A new Partitioning object.

get_element_lengths()[source]

Return the lengths of each partition.

Return type:

ndarray

get_ends()[source]

Return the names of each partition.

Return type:

Optional[NamedList]

get_names()[source]

Return the names of each partition.

Return type:

Optional[NamedList]

get_nobj()[source]

Return the total number of objects across all partitions.

Return type:

int

get_partition_range(i)[source]

Get the start and end indices for partition i.

Return type:

tuple

get_starts()[source]

Return the starts of each partition.

Return type:

Optional[NamedList]

property names: Names | None

Alias for get_names, provided for back-compatibility.

nobj()[source]

Alias for get_nobj.

Return type:

int

set_names(names, in_place=False)[source]

Set the names of list elements.

Parameters:
  • names (Optional[List[str]]) –

    New names, same as the number of elements.

    May be None to remove row names.

  • in_place (bool) – Whether to modify the Partitioning in place.

Return type:

Partitioning

Returns:

A modified Partitioning object, either as a copy of the original or as a reference to the (in-place-modified) original.

property starts: Names | None

Alias for get_starts, provided for back-compatibility.

Module contents