biocutils package¶
Submodules¶
biocutils.BooleanList module¶
- class biocutils.BooleanList.BooleanList(data=None, names=None, _validate=True)[source]¶
Bases:
NamedListList of booleans. This mimics a regular Python list except that anything added to it will be coerced into a boolean. None values are also acceptable and are treated as missing booleans. The list may also be named (see
NamedList), which provides some dictionary-like functionality.- safe_append(value, in_place=False)[source]¶
Calls
safe_append()after coercingvalueto a boolean.- Return type:
- safe_extend(other, in_place=True)[source]¶
Calls
safe_extend()after coercing elements ofotherto booleans.- Return type:
- safe_insert(index, value, in_place=False)[source]¶
Calls
safe_insert()after coercingvalueto a boolean.- Return type:
- set_slice(index, value, in_place=False)[source]¶
Calls
set_slice()after coercingvalueto booleans.- Return type:
- set_value(index, value, in_place=False)[source]¶
Calls
set_value()after coercingvalueto a boolean.- Return type:
biocutils.Factor module¶
- class biocutils.Factor.Factor(codes, levels, ordered=False, names=None, _validate=True)[source]¶
Bases:
objectFactor class, equivalent to R’s
factor.This is a vector of integer codes, each of which is an index into a list of unique strings. The aim is to encode a list of strings as integers for easier numerical analysis.
- __eq__(other)[source]¶
- Parameters:
other (
Factor) – AnotherFactor.- Returns:
Whether the current object is equal to
other, i.e., same codes, levels, names and ordered status.
- __getitem__(index)[source]¶
If
indexis a scalar, this is an alias forget_value().If
indexis a sequence, this is an alias forget_slice().
- __hash__ = None¶
- __iter__()[source]¶
- Return type:
- Returns:
An iterator over the factor. This will iterate over the codes and report the corresponding level (or None).
- __setitem__(index, value)[source]¶
If
indexis a scalar, this is an alias forset_value().If
indexis a sequence, this is an alias forset_slice().
- property codes: ndarray¶
Alias for
get_codes().
- drop_unused_levels(in_place=False)[source]¶
Drop unused levels.
- Parameters:
in_place (
bool) – Whether to perform this modification in-place.- Return type:
- Returns:
If
in_place = False, returns same type as caller (a newFactorobject) where all unused levels have been removed.If
in_place = True, unused levels are removed from the current object; a reference to the current object is returned.
- static from_sequence(x, levels=None, sort_levels=True, ordered=False, names=None, **kwargs)[source]¶
Convert a sequence of hashable values into a factor.
- Parameters:
x (
Sequence[str]) – A sequence of strings. Any value may be None to indicate missingness.levels (
Optional[Sequence[str]]) – Sequence of reference levels, against which the entries inxare compared. If None, this defaults to all unique values ofx.sort_levels (
bool) – Whether to sort the automatically-determined levels. If False, the levels are kept in order of their appearance inx. Not used iflevelsis explicitly supplied.ordered (
bool) – Whether the levels should be assumed to be ordered. Note that this refers to their importance and has nothing to do with their sorting order or with the setting ofsort_levels.names (
Optional[Sequence[str]]) – List of names. This should have same length asx. Alternatively None, if the factor has no names.kwargs – Further arguments to pass to
factorize().
- Return type:
- Returns:
A
Factorobject.
- get_codes()[source]¶
- Return type:
- Returns:
Array of integer codes, used as indices into the levels from
get_levels(). Missing values are marked with -1.This should be treated as a read-only reference. To modify the codes, use
set_codes()instead.
- get_levels()[source]¶
- Return type:
- Returns:
List of strings containing the factor levels.
This should be treated as a read-only reference. To modify the levels, use
replace_levels()instead.
- get_names()[source]¶
- Return type:
- Returns:
Names for the factor elements.
This should be treated as a read-only reference. To modify the names, use
set_names()instead.
- get_slice(index)[source]¶
- Parameters:
index (
Union[slice,range,Sequence,int,str,bool,NormalizedSubscript]) – Subset of elements to obtain, seenormalize_subscript()for details. Strings are matched to names in the current object, using the first occurrence if duplicate names are present. Scalars are treated as length-1 sequences.- Return type:
- Returns:
A
Factoris returned containing the specified subset.
- get_value(index)[source]¶
- Parameters:
index (
Union[str,int]) – Integer index of the element to obtain. Alternatively, a string containing the name of the element, using the first occurrence if duplicate names are present.- Return type:
- Returns:
The factor level for the code at the specified position, or None if the entry is missing.
- property levels: StringList¶
Alias for
get_levels().
- property names: Names¶
Alias for
get_names().
- property ordered: bool¶
Alias for
get_ordered().
- remap_levels(levels, in_place=False)[source]¶
Remap codes to a replacement list of levels. Each entry of the remapped
Factorwill refer to the same string across the old and new levels, provided that string is present in both sets of levels. (To change the levels without altering the codes of theFactor, usereplace_levels()instead.)- Parameters:
levels (
Union[str,Sequence[str]]) –A sequence of replacement levels. These should be unique strings with no missing values.
Alternatively a single string containing an existing level in this object. The new levels are defined as a permutation of the existing levels where the provided string is now the first level. The order of all other levels is preserved.
in_place (
bool) – Whether to perform this modification in-place.
- Return type:
- Returns:
If
in_place = False, returns same type as caller (a newFactorobject) where the levels have been replaced. This will automatically update the codes so that they still refer to the same string in the newlevels. If a code refers to a level that is not present in the newlevels, it is set to a missing value.If
in_place = True, the levels are replaced in the current object, and a reference to the current object is returned.
- replace_levels(levels, in_place=False)[source]¶
Replace the existing levels with a new list. The codes of the returned
Factorare unchanged by this method and will index into the replacementlevels, so each element of theFactormay refer to a different string after the levels are replaced. (To change the levels while ensuring that each element of theFactorrefers to the same string, useremap_levels(). instead.)- Parameters:
- Return type:
- Returns:
If
in_place = False, returns same type as caller (a newFactorobject) where the levels have been replaced. Codes are unchanged and may refer to different strings.If
in_place = True, the levels are replaced in the current object, and a reference to the current object is returned.
- set_codes(codes, in_place=False)[source]¶
- Parameters:
- Return type:
- Returns:
A modified
Factorobject with the new codes, either as a new object or as a reference to the current object.
- set_levels(levels, remap=True, in_place=False)[source]¶
Alias for
remap_levels()ifremap = True, otherwise an alias forreplace_levels(). The first alias is deprecated andremap_levels()should be used directly if that is the intent.- Return type:
- set_slice(index, value, in_place=False)[source]¶
Replace items in the
Factorlist. Theindexelements in the current object are replaced with the corresponding values invalue. This is performed by finding the level for each entry of the replacementvalue, matching it to a level in the current object, and replacing the entry ofcodeswith the code of the matched level. If there is no matching level, a missing value is inserted.- Parameters:
index (
Union[slice,range,Sequence,int,str,bool,NormalizedSubscript]) – Subset of elements to replace, seenormalize_subscript()for details. Strings are matched to names in the current object, using the first occurrence if duplicate names are present. Scalars are treated as length-1 sequences.value (
Factor) – AFactorof the same length containing the replacement values.in_place (
bool) – Whether the replacement should be performed in place.
- Returns:
A
Factorobject with values atindexreplaced byvalue. This is either a new object or a reference to the current object, depending onin_place.
- set_value(index, value, in_place=False)[source]¶
- Parameters:
index (
Union[str,int]) – Integer index of the element to replace. Alternatively, a string containing the name of the element, using the first occurrence if duplicate names are present.value (
Optional[str]) – Replacement value. This should be a string corresponding to a factor level, or None if missing.in_place (
bool) – Whether to perform the modification in place.
- Return type:
- Returns:
A
Factorobject with the modified entry atindex. This is either a new object or a reference to the current object.
- to_pandas()[source]¶
Coerce to
Categoricalobject.- Returns:
A
Categoricalobject.- Return type:
Categorical
biocutils.FloatList module¶
- class biocutils.FloatList.FloatList(data=None, names=None, _validate=True)[source]¶
Bases:
NamedListList of floats. This mimics a regular Python list except that anything added to it will be coerced into a float. None values are also acceptable and are treated as missing floats. The list may also be named (see
NamedList), which provides some dictionary-like functionality.- __annotations__ = {}¶
- safe_append(value, in_place=False)[source]¶
Calls
safe_append()after coercingvalueto a float.- Return type:
- safe_extend(other, in_place=True)[source]¶
Calls
safe_extend()after coercing elements ofotherto floats.- Return type:
- safe_insert(index, value, in_place=False)[source]¶
Calls
safe_insert()after coercingvalueto a float.- Return type:
- set_slice(index, value, in_place=False)[source]¶
Calls
set_slice()after coercingvalueto floats.- Return type:
- set_value(index, value, in_place=False)[source]¶
Calls
set_value()after coercingvalueto a float.- Return type:
biocutils.IntegerList module¶
- class biocutils.IntegerList.IntegerList(data=None, names=None, _validate=True)[source]¶
Bases:
NamedListList of integers. This mimics a regular Python list except that anything added to it will be coerced into a integer. None values are also acceptable and are treated as missing integers. The list may also be named (see
NamedList), which provides some dictionary-like functionality.- __annotations__ = {}¶
- safe_append(value, in_place=False)[source]¶
Calls
safe_append()after coercingvalueto a integer.- Return type:
- safe_extend(other, in_place=True)[source]¶
Calls
safe_extend()after coercing elements ofotherto integers.- Return type:
- safe_insert(index, value, in_place=False)[source]¶
Calls
safe_insert()after coercingvalueto a integer.- Return type:
- set_slice(index, value, in_place=False)[source]¶
Calls
set_slice()after coercingvalueto integers.- Return type:
- set_value(index, value, in_place=False)[source]¶
Calls
set_value()after coercingvalueto a integer.- Return type:
biocutils.NamedList module¶
- class biocutils.NamedList.NamedList(data=None, names=None, _validate=True)[source]¶
Bases:
objectA list-like object that could have names for each element, equivalent to R’s named list. This combines list and dictionary functionality, e.g., it can be indexed by position or slices (list) but also by name (dictionary).
- __add__(other)[source]¶
Alias for
safe_extend().- Return type:
- __annotations__ = {}¶
- __deepcopy__(memo=None, _nil=[])[source]¶
- Parameters:
memo – See
deepcopy()for details._nil – See
deepcopy()for details.
- Return type:
- Returns:
A deep copy of a
NamedListwith the same contents.
- __getitem__(index)[source]¶
If
indexis a scalar, this is an alias forget_value().If
indexis a sequence, this is an alias forget_slice().
- __hash__ = None¶
- __iadd__(other)[source]¶
Alias for
extend(), returning a reference to the current object after the in-place modification.
- __setitem__(index, value)[source]¶
If
indexis a scalar, this is an alias forset_value()within_place = True.If
indexis a sequence, this is an alias forset_slice()within_place = True.
- append(value)[source]¶
Alias for
safe_append()within_place = True.
- as_list()[source]¶
- Return type:
- Returns:
The underlying list of elements.
The returned object should be treated as a read-only reference.
- extend(other)[source]¶
Alias for
safe_extend()within_place = True.
- get_names()[source]¶
- Return type:
- Returns:
Names for the list elements.
The returned object should be treated as a read-only reference. To modify the names, use
set_names()instead.
- get_slice(index)[source]¶
- Parameters:
index (
Union[slice,range,Sequence,int,str,bool,NormalizedSubscript]) – Subset of elements to obtain, seenormalize_subscript()for details. Strings are matched to names in the current object, using the first occurrence if duplicate names are present. Scalars are treated as length-1 sequences.- Return type:
- Returns:
A
NamedListis returned containing the specified subset.
- insert(index, value)[source]¶
Alias for
safe_insert()within_place = True.
- property names: Names¶
Alias for
get_names().
- safe_append(value, in_place=False)[source]¶
- Parameters:
- Return type:
- Returns:
A
NamedListwherevalueis added to the end. Ifin_place = False, this is a new object, otherwise it is a reference to the current object. If names are present in the current object, the newly added element has its name set to an empty string.
- safe_extend(other, in_place=False)[source]¶
- Parameters:
- Return type:
- Returns:
A
NamedListwhere items inotherare added to the end. Ifin_place = False, this is a new object, otherwise a reference to the current object is returned.
- safe_insert(index, value, in_place=False)[source]¶
- Parameters:
index (
Union[int,str]) – An integer index containing a position to insert at. Alternatively, the name of the value to insert at (the first occurrence of each name is used).value (
Any) – A value to be inserted into the current object.in_place (
bool) – Whether to modify the current object in place.
- Return type:
- Returns:
A
NamedListwherevalueis inserted atindex. This is a new object ifin_place = False, otherwise it is a reference to the current object. If names are present in the current object, the newly inserted element’s name is set to an empty string.
- set_names(names, in_place=False)[source]¶
- Parameters:
- Return type:
- Returns:
A modified
NamedListwith the new names. Ifin_place = False, this is a newNamedList, otherwise it is a reference to the currentNamedList.
- set_slice(index, value, in_place=False)[source]¶
- Parameters:
index (
Union[slice,range,Sequence,int,str,bool,NormalizedSubscript]) – Subset of elements to replace, seenormalize_subscript()for details. Strings are matched to names in the current object, using the first occurrence if duplicate names are present. Scalars are treated as length-1 sequences.value (
Sequence) –If
indexis a sequence, a sequence of the same length containing values to be set at the positions inindex.If
indexis a scalar, any object to be used as the replacement value for the position atindex.in_place (
bool) – Whether to perform the replacement in place.
- Return type:
- Returns:
A
NamedListwhere the entries atindexare replaced with the contents ofvalue. Ifin_place = False, this is a new object, otherwise it is a reference to the current object.Unlike
set_value(), this will not add new elements ifindexcontains names that do not already exist in the object; a missing name error is raised instead.
- set_value(index, value, in_place=False)[source]¶
- Parameters:
index (
Union[str,int]) – Integer index of the element to obtain. Alternatively, a string containing the name of the element; we consider the first occurrence of the name if duplicates are present.value (
Any) – Replacement value of the list element.in_place (
bool) – Whether to perform the replacement in place.
- Return type:
- Returns:
A
NamedListis returned after the value at the specified position (or with the specified name) is replaced. Ifin_place = False, this is a new object, otherwise it is a reference to the current object.If
indexis a name that does not already exist in the current object,valueis added to the end of the list, and theindexis added as a new name.
biocutils.Names module¶
- class biocutils.Names.Names(names=None, _validate=True)[source]¶
Bases:
objectList of strings containing names. Typically used to decorate sequences, such that callers can get or set elements by name instead of position.
- __add__(other)[source]¶
- Parameters:
other (
list) – List of names.- Returns:
A new
Namescontaining the combined contents of the current object andother.
- __deepcopy__(memo=None, _nil=[])[source]¶
- Parameters:
memo – See
deepcopy()for details._nil – See
deepcopy()for details.
- Return type:
- Returns:
A deep copy of this
Namesobject with the same contents.
- __getitem__(index)[source]¶
If
indexis a scalar, this is an alias forget_value.If
indexis a sequence, this is an alias forget_slice.
- __hash__ = None¶
- __iadd__(other)[source]¶
- Parameters:
other (
list) – List of names.- Returns:
The current object is modified by adding
otherto its names.
- __iter__()[source]¶
- Return type:
list_iterator
- Returns:
An iterator on the underlying list of names.
- __setitem__(index, value)[source]¶
If
indexis a scalar, this is an alias forset_valuewithin_place = True.If
indexis a sequence, this is an alias forset_slicewithin_place = True.
- append(value)[source]¶
Alias for
safe_appendwithin_place = True.
- extend(value)[source]¶
Alias for
safe_extendwithin_place = True.
- get_slice(index)[source]¶
- Parameters:
index (
Union[slice,range,Sequence,int,bool,NormalizedSubscript]) – Positions of interest, see the allowed indices innormalize_subscript()for more details. Scalars are treated as length-1 sequences.- Return type:
- Returns:
A
Namesobject containing the names at the specified positions.
- insert(index, value)[source]¶
Alias for
safe_insertwithin_place = True.
- set_slice(index, value, in_place=False)[source]¶
- Parameters:
- Return type:
- Returns:
A modified
Namesobject with the replacement name, either as a new object or as a reference to the current object.
biocutils.StringList module¶
- class biocutils.StringList.StringList(data=None, names=None, _validate=True)[source]¶
Bases:
NamedListList of strings. This mimics a regular Python list except that anything added to it will be coerced into a string. None values are also acceptable and are treated as missing strings. The list may also be named (see
NamedList), which provides some dictionary-like functionality.- __annotations__ = {}¶
- safe_append(value, in_place=False)[source]¶
Calls
safe_append()after coercingvalueto a string.- Return type:
- safe_extend(other, in_place=True)[source]¶
Calls
safe_extend()after coercing elements ofotherto strings.- Return type:
- safe_insert(index, value, in_place=False)[source]¶
Calls
safe_insert()after coercingvalueto a string.- Return type:
- set_slice(index, value, in_place=False)[source]¶
Calls
set_slice()after coercingvalueto strings.- Return type:
- set_value(index, value, in_place=False)[source]¶
Calls
set_value()after coercingvalueto a string.- Return type:
biocutils.assign module¶
- biocutils.assign.assign(x, indices, replacement)[source]¶
Generic assign that checks if the objects are n-dimensional for n > 1 (i.e. has a
shapeproperty of length greater than 1); if so, it callsassign_rows()to assign them along the first dimension, otherwise it assumes that they are vector-like and callsassign_sequence()instead.
biocutils.assign_rows module¶
- biocutils.assign_rows.assign_rows(x, indices, replacement)[source]¶
Assign
replacementvalues to a copy ofxat the rows specified byindices. This defaults to creating a deep copy ofxand then assigningreplacementto the first dimension of the copy.- Parameters:
x (
Any) – Any high-dimensional object.indices (
Sequence[int]) – Sequence of non-negative integers specifying rows ofx.replacement (
Any) – Replacement values to be assigned tox. This should have the same number of rows as the length ofindices. Typicallyreplacementwill have the same dimensionality asx.
- Return type:
- Returns:
A copy of
xwith the rows replaced atindices.
biocutils.assign_sequence module¶
- biocutils.assign_sequence.assign_sequence(x, indices, replacement)[source]¶
Assign
replacementvalues to a copy ofxat the specifiedindices. This defaults to creating a deep copy ofxand then iterating throughindicesto assign the values ofreplacement.- Parameters:
- Return type:
- Returns:
A copy of
xwith the replacement values.
biocutils.combine module¶
- biocutils.combine.combine(*x)[source]¶
Generic combine that checks if the objects are n-dimensional for n > 1 (i.e. has a
shapeproperty of length greater than 1); if so, it callscombine_rows()to combine them by the first dimension, otherwise it assumes that they are vector-like and callscombine_sequences()instead.- Parameters:
x (
Any) – Objects to combine.- Returns:
A combined object, typically the same type as the first element in
x.
biocutils.combine_columns module¶
- biocutils.combine_columns.combine_columns(*x)[source]¶
Combine n-dimensional objects along the second dimension.
If all elements are
ndarray, we combine them using numpy’sconcatenate().If all elements are either
spmatrixorsparray, these objects are combined using scipy’shstack.If all elements are
DataFrameobjects, they are combined usingconcat()along the second axis.- Parameters:
x (
Any) – n-dimensional objects to combine. All elements of x are expected to be the same class.- Returns:
Combined object, typically the same type as the first entry of
x
biocutils.combine_rows module¶
- biocutils.combine_rows.combine_rows(*x)[source]¶
Combine n-dimensional objects along their first dimension.
If all elements are
ndarray, we combine them using numpy’sconcatenate().If all elements are either
spmatrixorsparray, these objects are combined using scipy’svstack.If all elements are
DataFrameobjects, they are combined usingconcat()along the first axis.- Parameters:
x (
Any) – One or more n-dimensional objects to combine. All elements of x are expected to be the same class.- Returns:
Combined object, typically the same type as the first entry of
x.
biocutils.combine_sequences module¶
- biocutils.combine_sequences.combine_sequences(*x)[source]¶
Combine vector-like objects (1-dimensional arrays).
If all elements are
ndarray, we combine them using numpy’sconcatenate().If all elements are
Seriesobjects, they are combined usingconcat().For all other scenarios, all elements are coerced to a
listand combined.- Parameters:
x (
Any) – Vector-like objects to combine. All elements ofxare expected to be the same class or atleast compatible with each other.- Returns:
A combined object, ideally of the same type as the first element in
x.
biocutils.convert_to_dense module¶
biocutils.extract_column_names module¶
biocutils.extract_row_names module¶
biocutils.factorize module¶
- biocutils.factorize.factorize(x, levels=None, sort_levels=False, dtype=None, fail_missing=None)[source]¶
Convert a sequence of hashable values into a factor.
- Parameters:
x (
Sequence) – A sequence of hashable values. Any value may be None to indicate missingness.levels (
Optional[Sequence]) – Sequence of reference levels, against which the entries inxare compared. If None, this defaults to all unique values ofx.sort_levels (
bool) – Whether to sort the automatically-determined levels. If False, the levels are kept in order of their appearance inx. Not used iflevelsis explicitly supplied.dtype (
Optional[dtype]) – NumPy type of the array of indices, seematch()for details.fail_missing (
Optional[bool]) – Whether to raise an error upon encountering missing levels inx, seematch()for details.
- Return type:
- Returns:
Tuple where the first element is a list of unique levels and the second element in a NumPy array containing integer codes, i.e., indices into the first list. Indexing the first list by the second array will recover
x, with the exception of any None or masked values inxthat will instead be represented by -1 in the second array.
biocutils.get_height module¶
biocutils.intersect module¶
- biocutils.intersect.intersect(*x, duplicate_method='first')[source]¶
Identify the intersection of values in multiple sequences, while preserving the order of values in the first sequence.
- Parameters:
x (
Sequence) – Zero, one or more sequences of interest containing hashable values. We ignore missing values as defined byis_missing_scalar().duplicate_method (
Literal['first','last']) – Whether to keep the first or last occurrence of duplicated values when preserving order in the first sequence.
- Return type:
- Returns:
Intersection of values across all
x.
biocutils.is_high_dimensional module¶
biocutils.is_list_of_type module¶
- biocutils.is_list_of_type.is_list_of_type(x, target_type, ignore_none=False)[source]¶
Checks if
xis a list, and whether all elements of the list are of the same type.- Parameters:
- Return type:
- Returns:
True if
xis a list or tuple and all elements are of the target type (or None, ifignore_none = True). Otherwise, False.
biocutils.is_missing_scalar module¶
biocutils.map_to_index module¶
- biocutils.map_to_index.map_to_index(x, duplicate_method='first')[source]¶
Create a dictionary to map values of a sequence to positional indices.
- Parameters:
x (
Sequence) – Sequence of hashable values. We ignore missing values defined byis_missing_scalar().duplicate_method (
Literal['first','last']) – Whether to consider the first or last occurrence of a duplicated value inx.
- Returns:
Dictionary that maps values of
xto their position insidex.- Return type:
biocutils.match module¶
- biocutils.match.match(x, targets, duplicate_method='first', dtype=None, fail_missing=None)[source]¶
Find a matching value of each element of
xintarget.- Parameters:
x (
Sequence) – Sequence of values to match.targets (
Union[dict,Sequence]) – Sequence of targets to be matched against. Alternatively, a dictionary generated by passing a sequence of targets tomap_to_index().duplicate_method (
Literal['first','last']) – How to handle duplicate entries intargets. Matches can be reported to the first or last occurrence of duplicates.dtype (
Optional[ndarray]) – NumPy type of the output array. This should be an integer type; if missing values are expected, the type should be a signed integer. If None, a suitable signed type is automatically determined.fail_missing (
Optional[bool]) – Whether to raise an error ifxcannot be found intargets. IfNone, this defaults toTrueifdtypeis an unsigned type, otherwise it defaults toFalse.
- Return type:
- Returns:
Array of length equal to
x, containing the integer position of each entry ofxinsidetarget; or -1, if the entry ofxis None or cannot be found intarget.
biocutils.normalize_subscript module¶
- class biocutils.normalize_subscript.NormalizedSubscript(subscript)[source]¶
Bases:
objectSubscript normalized by
normalize_subscript(). This is used to indicate that no further normalization is required, such thatnormalize_subscript()is just a no-op.
- biocutils.normalize_subscript.normalize_subscript(sub, length, names=None, non_negative_only=True)[source]¶
Normalize a subscript for
__getitem__or friends into a sequence of integer indices, for consistent downstream use.- Parameters:
sub (
Union[slice,range,Sequence,int,str,bool,NormalizedSubscript]) –The subscript. This can be any of the following:
A slice.
A range containing indices to elements. Negative values are allowed. An error is raised if the indices are out of range.
A single integer specifying the index of an element. A negative value is allowed. An error is raised if the index is out of range.
A single string that can be found in
names, which is converted to the index of the first occurrence of that string innames. An error is raised if the string cannot be found.A single boolean, which is converted into a list containing the first element if true, and an empty list if false.
A sequence of strings, integers and/or booleans. Strings are converted to indices based on first occurrence in
names, as described above. Integers should be indices to an element. Each truthy boolean is converted to an index equal to its position insub, and each Falsey boolean is ignored.A
NormalizedSubscript, in which case thesubscriptproperty is directly returned.
length (
int) – Length of the object.names (
Optional[Sequence[str]]) – List of names for each entry in the object. If not None, this should have length equal tolength. Some optimizations are possible if this is aNamesobject.non_negative_only (
bool) – Whether negative indices must be converted into non-negative equivalents. Setting this to False may improve efficiency.
- Return type:
- Returns:
A tuple containing (i) a sequence of integer indices in
[0, length)specifying the subscript elements, and (ii) a boolean indicating whethersubwas a scalar.
biocutils.package_utils module¶
biocutils.print_truncated module¶
- biocutils.print_truncated.print_truncated(x, truncated_to=3, full_threshold=10)[source]¶
Pretty-print an object, replacing the middle elements of lists/dictionaries with an ellipsis if there are too many. This provides a useful preview of an object without spewing out all of its contents on the screen.
- Parameters:
- Return type:
- Returns:
String containing the pretty-printed contents.
- biocutils.print_truncated.print_truncated_dict(x, truncated_to=3, full_threshold=10, transform=None, sep=', ', include_brackets=True)[source]¶
Pretty-print a dictionary, replacing the middle elements with an ellipsis if there are too many. This provides a useful preview of an object without spewing out all of its contents on the screen.
- Parameters:
x (
Dict) – Dictionary to be printed.truncated_to (
int) – Number of elements to truncate to, at the start and end of the sequence. This should be less than half offull_threshold.full_threshold (
int) – Threshold on the number of elements, below which the list is shown in its entirety.transform (
Optional[Callable]) – Optional transformation to apply to the values ofxafter truncation but before printing. Defaults toprint_truncated()if not supplied.sep (
str) – Separator between elements in the printed list.include_brackets (
bool) – Whether to include the start/end brackets.
- Return type:
- Returns:
String containing the pretty-printed truncated dict.
- biocutils.print_truncated.print_truncated_list(x, truncated_to=3, full_threshold=10, transform=None, sep=', ', include_brackets=True)[source]¶
Pretty-print a list, replacing the middle elements with an ellipsis if there are too many. This provides a useful preview of an object without spewing out all of its contents on the screen.
- Parameters:
x (
List) – List to be printed.truncated_to (
int) – Number of elements to truncate to, at the start and end of the list. This should be less than half offull_threshold.full_threshold (
int) – Threshold on the number of elements, below which the list is shown in its entirety.transform (
Optional[Callable]) – Optional transformation to apply to the elements ofxafter truncation but before printing. Defaults toprint_truncated()if not supplied.sep (
str) – Separator between elements in the printed list.include_brackets (
bool) – Whether to include the start/end brackets.
- Return type:
- Returns:
String containing the pretty-printed truncated list.
biocutils.print_wrapped_table module¶
- biocutils.print_wrapped_table.create_floating_names(names, indices)[source]¶
Create the floating names to use in
print_wrapped_table(). If no names are present, positional indices are used instead.
- biocutils.print_wrapped_table.print_type(x)[source]¶
Print the type of an object, with some special behavior for certain classes (e.g., to add the data type of NumPy arrays). This is intended for display at the top of the columns of
print_wrapped_table().- Parameters:
x – Some object.
- Return type:
- Returns:
String containing the class of the object.
- biocutils.print_wrapped_table.print_wrapped_table(columns, floating_names=None, sep=' ', window=None)[source]¶
Pretty-print a table with aligned and wrapped columns. All column contents are padded so that they are right- justified. Wrapping is performed whenever a new column would exceed the window width, in which case the entire column (and all subsequent columns) are printed below the previous columns.
- Parameters:
columns (
List[Sequence[str]]) –List of list of strings, where each inner list is the same length and contains the visible contents of a column. Strings are typically generated by calling repr() on data column values.
Callers are responsible for inserting ellipses, adding column type information (e.g., with
print_type()) or truncating long strings (e.g., withtruncate_strings()).floating_names (
Optional[Sequence[str]]) –List of strings to be added to the left of the table. This is printed repeatedly for each set of wrapped columns.
See also
create_floating_names().sep (
str) – Separator between columns.window (
Optional[int]) – Size of the terminal window, in characters. We attempt to determine this automatically, otherwise it is set to 150.
- Return type:
- Returns:
String containing the pretty-printed table.
- biocutils.print_wrapped_table.truncate_strings(values, width=40)[source]¶
Truncate long strings for printing in
print_wrapped_table().
biocutils.relaxed_combine_columns module¶
- biocutils.relaxed_combine_columns.relaxed_combine_columns(*x)[source]¶
Combine n-dimensional objects along the second dimension.
- Parameters:
x (
Any) – n-dimensional objects to combine. All elements of x are expected to be the same class.- Returns:
Combined object, typically the same type as the first entry of
x
biocutils.relaxed_combine_rows module¶
- biocutils.relaxed_combine_rows.relaxed_combine_rows(*x)[source]¶
Combine n-dimensional objects along their first dimension.
- Parameters:
x (
Any) – One or more n-dimensional objects to combine. All elements of x are expected to be the same class.- Returns:
Combined object, typically the same type as the first entry of
x.
biocutils.reverse_index module¶
biocutils.show_as_cell module¶
- biocutils.show_as_cell.show_as_cell(x, indices)[source]¶
Show the contents of
xas a cell of a table, typically for use in the__str__method of a class that containsx.- Parameters:
- Return type:
- Returns:
List of strings of length equal to
indices, containing a string summary of each of the specified elements ofx.
biocutils.subset module¶
- biocutils.subset.subset(x, indices)[source]¶
Generic subset that checks if the objects are n-dimensional for n > 1 (i.e. has a
shapeproperty of length greater than 1); if so, it callssubset_rows()to subset them along the first dimension, otherwise it assumes that they are vector-like and callssubset_sequence()instead.- Parameters:
x (
Any) – Object to be subsetted.- Returns:
The subsetted object, typically the same type as
x.
biocutils.subset_rows module¶
biocutils.subset_sequence module¶
biocutils.union module¶
- biocutils.union.union(*x, duplicate_method='first')[source]¶
Identify the union of values in multiple sequences, while preserving the order of the first (or last) occurence of each value.
- Parameters:
x (
Sequence) – Zero, one or more sequences of interest containing hashable values. We ignore missing values as defined byis_missing_scalar().duplicate_method (
Literal['first','last']) – Whether to take the first or last occurrence of each value in the ordering of the output. If first, the first occurrence in the earliest sequence ofxis reported; if last, the last occurrence in the latest sequence ofxis reported.
- Return type:
- Returns:
Union of values across all
x.