iranges package¶

Submodules¶

iranges.IRanges module¶

class iranges.IRanges.IRanges(start=[], width=[], names=None, mcols=None, metadata=None, validate=True)[source]¶

Bases: object

A collection of integer ranges, equivalent to the IRanges class from the Bioconductor package of the same name. It enables efficient storage and manipulation of genomic intervals defined by start positions and widths.

Each range consists of a start position and width. For genomic sequences, the start is typically 1-based, though other applications may use zero or negative values. The width represents the length of the interval. Ends are inclusive.

__copy__()[source]¶

Shallow copy of the object.

Return type:: IRanges
Returns:: Same type as the caller, a shallow copy of this object.

__deepcopy__(memo)[source]¶

Deep copy of the object.

Parameters:: memo – Passed to internal deepcopy() calls.
Return type:: IRanges
Returns:: Same type as the caller, a deep copy of this object.

__getitem__(subset)[source]¶

Subset the IRanges.

Parameters:: subset (Union[Sequence, int, str, bool, slice, range]) – Integer indices, a boolean filter, or (if the current object is named) names specifying the ranges to be extracted, see normalize_subscript().
Return type:: IRanges
Returns:: A new IRanges object containing the ranges of interest.

__init__(start=[], width=[], names=None, mcols=None, metadata=None, validate=True)[source]¶

Parameters:

start (Sequence[int]) – Sequence of integers containing the start position for each range. All values should fall within the range that can be represented by a 32-bit signed integer.
width (Sequence[int]) – Sequence of integers containing the width for each range. This should be of the same length as start. All values should be non-negative and fall within the range that can be represented by a 32-bit signed integer. Similarly, start + width should not exceed the range of a 32-bit signed integer.
names (Optional[Sequence[str]]) – Sequence of strings containing the name for each range. This should have length equal to start and should only contain strings. If no names are present, None may be supplied instead.
mcols (Optional[BiocFrame]) – A data frame containing additional metadata columns for each range. This should have number of rows equal to the length of start. If None, defaults to a zero-column data frame.
metadata (Optional[dict]) – Additional metadata. If None, defaults to an empty dictionary.
validate (bool) – Whether to validate the arguments, internal use only.

__iter__()[source]¶

Iterator over ranges.

Return type:: IRangesIter

__len__()[source]¶

Return type:: int
Returns:: Length of this object.

__repr__()[source]¶

Return repr(self).

Return type:: str

__setitem__(args, value)[source]¶

Add or update positions (in-place operation).

Parameters:

subset – Integer indices, a boolean filter, or (if the current object is named) names specifying the ranges to be replaced, see normalize_subscript().
value (IRanges) – An IRanges object of length equal to the number of ranges to be replaced, as specified by subset.

Returns:

Specified ranges are replaced by value in the current object.

combine(*other)[source]¶

Combine multiple range objects into one.

Wrapper around combine_sequences().

Return type:: IRanges
Returns:: An IRanges containing all the combined ranges.

count_overlaps(query, query_type='any', max_gap=-1, min_overlap=0, delete_index=True, num_threads=1)[source]¶

Count number of overlaps for each range in query.

Parameters:

query (IRanges) – Query IRanges.
query_type (Literal['any', 'start', 'end', 'within']) –
Overlap query type, must be one of
- ”any”: Any overlap is good
- ”start”: Overlap at the beginning of the range
- ”end”: Must overlap at the end of the range
- ”within”: Fully contain the query interval
Defaults to “any”.
max_gap (int) – Maximum gap allowed in the overlap. Defaults to -1 (no gap allowed).
min_overlap (int) – Minimum overlap with query. Defaults to 1.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Return type:

ndarray

Returns:

NumPy vector with length same as number of query ranges, value represents the number of overlaps in self for each query.

coverage(shift=None, width=None, weight=None, circle_length=None, method='auto')[source]¶

Compute weighted coverage of ranges.

Parameters:

shift (Optional[ndarray]) – Array of shift values. Defaults to None for no shift.
width (Optional[int]) – Maximum width to clip to. Defaults to None for no clipping.
weight (Optional[ndarray]) – Array of weights. Defaults to None for equal weights for all ranges (weight = 1).
circle_length (Optional[int]) – Length of circular sequence. Defaults to None for linear sequence.
method (Literal['auto', 'sort', 'hash', 'naive']) – Coverage computation method. Defaults to “auto”.

Return type:

ndarray

Returns:

NumPy array containing coverage values.

disjoin(with_reverse_map=False)[source]¶

Calculate disjoint ranges.

Parameters:: with_reverse_map (bool) – Whether to return a map of indices back to the original object. Defaults to False.
Return type:: IRanges
Returns:: A new IRanges containing disjoint ranges.

disjoint_bins()[source]¶

Split ranges into a set of bins so that the ranges in each bin are disjoint.

Return type:: ndarray
Returns:: An NumPy vector indicating the bin index for each range.

distance(query)[source]¶

Calculate the pair-wise distance between ranges.

Parameters:: query (IRanges) – Query IRanges.
Return type:: ndarray
Returns:: NumPy vector containing distances for each range in query.

classmethod empty()[source]¶

Create an zero-length IRanges object.

Returns:: Same type as caller, in this case a IRanges.

property end: ndarray¶

Get all end positions (read-only).

Returns:: NumPy array of 32-bit signed integers containing the end position for all ranges.

find_overlaps(query, query_type='any', select='all', max_gap=-1, min_overlap=0, delete_index=True, num_threads=1)[source]¶

Find overlaps with query.

Parameters:

query (IRanges) – Query IRanges.
query_type (Literal['any', 'start', 'end', 'within']) –
Overlap query type, must be one of
- ”any”: Any overlap is good
- ”start”: Overlap at the beginning of the range
- ”end”: Must overlap at the end of the range
- ”within”: Fully contain the query interval
Defaults to “any”.
select (Literal['all', 'first', 'last', 'arbitrary']) –
Determine what hit to choose when there are multiple hits for a query range.

Must be one of “all”, “first”, “last”, “arbitrary”.

Defaults to “all”.
max_gap (int) – Maximum gap allowed in the overlap. Defaults to -1 (no gap allowed).
min_overlap (int) – Minimum overlap with query. Defaults to 1.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Returns:

query_hits: Indices into query ranges
self_hits: Corresponding indices into self ranges that are upstream

Each row represents a query-subject pair where subject precedes query.

Return type:

A BiocFrame with two columns

flank(width, start=True, both=False, in_place=False)[source]¶

Compute flanking ranges for each range. The logic is from the IRanges package.

If start is True for a given range, the flanking occurs at the start, otherwise the end. The widths of the flanks are given by the width parameter.

width can be negative, in which case the flanking region is reversed so that it represents a prefix or suffix of the range.

Notes

ir.flank(3, True), where “x” indicates a range in ir and “-” indicates the resulting flanking region:

—xxxxxxx

If start were False, the range in ir becomes: xxxxxxx—

For negative width, i.e. ir.flank(x, -3, FALSE), where “*” indicates the overlap between “x” and the result:

xxxx***

If both is True, then, for all ranges in “x”, the flanking regions are extended into (or out of, if width is negative) the range, so that the result straddles the given endpoint and has twice the width given by width.

This is illustrated below for ir.flank(3, both=TRUE):: —***xxxx

Checkout the documentation on the Bioc package for more details.

Parameters:

width (int) – Width to flank by. May be negative.
start (bool) – Whether to only flank starts. Defaults to True.
both (bool) – Whether to flank both starts and ends. Defaults to False.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the flanked ranges. Otherwise, the current object is directly modified and a reference to it is returned.

follow(query, select='last', delete_index=True, num_threads=1)[source]¶

Find nearest positions that are downstream/follow each query range.

Parameters:

query (IRanges) – Query IRanges.
select (Literal['all', 'last']) – Whether to return “all” hits or just “last”. Defaults to “last”.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Returns:

A numpy array of integers with length matching query, containing indices: into self for the closest downstream position of each query range. Value may be None if there are no matches.
If select=”all”:: A BiocFrame with two columns: - query_hits: Indices into query ranges - self_hits: Corresponding indices into self ranges that are upstream Each row represents a query-self pair where self follows query.

Return type:

If select=”last”

classmethod from_pandas(input)[source]¶

Create an IRanges object from a DataFrame.

Parameters:: input – Input data must contain columns ‘start’ and ‘width’.
Return type:: IRanges
Returns:: A IRanges object.

classmethod from_polars(input)[source]¶

Create an IRanges object from a DataFrame.

Parameters:: input – Input data must contain columns ‘start’ and ‘width’.
Return type:: IRanges
Returns:: A IRanges object.

gaps(start=None, end=None)[source]¶

Gaps returns an IRanges object representing the set of intervals that remain after the ranges are removed specified by the start and end arguments.

Parameters:

start (Optional[int]) – Restrict start position. Defaults to 1.
end (Optional[int]) – Restrict end position. Defaults to None.

Return type:

IRanges

Returns:

A new IRanges’s with the gap regions.

get_end()[source]¶

Get end positions (inclusive).

Return type:: ndarray
Returns:: NumPy array of 32-bit signed integers containing the end position for all ranges.

get_mcols()[source]¶

Get metadata about ranges.

Return type:: BiocFrame
Returns:: Data frame containing additional metadata columns for all ranges.

get_metadata()[source]¶

Get additional metadata.

Return type:: dict
Returns:: Dictionary containing additional metadata.

get_names()[source]¶

Get range names.

Return type:: Optional[Names]
Returns:: List containing the names for all ranges, or None if no names are present.

get_row(index_or_name)[source]¶

Access a row by index or row name.

Parameters:

index_or_name (Union[str, int]) –

Integer index of the row to access.

Alternatively, you may provide a string specifying the row name to access, only if names are available.

Raises:

ValueError – If index_or_name is not in row names. If the integer index is greater than the number of rows.
TypeError – If index_or_name is neither a string nor an integer.

Returns:

A sliced IRanges object.

Return type:

IRanges

get_start()[source]¶

Get start positions.

Return type:: ndarray
Returns:: NumPy array of 32-bit signed integers containing the start positions for all ranges.

get_width()[source]¶

Get widths.

Return type:: ndarray
Returns:: NumPy array of 32-bit signed integers containing the widths for all ranges.

intersect(other)[source]¶

Find intersecting ranges with other.

Parameters:: other (IRanges) – An IRanges object.
Raises:: TypeError – If other is not IRanges.
Return type:: IRanges
Returns:: A new IRanges object with all intersecting ranges.

intersect_ncls(other, delete_index=True, num_threads=1)[source]¶

Find intersecting ranges with other. Uses the nclist index.

Parameters:

other (IRanges) – An IRanges object.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Raises:

TypeError – If other is not IRanges.

Return type:

IRanges

Returns:

A new IRanges object with all intersecting ranges.

is_disjoint()[source]¶

Check if the ranges are disjoint.

Return type:: bool
Returns:: True if all ranges are non-overlapping, otherwise False.

property mcols: BiocFrame¶

Get metadata.

Returns:: Data frame containing additional metadata columns for all ranges.

property metadata: dict¶

Get additional metadata.

Returns:: Dictionary containing additional metadata.

property names: Names | None¶

Get names.

Returns:: List containing the names for all ranges, or None if no names are available.

narrow(start=None, width=None, end=None, in_place=False)[source]¶

Narrow ranges.

Important: These arguments are relative shift in positions for each range.

Parameters:

start (Union[int, List[int], ndarray, None]) – Relative start position. Defaults to None.
width (Union[int, List[int], ndarray, None]) – Width of each interval position. Defaults to None.
end (Union[int, List[int], ndarray, None]) – Relative end position. Defaults to None.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the narrowed ranges. Otherwise, the current object is directly modified and a reference to it is returned.

nearest(query, select='arbitrary', delete_index=True, num_threads=1)[source]¶

Find nearest ranges in both directions.

Parameters:

query (IRanges) – Query IRanges.
select (Literal['all', 'arbitrary']) – Whether to return “all” hits or “arbitrary” choice.
delete_index (bool) – Delete the cached ncls index. Internal use only.
num_threads (int) – Number of threads to use. Defaults to 1.

Returns:

A numpy array of integers with length matching query, containing indices: into self for the closest for each query range. Value may be None if there are no matches.
If select=”all”:: A BiocFrame with two columns: - query_hits: Indices into query ranges - self_hits: Corresponding indices into self ranges that are upstream Each row represents a query-subject pair where subject is nearest to query.

Return type:

If select=”arbitrary”

order(decreasing=False)[source]¶

Get the order of indices for sorting.

Parameters:: decreasing (bool) – Whether to sort in descending order. Defaults to False.
Return type:: ndarray
Returns:: NumPy vector containing index positions in the sorted order.

overlap_indices(start=None, end=None)[source]¶

Find overlaps with the start and end positions.

Parameters:

start (Optional[int]) – Start position. Defaults to None.
end (Optional[int]) – End position. Defaults to None.

Return type:

ndarray

Returns:

NumPy vector containing indices that overlap with the given range.

precede(query, select='first', delete_index=True, num_threads=1)[source]¶

Find nearest positions that are upstream/precede each query range.

Parameters:

query (IRanges) – Query IRanges.
select (Literal['all', 'first']) – Whether to return “all” hits or just “first”. Defaults to “first”.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Returns:

A numpy array of integers with length matching query, containing indices: into self for the closest upstream position of each query range. Value may be None if there are no matches.
If select=”all”:: A BiocFrame with two columns: - query_hits: Indices into query ranges - self_hits: Corresponding indices into self ranges that are upstream Each row represents a query-self pair where self precedes query.

Return type:

If select=”first”

promoters(upstream=2000, downstream=200, in_place=False)[source]¶

Get promoter regions (upstream and downstream of TSS sites).

Generates promoter ranges relative to the transcription start site (TSS), where TSS is start(x). The promoter range is expanded around the TSS according to the upstream and downstream arguments. Upstream represents the number of nucleotides in the 5’ direction and downstream the number in the 3’ direction. The full range is defined as, (start(x) - upstream) to (start(x) + downstream - 1).

Parameters:

upstream (int) – Number of positions to extend in the 5’ direction. Defaults to 2000.
downstream (int) – Number of positions to extend in the 3’ direction. Defaults to 200.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the promoter ranges. Otherwise, the current object is directly modified and a reference to it is returned.

range()[source]¶

Concatenate and compute the mix and max across all ranges.

Return type:: IRanges
Returns:: An new IRanges instance with a single range, the minimum of all the start positions, Maximum of all end positions.

reduce(with_reverse_map=False, drop_empty_ranges=False, min_gap_width=1)[source]¶

Reduce orders the ranges, then merges overlapping or adjacent ranges.

Parameters:

with_reverse_map (bool) – Whether to return map of indices back to original object. Defaults to False.
drop_empty_ranges (bool) – Whether to drop empty ranges. Defaults to False.
min_gap_width (int) – Ranges separated by a gap of at least min_gap_width positions are not merged. Defaults to 1.

Return type:

IRanges

Returns:

A new IRanges object with reduced ranges.

reflect(bounds, in_place=False)[source]¶

Reverses each range in x relative to the corresponding range in bounds.

Reflection preserves the width of a range, but shifts it such the distance from the left bound to the start of the range becomes the distance from the end of the range to the right bound. This is illustrated below, where x represents a range in x and [ and ] indicate the bounds:

[..xxx…..] becomes […..xxx..]

Parameters:

bounds (IRanges) – IRanges with the same length as the current object specifying the bounds.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the reflected ranges. Otherwise, the current object is directly modified and a reference to it is returned.

resize(width, fix='start', in_place=False)[source]¶

Resize ranges to the specified width where either the start, end, or center is used as an anchor.

Parameters:

width (Union[int, List[int], ndarray]) – Width to resize, must be non-negative!
fix (Union[Literal['start', 'end', 'center'], List[Literal['start', 'end', 'center']]]) –
Fix positions by “start”, “end”, or “center”.

Alternatively, fix may be a list with the same size as this IRanges object, denoting what to use as an anchor for each interval.

Defaults to “start”.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the resized ranges. Otherwise, the current object is directly modified and a reference to it is returned.

restrict(start=None, end=None, keep_all_ranges=False)[source]¶

Restrict ranges to a given start and end positions.

Parameters:

start (Union[int, List[int], ndarray, None]) – Start position. Defaults to None.
end (Union[int, List[int], ndarray, None]) – End position. Defaults to None.
keep_all_ranges (bool) – Whether to keep ranges that do not overlap with start and end. Defaults to False.

Return type:

IRanges

Returns:

A new IRanges with the restricted ranges.

set_mcols(mcols, in_place=False)[source]¶

Set new metadata about ranges.

Parameters:

mcols (Optional[BiocFrame]) – Data frame of additional columns, see the constructor for details.
in_place (bool) – Whether to modify the object in place.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the modified metadata columns. Otherwise, the current object is directly modified and a reference to it is returned.

set_metadata(metadata, in_place=False)[source]¶

Set or replace metadata.

Parameters:

metadata (Optional[dict]) – Additional metadata.
in_place (bool) – Whether to modify the object in place.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the modified metadata. Otherwise, the current object is directly modified and a reference to it is returned.

set_names(names, in_place=False)[source]¶

Parameters:

names (Optional[Sequence[str]]) – Sequence of names or None, see the constructor for details.
in_place (bool) – Whether to modify the object in place.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the modified names. Otherwise, the current object is directly modified and a reference to it is returned.

set_start(start, in_place=False)[source]¶

Modify start positions (in-place operation).

Parameters:

start (Sequence[int]) – Sequence of start positions, see the constructor for details.
in_place (bool) – Whether to modify the object in place.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the modified start positions. Otherwise, the current object is directly modified and a reference to it is returned.

set_width(width, in_place=False)[source]¶

Parameters:

width (Sequence[int]) – Sequence of widths, see the constructor for details.
in_place (bool) – Whether to modify the object in place.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the modified widths. Otherwise, the current object is directly modified and a reference to it is returned.

setdiff(other)[source]¶

Find set difference with other.

Parameters:: other (IRanges) – An IRanges object.
Raises:: TypeError – If other is not IRanges.
Return type:: IRanges
Returns:: A new IRanges object.

shift(shift, in_place=False)[source]¶

Shift ranges by specified amount.

Parameters:

shift (Union[int, List[int], ndarray]) – Amount to shift by.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the shifted ranges. Otherwise, the current object is directly modified and a reference to it is returned.

shift_and_clip_ranges(shift, width=None, circle_length=None)[source]¶

Shift and clip interval ranges.

Parameters:

shift (ndarray) – Array of shift values (will be recycled if necessary).
width (Optional[int]) – Maximum width to clip to. Defaults to None for no clipping.
circle_length (Optional[int]) – Length of circular sequence. Defaults to None for linear sequence.

Returns:

Array of shifted/clipped start positions
Array of shifted/clipped widths
Coverage length
Boolean indicating if ranges are in tiling configuration

Return type:

Tuple of

sliding_windows(width, step=1)[source]¶

Create sliding windows of fixed width and step size.

Parameters:

width (int) – Width of each window.
step (int) – Step size between window starts.

Return type:

List[IRanges]

Returns:

List of IRanges objects, one per input range containing the windows.

sort(decreasing=False, in_place=False)[source]¶

Sort the ranges.

Parameters:

decreasing (bool) – Whether to sort in descending order. Defaults to False.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the sorted ranges. Otherwise, the current object is directly modified and a reference to it is returned.

property start: ndarray¶

Get start positions.

Returns:: NumPy array of 32-bit signed integers containing the start positions for all ranges.

subset_by_overlaps(query, query_type='any', select='all', max_gap=-1, min_overlap=0, delete_index=True, num_threads=1)[source]¶

Subset to overlapping ranges with query.

Parameters:

query (IRanges) – Query IRanges object.
query_type (Literal['any', 'start', 'end', 'within']) –
Overlap query type, must be one of
- ”any”: Any overlap is good
- ”start”: Overlap at the beginning of the range
- ”end”: Must overlap at the end of the range
- ”within”: Fully contain the query interval
Defaults to “any”.
select (Literal['all', 'first', 'last', 'arbitrary']) –
Determine what hit to choose when there are multiple hits for a query range.

Must be one of “all”, “first”, “last”, “arbitrary”.

Defaults to “all”.
max_gap (int) – Maximum gap allowed in the overlap. Defaults to -1 (no gap allowed).
min_overlap (int) – Minimum overlap with query. Defaults to 1.
delete_index (bool) – Defaults to True, to delete the cached ncls index. Set to False, to reuse the index across multiple queries.
num_threads (int) – Number of threads to use. Defaults to 1.

Return type:

IRanges

Returns:

A new IRanges object containing ranges that overlap with query.

terminators(upstream=2000, downstream=200, in_place=False)[source]¶

Get terminator regions (upstream and downstream of TES).

Parameters:

upstream (int) – Number of positions to extend in the 5’ direction. Defaults to 2000.
downstream (int) – Number of positions to extend in the 3’ direction. Defaults to 200.
in_place (bool) – Whether to modify the object in place. Defaults to False.

Return type:

IRanges

Returns:

If in_place = False, a new IRanges is returned with the terminator ranges. Otherwise, the current object is directly modified and a reference to it is returned.

threebands(start=None, end=None, width=None)[source]¶

Split ranges into three parts: left, middle, and right.

Parameters:

starts – Array of start positions.
widths – Array of widths.
start (Union[int, ndarray, None]) – Start positions for middle band.
end (Union[int, ndarray, None]) – End positions for middle band.
width (Union[int, ndarray, None]) – Width for middle band.

Returns:

‘left’: IRanges for left bands ‘middle’: IRanges for middle bands ‘right’: IRanges for right bands

Return type:

Dictionary with

tile(n=None, width=None)[source]¶

Split ranges into either n equal parts or parts of fixed width.

Parameters:

n (Union[int, ndarray, None]) – Number of tiles per range (mutually exclusive with width).
width (Union[int, ndarray, None]) – Width of each tile (mutually exclusive with n).

Return type:

List[IRanges]

Returns:

List of IRanges objects, one per input range containing the tiles.

to_pandas()[source]¶

Convert this IRanges object to a DataFrame.

Returns:: A DataFrame object.

to_polars()[source]¶

Convert this IRanges object to a DataFrame.

Returns:: A DataFrame object.

union(other)[source]¶

Find union of ranges with other.

Parameters:: other (IRanges) – An IRanges object.
Raises:: TypeError – If other is not IRanges.
Return type:: IRanges
Returns:: A new IRanges object with all ranges.

property width: ndarray¶

Get widths.

Returns:: NumPy array of 32-bit signed integers containing the widths for all ranges.

class iranges.IRanges.IRangesIter(obj)[source]¶

Bases: object

An iterator to IRanges.

Parameters:: obj (IRanges) – Object to iterate.

__init__(obj)[source]¶

Initialize the iterator.

Parameters:: obj (IRanges) – Source object to iterate.

__iter__()[source]¶

__next__()[source]¶

iranges.lib_iranges module¶

Iranges cpp implementations

class iranges.lib_iranges.NCListHandler¶

Bases: pybind11_object

Manages an nclist-cpp index for overlap queries.

__init__(self: iranges.lib_iranges.NCListHandler, starts: numpy.ndarray[numpy.int32], ends: numpy.ndarray[numpy.int32]) → None¶

find_overlaps(self: iranges.lib_iranges.NCListHandler, query_starts: numpy.ndarray[numpy.int32], query_ends: numpy.ndarray[numpy.int32], query_type: str = 'any', select: str = 'all', max_gap: int = -1, min_overlap: int = 1, num_threads: int = 1) → tuple¶: Finds overlaps between query intervals and the indexed subject intervals.

class iranges.lib_iranges.NCListSearchHandler¶

Bases: pybind11_object

Manages nearest neighbor queries.

__annotations__ = {}¶

__init__(self: iranges.lib_iranges.NCListSearchHandler, starts: numpy.ndarray[numpy.int32], ends: numpy.ndarray[numpy.int32]) → None¶

follow(self: iranges.lib_iranges.NCListSearchHandler, query_starts: numpy.ndarray[numpy.int32], select: str = 'last', num_threads: int = 1) → object¶: Find nearest positions that are upstream/precede each query range.

nearest(self: iranges.lib_iranges.NCListSearchHandler, query_starts: numpy.ndarray[numpy.int32], query_ends: numpy.ndarray[numpy.int32], select: str = 'arbitrary', num_threads: int = 1) → object¶: Find nearest ranges in both directions.

precede(self: iranges.lib_iranges.NCListSearchHandler, query_ends: numpy.ndarray[numpy.int32], select: str = 'first', num_threads: int = 1) → object¶: Find nearest positions that are downstream/follow each query range.

iranges.lib_iranges.coverage(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32], shift: numpy.ndarray[numpy.int32], width: object, weight: numpy.ndarray[numpy.float64], circle_len: object, method: str = 'auto') → numpy.ndarray[numpy.float64]¶: Compute weighted coverage of ranges

iranges.lib_iranges.disjoint_bins(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32]) → numpy.ndarray[numpy.int32]¶: Assign ranges to disjoint bins

iranges.lib_iranges.find_overlaps_groups(self_starts: numpy.ndarray[numpy.int32], self_ends: numpy.ndarray[numpy.int32], self_groups: list[numpy.ndarray[numpy.int32]], query_starts: numpy.ndarray[numpy.int32], query_ends: numpy.ndarray[numpy.int32], query_groups: list[numpy.ndarray[numpy.int32]], query_type: str = 'any', select: str = 'all', max_gap: int = -1, min_overlap: int = 1, num_threads: int = 1) → tuple¶: Finds overlaps between query and subject intervals, respecting group boundaries.

iranges.lib_iranges.gaps_ranges(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32], restrict_start: object = None, restrict_end: object = None) → tuple[numpy.ndarray[numpy.int32], numpy.ndarray[numpy.int32]]¶: Find gaps between ranges

iranges.lib_iranges.get_order(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32]) → list[int]¶: Get the order of genomic ranges

iranges.lib_iranges.reduce_ranges(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32], drop_empty_ranges: bool = False, min_gapwidth: int = 0, with_revmap: bool = False, with_inframe_start: bool = False) → dict¶: Reduce ranges by merging overlapping or adjacent ranges

iranges.lib_iranges.shift_and_clip_ranges(starts: numpy.ndarray[numpy.int32], widths: numpy.ndarray[numpy.int32], shift: numpy.ndarray[numpy.int32], width: object, circle_len: object) → tuple[numpy.ndarray[numpy.int32], numpy.ndarray[numpy.int32], int, bool]¶: Shift and clip ranges

iranges.sew_handler module¶

class iranges.sew_handler.SEWWrangler(ref_widths, start=None, end=None, width=None, translate_negative=True, allow_nonnarrowing=False)[source]¶

Bases: object

Handler to resolve start/end/width parameters.

__init__(ref_widths, start=None, end=None, width=None, translate_negative=True, allow_nonnarrowing=False)[source]¶

Initialize SEW parameters.

Parameters:

ref_widths (ndarray) – Reference widths array.
start (Union[int, ndarray, None]) – Start positions.
end (Union[int, ndarray, None]) – End positions.
width (Union[int, ndarray, None]) – Widths.
translate_negative (bool) – Whether to translate negative coordinates.
allow_nonnarrowing (bool) – Whether to allow ranges wider than reference.

solve()[source]¶

Resolve Start/End/Width parameters to concrete ranges.

Return type:: Tuple[ndarray, ndarray]
Returns:: Tuple of resolved (starts, widths) ranges.

iranges.utils module¶

iranges.utils.calc_gap_and_overlap(start1, width1, start2, width2)[source]¶: Calculate gap, overlap and relative position between two intervals.

iranges.utils.clip_ranges(starts, widths, min_val=None, max_val=None)[source]¶

Clip ranges to specified bounds.

Parameters:

starts (ndarray) – Start positions.
widths (ndarray) – Widths.
min_val (Optional[int]) – Minimum allowed position. Defaults to None for no lower bound.
max_val (Optional[int]) – Maximum allowed position. Defaults to None for no upper bound.

Return type:

Tuple[ndarray, ndarray]

Returns:

Tuple of clipped (starts, widths) ranges.

iranges.utils.compute_up_down(starts, widths, upstream, downstream, site)[source]¶

Helper for promoters/terminators.

Return type:: Tuple[ndarray, ndarray]

iranges.utils.find_interval(x, vec)[source]¶

Python implementation of R’s findInterval function.

Parameters:

x (ndarray) – Values to find intervals for.
vec (ndarray) – Sorted vector to find intervals in.

Return type:

ndarray

Returns:

NumPy array of indices indicating which interval each x value falls into.

iranges.utils.handle_negative_coords(coords, ref_len)[source]¶

Convert negative coordinates to positive using reference length.

Parameters:

coords (MaskedArray) – Coordinate array (can have negative values).
ref_len (ndarray) – Reference lengths for conversion.

Return type:

MaskedArray

Returns:

Array with negative coordinates converted to positive.

iranges.utils.normalize_array(x, length, dtype=<class 'numpy.int32'>)[source]¶

Normalize input to masked array with proper length and type.

Parameters:

x (Union[int, float, number, ndarray, None]) – Input value (scalar, array, or None).
length (int) – Expected length for output array.
dtype (dtype) – Expected numpy dtype.

Return type:

MaskedArray

Returns:

Normalized masked array.

iranges package¶

Submodules¶

iranges.IRanges module¶

iranges.lib_iranges module¶

iranges.sew_handler module¶

iranges.utils module¶

Module contents¶