rds2py package

Submodules

rds2py.PyRdsReader module

Low-level interface for reading RDS file format.

This module provides the core functionality for parsing RDS files at a binary level and converting them into a dictionary representation that can be further processed by higher-level functions.

class rds2py.PyRdsReader.PyRdsParser(file_path)[source]

Bases: object

Parser for reading RDS files.

This class provides low-level access to RDS file contents, handling the binary format and converting it into Python data structures. It supports various R data types and handles special R cases like NA values, integer sequences and range functions.

R_MIN

Minimum integer value in R, used for handling NA values.

rds_object

Internal representation of the RDS file.

root_object

Root object of the parsed RDS file.

R_MIN: int = -2147483648
__annotations__ = {'R_MIN': <class 'int'>}
get_dimensions()[source]
Return type:

Optional[tuple]

parse()[source]

Parse the entire RDS file into a dictionary structure.

Returns:

  • ‘type’: The R object type

  • ’data’: The actual data (if applicable)

  • ’attributes’: R object attributes (if any)

  • ’class_name’: The R class name

  • Additional keys depending on the object type

Return type:

A dictionary containing the parsed data with keys

Raises:

PyRdsParserError – If there’s an error during parsing.

exception rds2py.PyRdsReader.PyRdsParserError[source]

Bases: Exception

Exception raised for errors during RDS parsing.

rds2py.generics module

Core functionality for reading RDS files in Python.

This module provides the main interface for reading RDS files and converting them to appropriate Python objects. It maintains a registry of supported R object types and their corresponding Python parser functions.

The module supports various R object types including vectors, matrices, data frames, and specialized Bioconductor objects like GenomicRanges and SummarizedExperiment.

Example

data = read_rds("example.rds")
print(type(data))
rds2py.generics.read_rds(path, **kwargs)[source]

Read an RDS file and convert it to an appropriate Python object.

Parameters:
  • path (str) – Path to the RDS file to be read.

  • **kwargs – Additional arguments passed to specific parser functions.

Returns:

A Python object representing the data in the RDS file. The exact type depends on the contents of the RDS file and the available parsers.

rds2py.lib_rds_parser module

class rds2py.lib_rds_parser.RdsObject

Bases: pybind11_object

get_robject(self: rds2py.lib_rds_parser.RdsObject) RdsReader
exception rds2py.lib_rds_parser.RdsParserError

Bases: Exception

class rds2py.lib_rds_parser.RdsReader

Bases: pybind11_object

__annotations__ = {}
get_attribute_names(self: rds2py.lib_rds_parser.RdsReader) list
get_class_name(self: rds2py.lib_rds_parser.RdsReader) str
get_dimensions(self: rds2py.lib_rds_parser.RdsReader) tuple[int, int]
get_numeric_data(self: rds2py.lib_rds_parser.RdsReader) numpy.ndarray
get_package_name(self: rds2py.lib_rds_parser.RdsReader) str
get_rsize(self: rds2py.lib_rds_parser.RdsReader) int
get_rtype(self: rds2py.lib_rds_parser.RdsReader) str
get_string_arr(self: rds2py.lib_rds_parser.RdsReader) list
load_attribute_by_name(self: rds2py.lib_rds_parser.RdsReader, arg0: str) object
load_vec_element(self: rds2py.lib_rds_parser.RdsReader, arg0: int) object

rds2py.rdsutils module

Utility functions for RDS file parsing and class inference.

This module provides helper functions for parsing RDS files and inferring the appropriate R class information from parsed objects.

rds2py.rdsutils.get_class(robj)[source]

Infer the R class name from a parsed RDS object.

Notes

  • Handles both S4 and non-S4 R objects

  • Special handling for vectors and matrices

  • Checks for class information in object attributes

Parameters:

robj (dict) – Dictionary containing parsed RDS data, typically the output of parse_rds().

Return type:

str

Returns:

The inferred R class name, or None if no class can be determined.

rds2py.rdsutils.parse_rds(path)[source]

Parse an RDS file into a dictionary representation.

Parameters:

path (str) – Path to the RDS file to be parsed.

Return type:

dict

Returns:

A dictionary containing the parsed contents of the RDS file. The structure depends on the type of R object stored in the file.

rds2py.read_atomic module

Functions for parsing atomic R vector types into Python objects.

This module provides parser functions for converting R’s atomic vector types (boolean, integer, string, and double) into appropriate Python objects using the biocutils package’s specialized list classes.

rds2py.read_atomic.read_boolean_vector(robject, **kwargs)[source]

Convert an R boolean vector to a Python BooleanList.

Parameters:
  • robject (dict) – Dictionary containing parsed R boolean vector data.

  • **kwargs – Additional arguments.

Return type:

BooleanList

Returns:

A BooleanList object containing the vector data and any associated names.

rds2py.read_atomic.read_double_vector(robject, **kwargs)[source]

Convert an R double vector to a Python FloatList.

Parameters:
  • robject (dict) – Dictionary containing parsed R double vector data.

  • **kwargs – Additional arguments.

Return type:

FloatList

Returns:

A FloatList object containing the vector data and any associated names.

rds2py.read_atomic.read_integer_vector(robject, **kwargs)[source]

Convert an R integer vector to a Python IntegerList.

Parameters:
  • robject (dict) – Dictionary containing parsed R integer vector data.

  • **kwargs – Additional arguments.

Return type:

IntegerList

Returns:

A IntegerList object containing the vector data and any associated names.

rds2py.read_atomic.read_string_vector(robject, **kwargs)[source]

Convert an R string vector to a Python StringList.

Parameters:
  • robject (dict) – Dictionary containing parsed R string vector data.

  • **kwargs – Additional arguments.

Return type:

StringList

Returns:

A StringList object containing the vector data and any associated names.

rds2py.read_delayed_matrix module

Functions and classes for parsing R delayed matrix objects from HDF5Array.

rds2py.read_delayed_matrix.read_hdf5_sparse(robject, **kwargs)[source]

Convert an R delayed sparse array (H5-backed).

Parameters:
  • robject (dict) – Dictionary containing parsed delayed sparse array.

  • **kwargs – Additional arguments.

Return type:

Hdf5CompressedSparseMatrix

Returns:

A Hdf5CompressedSparseMatrix from the ‘hdf5array’ package.

rds2py.read_dict module

Functions for parsing R vector and dictionary-like objects.

This module provides functionality to convert R named vectors and list objects into Python dictionaries or lists, maintaining the structure and names of the original R objects.

rds2py.read_dict.read_dict(robject, **kwargs)[source]

Convert an R named vector or list to a Python dictionary or list.

Parameters:
  • robject (dict) – Dictionary containing parsed R vector/list data.

  • **kwargs – Additional arguments.

Return type:

dict

Returns:

If the R object has names, returns a dictionary mapping names to values. Otherwise, returns a list of parsed values.

Example

>>> # For a named R vector c(a=1, b=2)
>>> result = read_dict(robject)
>>> print(result)
{'a': 1, 'b': 2}

rds2py.read_factor module

Functions for parsing R factor objects.

This module handles the conversion of R factors (categorical variables) into Python lists, preserving the levels and maintaining the order of the factor levels.

rds2py.read_factor.read_factor(robject, **kwargs)[source]

Convert an R factor to a Python list.

Parameters:
  • robject (dict) – Dictionary containing parsed R factor data.

  • **kwargs – Additional arguments.

Return type:

list

Returns:

A list containing the factor values, with each value repeated according to its length if specified.

rds2py.read_frame module

Functions for parsing R data frame objects.

This module provides parsers for converting both base R data.frame objects and Bioconductor DataFrame objects into Python BiocFrame objects, preserving row names, column names, and data types.

rds2py.read_frame.read_data_frame(robject, **kwargs)[source]

Convert an R data.frame to a BiocFrame object.

Parameters:
  • robject (dict) – Dictionary containing parsed R data.frame object.

  • **kwargs – Additional arguments.

Returns:

A BiocFrame object containing the data frame’s contents, with preserved column and row names.

rds2py.read_frame.read_dframe(robject, **kwargs)[source]

Convert an R DFrame (Bioconductor’s DataFrame) to a BiocFrame object.

Parameters:
  • robject (dict) – Dictionary containing parsed R DFrame object.

  • **kwargs – Additional arguments.

Returns:

A BiocFrame object containing the DataFrame’s contents, with preserved metadata and structure.

rds2py.read_granges module

Functions for parsing Bioconductor GenomicRanges objects.

This module provides parsers for converting Bioconductor’s GenomicRanges and GenomicRangesList objects into their Python equivalents, preserving all genomic coordinates and associated metadata.

rds2py.read_granges.read_genomic_ranges(robject, **kwargs)[source]

Convert an R GenomicRanges object to a Python GenomicRanges object.

Parameters:
  • robject (dict) – Dictionary containing parsed GenomicRanges data.

  • **kwargs – Additional arguments.

Return type:

GenomicRanges

Returns:

A Python GenomicRanges object containing genomic intervals with associated annotations.

rds2py.read_granges.read_granges_list(robject, **kwargs)[source]

Convert an R GenomicRangesList object to a Python GenomicRangesList.

Parameters:
  • robject (dict) – Dictionary containing parsed GenomicRangesList data.

  • **kwargs – Additional arguments.

Return type:

GenomicRangesList

Returns:

A Python GenomicRangesList object containing containing multiple GenomicRanges objects.

rds2py.read_mae module

Functions for parsing Bioconductor MultiAssayExperiment objects.

This module handles the conversion of Bioconductor’s MultiAssayExperiment container format into its Python equivalent, preserving the complex relationships between multiple experimental assays and sample metadata.

rds2py.read_mae.read_multi_assay_experiment(robject, **kwargs)[source]

Convert an R MultiAssayExperiment to a Python MultiAssayExperiment object.

Parameters:
  • robject (dict) – Dictionary containing parsed MultiAssayExperiment data.

  • **kwargs – Additional arguments.

Return type:

MultiAssayExperiment

Returns:

A Python MultiAssayExperiment object containing multiple experimental assays with associated metadata.

rds2py.read_matrix module

Functions and classes for parsing R matrix objects.

This module provides functionality to convert R matrix objects (both dense and sparse) into their Python equivalents using NumPy and SciPy sparse matrix formats. It handles various R matrix types including dgCMatrix, dgRMatrix, and dgTMatrix.

class rds2py.read_matrix.MatrixWrapper(matrix, dimnames=None)[source]

Bases: object

A simple wrapper class for matrices that preserves dimension names.

This class bundles a matrix (dense or sparse) with its dimension names, maintaining the R-style naming of rows and columns.

matrix

The underlying matrix object (numpy.ndarray or scipy.sparse matrix).

dimnames

A tuple of (row_names, column_names), each being a list of strings or None.

rds2py.read_matrix.read_dgcmatrix(robject, **kwargs)[source]

Parse an R dgCMatrix (sparse column matrix).

Parameters:
  • robject (dict) – Dictionary containing parsed dgCMatrix data.

  • **kwargs – Additional arguments.

Return type:

spmatrix

Returns:

Parsed sparse column matrix.

rds2py.read_matrix.read_dgrmatrix(robject, **kwargs)[source]

Parse an R dgRMatrix (sparse row matrix).

Parameters:
  • robject (dict) – Dictionary containing parsed dgRMatrix data.

  • **kwargs – Additional arguments.

Return type:

spmatrix

Returns:

Parsed sparse row matrix.

rds2py.read_matrix.read_dgtmatrix(robject, **kwargs)[source]

Parse an R dgTMatrix (sparse triplet matrix)..

Parameters:
  • robject (dict) – Dictionary containing parsed dgTMatrix data.

  • **kwargs – Additional arguments.

Return type:

spmatrix

Returns:

Parsed sparse matrix.

rds2py.read_matrix.read_ndarray(robject, order='F', **kwargs)[source]

Parse an R matrix as a NumPy array.

Parameters:
  • robject (dict) – Dictionary containing parsed dgCMatrix data.

  • order (Literal['C', 'F']) – Memory layout for the array.

  • **kwargs – Additional arguments.

Return type:

ndarray

Returns:

Parsed dense array.

rds2py.read_rle module

Functions for parsing R’s Rle (Run-length encoding) objects.

This module provides functionality to convert R’s Rle (Run-length encoding) objects into Python lists, expanding the compressed representation into its full form.

rds2py.read_rle.read_rle(robject, **kwargs)[source]

Convert an R Rle object to a Python list.

Parameters:
  • robject (dict) – Dictionary containing parsed Rle data.

  • **kwargs – Additional arguments.

Return type:

list

Returns:

Expanded list where each value is repeated according to its run length.

Example

>>> # For Rle with values=[1,2] and lengths=[3,2]
>>> result = read_rle(robject)
>>> print(result)
[1, 1, 1, 2, 2]

rds2py.read_sce module

Functions for parsing Bioconductor SingleCellExperiment objects.

This module provides parsers for converting Bioconductor’s SingleCellExperiment objects into their Python equivalents, handling the complex structure of single-cell data including multiple assays, reduced dimensions, and alternative experiments.

rds2py.read_sce.read_alts_summarized_experiment_by_column(robject, **kwargs)[source]

Parse alternative experiments in a SingleCellExperiment.

rds2py.read_sce.read_single_cell_experiment(robject, **kwargs)[source]

Convert an R SingleCellExperiment to Python SingleCellExperiment.

Parameters:
  • robject (dict) – Dictionary containing parsed SingleCellExperiment data.

  • **kwargs – Additional arguments.

Return type:

SingleCellExperiment

Returns:

A Python SingleCellExperiment object containing the assay data and associated metadata.

rds2py.read_se module

rds2py.read_se.read_ranged_summarized_experiment(robject, **kwargs)[source]

Convert an R RangedSummarizedExperiment to its Python equivalent.

Parameters:
  • robject (dict) – Dictionary containing parsed SummarizedExperiment data.

  • **kwargs – Additional arguments.

Return type:

RangedSummarizedExperiment

Returns:

A Python RangedSummarizedExperiment object.

rds2py.read_se.read_summarized_experiment(robject, **kwargs)[source]

Convert an R SummarizedExperiment to Python SummarizedExperiment.

Parameters:
  • robject (dict) – Dictionary containing parsed SummarizedExperiment data.

  • **kwargs – Additional arguments.

Return type:

SummarizedExperiment

Returns:

A SummarizedExperiment from the R object.

Module contents