rds2py package¶
Submodules¶
rds2py.PyRdsReader module¶
Low-level interface for reading RDS file format.
This module provides the core functionality for parsing RDS files at a binary level and converting them into a dictionary representation that can be further processed by higher-level functions.
- class rds2py.PyRdsReader.PyRdsParser(file_path: str)[source]¶
Bases:
object
Parser for reading RDS files.
This class provides low-level access to RDS file contents, handling the binary format and converting it into Python data structures. It supports various R data types and handles special R cases like NA values, integer sequences and range functions.
- rds_object¶
Internal representation of the RDS file.
- root_object¶
Root object of the parsed RDS file.
- __annotations__ = {'R_MIN': <class 'int'>}¶
- parse() Dict[str, Any] [source]¶
Parse the entire RDS file into a dictionary structure.
- Returns:
‘type’: The R object type
’data’: The actual data (if applicable)
’attributes’: R object attributes (if any)
’class_name’: The R class name
Additional keys depending on the object type
- Return type:
A dictionary containing the parsed data with keys
- Raises:
PyRdsParserError – If there’s an error during parsing.
rds2py.generics module¶
Core functionality for reading RDS files in Python.
This module provides the main interface for reading RDS files and converting them to appropriate Python objects. It maintains a registry of supported R object types and their corresponding Python parser functions.
The module supports various R object types including vectors, matrices, data frames, and specialized Bioconductor objects like GenomicRanges and SummarizedExperiment.
Example
data = read_rds("example.rds")
print(type(data))
- rds2py.generics.read_rds(path: str, **kwargs)[source]¶
Read an RDS file and convert it to an appropriate Python object.
- Parameters:
path – Path to the RDS file to be read.
**kwargs – Additional arguments passed to specific parser functions.
- Returns:
A Python object representing the data in the RDS file. The exact type depends on the contents of the RDS file and the available parsers.
rds2py.lib_rds_parser module¶
- class rds2py.lib_rds_parser.RdsObject¶
Bases:
pybind11_object
- get_robject(self: rds2py.lib_rds_parser.RdsObject) RdsReader ¶
- class rds2py.lib_rds_parser.RdsReader¶
Bases:
pybind11_object
- get_attribute_names(self: rds2py.lib_rds_parser.RdsReader) list ¶
- get_class_name(self: rds2py.lib_rds_parser.RdsReader) str ¶
- get_dimensions(self: rds2py.lib_rds_parser.RdsReader) tuple[int, int] ¶
- get_numeric_data(self: rds2py.lib_rds_parser.RdsReader) numpy.ndarray ¶
- get_package_name(self: rds2py.lib_rds_parser.RdsReader) str ¶
- get_rsize(self: rds2py.lib_rds_parser.RdsReader) int ¶
- get_rtype(self: rds2py.lib_rds_parser.RdsReader) str ¶
- get_string_arr(self: rds2py.lib_rds_parser.RdsReader) list ¶
- load_attribute_by_name(self: rds2py.lib_rds_parser.RdsReader, arg0: str) object ¶
- load_vec_element(self: rds2py.lib_rds_parser.RdsReader, arg0: int) object ¶
rds2py.rdsutils module¶
Utility functions for RDS file parsing and class inference.
This module provides helper functions for parsing RDS files and inferring the appropriate R class information from parsed objects.
- rds2py.rdsutils.get_class(robj: dict) str [source]¶
Infer the R class name from a parsed RDS object.
Notes
Handles both S4 and non-S4 R objects
Special handling for vectors and matrices
Checks for class information in object attributes
- Parameters:
robj – Dictionary containing parsed RDS data, typically the output of
parse_rds()
.- Returns:
The inferred R class name, or None if no class can be determined.
rds2py.read_atomic module¶
Functions for parsing atomic R vector types into Python objects.
This module provides parser functions for converting R’s atomic vector types (boolean, integer, string, and double) into appropriate Python objects using the biocutils package’s specialized list classes.
- rds2py.read_atomic.read_boolean_vector(robject: dict, **kwargs) BooleanList [source]¶
Convert an R boolean vector to a Python
BooleanList
.- Parameters:
robject – Dictionary containing parsed R boolean vector data.
**kwargs – Additional arguments.
- Returns:
A BooleanList object containing the vector data and any associated names.
- rds2py.read_atomic.read_double_vector(robject: dict, **kwargs) FloatList [source]¶
Convert an R double vector to a Python
FloatList
.- Parameters:
robject – Dictionary containing parsed R double vector data.
**kwargs – Additional arguments.
- Returns:
A FloatList object containing the vector data and any associated names.
- rds2py.read_atomic.read_integer_vector(robject: dict, **kwargs) IntegerList [source]¶
Convert an R integer vector to a Python
IntegerList
.- Parameters:
robject – Dictionary containing parsed R integer vector data.
**kwargs – Additional arguments.
- Returns:
A IntegerList object containing the vector data and any associated names.
- rds2py.read_atomic.read_string_vector(robject: dict, **kwargs) StringList [source]¶
Convert an R string vector to a Python
StringList
.- Parameters:
robject – Dictionary containing parsed R string vector data.
**kwargs – Additional arguments.
- Returns:
A StringList object containing the vector data and any associated names.
rds2py.read_delayed_matrix module¶
Functions and classes for parsing R delayed matrix objects from HDF5Array.
- rds2py.read_delayed_matrix.read_hdf5_sparse(robject: dict, **kwargs) Hdf5CompressedSparseMatrix [source]¶
Convert an R delayed sparse array (H5-backed).
- Parameters:
robject – Dictionary containing parsed delayed sparse array.
**kwargs – Additional arguments.
- Returns:
A Hdf5CompressedSparseMatrix from the ‘hdf5array’ package.
rds2py.read_dict module¶
Functions for parsing R vector and dictionary-like objects.
This module provides functionality to convert R named vectors and list objects into Python dictionaries or lists, maintaining the structure and names of the original R objects.
- rds2py.read_dict.read_dict(robject: dict, **kwargs) dict [source]¶
Convert an R named vector or list to a Python dictionary or list.
- Parameters:
robject – Dictionary containing parsed R vector/list data.
**kwargs – Additional arguments.
- Returns:
If the R object has names, returns a dictionary mapping names to values. Otherwise, returns a list of parsed values.
Example
>>> # For a named R vector c(a=1, b=2) >>> result = read_dict(robject) >>> print(result) {'a': 1, 'b': 2}
rds2py.read_factor module¶
Functions for parsing R factor objects.
This module handles the conversion of R factors (categorical variables) into Python lists, preserving the levels and maintaining the order of the factor levels.
- rds2py.read_factor.read_factor(robject: dict, **kwargs) list [source]¶
Convert an R factor to a Python list.
- Parameters:
robject – Dictionary containing parsed R factor data.
**kwargs – Additional arguments.
- Returns:
A list containing the factor values, with each value repeated according to its length if specified.
rds2py.read_frame module¶
Functions for parsing R data frame objects.
This module provides parsers for converting both base R data.frame objects and Bioconductor DataFrame objects into Python BiocFrame objects, preserving row names, column names, and data types.
- rds2py.read_frame.read_data_frame(robject: dict, **kwargs)[source]¶
Convert an R data.frame to a
BiocFrame
object.- Parameters:
robject – Dictionary containing parsed R data.frame object.
**kwargs – Additional arguments.
- Returns:
A BiocFrame object containing the data frame’s contents, with preserved column and row names.
- rds2py.read_frame.read_dframe(robject: dict, **kwargs)[source]¶
Convert an R DFrame (Bioconductor’s DataFrame) to a BiocFrame object.
- Parameters:
robject – Dictionary containing parsed R DFrame object.
**kwargs – Additional arguments.
- Returns:
A BiocFrame object containing the DataFrame’s contents, with preserved metadata and structure.
rds2py.read_granges module¶
Functions for parsing Bioconductor GenomicRanges objects.
This module provides parsers for converting Bioconductor’s GenomicRanges and GenomicRangesList objects into their Python equivalents, preserving all genomic coordinates and associated metadata.
- rds2py.read_granges.read_genomic_ranges(robject: dict, **kwargs) GenomicRanges [source]¶
Convert an R GenomicRanges object to a Python
GenomicRanges
object.- Parameters:
robject – Dictionary containing parsed GenomicRanges data.
**kwargs – Additional arguments.
- Returns:
A Python GenomicRanges object containing genomic intervals with associated annotations.
- rds2py.read_granges.read_granges_list(robject: dict, **kwargs) GenomicRangesList [source]¶
Convert an R GenomicRangesList object to a Python
GenomicRangesList
.- Parameters:
robject – Dictionary containing parsed GenomicRangesList data.
**kwargs – Additional arguments.
- Returns:
A Python GenomicRangesList object containing containing multiple GenomicRanges objects.
rds2py.read_mae module¶
Functions for parsing Bioconductor MultiAssayExperiment objects.
This module handles the conversion of Bioconductor’s MultiAssayExperiment container format into its Python equivalent, preserving the complex relationships between multiple experimental assays and sample metadata.
- rds2py.read_mae.read_multi_assay_experiment(robject: dict, **kwargs) MultiAssayExperiment [source]¶
Convert an R MultiAssayExperiment to a Python
MultiAssayExperiment
object.- Parameters:
robject – Dictionary containing parsed MultiAssayExperiment data.
**kwargs – Additional arguments.
- Returns:
A Python MultiAssayExperiment object containing multiple experimental assays with associated metadata.
rds2py.read_matrix module¶
Functions and classes for parsing R matrix objects.
This module provides functionality to convert R matrix objects (both dense and sparse) into their Python equivalents using NumPy and SciPy sparse matrix formats. It handles various R matrix types including dgCMatrix, dgRMatrix, and dgTMatrix.
- class rds2py.read_matrix.MatrixWrapper(matrix, dimnames=None)[source]¶
Bases:
object
A simple wrapper class for matrices that preserves dimension names.
This class bundles a matrix (dense or sparse) with its dimension names, maintaining the R-style naming of rows and columns.
- matrix¶
The underlying matrix object (numpy.ndarray or scipy.sparse matrix).
- dimnames¶
A tuple of (row_names, column_names), each being a list of strings or None.
- rds2py.read_matrix.read_dgcmatrix(robject: dict, **kwargs) spmatrix [source]¶
Parse an R dgCMatrix (sparse column matrix).
- Parameters:
robject – Dictionary containing parsed dgCMatrix data.
**kwargs – Additional arguments.
- Returns:
Parsed sparse column matrix.
- rds2py.read_matrix.read_dgrmatrix(robject: dict, **kwargs) spmatrix [source]¶
Parse an R dgRMatrix (sparse row matrix).
- Parameters:
robject – Dictionary containing parsed dgRMatrix data.
**kwargs – Additional arguments.
- Returns:
Parsed sparse row matrix.
- rds2py.read_matrix.read_dgtmatrix(robject: dict, **kwargs) spmatrix [source]¶
Parse an R dgTMatrix (sparse triplet matrix)..
- Parameters:
robject – Dictionary containing parsed dgTMatrix data.
**kwargs – Additional arguments.
- Returns:
Parsed sparse matrix.
- rds2py.read_matrix.read_ndarray(robject: dict, order: Literal['C', 'F'] = 'F', **kwargs) ndarray [source]¶
Parse an R matrix as a NumPy array.
- Parameters:
robject – Dictionary containing parsed dgCMatrix data.
order – Memory layout for the array.
**kwargs – Additional arguments.
- Returns:
Parsed dense array.
rds2py.read_rle module¶
Functions for parsing R’s Rle (Run-length encoding) objects.
This module provides functionality to convert R’s Rle (Run-length encoding) objects into Python lists, expanding the compressed representation into its full form.
- rds2py.read_rle.read_rle(robject: dict, **kwargs) list [source]¶
Convert an R Rle object to a Python list.
- Parameters:
robject – Dictionary containing parsed Rle data.
**kwargs – Additional arguments.
- Returns:
Expanded list where each value is repeated according to its run length.
Example
>>> # For Rle with values=[1,2] and lengths=[3,2] >>> result = read_rle(robject) >>> print(result) [1, 1, 1, 2, 2]
rds2py.read_sce module¶
Functions for parsing Bioconductor SingleCellExperiment objects.
This module provides parsers for converting Bioconductor’s SingleCellExperiment objects into their Python equivalents, handling the complex structure of single-cell data including multiple assays, reduced dimensions, and alternative experiments.
- rds2py.read_sce.read_alts_summarized_experiment_by_column(robject: dict, **kwargs)[source]¶
Parse alternative experiments in a SingleCellExperiment.
- rds2py.read_sce.read_single_cell_experiment(robject: dict, **kwargs) SingleCellExperiment [source]¶
Convert an R SingleCellExperiment to Python SingleCellExperiment.
- Parameters:
robject – Dictionary containing parsed SingleCellExperiment data.
**kwargs – Additional arguments.
- Returns:
A Python SingleCellExperiment object containing the assay data and associated metadata.
rds2py.read_se module¶
- rds2py.read_se.read_ranged_summarized_experiment(robject: dict, **kwargs) RangedSummarizedExperiment [source]¶
Convert an R RangedSummarizedExperiment to its Python equivalent.
- Parameters:
robject – Dictionary containing parsed SummarizedExperiment data.
**kwargs – Additional arguments.
- Returns:
A Python RangedSummarizedExperiment object.
- rds2py.read_se.read_summarized_experiment(robject: dict, **kwargs) SummarizedExperiment [source]¶
Convert an R SummarizedExperiment to Python
SummarizedExperiment
.- Parameters:
robject – Dictionary containing parsed SummarizedExperiment data.
**kwargs – Additional arguments.
- Returns:
A SummarizedExperiment from the R object.