ensembldb package

Submodules

ensembldb.ensdb module

class ensembldb.ensdb.EnsDb(dbpath)[source]

Bases: object

Interface to Ensembl SQLite annotations.

__enter__()[source]
__exit__(exc_type, exc_val, exc_tb)[source]
__init__(dbpath)[source]

Initialize the EnsDb object.

Parameters:

dbpath (str) – Path to the SQLite database file.

close()[source]
exons(filter=None)[source]

Retrieve exons as GenomicRanges.

Parameters:

filter (Optional[Dict[str, Union[str, List[str]]]]) –

A dictionary defining filters to narrow down the result. Keys are column names (e.g., “exon_id”, “gene_id”, “tx_id”). Values can be a single string or a list of strings to match.

This allows filtering exons by associated gene or transcript IDs (e.g., {‘gene_id’: ‘ENSG00000139618’}).

Return type:

GenomicRanges

Returns:

A GenomicRanges object containing exon coordinates and metadata.

genes(filter=None)[source]

Retrieve genes as GenomicRanges.

Parameters:

filter (Optional[Dict[str, Union[str, List[str]]]]) –

A dictionary defining filters to narrow down the result. Keys are column names (e.g., “gene_id”, “gene_name”, “gene_biotype”). Values can be a single string or a list of strings to match.

Example

{‘gene_name’: ‘BRCA1’} {‘gene_biotype’: [‘protein_coding’, ‘lincRNA’]}

Return type:

GenomicRanges

Returns:

A GenomicRanges object containing gene coordinates and metadata.

property metadata: BiocFrame

Get database metadata.

transcripts(filter=None)[source]

Retrieve transcripts as GenomicRanges.

Parameters:

filter (Optional[Dict[str, Union[str, List[str]]]]) –

A dictionary defining filters to narrow down the result. Keys are column names (e.g., “tx_id”, “gene_id”, “tx_biotype”). Values can be a single string or a list of strings to match.

Columns from the gene table (like “gene_name”) can also be used as keys since the query performs a join.

Return type:

GenomicRanges

Returns:

A GenomicRanges object containing transcript coordinates and metadata.

ensembldb.record module

class ensembldb.record.EnsDbRecord(ensdb_id, title, species, taxonomy_id, genome, description, url, release_date, ensembl_version=None)[source]

Bases: object

Container for a single EnsDb entry.

__annotations__ = {'description': 'Optional[str]', 'ensdb_id': 'str', 'ensembl_version': 'Optional[str]', 'genome': 'Optional[str]', 'release_date': 'Optional[date]', 'species': 'Optional[str]', 'taxonomy_id': 'Optional[str]', 'title': 'str', 'url': 'str'}
__dataclass_fields__ = {'description': Field(name='description',type='Optional[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'ensdb_id': Field(name='ensdb_id',type='str',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'ensembl_version': Field(name='ensembl_version',type='Optional[str]',default=None,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'genome': Field(name='genome',type='Optional[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'release_date': Field(name='release_date',type='Optional[date]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'species': Field(name='species',type='Optional[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'taxonomy_id': Field(name='taxonomy_id',type='Optional[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'title': Field(name='title',type='str',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'url': Field(name='url',type='str',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}
__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=True,match_args=True,kw_only=False,slots=False,weakref_slot=False)
__delattr__(name)

Implement delattr(self, name).

__eq__(other)

Return self==value.

__hash__()

Return hash(self).

__init__(ensdb_id, title, species, taxonomy_id, genome, description, url, release_date, ensembl_version=None)
__match_args__ = ('ensdb_id', 'title', 'species', 'taxonomy_id', 'genome', 'description', 'url', 'release_date', 'ensembl_version')
__repr__()

Return repr(self).

__setattr__(name, value)

Implement setattr(self, name, value).

description: str | None
ensdb_id: str
ensembl_version: str | None = None
classmethod from_db_row(row)[source]

Build a record from a database query row.

Return type:

EnsDbRecord

genome: str | None
release_date: date | None
species: str | None
taxonomy_id: str | None
title: str
url: str

ensembldb.registry module

class ensembldb.registry.EnsDbRegistry(cache_dir=None, force=False)[source]

Bases: object

Registry for EnsDb resources.

__init__(cache_dir=None, force=False)[source]

Initialize the EnsDb registry.

Parameters:
  • cache_dir (Union[str, Path, None]) – Path to cache directory.

  • force (bool) – Force re-download of metadata.

download(ensdb_id, force=False)[source]
Return type:

str

get_record(ensdb_id)[source]
Return type:

EnsDbRecord

list_ensdbs()[source]

List available EnsDb IDs.

Return type:

List[str]

load_db(ensdb_id, force=False)[source]
Return type:

EnsDb

Module contents