genomicranges.io package

Submodules

genomicranges.io.gtf module

genomicranges.io.gtf.parse_gtf(path, compressed, skiprows=None, comment='#')[source]

Read a GTF file as DataFrame.

Parameters:
  • path (str) – Path to the GTF file.

  • compressed (bool) – Whether the file is gzip compressed.

  • skiprows (Union[int, List[int], None]) – Rows to skip if the gtf file has header.

  • comment (str) – Character indicating that the line should not be parsed. Defaults to “#”.

Returns:

Pandas DataFrame containing annotations from GTF.

genomicranges.io.gtf.read_gtf(file, skiprows=None, comment='#')[source]

Read a GTF file as GenomicRanges.

Parameters:
  • file (str) – Path to GTF file.

  • skiprows (Union[int, List[int]]) – Rows to skip if the gtf file has header.

  • comment (str) – Character indicating that the line should not be parsed. Defaults to “#”.

Return type:

GenomicRanges

Returns:

Genomic Ranges with annotations from the GTF file.

genomicranges.io.ucsc module

genomicranges.io.ucsc.access_gtf_ucsc(genome, type='refGene')[source]

Generate a path to a genome gtf file from UCSC, e.g. for hg19 genome.

Parameters:
  • genome (str) – Genome shortcode; e.g. hg19, hg38, mm10 etc.

  • type (Literal['refGene', 'ensGene', 'knownGene', 'ncbiRefSeq']) – Defaults to “refGene”.

Raises:

Exception, ValueError – When type does not match with a valid input.

Return type:

str

Returns:

The URI to the file.

genomicranges.io.ucsc.read_ucsc(genome, type='refGene')[source]

Load a genome annotation from UCSC as GenomicRanges.

Parameters:
  • genome (str) – Genome shortcode; e.g. hg19, hg38, mm10 etc.

  • type (Literal['refGene', 'ensGene', 'knownGene', 'ncbiRefSeq']) – Defaults to “refGene”.

Return type:

GenomicRanges

Returns:

The gene model from UCSC.

Module contents