transcript = Transcript(transcriptid,transcriptname="",strand = ".",chrom=None,exons = [], transcript_type=None,leftmost=None,rightmost=None,transcript_source=None,parsed=0,transcript_status=None)
Here, leftmost and rightmost are ORF's leftmost and rightmost postion on genome. 0-based, [leftmost,rightmost), exons = [[s1,e1],[s2,e2],…], also s1,e2 is 0-based [s1,e2) on genome.
transcript.add_exon(start,end) # 0-based [start,end) on genome, start < end
transcript.parse_transcript()
then
other informations
gene = Gene(geneid,genename="",gene_type=None,gene_source=None,gene_status=None)
gene.add_transcript(Transcript_instance) # return the Transcript_instance
gene.get_transcript(tid,transcriptname="",strand = ".",chrom=None,exons = [], transcript_type=None,leftmost=None,rightmost=None,transcript_source=None,parsed=0,transcript_status=None) # if tid is already include, it will get transcript by id, if not exist, it will create a trancript instance and add it to this gene with optional parameter
for transcript in gene.transcripts: print transcript
gene.parse_gene()
then return
gene.geneid => gene id gene.genename => gene name, offical symbol gene.gene_type => gene type gene.gene_source => gene source gene.gene_status => status gene.transcripts => record genes‘ transcripts, a dict object, transcript id => transcript instance gene.gene_start => gene leftmost on genome gene.gene_stop => gene rightmost on genome gene.chrom => gene located chromosome gene.strand => location strand on genome
gene.togtf()
gene.torefgene()
Beside above, we supply 2 useful functions to parse gtf file and refgene file
annoregion2 = Annoregion2(fmt="gtf",gattr="gene_name",tattr="transcript_name",gidattr="gene_id",tidattr="transcript_id",genetypeattr="gene_type",transcripttypeattr="transcript_type",tanno="exon,CDS,UTR") annoregion2.gtf2exons(gtf_filename) # then all genes is in annoregion2.h # which is a dict, include all genes, key => value, key is geneid, value is corresponding Gene_instance
hgene = readrefgene(refgene_filename) # to return a dict, {geneid => Gene_instance}
we supply a script to output the hgene dict, here the hgene dict is(geneid ⇒ gene_instance)
gene2file(hgene,fmt="gtf",outputprefix = "test.") # fmt = 'gtf' or 'refgene'