dammit.fileio package¶
Submodules¶
dammit.fileio.base module¶
-
dammit.fileio.base.
convert_dtypes
(df, dtypes)[source]¶ Convert the columns of a DataFrame to the types specified in the given dictionary, inplace.
Parameters: - df (DataFrame) – The DataFrame to convert.
- dtypes (dict) – Dictionary mapping columns to types.
dammit.fileio.gff3 module¶
-
class
dammit.fileio.gff3.
GFF3Parser
(filename, **kwargs)[source]¶ Bases:
dammit.fileio.base.ChunkParser
-
columns
= [('seqid', <class 'str'>), ('source', <class 'str'>), ('type', <class 'str'>), ('start', <class 'int'>), ('end', <class 'int'>), ('score', <class 'float'>), ('strand', <class 'str'>), ('phase', <class 'float'>), ('attributes', <class 'str'>)]¶
-
-
class
dammit.fileio.gff3.
GFF3Writer
(filename=None, converter=None, **converter_kwds)[source]¶ Bases:
object
-
static
mangle_coordinates
(gff3_df)[source]¶ Although 1-based fully closed intervals are of the Beast, we will respect the convention in the interests of peace between worlds and compatibility.
Parameters: gff3_df (DataFrame) – The DataFrame to “fix”.
-
version_line
= '##gff-version 3.2.1'¶
-
write
(data_df, version_line=True)[source]¶ Write the given data to a GFF3 file, using the converter if given.
Generates an empty file if given an empty DataFrame.
Parameters: - version_line (bool) – If True, write the GFF3 version line at the.
- that this will cause an existing file to be overwritten, but (Note) –
- only be added in the first call to write. (will) –
-
static
-
dammit.fileio.gff3.
maf_to_gff3
(maf_df, tag='', database='', ftype='translated_nucleotide_match')[source]¶ Convert a MAF DataFrame to a GFF3 DataFrame ready to be written to disk.
Parameters: - maf_df (pandas.DataFrame) – The MAF DataFrame. See dammit.fileio.maf.MafParser for column specs.
- tag (str) – Extra tag to add to the source column.
- database (str) – For the database entry in the attributes column.
- ftype (str) – The feature type; GMOD compliant if possible.
Returns: The GFF3 compliant DataFrame.
Return type: pandas.DataFrame
-
dammit.fileio.gff3.
next_ID
()¶
dammit.fileio.hmmer module¶
-
class
dammit.fileio.hmmer.
HMMerParser
(filename, query_regex=None, query_basename='Transcript', **kwargs)[source]¶ Bases:
dammit.fileio.base.ChunkParser
-
columns
= [('target_name', <class 'str'>), ('target_accession', <class 'str'>), ('tlen', <class 'int'>), ('query_name', <class 'str'>), ('query_accession', <class 'str'>), ('query_len', <class 'int'>), ('full_evalue', <class 'float'>), ('full_score', <class 'float'>), ('full_bias', <class 'float'>), ('domain_num', <class 'int'>), ('domain_total', <class 'int'>), ('domain_c_evalue', <class 'float'>), ('domain_i_evalue', <class 'float'>), ('domain_score', <class 'float'>), ('domain_bias', <class 'float'>), ('hmm_coord_from', <class 'int'>), ('hmm_coord_to', <class 'int'>), ('ali_coord_from', <class 'int'>), ('ali_coord_to', <class 'int'>), ('env_coord_from', <class 'int'>), ('env_coord_to', <class 'int'>), ('accuracy', <class 'float'>), ('description', <class 'str'>)]¶
-
dammit.fileio.infernal module¶
-
class
dammit.fileio.infernal.
InfernalParser
(filename, **kwargs)[source]¶ Bases:
dammit.fileio.base.ChunkParser
-
columns
= [('target_name', <class 'str'>), ('target_accession', <class 'str'>), ('query_name', <class 'str'>), ('query_accession', <class 'str'>), ('mdl', <class 'str'>), ('mdl_from', <class 'int'>), ('mdl_to', <class 'int'>), ('seq_from', <class 'int'>), ('seq_to', <class 'int'>), ('strand', <class 'str'>), ('trunc', <class 'str'>), ('pass', <class 'str'>), ('gc', <class 'float'>), ('bias', <class 'float'>), ('score', <class 'float'>), ('e_value', <class 'float'>), ('inc', <class 'str'>), ('description', <class 'str'>)]¶
-
dammit.fileio.maf module¶
-
class
dammit.fileio.maf.
MafParser
(filename, aln_strings=False, chunksize=10000, **kwargs)[source]¶ Bases:
dammit.fileio.base.ChunkParser
-
columns
= [('E', <class 'float'>), ('EG2', <class 'float'>), ('q_aln_len', <class 'int'>), ('q_len', <class 'int'>), ('q_name', <class 'str'>), ('q_start', <class 'int'>), ('q_strand', <class 'str'>), ('s_aln_len', <class 'int'>), ('s_len', <class 'int'>), ('s_name', <class 'str'>), ('s_start', <class 'int'>), ('s_strand', <class 'str'>), ('score', <class 'float'>), ('bitscore', <class 'float'>)]¶
-