7.1.1.3.1.4. cobramod.parsing.kegg

Data parsing for KEGG

This module handles the retrieval of data from KEGG into a local directory. The possible type of data that can be downloaded:

  • Metabolite: Identifiers that start with the letter C, e.g C00001

  • Reactions: Identifiers that start with the letter R, e.g R00001. The

gene information for reactions is also included if the specie is specified - Module Pathways: Identifiers that start with the letter M, e.g M00001

7.1.1.3.1.4.1. Attributes

7.1.1.3.1.4.2. Exceptions

WrongParserError

Simple Error that should be raised if a method cannot handle the parsing.

7.1.1.3.1.4.3. Functions

parse_metabolite_attributes(data, entry)

parse_reaction_attributes(data, entry, genome, ...)

data_from_string(raw)

Formats most of the keys for KEGG data and returns a dictionary.

build_references(data_dict)

Return a dictionary, where the keys are the names of cross-references

ko_generator(identifier)

Returns a list with the corresponding KO-entries for given identifier.

parse_ko_to_genes(string, reaction, genome)

Returns with a list with the corresponding genes for given genome. String

retrieve_kegg_genes(directory, identifier)

Stores the genes for given reaction in given directory. Function will call

parse_genes(directory, identifier, genome)

From given KEGG dictionary returns a dictionary with the key

get_graph(kegg_dict)

Returns dictionary with sequences for a graph, where the key the prior

parse_pathway_attributes(data, entry)

7.1.1.3.1.4.4. Module Contents

cobramod.parsing.kegg.debug_log
exception cobramod.parsing.kegg.WrongParserError

Bases: Exception

Simple Error that should be raised if a method cannot handle the parsing.

cobramod.parsing.kegg.parse_metabolite_attributes(data, entry)
Parameters:
Return type:

dict[str, Any]

cobramod.parsing.kegg.parse_reaction_attributes(data, entry, genome, gene_directory)
Parameters:
Return type:

dict[str, Any]

cobramod.parsing.kegg.data_from_string(raw)

Formats most of the keys for KEGG data and returns a dictionary.

Parameters:

raw (str)

Return type:

dict[str, list[str]]

cobramod.parsing.kegg.build_references(data_dict)

Return a dictionary, where the keys are the names of cross-references and keys their identifiers. If nothing is found, it will return None

Parameters:

data_dict (dict[str, list[str]])

Return type:

dict[str, str]

cobramod.parsing.kegg.ko_generator(identifier)

Returns a list with the corresponding KO-entries for given identifier. Otherwise it will raise a HTTPError

Parameters:

identifier (str)

Return type:

Generator[str, None, None]

cobramod.parsing.kegg.parse_ko_to_genes(string, reaction, genome)

Returns with a list with the corresponding genes for given genome. String is the raw text that include the gene information. If Abbreviation or no genome is present then, an empty list is returned.

Parameters:
  • string (str)

  • reaction (str)

  • genome (Optional[str])

Return type:

list[str]

cobramod.parsing.kegg.retrieve_kegg_genes(directory, identifier)

Stores the genes for given reaction in given directory. Function will call a HTTPError if nothing is found.

Parameters:
cobramod.parsing.kegg.parse_genes(directory, identifier, genome)

From given KEGG dictionary returns a dictionary with the key “genes” which include a dictionary with the identifier and name of the gene; and the key “rule” for the COBRApy representation of the gene-reaction rule

Parameters:
Return type:

dict[str, Any]

cobramod.parsing.kegg.get_graph(kegg_dict)

Returns dictionary with sequences for a graph, where the key the prior reaction is and the value, its successor; and a set with all participant reaction (vertex)

Parameters:

kegg_dict (dict)

Return type:

dict

cobramod.parsing.kegg.parse_pathway_attributes(data, entry)
Parameters:
Return type:

dict[str, Any]