1. Functions¶
[1]:
from IPython.display import display
from cobramod import __version__
from cobra import __version__ as cobra_version
print(f'CobraMod version: {__version__}')
print(f'COBRApy version: {cobra_version}')
# From Escher:
# This option turns off the warning message if you leave or refresh this page
import escher
escher.rc['never_ask_before_quit'] = True
Scaling...
A: min|aij| = 1.000e+00 max|aij| = 1.000e+00 ratio = 1.000e+00
Problem data seem to be well scaled
CobraMod version: 0.5.5-alpha.1
COBRApy version: 0.22.1
1.1. Retrieving metabolic pathway information¶
CobraMod can retrieve metabolic pathway information (metabolites, reactions or pathways) from various databases by using database-specific identifiers. It supports all databases from the BioCyc collection, Plant Metabolic Network (PMN), the KEGG database and the BiGG Models repository. Call cobramod.available_databases
to see all supported databases.
[2]:
from cobramod import available_databases
available_databases
[2]:
CobraMod supports Biocyc, the Plant Metabolic Network (PMN), KEGG and BiGG Models repository. Biocyc includes around 18.000 sub-databases and a complete list for BioCyc can be found at 'https://biocyc.org/biocyc-pgdb-list.shtml'. The database-specific identifiers can be found in the URL of the respective data. For instance, for "diphosphate" this is:
Database | URL with identifier (bold) |
BioCyc, sub-database ECOLI | https://biocyc.org/compound?orgid=ECOLI&id=PPI |
Plant Metabolic Network, sub-database CORN | https://pmn.plantcyc.org/compound?orgid=CORN&id=PPI |
KEGG | https://www.genome.jp/entry/C00013 |
BiGG Models Repository, universal model | http://bigg.ucsd.edu/universal/metabolites/ppi |
CobraMod uses abbreviations to represent the databases or sub-databases:
Database | Abbreviation |
BioCyc | META or identifier of sub-database e.g: ECOLI, ARA, GCF_000010885 |
Plant Metabolic Network | Prefix "pmn:" with the sub-database identifier, e.g pmn:PLANT, pmn:ARA, pmn:CORN |
KEGG | KEGG |
BiGG Models Repository | BIGG |
The user can download the metabolic pathway information using the cobramod.get_data
function. In this example we download information from the BioCyc sub-database YEAST.
[3]:
from cobramod import get_data
from pathlib import Path
dir_data = Path.cwd().resolve().joinpath("data")
identifiers = [
"CPD-14074",
"CPD-14075",
"CPD-14076",
"CPD-14553",
"CPD-15317",
"CPD-15322",
"CPD-15323",
]
for metabolite in identifiers:
get_data(
directory=dir_data,
identifier=metabolite,
database="YEAST"
)
The first argument in cobramod.get_data() is the system path where CobraMod stores the metabolic pathway information. CobraMod uses pathlib for path representation. The second argument is the data identifier used in the respective database. In this example we retrieve data from the BioCyc sub-database YEAST
. The last argument is the abbreviation of the database.
CobraMod creates a directory with the name of the database and stores the metabolic pathway information in it:
data
`-- YEAST
|-- CPD-14074.xml
|-- CPD-14075.xml
|-- CPD-14076.xml
|-- CPD-14553.xml
|-- CPD-15317.xml
|-- CPD-15322.xml
`-- CPD-15323.xml
1.2. Converting stored-data into COBRApy objects¶
CobraMod can convert metabolic pathway information (metabolites and reactions) into COBRApy objects (cobra.Reaction and cobra.Metabolite). It can thus be seamlessly integrated with a COBRApy workflow.
The function cobramod.create_object() creates COBRApy objects from metabolic pathway data retrieved by cobramod.get_data(). If no pathway information was downloaded cobramod.create_object() retrieves it automatically.
In this example, we convert the metabolite 2-Oxoglutarate with the KEGG identifier C00026 into a COBRApy object. CobraMod automatically identifies the KEGG entry as a metabolite and converts it into the corresponding COBRApy metabolite.
The first argument is the database-specific identifier (C00026
) followed by the database abbreviation (KEGG
). The third argument is the path representation for the directory of the metabolic pathway information. CobraMod downloads the data into this directory and utilize it instead of downloading it again. The last argument is the compartment of the reaction (c
for cytosol).
[4]:
from cobramod import create_object
from pathlib import Path
# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
new_object = create_object(
identifier="C00026",
database="KEGG",
directory=dir_data,
compartment="c"
)
print(type(new_object))
new_object
<class 'cobra.core.metabolite.Metabolite'>
[4]:
Metabolite identifier | C00026_c |
Name | 2-Oxoglutarate; |
Memory address | 0x07f52c74e1d10 |
Formula | C5H6O5 |
Compartment | c |
In 0 reaction(s) |
In the second example, we convert the reaction RXN-11502 from the PMN sub-database CORN into a COBRApy reaction. The first argument is the database-specific identifier (RXN-11502
) followed by the database identifier (pmn:CORN
). CobraMod automatically takes reversibility and gene information from the database entry and adds it to the reaction object.
[5]:
new_object = create_object(
identifier="RXN-11501",
database="pmn:CORN",
directory=dir_data,
compartment="c"
)
print(type(new_object))
display(new_object)
new_object.genes
<class 'cobra.core.reaction.Reaction'>
Reaction identifier | RXN_11501_c |
Name | alkaline α- galactosidase |
Memory address | 0x07f52c74f33d0 |
Stoichiometry |
CPD_170_c + WATER_c --> ALPHA_D_GALACTOSE_c + CPD_1099_c stachyose + H2O --> alpha-D-galactopyranose + raffinose |
GPR | ZM00001D031300 or ZM00001D031303 or ZM00001D003279 |
Lower bound | 0 |
Upper bound | 1000 |
[5]:
frozenset({<Gene ZM00001D003279 at 0x7f52c74e1f90>,
<Gene ZM00001D031300 at 0x7f52c74e1b10>,
<Gene ZM00001D031303 at 0x7f52c74e1050>})
1.3. Adding metabolites¶
The function cobramod.add_metabolites() extends the COBRApy function model.add_metabolites() and can be used with a simple syntax. It can utilize a single string, a list of strings, a file path or a COBRAPy metabolite object. In the next examples we showcase these options. We use the E. coli core model from COBRApy
as test model. This core model can be found under cobramod.test.textbook
. If the argument obj
is used as a string, it can be the database-specific identifier of the respective metabolite and its compartment or a user-defined metabolite. It uses the following syntax:
SYNTAX
To add metabolite information from a database:
database-specific_identifier, compartment
To add user-curated metabolites:
user-curated_identifier, name, compartment, chemical_formula, molecular_charge
In the first example, we add the metabolite L-methionine with the MetaCyc identifier MET to the test model. The first argument is the model name. The argument obj
contains the identifier MET
and the compartment c
. The second argument is the database identifier (META
) and the third argument is the directory where CobraMod stores the metabolite information.
[6]:
from cobramod import add_metabolites
from cobramod.test import textbook
from pathlib import Path
# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook.copy()
add_metabolites(
model=test_model,
obj="MET, c",
database="META",
directory=dir_data,
)
print(type(test_model.metabolites.get_by_id("MET_c")))
test_model.metabolites.get_by_id("MET_c")
<class 'cobra.core.metabolite.Metabolite'>
[6]:
Metabolite identifier | MET_c |
Name | L-methionine |
Memory address | 0x07f52c6dcbf50 |
Formula | C5H11N1O2S1 |
Compartment | c |
In 0 reaction(s) |
In the second example, we add two metabolites (acetyl-coa and sucrose) from MetaCyc. In the argument obj
we define a list with the database-specific metabolite identifiers and the respecitive compartments. The rest of the arguments remains the same as in the previous example. Before adding a metabolite, CobraMod tests if the given metabolite is already present in the model
(potentially with a different identifier) by checking the cross-reference database entries in the metabolite’s metadata against the model’s metabolite entries. If CobraMod identifies a metabolite as already present in the model it skips adding the metabolite, used the already present one, and prints a warning.
[7]:
add_metabolites(
model=test_model,
obj=["ACETYL-COA, c", "SUCROSE, c"],
database="META",
directory=dir_data,
)
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("accoa_c"))
test_model.metabolites.get_by_id("SUCROSE_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ACETYL-COA' was found as 'accoa_c'. Please curate if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1016: UserWarning: Metabolite "accoa_c" is already present in the model. Skipping addition.
warn(message=msg, category=UserWarning)
Metabolite identifier | accoa_c |
Name | Acetyl-CoA |
Memory address | 0x07f52c70bdd10 |
Formula | C23H34N7O17P3S |
Compartment | c |
In 7 reaction(s) | CS, PFL, MALS, Biomass_Ecoli_core, PTAr, PDH, ACALD |
[7]:
Metabolite identifier | SUCROSE_c |
Name | sucrose |
Memory address | 0x07f52c748a390 |
Formula | C12H22O11 |
Compartment | c |
In 0 reaction(s) |
In the third example, we use a text file to add metabolites to the test model. We use the file metabolites.txt in the current working directory with the following content:
SUCROSE, c
MET, c
MALTOSE_c, MALTOSE[c], c, C12H22O11, 1
In this example, CobraMod downloads the first two metabolites from MetaCyc, while MALTOSE_c
is a user-defined metabolite. The user can specify the path to this file in the obj
argument. The remaining arguments are the same as in the previous examples. We added two print statements to show that CobraMod adds the metabolites to the model.
[8]:
from cobramod.test import textbook_biocyc
# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
# This is our file
file = dir_data.joinpath("metabolites.txt")
# Using a copy
test_model = textbook_biocyc.copy()
print(f'Number of metabolites prior addition: {len(test_model.metabolites)}')
# Using CobraMod
add_metabolites(
model=test_model,
obj=file,
directory=dir_data,
database="META",
)
print(f'Number of metabolites after addition: {len(test_model.metabolites)}')
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("MET_c"))
display(test_model.metabolites.get_by_id("SUCROSE_c"))
test_model.metabolites.get_by_id("MALTOSE_c")
Number of metabolites prior addition: 72
Number of metabolites after addition: 75
Metabolite identifier | MET_c |
Name | L-methionine |
Memory address | 0x07f52c6d00c90 |
Formula | C5H11N1O2S1 |
Compartment | c |
In 0 reaction(s) |
Metabolite identifier | SUCROSE_c |
Name | sucrose |
Memory address | 0x07f52c6d76c50 |
Formula | C12H22O11 |
Compartment | c |
In 0 reaction(s) |
[8]:
Metabolite identifier | MALTOSE_c |
Name | MALTOSE[c] |
Memory address | 0x07f52c6d008d0 |
Formula | C12H22O11 |
Compartment | c |
In 0 reaction(s) |
Since cobramod.add_metabolites() is an extension of the COBRApy function model.add_metabolites() the user can also utilize COBRApy metabolites. In the following example, we use a variation of the test model (textbook_biocyc
) which uses BioCyc metabolite identifiers. We copy a COBRApy metabolite from the test model
and add it to the BioCyc test model.
[9]:
from cobramod import add_metabolites
from cobramod.test import textbook, textbook_biocyc
# Copying Metabolite from original model
metabolite = textbook.metabolites.get_by_id("xu5p__D_c")
# Using a copy
test_model = textbook_biocyc.copy()
add_metabolites(
model=test_model,
obj=metabolite
)
test_model.metabolites.get_by_id("xu5p__D_c")
[9]:
Metabolite identifier | xu5p__D_c |
Name | D-Xylulose 5-phosphate |
Memory address | 0x07f52c728a3d0 |
Formula | C5H9O8P |
Compartment | c |
In 3 reaction(s) | RPE, TKT1, TKT2 |
If CobraMod detects large molecules (e.g. enzymes) or if the metabolite information does not include a chemical formula the user receives a warning. In this example, we use the enzyme with the MetaCyc identifier Red-NADPH-Hemoprotein-Reductases
and add it to the test model. CobraMod raises a warning due to the missing chemical formula.
[10]:
# Using a copy
test_model = textbook.copy()
add_metabolites(
model=test_model,
obj="Red-NADPH-Hemoprotein-Reductases, c",
directory=dir_data,
database="META",
)
test_model.metabolites.get_by_id("Red_NADPH_Hemoprotein_Reductases_c")
/home/stefano/Documents/cobramod/src/cobramod/parsing/biocyc.py:110: UserWarning: Sum formula for the metabolite "Red-NADPH-Hemoprotein-Reductases" from BioCyc could not be found. Formula set to "X" and charge to 0. Please curate.
warn(msg)
[10]:
Metabolite identifier | Red_NADPH_Hemoprotein_Reductases_c |
Name | Red-NADPH-Hemoprotein-Reductases |
Memory address | 0x07f52c6d00f90 |
Formula | X |
Compartment | c |
In 0 reaction(s) |
NOTES
CobraMod replaces hyphens (
-
) with underscores (_
) in the identifiers when creating COBRApy metabolites.When adding several metabolites the user can only specify one database identifier (It is not possible to use two databases within the same function call) or alternatively can call the function twice.
CobraMod saves the curation process to the file
debug.log
.
1.4. Adding reactions¶
The function cobramod.add_reactions() extends the COBRApy function model.add_reactions() and can be used with a simple syntax. It can utilize a single string, a list of string, a file path or a COBRApy reaction object. In the next examples we showcase these options. Again, we use the E. coli core model from COBRApy as test model.
If the argument obj
is used as a string, it can be the database-specific identifier of the respective reaction and its compartment or a user-defined reaction. It uses the following syntax:
SYNTAX
To add reaction information from a database:
database-specific_identifier, compartment
To add user-defined reactions, the user can specify the identifier and name of the reaction following the COBRApy reaction string syntax:
user-curated_identifier, name | coefficient_1 metabolite_1 <-> coefficient_2 metabolite_2
Metabolites must contain a suffix which specifies the compartment. This is given by an underscore (_
) followed by the compartment-abbreviation. In this example, we create a transport reaction for oxygen between the external comparment (e
) and the cytosol (c
):
TRANS_H2O_ec, Oxygen Transport | 2 OXYGEN-MOLECULE_e <-> 2 OXYGEN_MOLECULE_c
In the first example, we add the BIGG reaction AKGDH to a test model. This time we use a text model which is a variaton of the original core model and uses KEGG identifiers (textbook_kegg
). The first argument is the model name. The obj
argument takes the identifier AKGDH
and the compartment c
. The next argument is the database identifier (BIGG
) and the last argument is the directory where CobraMod stores and retrieves the
metabolic pathway information from. The argument model_id
is a BIGG-specific argument. Please read the notes below for more information.
Because the test model uses KEGG identifiers, CobraMod finds the reaction AKGDH
with the KEGG identifier R08549_c
.
[11]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook_kegg.copy()
add_reactions(
model=test_model,
obj="AKGDH, c",
database="BIGG",
directory=dir_data,
model_id="e_coli_core"
)
display(test_model.reactions.get_by_id("R08549_c"))
/home/stefano/Documents/cobramod/src/cobramod/parsing/bigg.py:215: UserWarning: Reaction "AKGDH" from BIGG set to reversible. Please modify if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:329: UserWarning: Reaction 'AKGDH' was found as 'R08549_c'. Please curate if necessary
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "R08549_c" is already present in the model. Skipping addition.
warn(message=msg, category=UserWarning)
Reaction identifier | R08549_c |
Name | 2-Oxogluterate dehydrogenase |
Memory address | 0x07f52c6bd4450 |
Stoichiometry |
C00003_c + C00010_c + C00026_c --> C00004_c + C00011_c + C00091_c Nicotinamide adenine dinucleotide + Coenzyme A + 2-Oxoglutarate --> Nicotinamide adenine dinucleotide - reduced + CO2 + Succinyl-CoA |
GPR | b0726 and b0116 and b0727 |
Lower bound | 0.0 |
Upper bound | 1000.0 |
In the second example, we add two reactions (R00315 and R02736) from KEGG. The argument obj
is a list with the database-specific identifiers and the respective compartments of the reactions. The majority of the arguments remains the same as in the previous example. The argument genome
is a KEGG-specific argument. Please read the notes below for more information.
CobraMod parses the reaction information for gene identifiers and automatically adds them to the COBRApy reaction. In this example, CobraMod creates the genes 2296
, b3115
and b1849
, and adds it to the first reaction R00315
.
[12]:
add_reactions(
model=test_model,
obj=["R00315, c", "R02736 ,c"],
directory=dir_data,
database="KEGG",
genome="ecc"
)
display(test_model.reactions.get_by_id("R00315_c"))
print(test_model.reactions.get_by_id("R00315_c").genes)
test_model.reactions.get_by_id("R02736_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "R00315_c" is already present in the model. Skipping addition.
warn(message=msg, category=UserWarning)
Reaction identifier | R00315_c |
Name | acetate kinase |
Memory address | 0x07f52c6bd42d0 |
Stoichiometry |
C00002_c + C00033_c <=> C00227_c + G11113_c ATP + Acetate <=> Acetyl phosphate + ADP |
GPR | b2296 or b3115 or b1849 |
Lower bound | -1000.0 |
Upper bound | 1000.0 |
frozenset({<Gene b1849 at 0x7f52c6c76810>, <Gene b2296 at 0x7f52c6c76890>, <Gene b3115 at 0x7f52c6c767d0>})
[12]:
Reaction identifier | R02736_c |
Name | beta-D-glucose-6-phosphate:NADP+ 1-oxoreductase |
Memory address | 0x07f52c4b61310 |
Stoichiometry |
C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone |
GPR | c2265 |
Lower bound | 0 |
Upper bound | 1000 |
In the following example, we use a text file to add reactions to the test model. To this end, we use the file reactions.txt in the current working directory. This file has the following content:
R04382, c
R02736, c
C06118_ce, digalacturonate transport | 1 C06118_c <-> 1 C06118_e
CobraMod downloads the first two reactions from KEGG. C06118_ce
is a user-defined reaction. The user can specify the file path in the obj
argument. The next arguments are the same as in the previous examples. We added two print statements to show that CobraMod adds the reaction to the model.
[13]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path
dir_data = Path.cwd().resolve().joinpath("data")
test_model = textbook_kegg.copy()
# This is the file with text
file = dir_data.joinpath("reactions.txt")
print(f'Number of reactions prior addition: {len(test_model.reactions)}')
add_reactions(
model=test_model,
obj=file,
directory=dir_data,
database="KEGG",
genome="ecc"
)
print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Show in jupyter
display(test_model.reactions.get_by_id("R04382_c"))
display(test_model.reactions.get_by_id("R02736_c"))
test_model.reactions.get_by_id("C06118_ce")
Number of reactions prior addition: 95
Number of reactions after addition: 98
Reaction identifier | R04382_c |
Name | 4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase |
Memory address | 0x07f52c6be34d0 |
Stoichiometry |
C06118_c <=> 2.0 C04053_c 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate; |
GPR | c0319 |
Lower bound | -1000 |
Upper bound | 1000 |
Reaction identifier | R02736_c |
Name | beta-D-glucose-6-phosphate:NADP+ 1-oxoreductase |
Memory address | 0x07f52c7067510 |
Stoichiometry |
C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone |
GPR | c2265 |
Lower bound | 0 |
Upper bound | 1000 |
[13]:
Reaction identifier | C06118_ce |
Name | digalacturonate transport |
Memory address | 0x07f52c717b250 |
Stoichiometry |
C06118_c <=> C06118_e 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; |
GPR | |
Lower bound | -1000 |
Upper bound | 1000 |
Since cobramod.add_reactions() is an extension of the original COBRApy function model.add_reactions() the user can also utilize COBRApy reactions. In this example, we use the test model (textbook_kegg
). We copy a COBRApy reaction from the test model and add it to the KEGG test model.
[14]:
from cobramod.test import textbook_kegg, textbook
from cobramod import add_reactions
from pathlib import Path
# Using copy of test model
test_model = textbook_kegg.copy()
# Obtaining a reaction
reaction = textbook.reactions.get_by_id("ACALDt")
add_reactions(model=test_model, obj=reaction)
test_model.reactions.get_by_id("ACALDt")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "ACALDt" is already present in the model. Skipping addition.
warn(message=msg, category=UserWarning)
[14]:
Reaction identifier | ACALDt |
Name | R acetaldehyde reversible - transport |
Memory address | 0x07f52c4a75cd0 |
Stoichiometry |
C00084_e <=> C00084_c Acetaldehyde <=> Acetaldehyde |
GPR | s0001 |
Lower bound | -1000.0 |
Upper bound | 1000.0 |
By default, COBRApy ignores metabolites that appear on both sides of a reaction equation. CobraMod identifies such reactions and assigns one of these metabolites to the extracellular compartment and raises a warning suggesting manual curation. In the following example, we add a transport reaction for acetic acid from BioCyc sub-database YEAST
to the test model.
[15]:
test_model = textbook_kegg.copy()
add_reactions(
model=test_model,
obj="TRANS-RXN-455, c",
database="YEAST",
directory=dir_data,
)
# Show in jupyter
test_model.reactions.get_by_id("TRANS_RXN_455_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:376: UserWarning: Reaction "TRANS_RXN_455_c" has metabolite "CPD_24335_e" on both sides of the equation (e.g transport reaction). COBRApy ignores these metabolites. To avoid this, by default, CobraMod will assign one metabolite to the extracellular compartment. Please curate the reaction if necessary.
warn(message=msg, category=UserWarning)
[15]:
Reaction identifier | TRANS_RXN_455_c |
Name | acetic acid uptake |
Memory address | 0x07f52c49e17d0 |
Stoichiometry |
CPD_24335_e --> CPD_24335_c acetic+acid --> acetic+acid |
GPR | G3O-32144 |
Lower bound | 0 |
Upper bound | 1000 |
NOTES
CobraMod replaces hyphens (
-
) to underscores (_
) in the identifiers when creating COBRApy reactions.When adding several reactions the user can only specify one database identifier (It is not possible to use two databases within the same function call) or alternatively can call the function twice.
CobraMod tries to identify if a given metabolites or reactions are already present in the model and used those instead of adding redundant entries.
CobraMod saves the curation process to the file
debug.log
.The argument
model_id
is specific for the BiGG Models repository. CobraMod can download metabolite and reaction information directly from the BIGG models. A list of all models can be found here.The argument
genome
can be used with the databaseKEGG
and specifies the genome for which gene information will be retrieved. The complete list of all available genomes can be found here. If no genome is specified, no gene information will be retrieved and a warning is printed as shown below:
[16]:
test_model = textbook_kegg.copy()
add_reactions(
model=test_model,
obj="R04382, c",
database="KEGG",
directory=dir_data,
)
test_model.reactions.get_by_id("R04382_c")
/home/stefano/Documents/cobramod/src/cobramod/parsing/kegg.py:196: UserWarning: Nothing was specified in argument "genome". Reaction "R04382" will not include genes. Please modify if necessary.
warn(message=msg, category=UserWarning)
[16]:
Reaction identifier | R04382_c |
Name | 4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase |
Memory address | 0x07f52c6c6db10 |
Stoichiometry |
C06118_c <=> 2.0 C04053_c 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate; |
GPR | |
Lower bound | -1000 |
Upper bound | 1000 |
1.5. Adding pathways¶
CobraMod can add metabolic pathways to a given model. The function cobramod.add_pathway() takes as an argument either a sequence of database-specific reaction identifiers or a single pathway identifier. CobraMod download the pathway information from the databases, creates the COBRApy objects and prints a short summary about the curation process when adding the respective pathway. Additionally, to ensure reproducibility and for tracking the curation process, the user can write changes to file. CobraMod creates the object cobramod.Pathway which contains the pathway information.
When adding serveral pathways, we recommend adding pathways one-by-one to perform neccessary curation steps inbetween. In the examples below we showcase two options. Again, we use the E. coli core model from COBRApy as test model.
In the first example, we add the acetoacetate degradation pathway from the BioCyc sub-database ECOLI
to the test model. This pathway contains two reactions and six metabolites.
In this example, the first argument is the model to extend. The pathway
argument uses the database-specific identifier ACETOACETATE-DEG-PWY
and the database identifier ECOLI
. We define the compartment as c
(cytosol), i.e. all metabolites and reactions will be assigned to this compartment.
The argument filename
specifies a file to which a summary of the changes is written. Additionally, add_pathway() prints a summary of the changes to the model. Calling the respective cobramod.Pathway outputs a table with the main attributes of the object.
All COBRApy reactions of the pathway are tested for their capacity to cary a non-zero flux. Read more about it in the non-zero flux test section.
[17]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy of test model
test_model = textbook.copy()
add_pathway(
model=test_model,
pathway="ACETOACETATE-DEG-PWY",
database="ECOLI",
compartment="c",
filename="summary.txt",
directory=dir_data,
)
# Display in jupyter
test_model.groups.get_by_id("ACETOACETATE-DEG-PWY")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ACET' was found as 'ac_c'. Please curate if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'CO-A' was found as 'coa_c'. Please curate if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/extension.py:573: UserWarning: Auxiliary sink reaction for "SK_3_KETOBUTYRATE_c" created. Consider removing it and adding the synthesis reactions for the metabolite.
warn(message=msg, category=UserWarning)
Number of new | removed entities in
*=====================|===================*
Reactions 2 | 0
Metabolites 2 | 0
Exchange 0 | 0
Demand 0 | 0
Sinks 1 | 0
Genes 4 | 0
Groups 1 | 0
[17]:
Pathway identifier | ACETOACETATE-DEG-PWY |
Name | |
Memory address | 0x0139993464354896 |
Reactions involved | ACETOACETYL_COA_TRANSFER_RXN_c, ACETYL_COA_ACETYLTRANSFER_RXN_c |
Genes involved | EG12432, EG11670, EG11669, EG11672 |
Visualization attributes |
|
Below is an example of the summary in form of a text file. The first part lists names of all reactions, metabolites, exchange reactions, auxiliary demand and sink reactions, genes, and groups in the model. The second part of the summary lists all elements that were added or removed by the function call add_pathway()
.
[18]:
%cat summary.txt
Summary:
Model identifier: e_coli_core
Model name:
Reactions:
['ACALD', 'ACALDt', 'ACKr', 'ACONTa', 'ACONTb', 'ACt2r', 'ADK1', 'AKGDH', 'AKGt2r', 'ALCD2x', 'ATPM', 'ATPS4r', 'Biomass_Ecoli_core', 'CO2t', 'CS', 'CYTBD', 'D_LACt2', 'ENO', 'ETOHt2r', 'FBA', 'FBP', 'FORt2', 'FORti', 'FRD7', 'FRUpts2', 'FUM', 'FUMt2_2', 'G6PDH2r', 'GAPD', 'GLCpts', 'GLNS', 'GLNabc', 'GLUDy', 'GLUN', 'GLUSy', 'GLUt2r', 'GND', 'H2Ot', 'ICDHyr', 'ICL', 'LDH_D', 'MALS', 'MALt2_2', 'MDH', 'ME1', 'ME2', 'NADH16', 'NADTRHD', 'NH4t', 'O2t', 'PDH', 'PFK', 'PFL', 'PGI', 'PGK', 'PGL', 'PGM', 'PIt2r', 'PPC', 'PPCK', 'PPS', 'PTAr', 'PYK', 'PYRt2', 'RPE', 'RPI', 'SUCCt2_2', 'SUCCt3', 'SUCDi', 'SUCOAS', 'TALA', 'THD2', 'TKT1', 'TKT2', 'TPI', 'ACETOACETYL_COA_TRANSFER_RXN_c', 'ACETYL_COA_ACETYLTRANSFER_RXN_c']
Metabolites:
['13dpg_c', '2pg_c', '3pg_c', '6pgc_c', '6pgl_c', 'ac_c', 'ac_e', 'acald_c', 'acald_e', 'accoa_c', 'acon_C_c', 'actp_c', 'adp_c', 'akg_c', 'akg_e', 'amp_c', 'atp_c', 'cit_c', 'co2_c', 'co2_e', 'coa_c', 'dhap_c', 'e4p_c', 'etoh_c', 'etoh_e', 'f6p_c', 'fdp_c', 'for_c', 'for_e', 'fru_e', 'fum_c', 'fum_e', 'g3p_c', 'g6p_c', 'glc__D_e', 'gln__L_c', 'gln__L_e', 'glu__L_c', 'glu__L_e', 'glx_c', 'h2o_c', 'h2o_e', 'h_c', 'h_e', 'icit_c', 'lac__D_c', 'lac__D_e', 'mal__L_c', 'mal__L_e', 'nad_c', 'nadh_c', 'nadp_c', 'nadph_c', 'nh4_c', 'nh4_e', 'o2_c', 'o2_e', 'oaa_c', 'pep_c', 'pi_c', 'pi_e', 'pyr_c', 'pyr_e', 'q8_c', 'q8h2_c', 'r5p_c', 'ru5p__D_c', 's7p_c', 'succ_c', 'succ_e', 'succoa_c', 'xu5p__D_c', '3_KETOBUTYRATE_c', 'ACETOACETYL_COA_c']
Exchange:
['EX_ac_e', 'EX_acald_e', 'EX_akg_e', 'EX_co2_e', 'EX_etoh_e', 'EX_for_e', 'EX_fru_e', 'EX_fum_e', 'EX_glc__D_e', 'EX_gln__L_e', 'EX_glu__L_e', 'EX_h_e', 'EX_h2o_e', 'EX_lac__D_e', 'EX_mal__L_e', 'EX_nh4_e', 'EX_o2_e', 'EX_pi_e', 'EX_pyr_e', 'EX_succ_e']
Demand:
[]
Sinks:
['SK_3_KETOBUTYRATE_c']
Genes:
['b1241', 'b0351', 's0001', 'b3115', 'b1849', 'b2296', 'b1276', 'b0118', 'b0474', 'b0116', 'b0726', 'b0727', 'b2587', 'b0356', 'b1478', 'b3735', 'b3733', 'b3734', 'b3732', 'b3736', 'b3738', 'b3731', 'b3737', 'b3739', 'b0720', 'b0733', 'b0979', 'b0978', 'b0734', 'b2975', 'b3603', 'b2779', 'b1773', 'b2925', 'b2097', 'b4232', 'b3925', 'b0904', 'b2492', 'b4154', 'b4152', 'b4153', 'b4151', 'b1819', 'b1817', 'b2415', 'b1818', 'b2416', 'b1611', 'b4122', 'b1612', 'b3528', 'b1852', 'b1779', 'b1621', 'b1101', 'b2417', 'b3870', 'b1297', 'b0809', 'b0810', 'b0811', 'b1761', 'b1524', 'b1812', 'b0485', 'b3213', 'b3212', 'b4077', 'b2029', 'b0875', 'b1136', 'b4015', 'b2133', 'b1380', 'b4014', 'b2976', 'b3236', 'b1479', 'b2463', 'b2286', 'b2279', 'b2276', 'b2280', 'b2288', 'b2284', 'b2287', 'b2281', 'b2277', 'b2285', 'b2282', 'b2278', 'b2283', 'b3962', 'b1602', 'b1603', 'b0451', 'b0114', 'b0115', 'b3916', 'b1723', 'b0902', 'b0903', 'b2579', 'b3114', 'b3952', 'b3951', 'b4025', 'b2926', 'b0767', 'b0755', 'b3612', 'b4395', 'b3493', 'b2987', 'b3956', 'b3403', 'b1702', 'b2297', 'b2458', 'b1676', 'b1854', 'b4301', 'b3386', 'b2914', 'b4090', 'b0722', 'b0724', 'b0721', 'b0723', 'b0728', 'b0729', 'b2464', 'b0008', 'b2935', 'b2465', 'b3919', 'EG12432', 'EG11670', 'EG11669', 'EG11672']
Groups:
['ACETOACETATE-DEG-PWY']
New:
Reactions:
['ACETOACETYL_COA_TRANSFER_RXN_c', 'ACETYL_COA_ACETYLTRANSFER_RXN_c']
Metabolites:
['3_KETOBUTYRATE_c', 'ACETOACETYL_COA_c']
Exchange:
[]
Demand:
[]
Sinks:
['SK_3_KETOBUTYRATE_c']
Genes:
['EG12432', 'EG11670', 'EG11669', 'EG11672']
Groups:
['ACETOACETATE-DEG-PWY']
Removed:
Reactions:
[]
Metabolites:
[]
Exchange:
[]
Demand:
[]
Sinks:
[]
Genes:
[]
Groups:
[]
In the next example, we use a list of database-specific reaction identifiers as pathway
argument. We use the database identifier ECOLI
and the compartment c
(cytosol). Additionally, we define a pathway name by using the argument group
. The user can also use this argument to merge pathways by using the same group names.
[19]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook_biocyc
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")
test_model = textbook_biocyc.copy()
# Defining database-specific identifiers
sequence = ["PEPDEPHOS-RXN", "PYRUVFORMLY-RXN", "FHLMULTI-RXN"]
print(f'Number of reaction prior addition: {len(test_model.reactions)}')
add_pathway(
model=test_model,
pathway=sequence,
directory=dir_data,
database="ECOLI",
compartment="c",
group="curated_pathway"
)
print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Display in jupyter
test_model.groups.get_by_id("curated_pathway")
Number of reaction prior addition: 95
Number of new | removed entities in
*=====================|===================*
Reactions 3 | 0
Metabolites 2 | 0
Exchange 0 | 0
Demand 0 | 0
Sinks 1 | 0
Genes 11 | 0
Groups 1 | 0
Number of reactions after addition: 99
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ATP' was found as 'ATP_c'. Please curate if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ADP' was found as 'ADP_c'. Please curate if necessary.
warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/extension.py:685: UserWarning: Auxiliary sink reaction for "SK_HYDROGEN_MOLECULE_c" created. Consider removing it and adding the synthesis reactions for the metabolite.
warn(message=msg, category=UserWarning)
[19]:
Pathway identifier | curated_pathway |
Name | |
Memory address | 0x0139993460803984 |
Reactions involved | PEPDEPHOS_RXN_c, PYRUVFORMLY_RXN_c, FHLMULTI_RXN_c |
Genes involved | EG10803, EG10804, G7627, EG10701, EG10479, EG10480, EG10476, EG10478, EG10477, EG10475, EG10285 |
Visualization attributes |
|
NOTES
A pathway is a set of COBRApy reactions. All the notes listed for
add_metabolites()
andadd_reactions()
also apply to pathways, i. e., handling of duplicate elements, transport reactions and the argumentgenome
for KEGG.CobraMod saves the curation process to the file
debug.log
.
1.6. Non-zero flux test¶
When calling the function add_pathway()
, CobraMod tests each reaction of the Pathway
object for its capability to carry a non-zero flux, i.e., if the involved metabolites can be turned over. Additionally, the user can test individual COBRApy reactions for their capability to cary a non-zero flux by using the function cobramod.test_non_zero_flux().
When running these functions, CobraMod creates auxillary demand reactions for the reaction’s metabolites to enable a flux through the tested reaction. Note that demand reactions enable an efflux of the respective metabolite from the system but no uptake. If the test initially fails, auxiliary sink reactions are added to the model and CobraMod raises an error suggesting manual curation steps. These auxillary sink reactions enable an influx of the respective metabolite to the system. Based on this information the user can then manually curate the model and if necessary add the missing synthesis reactions for the respective metabolites. Otherwise, if a reaction passes the test, no message is printed and the auxilary reactions are removed from the model.
The user can also use the argument ignore_list
to specify metabolites for which no auxiliary sink reactions should be created. This can be for example ubiquitous metabololites such as water, ATP or ADP.
In the following artifical example (E. coli is a prokaryote without compartmentation), we introduce the glutathione synthase reaction (GLUTATHIONE-SYN-RXN
) into a new comparment (x
) and test the reaction for its capability to carry a non-zero flux. This reaction has the following equation:
ATP_x + GLY_x + L_GAMMA_GLUTAMYLCYSTEINE_x --> ADP_x + GLUTATHIONE_x + PROTON_x + Pi_x
Because we are testing a reaction in a new, empty compartment, CobraMod creates auxiliary demand and sink reactions for the involved metabolites. To showcase a scenario in which an error message is printed, we ignore the metabolite L_GAMMA_GLUTAMYLCYSTEINE_x
using the ignore_list
argument. Thus, CobraMod does not create an auxiliary sink reaction for this metabolite. We use the function cobramod.add_reactions()
to add the reaction and additional tranport reactions for all metabolites
but L_GAMMA_GLUTAMYLCYSTEINE_x
to the model. Then we run the test_non_zero_flux() with the argument ignore_list
. Because L_GAMMA_GLUTAMYLCYSTEINE_x
cannot be turnover (due to a missing transport reaction into ‘x’) the test fails and the user receives an error message with suggestions for manual curation steps.
[20]:
from cobramod import test_non_zero_flux, add_reactions
from cobramod.test import textbook_biocyc
test_model = textbook_biocyc.copy()
# Adding the reaction into the test model
add_reactions(
model=test_model,
# The list has 4 reactions
obj=[
"Redox_ADP_ATP_, Redox_ADP_ATP_x | "
+"ADP_x + Pi_x + PROTON_x <-> ATP_x + WATER_x",
"TRANS_Pi_cx, Transport Phosphate_cp | Pi_c <-> Pi_x",
"TRANS_GLUTATHIONE_cx, Transport GLUTATHIONE_cx | "
+ "GLUTATHIONE_c <-> GLUTATHIONE_x",
"GLUTATHIONE-SYN-RXN, x",
],
directory=dir_data,
database="ECOLI",
replacement={},
)
# Running test for GLUTATHIONE_SYN_RXN_x
test_non_zero_flux(
model=test_model,
reaction="GLUTATHIONE_SYN_RXN_x",
ignore_list=["L_GAMMA_GLUTAMYLCYSTEINE_x"],
)
---------------------------------------------------------------------------
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
Infeasible Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
352 try:
--> 353 model.slim_optimize(error_value=None)
354 # if works, pass and return old objective
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
1063 else:
-> 1064 assert_optimal(self, message)
1065
~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
589 exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590 raise exception_cls(f"{message} ({status}).")
591
Infeasible: None (infeasible).
During handling of the above exception, another exception occurred:
NotInRangeError Traceback (most recent call last)
/tmp/ipykernel_314184/1097592407.py in <module>
24 model=test_model,
25 reaction="GLUTATHIONE_SYN_RXN_x",
---> 26 ignore_list=["L_GAMMA_GLUTAMYLCYSTEINE_x"],
27 )
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
372 reaction=reaction,
373 times=times + 1,
--> 374 ignore_list=ignore_list,
375 )
376
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
347 # Setting maximum times for recursion
348 if times == len(model.reactions.get_by_id(reaction).metabolites):
--> 349 raise NotInRangeError(reaction=model.reactions.get_by_id(reaction))
350 # answer must be reasonable
351 # comparison must be using absolute values
NotInRangeError: The following reaction "GLUTATHIONE_SYN_RXN_x" failed the non-zero flux test multiple times. Flux values are below solver tolerance. Please curate manually by adding reactions that enable turnover of metabolites: L_GAMMA_GLUTAMYLCYSTEINE_x
1.7. Curation process¶
CobraMod automatically performs the following curation steps during the creation of COBRApy reaction and metabolite objects and CobraMod pathway objects:
If CobraMod encounters large molecules or data objects with missing entries, it prints a warning.
When adding a metabolite or reaction to the model CobraMod tries to identify if these elements are already present in the model (potentially with a differnt name) and used the exiting elements instead of creating redundant ones (by checking the cross-reference database entries in the elements’s metadata against the model entries).
CobraMod utilizes the COBRApy method cobra.Reaction.check_mass_balance() and returns a warning if imbalances are found.
This package uses the reaction reversibility information provided with the obtained reaction data. If reversibility information is missing, CobraMod raises a warning.
When CobraMod adds a pathway, every pathway reaction undergos a non-zero flux test. If a reaction cannot carry a non-zero flux, CobraMod adds auxiliary sink reactions to unblock the reaction and suggests manual curation steps based on these auxiliary modifications.
All information about downloads, the creation of objects, warnings and exceptions are written to the log file
debug.log
. As an example, below we show part of such a log file.
[21]:
!head debug.log -n 20
2021-07-02 12:49:56,077 INFO Data for "CPD-14074" retrieved.
2021-07-02 12:49:56,082 INFO Data for "CPD-14075" retrieved.
2021-07-02 12:49:56,086 INFO Data for "CPD-14076" retrieved.
2021-07-02 12:49:56,092 INFO Data for "CPD-14553" retrieved.
2021-07-02 12:49:56,096 INFO Data for "CPD-15317" retrieved.
2021-07-02 12:49:56,102 INFO Data for "CPD-15322" retrieved.
2021-07-02 12:49:56,108 INFO Data for "CPD-15323" retrieved.
2021-07-02 12:49:56,114 INFO Data for "CPD-15326" retrieved.
2021-07-02 12:49:56,142 INFO Object 'C00026_c' identified as a metabolite
2021-07-02 12:49:56,616 INFO Metabolite "MET_c" was added to model.
2021-07-02 12:49:56,635 WARNING Metabolite "MET_c" was already in model. Skipping.
2021-07-02 12:49:56,635 INFO Metabolite "SUCROSE_c" was added to model.
2021-07-02 12:49:56,688 INFO Metabolite "SUCROSE_c" was added to model.
2021-07-02 12:49:56,689 INFO Metabolite "MET_c" was added to model.
2021-07-02 12:49:56,689 INFO Metabolite "MALTOSE_c" was added to model.
2021-07-02 12:49:56,719 INFO Metabolite "xu5p__D_c" was added to model.
2021-07-02 12:49:56,758 WARNING Gene-reaction rule for reaction "R04382" assumed as "OR".
2021-07-02 12:49:56,772 INFO For reaction "R04382_c", gene "c0319" was created and its name changed to "c0319".
2021-07-02 12:49:56,774 INFO Reaction "R04382_c" was added to model.
2021-07-02 12:49:56,788 WARNING Gene-reaction rule for reaction "R04382" assumed as "OR".
1.8. Converting COBRApy groups back to CobraMod pathways¶
The COBRApy function cobra.io.write_sbml_model() writes cobra models to sbml files. If the user calls this function with a model that contains cobramod.Pathway elements these will be saved as COBRApy Group elements and the CobraMod pathway
objects are lost. To convert COBRApy groups back to CobraMod pathways (and use the associated functionalities), the user can call the function cobramod.model_convert(). In the following example, we create a Group
with four reactions and add it to the model. We then use the cobramod.model_convert() function with the argument
model
and can then call the newly converted respective CobraMod pathway objects.
[22]:
from cobramod import model_convert
from cobramod.test import textbook_biocyc
from cobra.core.group import Group
test_model = textbook_biocyc.copy()
# Creation of group
test_group = Group(id="curated_pathway")
for reaction in ("GLCpts", "G6PDH2r", "PGL", "GND"):
test_group.add_members([test_model.reactions.get_by_id(reaction)])
test_model.add_groups([test_group])
# Conversion to a Pathway
model_convert(model=test_model)
# Display to Jupyter
test_model.groups.get_by_id("curated_pathway")
[22]:
Pathway identifier | curated_pathway |
Name | |
Memory address | 0x0139993642208144 |
Reactions involved | GLCpts, G6PDH2r, PGL, GND |
Genes involved | b2417, b2416, b1101, b1818, b1621, b2415, b1817, b1819, b1852, b0767, b2029 |
Visualization attributes |
|
1.9. Visualization with Escher¶
CobraMod uses Escher to visualize pathways and fluxes. Each CobraMod pathway includes a visualization method Pathway.visualize() which automatically generates pathway maps of the respective set of reactions. These pathway maps can be easily customized to visualize flux distributions using default or user-defined colors and gradients (linear or quantile normalized).
In the following example, we call the function visualize
without any arguments.
[23]:
test_model.groups.get_by_id("curated_pathway").visualize()
We can modify the orientation of our pathway by changing the attribute vertical
to True
.
[24]:
test_model.groups.get_by_id("curated_pathway").vertical = True
test_model.groups.get_by_id("curated_pathway").visualize()
The visualization method can also be called with the argument solution_fluxes
. This argument can be a dictionary with the fluxes of the reactions or a COBRApy Solution. CobraMod assigns colors to the flux values based on the chosen normalizaton method.
By default, the visualization method uses the minimal and maximal values of the solution_flux
argument. The user can define the color of the values by changing the attributes color_negative
and color_positive
. CobraMod creates a linear color gradient with zero flux values colored in grey.
In the following example, we create a dictionary with fluxes and we pass it to the visualization method.
[25]:
# For flux visualization of the group
solution = {
"GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1
}
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution
)
We can change the colors of the fluxes by changing the attribute color_negative
and color_positive
. In this example, we use the red color for negative fluxes and green for positive fluxes.
[26]:
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").color_negative = "red"
test_model.groups.get_by_id("curated_pathway").color_positive = "green"
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution
)
The user can also manually set bounds for color gradients by modifying the CobraMod patway attribute color_min_max
. In this example we change the bounds to -10 and 10. This option is useful when comparing different flux simulations.
[27]:
test_model.groups.get_by_id("curated_pathway").color_min_max = [-10, 10]
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution
)
In the next example, we use the default settings for the color_min_max
attribute by setting the respective entry to None
and change the color gradient to orange and light blue. A list of available colors can be found here.
[28]:
# New flux with high value
solution = {
"GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1, "Other": 1000
}
# Using defaults
test_model.groups.get_by_id("curated_pathway").color_min_max = None
test_model.groups.get_by_id("curated_pathway").color_negative = "orange"
test_model.groups.get_by_id("curated_pathway").color_positive = "lightskyblue"
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution
)
The user can change the color gradient to a quantile normalization. When choosing this option, the color gradient is determined by the quantiles of thesolution_fluxes
argument and not by the minimal und minimal flux values. The user can specify this option by changing the attribute color_quantile
to True
. This option is useful when the fluxes values vary by several orders of magnitude. For instance, in the previous example, we added a reaction to the dictionary with a flux value of
1000. We can see that the positive colors are pale. Thus, in the next example we change the attribute color_quantile
and now the colors are much brighter.
[29]:
test_model.groups.get_by_id("curated_pathway").color_quantile = True
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution
)
The user can call the Pathway
for a summary of the current attributes.
[30]:
test_model.groups.get_by_id("curated_pathway")
[30]:
Pathway identifier | curated_pathway |
Name | |
Memory address | 0x0139993642208144 |
Reactions involved | GLCpts, G6PDH2r, PGL, GND |
Genes involved | b2417, b2416, b1101, b1818, b1621, b2415, b1817, b1819, b1852, b0767, b2029 |
Visualization attributes |
|
CobraMod pathway maps are saved as HTML files with the default name pathway.html
. The user can specify the file name with the argument filename
. In the following example, we name the file curated_pathway.html
.
[31]:
test_model.groups.get_by_id("curated_pathway").visualize(
solution_fluxes=solution, filename = "curated_pathway.html"
)
We can verify that the file exists by using the ls
command.
[32]:
!ls curated_pathway.html
curated_pathway.html