1. Functions

[1]:
from IPython.display import display
from cobramod import __version__
from cobra import __version__ as cobra_version
print(f'CobraMod version: {__version__}')
print(f'COBRApy version: {cobra_version}')
# From Escher:
# This option turns off the warning message if you leave or refresh this page
import escher
escher.rc['never_ask_before_quit'] = True
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled
CobraMod version: 0.5.5-alpha.1
COBRApy version: 0.22.1

1.1. Retrieving metabolic pathway information

CobraMod can retrieve metabolic pathway information (metabolites, reactions or pathways) from various databases by using database-specific identifiers. It supports all databases from the BioCyc collection, Plant Metabolic Network (PMN), the KEGG database and the BiGG Models repository. Call cobramod.available_databases to see all supported databases.

[2]:
from cobramod import available_databases

available_databases
[2]:

CobraMod supports Biocyc, the Plant Metabolic Network (PMN), KEGG and BiGG Models repository. Biocyc includes around 18.000 sub-databases and a complete list for BioCyc can be found at 'https://biocyc.org/biocyc-pgdb-list.shtml'. The database-specific identifiers can be found in the URL of the respective data. For instance, for "diphosphate" this is:

Database

URL with identifier (bold)

BioCyc, sub-database ECOLI https://biocyc.org/compound?orgid=ECOLI&id=PPI
Plant Metabolic Network, sub-database CORN https://pmn.plantcyc.org/compound?orgid=CORN&id=PPI
KEGG https://www.genome.jp/entry/C00013
BiGG Models Repository, universal model http://bigg.ucsd.edu/universal/metabolites/ppi

CobraMod uses abbreviations to represent the databases or sub-databases:

Database

Abbreviation

BioCyc META or identifier of sub-database e.g: ECOLI, ARA, GCF_000010885
Plant Metabolic Network Prefix "pmn:" with the sub-database identifier, e.g pmn:PLANT, pmn:ARA, pmn:CORN
KEGG KEGG
BiGG Models Repository BIGG

The user can download the metabolic pathway information using the cobramod.get_data function. In this example we download information from the BioCyc sub-database YEAST.

[3]:
from cobramod import get_data
from pathlib import Path

dir_data = Path.cwd().resolve().joinpath("data")
identifiers = [
    "CPD-14074",
    "CPD-14075",
    "CPD-14076",
    "CPD-14553",
    "CPD-15317",
    "CPD-15322",
    "CPD-15323",
]

for metabolite in identifiers:
    get_data(
        directory=dir_data,
        identifier=metabolite,
        database="YEAST"
    )

The first argument in cobramod.get_data() is the system path where CobraMod stores the metabolic pathway information. CobraMod uses pathlib for path representation. The second argument is the data identifier used in the respective database. In this example we retrieve data from the BioCyc sub-database YEAST. The last argument is the abbreviation of the database.

CobraMod creates a directory with the name of the database and stores the metabolic pathway information in it:

data
`-- YEAST
    |-- CPD-14074.xml
    |-- CPD-14075.xml
    |-- CPD-14076.xml
    |-- CPD-14553.xml
    |-- CPD-15317.xml
    |-- CPD-15322.xml
    `-- CPD-15323.xml

1.2. Converting stored-data into COBRApy objects

CobraMod can convert metabolic pathway information (metabolites and reactions) into COBRApy objects (cobra.Reaction and cobra.Metabolite). It can thus be seamlessly integrated with a COBRApy workflow.

The function cobramod.create_object() creates COBRApy objects from metabolic pathway data retrieved by cobramod.get_data(). If no pathway information was downloaded cobramod.create_object() retrieves it automatically.

In this example, we convert the metabolite 2-Oxoglutarate with the KEGG identifier C00026 into a COBRApy object. CobraMod automatically identifies the KEGG entry as a metabolite and converts it into the corresponding COBRApy metabolite.

The first argument is the database-specific identifier (C00026) followed by the database abbreviation (KEGG). The third argument is the path representation for the directory of the metabolic pathway information. CobraMod downloads the data into this directory and utilize it instead of downloading it again. The last argument is the compartment of the reaction (c for cytosol).

[4]:
from cobramod import create_object
from pathlib import Path

# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")

new_object = create_object(
    identifier="C00026",
    database="KEGG",
    directory=dir_data,
    compartment="c"
)

print(type(new_object))
new_object
<class 'cobra.core.metabolite.Metabolite'>
[4]:
Metabolite identifierC00026_c
Name2-Oxoglutarate;
Memory address 0x07f52c74e1d10
FormulaC5H6O5
Compartmentc
In 0 reaction(s)

In the second example, we convert the reaction RXN-11502 from the PMN sub-database CORN into a COBRApy reaction. The first argument is the database-specific identifier (RXN-11502) followed by the database identifier (pmn:CORN). CobraMod automatically takes reversibility and gene information from the database entry and adds it to the reaction object.

[5]:
new_object = create_object(
    identifier="RXN-11501",
    database="pmn:CORN",
    directory=dir_data,
    compartment="c"
)

print(type(new_object))
display(new_object)
new_object.genes
<class 'cobra.core.reaction.Reaction'>
Reaction identifierRXN_11501_c
Namealkaline α- galactosidase
Memory address 0x07f52c74f33d0
Stoichiometry

CPD_170_c + WATER_c --> ALPHA_D_GALACTOSE_c + CPD_1099_c

stachyose + H2O --> alpha-D-galactopyranose + raffinose

GPRZM00001D031300 or ZM00001D031303 or ZM00001D003279
Lower bound0
Upper bound1000
[5]:
frozenset({<Gene ZM00001D003279 at 0x7f52c74e1f90>,
           <Gene ZM00001D031300 at 0x7f52c74e1b10>,
           <Gene ZM00001D031303 at 0x7f52c74e1050>})

1.3. Adding metabolites

The function cobramod.add_metabolites() extends the COBRApy function model.add_metabolites() and can be used with a simple syntax. It can utilize a single string, a list of strings, a file path or a COBRAPy metabolite object. In the next examples we showcase these options. We use the E. coli core model from COBRApy as test model. This core model can be found under cobramod.test.textbook. If the argument obj is used as a string, it can be the database-specific identifier of the respective metabolite and its compartment or a user-defined metabolite. It uses the following syntax:


SYNTAX

To add metabolite information from a database:

database-specific_identifier, compartment

To add user-curated metabolites:

user-curated_identifier, name, compartment, chemical_formula, molecular_charge

In the first example, we add the metabolite L-methionine with the MetaCyc identifier MET to the test model. The first argument is the model name. The argument obj contains the identifier MET and the compartment c. The second argument is the database identifier (META) and the third argument is the directory where CobraMod stores the metabolite information.

[6]:
from cobramod import add_metabolites
from cobramod.test import textbook
from pathlib import Path

# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook.copy()

add_metabolites(
    model=test_model,
    obj="MET, c",
    database="META",
    directory=dir_data,
)
print(type(test_model.metabolites.get_by_id("MET_c")))
test_model.metabolites.get_by_id("MET_c")
<class 'cobra.core.metabolite.Metabolite'>
[6]:
Metabolite identifierMET_c
NameL-methionine
Memory address 0x07f52c6dcbf50
FormulaC5H11N1O2S1
Compartmentc
In 0 reaction(s)

In the second example, we add two metabolites (acetyl-coa and sucrose) from MetaCyc. In the argument obj we define a list with the database-specific metabolite identifiers and the respecitive compartments. The rest of the arguments remains the same as in the previous example. Before adding a metabolite, CobraMod tests if the given metabolite is already present in the model (potentially with a different identifier) by checking the cross-reference database entries in the metabolite’s metadata against the model’s metabolite entries. If CobraMod identifies a metabolite as already present in the model it skips adding the metabolite, used the already present one, and prints a warning.

[7]:
add_metabolites(
    model=test_model,
    obj=["ACETYL-COA, c", "SUCROSE, c"],
    database="META",
    directory=dir_data,
)
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("accoa_c"))
test_model.metabolites.get_by_id("SUCROSE_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ACETYL-COA' was found as 'accoa_c'. Please curate if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1016: UserWarning: Metabolite "accoa_c" is already present in the model. Skipping addition.
  warn(message=msg, category=UserWarning)
Metabolite identifieraccoa_c
NameAcetyl-CoA
Memory address 0x07f52c70bdd10
FormulaC23H34N7O17P3S
Compartmentc
In 7 reaction(s) CS, PFL, MALS, Biomass_Ecoli_core, PTAr, PDH, ACALD
[7]:
Metabolite identifierSUCROSE_c
Namesucrose
Memory address 0x07f52c748a390
FormulaC12H22O11
Compartmentc
In 0 reaction(s)

In the third example, we use a text file to add metabolites to the test model. We use the file metabolites.txt in the current working directory with the following content:

SUCROSE, c
MET, c
MALTOSE_c, MALTOSE[c], c, C12H22O11, 1

In this example, CobraMod downloads the first two metabolites from MetaCyc, while MALTOSE_c is a user-defined metabolite. The user can specify the path to this file in the obj argument. The remaining arguments are the same as in the previous examples. We added two print statements to show that CobraMod adds the metabolites to the model.

[8]:
from cobramod.test import textbook_biocyc
# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
# This is our file
file = dir_data.joinpath("metabolites.txt")
# Using a copy
test_model = textbook_biocyc.copy()

print(f'Number of metabolites prior addition: {len(test_model.metabolites)}')
# Using CobraMod
add_metabolites(
    model=test_model,
    obj=file,
    directory=dir_data,
    database="META",
)
print(f'Number of metabolites after addition: {len(test_model.metabolites)}')
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("MET_c"))
display(test_model.metabolites.get_by_id("SUCROSE_c"))
test_model.metabolites.get_by_id("MALTOSE_c")
Number of metabolites prior addition: 72
Number of metabolites after addition: 75
Metabolite identifierMET_c
NameL-methionine
Memory address 0x07f52c6d00c90
FormulaC5H11N1O2S1
Compartmentc
In 0 reaction(s)
Metabolite identifierSUCROSE_c
Namesucrose
Memory address 0x07f52c6d76c50
FormulaC12H22O11
Compartmentc
In 0 reaction(s)
[8]:
Metabolite identifierMALTOSE_c
NameMALTOSE[c]
Memory address 0x07f52c6d008d0
FormulaC12H22O11
Compartmentc
In 0 reaction(s)

Since cobramod.add_metabolites() is an extension of the COBRApy function model.add_metabolites() the user can also utilize COBRApy metabolites. In the following example, we use a variation of the test model (textbook_biocyc) which uses BioCyc metabolite identifiers. We copy a COBRApy metabolite from the test model and add it to the BioCyc test model.

[9]:
from cobramod import add_metabolites
from cobramod.test import textbook, textbook_biocyc

# Copying Metabolite from original model
metabolite = textbook.metabolites.get_by_id("xu5p__D_c")
# Using a copy
test_model = textbook_biocyc.copy()
add_metabolites(
    model=test_model,
    obj=metabolite
)

test_model.metabolites.get_by_id("xu5p__D_c")
[9]:
Metabolite identifierxu5p__D_c
NameD-Xylulose 5-phosphate
Memory address 0x07f52c728a3d0
FormulaC5H9O8P
Compartmentc
In 3 reaction(s) RPE, TKT1, TKT2

If CobraMod detects large molecules (e.g. enzymes) or if the metabolite information does not include a chemical formula the user receives a warning. In this example, we use the enzyme with the MetaCyc identifier Red-NADPH-Hemoprotein-Reductases and add it to the test model. CobraMod raises a warning due to the missing chemical formula.

[10]:
# Using a copy
test_model = textbook.copy()

add_metabolites(
    model=test_model,
    obj="Red-NADPH-Hemoprotein-Reductases, c",
    directory=dir_data,
    database="META",
)
test_model.metabolites.get_by_id("Red_NADPH_Hemoprotein_Reductases_c")
/home/stefano/Documents/cobramod/src/cobramod/parsing/biocyc.py:110: UserWarning: Sum formula for the metabolite "Red-NADPH-Hemoprotein-Reductases" from BioCyc could not be found. Formula set to "X" and charge to 0. Please curate.
  warn(msg)
[10]:
Metabolite identifierRed_NADPH_Hemoprotein_Reductases_c
NameRed-NADPH-Hemoprotein-Reductases
Memory address 0x07f52c6d00f90
FormulaX
Compartmentc
In 0 reaction(s)

NOTES

  • CobraMod replaces hyphens (-) with underscores (_) in the identifiers when creating COBRApy metabolites.

  • When adding several metabolites the user can only specify one database identifier (It is not possible to use two databases within the same function call) or alternatively can call the function twice.

  • CobraMod saves the curation process to the file debug.log.


1.4. Adding reactions

The function cobramod.add_reactions() extends the COBRApy function model.add_reactions() and can be used with a simple syntax. It can utilize a single string, a list of string, a file path or a COBRApy reaction object. In the next examples we showcase these options. Again, we use the E. coli core model from COBRApy as test model.

If the argument obj is used as a string, it can be the database-specific identifier of the respective reaction and its compartment or a user-defined reaction. It uses the following syntax:


SYNTAX

To add reaction information from a database:

database-specific_identifier, compartment

To add user-defined reactions, the user can specify the identifier and name of the reaction following the COBRApy reaction string syntax:

user-curated_identifier, name | coefficient_1 metabolite_1 <-> coefficient_2 metabolite_2

Metabolites must contain a suffix which specifies the compartment. This is given by an underscore (_) followed by the compartment-abbreviation. In this example, we create a transport reaction for oxygen between the external comparment (e) and the cytosol (c):

TRANS_H2O_ec, Oxygen Transport | 2 OXYGEN-MOLECULE_e <-> 2 OXYGEN_MOLECULE_c

In the first example, we add the BIGG reaction AKGDH to a test model. This time we use a text model which is a variaton of the original core model and uses KEGG identifiers (textbook_kegg). The first argument is the model name. The obj argument takes the identifier AKGDH and the compartment c. The next argument is the database identifier (BIGG) and the last argument is the directory where CobraMod stores and retrieves the metabolic pathway information from. The argument model_id is a BIGG-specific argument. Please read the notes below for more information.

Because the test model uses KEGG identifiers, CobraMod finds the reaction AKGDH with the KEGG identifier R08549_c.

[11]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path

dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook_kegg.copy()

add_reactions(
    model=test_model,
    obj="AKGDH, c",
    database="BIGG",
    directory=dir_data,
    model_id="e_coli_core"
)

display(test_model.reactions.get_by_id("R08549_c"))
/home/stefano/Documents/cobramod/src/cobramod/parsing/bigg.py:215: UserWarning: Reaction "AKGDH" from BIGG set to reversible. Please modify if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:329: UserWarning: Reaction 'AKGDH' was found as 'R08549_c'. Please curate if necessary
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "R08549_c" is already present in the model. Skipping addition.
  warn(message=msg, category=UserWarning)
Reaction identifierR08549_c
Name2-Oxogluterate dehydrogenase
Memory address 0x07f52c6bd4450
Stoichiometry

C00003_c + C00010_c + C00026_c --> C00004_c + C00011_c + C00091_c

Nicotinamide adenine dinucleotide + Coenzyme A + 2-Oxoglutarate --> Nicotinamide adenine dinucleotide - reduced + CO2 + Succinyl-CoA

GPRb0726 and b0116 and b0727
Lower bound0.0
Upper bound1000.0

In the second example, we add two reactions (R00315 and R02736) from KEGG. The argument obj is a list with the database-specific identifiers and the respective compartments of the reactions. The majority of the arguments remains the same as in the previous example. The argument genome is a KEGG-specific argument. Please read the notes below for more information.

CobraMod parses the reaction information for gene identifiers and automatically adds them to the COBRApy reaction. In this example, CobraMod creates the genes 2296, b3115 and b1849, and adds it to the first reaction R00315.

[12]:
add_reactions(
    model=test_model,
    obj=["R00315, c", "R02736 ,c"],
    directory=dir_data,
    database="KEGG",
    genome="ecc"
)

display(test_model.reactions.get_by_id("R00315_c"))
print(test_model.reactions.get_by_id("R00315_c").genes)
test_model.reactions.get_by_id("R02736_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "R00315_c" is already present in the model. Skipping addition.
  warn(message=msg, category=UserWarning)
Reaction identifierR00315_c
Nameacetate kinase
Memory address 0x07f52c6bd42d0
Stoichiometry

C00002_c + C00033_c <=> C00227_c + G11113_c

ATP + Acetate <=> Acetyl phosphate + ADP

GPRb2296 or b3115 or b1849
Lower bound-1000.0
Upper bound1000.0
frozenset({<Gene b1849 at 0x7f52c6c76810>, <Gene b2296 at 0x7f52c6c76890>, <Gene b3115 at 0x7f52c6c767d0>})
[12]:
Reaction identifierR02736_c
Namebeta-D-glucose-6-phosphate:NADP+ 1-oxoreductase
Memory address 0x07f52c4b61310
Stoichiometry

C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c

Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone

GPRc2265
Lower bound0
Upper bound1000

In the following example, we use a text file to add reactions to the test model. To this end, we use the file reactions.txt in the current working directory. This file has the following content:

R04382, c
R02736, c
C06118_ce, digalacturonate transport | 1 C06118_c <-> 1 C06118_e

CobraMod downloads the first two reactions from KEGG. C06118_ce is a user-defined reaction. The user can specify the file path in the obj argument. The next arguments are the same as in the previous examples. We added two print statements to show that CobraMod adds the reaction to the model.

[13]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path

dir_data = Path.cwd().resolve().joinpath("data")
test_model = textbook_kegg.copy()
# This is the file with text
file = dir_data.joinpath("reactions.txt")

print(f'Number of reactions prior addition: {len(test_model.reactions)}')

add_reactions(
    model=test_model,
    obj=file,
    directory=dir_data,
    database="KEGG",
    genome="ecc"
)

print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Show in jupyter
display(test_model.reactions.get_by_id("R04382_c"))
display(test_model.reactions.get_by_id("R02736_c"))
test_model.reactions.get_by_id("C06118_ce")
Number of reactions prior addition: 95
Number of reactions after addition: 98
Reaction identifierR04382_c
Name4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address 0x07f52c6be34d0
Stoichiometry

C06118_c <=> 2.0 C04053_c

4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;

GPRc0319
Lower bound-1000
Upper bound1000
Reaction identifierR02736_c
Namebeta-D-glucose-6-phosphate:NADP+ 1-oxoreductase
Memory address 0x07f52c7067510
Stoichiometry

C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c

Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone

GPRc2265
Lower bound0
Upper bound1000
[13]:
Reaction identifierC06118_ce
Namedigalacturonate transport
Memory address 0x07f52c717b250
Stoichiometry

C06118_c <=> C06118_e

4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate;

GPR
Lower bound-1000
Upper bound1000

Since cobramod.add_reactions() is an extension of the original COBRApy function model.add_reactions() the user can also utilize COBRApy reactions. In this example, we use the test model (textbook_kegg). We copy a COBRApy reaction from the test model and add it to the KEGG test model.

[14]:
from cobramod.test import textbook_kegg, textbook
from cobramod import add_reactions
from pathlib import Path

# Using copy of test model
test_model = textbook_kegg.copy()
# Obtaining a reaction
reaction = textbook.reactions.get_by_id("ACALDt")

add_reactions(model=test_model, obj=reaction)

test_model.reactions.get_by_id("ACALDt")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:1128: UserWarning: Reaction "ACALDt" is already present in the model. Skipping addition.
  warn(message=msg, category=UserWarning)
[14]:
Reaction identifierACALDt
NameR acetaldehyde reversible - transport
Memory address 0x07f52c4a75cd0
Stoichiometry

C00084_e <=> C00084_c

Acetaldehyde <=> Acetaldehyde

GPRs0001
Lower bound-1000.0
Upper bound1000.0

By default, COBRApy ignores metabolites that appear on both sides of a reaction equation. CobraMod identifies such reactions and assigns one of these metabolites to the extracellular compartment and raises a warning suggesting manual curation. In the following example, we add a transport reaction for acetic acid from BioCyc sub-database YEAST to the test model.

[15]:
test_model = textbook_kegg.copy()

add_reactions(
    model=test_model,
    obj="TRANS-RXN-455, c",
    database="YEAST",
    directory=dir_data,
)
# Show in jupyter
test_model.reactions.get_by_id("TRANS_RXN_455_c")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:376: UserWarning: Reaction "TRANS_RXN_455_c" has metabolite "CPD_24335_e" on both sides of the equation (e.g transport reaction). COBRApy ignores these metabolites. To avoid this, by default, CobraMod will assign one metabolite to the extracellular compartment. Please curate the reaction if necessary.
  warn(message=msg, category=UserWarning)
[15]:
Reaction identifierTRANS_RXN_455_c
Nameacetic acid uptake
Memory address 0x07f52c49e17d0
Stoichiometry

CPD_24335_e --> CPD_24335_c

acetic+acid --> acetic+acid

GPRG3O-32144
Lower bound0
Upper bound1000

NOTES

  • CobraMod replaces hyphens (-) to underscores (_) in the identifiers when creating COBRApy reactions.

  • When adding several reactions the user can only specify one database identifier (It is not possible to use two databases within the same function call) or alternatively can call the function twice.

  • CobraMod tries to identify if a given metabolites or reactions are already present in the model and used those instead of adding redundant entries.

  • CobraMod saves the curation process to the file debug.log.

  • The argument model_id is specific for the BiGG Models repository. CobraMod can download metabolite and reaction information directly from the BIGG models. A list of all models can be found here.

  • The argument genome can be used with the database KEGG and specifies the genome for which gene information will be retrieved. The complete list of all available genomes can be found here. If no genome is specified, no gene information will be retrieved and a warning is printed as shown below:

[16]:
test_model = textbook_kegg.copy()

add_reactions(
    model=test_model,
    obj="R04382, c",
    database="KEGG",
    directory=dir_data,
)
test_model.reactions.get_by_id("R04382_c")
/home/stefano/Documents/cobramod/src/cobramod/parsing/kegg.py:196: UserWarning: Nothing was specified in argument "genome". Reaction "R04382" will not include genes. Please modify if necessary.
  warn(message=msg, category=UserWarning)
[16]:
Reaction identifierR04382_c
Name4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address 0x07f52c6c6db10
Stoichiometry

C06118_c <=> 2.0 C04053_c

4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;

GPR
Lower bound-1000
Upper bound1000

1.5. Adding pathways

CobraMod can add metabolic pathways to a given model. The function cobramod.add_pathway() takes as an argument either a sequence of database-specific reaction identifiers or a single pathway identifier. CobraMod download the pathway information from the databases, creates the COBRApy objects and prints a short summary about the curation process when adding the respective pathway. Additionally, to ensure reproducibility and for tracking the curation process, the user can write changes to file. CobraMod creates the object cobramod.Pathway which contains the pathway information.

When adding serveral pathways, we recommend adding pathways one-by-one to perform neccessary curation steps inbetween. In the examples below we showcase two options. Again, we use the E. coli core model from COBRApy as test model.

In the first example, we add the acetoacetate degradation pathway from the BioCyc sub-database ECOLI to the test model. This pathway contains two reactions and six metabolites.

fbe4f666d1964bec8cd95f628c695d0a

In this example, the first argument is the model to extend. The pathway argument uses the database-specific identifier ACETOACETATE-DEG-PWY and the database identifier ECOLI. We define the compartment as c (cytosol), i.e. all metabolites and reactions will be assigned to this compartment.

The argument filename specifies a file to which a summary of the changes is written. Additionally, add_pathway() prints a summary of the changes to the model. Calling the respective cobramod.Pathway outputs a table with the main attributes of the object.

All COBRApy reactions of the pathway are tested for their capacity to cary a non-zero flux. Read more about it in the non-zero flux test section.

[17]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")

# Using copy of test model
test_model = textbook.copy()

add_pathway(
    model=test_model,
    pathway="ACETOACETATE-DEG-PWY",
    database="ECOLI",
    compartment="c",
    filename="summary.txt",
    directory=dir_data,
)

# Display in jupyter
test_model.groups.get_by_id("ACETOACETATE-DEG-PWY")
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ACET' was found as 'ac_c'. Please curate if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'CO-A' was found as 'coa_c'. Please curate if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/extension.py:573: UserWarning: Auxiliary sink reaction for "SK_3_KETOBUTYRATE_c" created. Consider removing it and adding the synthesis reactions for the metabolite.
  warn(message=msg, category=UserWarning)
Number of       new   | removed entities in
*=====================|===================*
Reactions        2    |    0
Metabolites      2    |    0
Exchange         0    |    0
Demand           0    |    0
Sinks            1    |    0
Genes            4    |    0
Groups           1    |    0

[17]:
Pathway identifier ACETOACETATE-DEG-PWY
Name
Memory address 0x0139993464354896
Reactions involved

ACETOACETYL_COA_TRANSFER_RXN_c, ACETYL_COA_ACETYLTRANSFER_RXN_c

Genes involved

EG12432, EG11670, EG11669, EG11672

Visualization attributes
  • vertical = False
  • color_negative = None
  • color_positive = None
  • color_quantile = False

 

Below is an example of the summary in form of a text file. The first part lists names of all reactions, metabolites, exchange reactions, auxiliary demand and sink reactions, genes, and groups in the model. The second part of the summary lists all elements that were added or removed by the function call add_pathway().

[18]:
%cat summary.txt
Summary:
Model identifier: e_coli_core
Model name:

Reactions:
['ACALD', 'ACALDt', 'ACKr', 'ACONTa', 'ACONTb', 'ACt2r', 'ADK1', 'AKGDH', 'AKGt2r', 'ALCD2x', 'ATPM', 'ATPS4r', 'Biomass_Ecoli_core', 'CO2t', 'CS', 'CYTBD', 'D_LACt2', 'ENO', 'ETOHt2r', 'FBA', 'FBP', 'FORt2', 'FORti', 'FRD7', 'FRUpts2', 'FUM', 'FUMt2_2', 'G6PDH2r', 'GAPD', 'GLCpts', 'GLNS', 'GLNabc', 'GLUDy', 'GLUN', 'GLUSy', 'GLUt2r', 'GND', 'H2Ot', 'ICDHyr', 'ICL', 'LDH_D', 'MALS', 'MALt2_2', 'MDH', 'ME1', 'ME2', 'NADH16', 'NADTRHD', 'NH4t', 'O2t', 'PDH', 'PFK', 'PFL', 'PGI', 'PGK', 'PGL', 'PGM', 'PIt2r', 'PPC', 'PPCK', 'PPS', 'PTAr', 'PYK', 'PYRt2', 'RPE', 'RPI', 'SUCCt2_2', 'SUCCt3', 'SUCDi', 'SUCOAS', 'TALA', 'THD2', 'TKT1', 'TKT2', 'TPI', 'ACETOACETYL_COA_TRANSFER_RXN_c', 'ACETYL_COA_ACETYLTRANSFER_RXN_c']
Metabolites:
['13dpg_c', '2pg_c', '3pg_c', '6pgc_c', '6pgl_c', 'ac_c', 'ac_e', 'acald_c', 'acald_e', 'accoa_c', 'acon_C_c', 'actp_c', 'adp_c', 'akg_c', 'akg_e', 'amp_c', 'atp_c', 'cit_c', 'co2_c', 'co2_e', 'coa_c', 'dhap_c', 'e4p_c', 'etoh_c', 'etoh_e', 'f6p_c', 'fdp_c', 'for_c', 'for_e', 'fru_e', 'fum_c', 'fum_e', 'g3p_c', 'g6p_c', 'glc__D_e', 'gln__L_c', 'gln__L_e', 'glu__L_c', 'glu__L_e', 'glx_c', 'h2o_c', 'h2o_e', 'h_c', 'h_e', 'icit_c', 'lac__D_c', 'lac__D_e', 'mal__L_c', 'mal__L_e', 'nad_c', 'nadh_c', 'nadp_c', 'nadph_c', 'nh4_c', 'nh4_e', 'o2_c', 'o2_e', 'oaa_c', 'pep_c', 'pi_c', 'pi_e', 'pyr_c', 'pyr_e', 'q8_c', 'q8h2_c', 'r5p_c', 'ru5p__D_c', 's7p_c', 'succ_c', 'succ_e', 'succoa_c', 'xu5p__D_c', '3_KETOBUTYRATE_c', 'ACETOACETYL_COA_c']
Exchange:
['EX_ac_e', 'EX_acald_e', 'EX_akg_e', 'EX_co2_e', 'EX_etoh_e', 'EX_for_e', 'EX_fru_e', 'EX_fum_e', 'EX_glc__D_e', 'EX_gln__L_e', 'EX_glu__L_e', 'EX_h_e', 'EX_h2o_e', 'EX_lac__D_e', 'EX_mal__L_e', 'EX_nh4_e', 'EX_o2_e', 'EX_pi_e', 'EX_pyr_e', 'EX_succ_e']
Demand:
[]
Sinks:
['SK_3_KETOBUTYRATE_c']
Genes:
['b1241', 'b0351', 's0001', 'b3115', 'b1849', 'b2296', 'b1276', 'b0118', 'b0474', 'b0116', 'b0726', 'b0727', 'b2587', 'b0356', 'b1478', 'b3735', 'b3733', 'b3734', 'b3732', 'b3736', 'b3738', 'b3731', 'b3737', 'b3739', 'b0720', 'b0733', 'b0979', 'b0978', 'b0734', 'b2975', 'b3603', 'b2779', 'b1773', 'b2925', 'b2097', 'b4232', 'b3925', 'b0904', 'b2492', 'b4154', 'b4152', 'b4153', 'b4151', 'b1819', 'b1817', 'b2415', 'b1818', 'b2416', 'b1611', 'b4122', 'b1612', 'b3528', 'b1852', 'b1779', 'b1621', 'b1101', 'b2417', 'b3870', 'b1297', 'b0809', 'b0810', 'b0811', 'b1761', 'b1524', 'b1812', 'b0485', 'b3213', 'b3212', 'b4077', 'b2029', 'b0875', 'b1136', 'b4015', 'b2133', 'b1380', 'b4014', 'b2976', 'b3236', 'b1479', 'b2463', 'b2286', 'b2279', 'b2276', 'b2280', 'b2288', 'b2284', 'b2287', 'b2281', 'b2277', 'b2285', 'b2282', 'b2278', 'b2283', 'b3962', 'b1602', 'b1603', 'b0451', 'b0114', 'b0115', 'b3916', 'b1723', 'b0902', 'b0903', 'b2579', 'b3114', 'b3952', 'b3951', 'b4025', 'b2926', 'b0767', 'b0755', 'b3612', 'b4395', 'b3493', 'b2987', 'b3956', 'b3403', 'b1702', 'b2297', 'b2458', 'b1676', 'b1854', 'b4301', 'b3386', 'b2914', 'b4090', 'b0722', 'b0724', 'b0721', 'b0723', 'b0728', 'b0729', 'b2464', 'b0008', 'b2935', 'b2465', 'b3919', 'EG12432', 'EG11670', 'EG11669', 'EG11672']
Groups:
['ACETOACETATE-DEG-PWY']

New:
Reactions:
['ACETOACETYL_COA_TRANSFER_RXN_c', 'ACETYL_COA_ACETYLTRANSFER_RXN_c']
Metabolites:
['3_KETOBUTYRATE_c', 'ACETOACETYL_COA_c']
Exchange:
[]
Demand:
[]
Sinks:
['SK_3_KETOBUTYRATE_c']
Genes:
['EG12432', 'EG11670', 'EG11669', 'EG11672']
Groups:
['ACETOACETATE-DEG-PWY']

Removed:
Reactions:
[]
Metabolites:
[]
Exchange:
[]
Demand:
[]
Sinks:
[]
Genes:
[]
Groups:
[]

In the next example, we use a list of database-specific reaction identifiers as pathway argument. We use the database identifier ECOLI and the compartment c (cytosol). Additionally, we define a pathway name by using the argument group. The user can also use this argument to merge pathways by using the same group names.

[19]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook_biocyc
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")

test_model = textbook_biocyc.copy()
# Defining database-specific identifiers
sequence = ["PEPDEPHOS-RXN", "PYRUVFORMLY-RXN", "FHLMULTI-RXN"]

print(f'Number of reaction prior addition: {len(test_model.reactions)}')

add_pathway(
    model=test_model,
    pathway=sequence,
    directory=dir_data,
    database="ECOLI",
    compartment="c",
    group="curated_pathway"
)

print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Display in jupyter
test_model.groups.get_by_id("curated_pathway")
Number of reaction prior addition: 95
Number of       new   | removed entities in
*=====================|===================*
Reactions        3    |    0
Metabolites      2    |    0
Exchange         0    |    0
Demand           0    |    0
Sinks            1    |    0
Genes           11    |    0
Groups           1    |    0

Number of reactions after addition: 99
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ATP' was found as 'ATP_c'. Please curate if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/creation.py:161: UserWarning: Metabolite 'ADP' was found as 'ADP_c'. Please curate if necessary.
  warn(message=msg, category=UserWarning)
/home/stefano/Documents/cobramod/src/cobramod/core/extension.py:685: UserWarning: Auxiliary sink reaction for "SK_HYDROGEN_MOLECULE_c" created. Consider removing it and adding the synthesis reactions for the metabolite.
  warn(message=msg, category=UserWarning)
[19]:
Pathway identifier curated_pathway
Name
Memory address 0x0139993460803984
Reactions involved

PEPDEPHOS_RXN_c, PYRUVFORMLY_RXN_c, FHLMULTI_RXN_c

Genes involved

EG10803, EG10804, G7627, EG10701, EG10479, EG10480, EG10476, EG10478, EG10477, EG10475, EG10285

Visualization attributes
  • vertical = False
  • color_negative = None
  • color_positive = None
  • color_quantile = False

 


NOTES

  • A pathway is a set of COBRApy reactions. All the notes listed for add_metabolites() and add_reactions() also apply to pathways, i. e., handling of duplicate elements, transport reactions and the argument genome for KEGG.

  • CobraMod saves the curation process to the file debug.log.


1.6. Non-zero flux test

When calling the function add_pathway(), CobraMod tests each reaction of the Pathway object for its capability to carry a non-zero flux, i.e., if the involved metabolites can be turned over. Additionally, the user can test individual COBRApy reactions for their capability to cary a non-zero flux by using the function cobramod.test_non_zero_flux().

When running these functions, CobraMod creates auxillary demand reactions for the reaction’s metabolites to enable a flux through the tested reaction. Note that demand reactions enable an efflux of the respective metabolite from the system but no uptake. If the test initially fails, auxiliary sink reactions are added to the model and CobraMod raises an error suggesting manual curation steps. These auxillary sink reactions enable an influx of the respective metabolite to the system. Based on this information the user can then manually curate the model and if necessary add the missing synthesis reactions for the respective metabolites. Otherwise, if a reaction passes the test, no message is printed and the auxilary reactions are removed from the model.

The user can also use the argument ignore_list to specify metabolites for which no auxiliary sink reactions should be created. This can be for example ubiquitous metabololites such as water, ATP or ADP.

In the following artifical example (E. coli is a prokaryote without compartmentation), we introduce the glutathione synthase reaction (GLUTATHIONE-SYN-RXN) into a new comparment (x) and test the reaction for its capability to carry a non-zero flux. This reaction has the following equation:

ATP_x + GLY_x + L_GAMMA_GLUTAMYLCYSTEINE_x --> ADP_x + GLUTATHIONE_x + PROTON_x + Pi_x

Because we are testing a reaction in a new, empty compartment, CobraMod creates auxiliary demand and sink reactions for the involved metabolites. To showcase a scenario in which an error message is printed, we ignore the metabolite L_GAMMA_GLUTAMYLCYSTEINE_x using the ignore_list argument. Thus, CobraMod does not create an auxiliary sink reaction for this metabolite. We use the function cobramod.add_reactions() to add the reaction and additional tranport reactions for all metabolites but L_GAMMA_GLUTAMYLCYSTEINE_x to the model. Then we run the test_non_zero_flux() with the argument ignore_list. Because L_GAMMA_GLUTAMYLCYSTEINE_x cannot be turnover (due to a missing transport reaction into ‘x’) the test fails and the user receives an error message with suggestions for manual curation steps.

[20]:
from cobramod import test_non_zero_flux, add_reactions
from cobramod.test import textbook_biocyc

test_model = textbook_biocyc.copy()

# Adding the reaction into the test model
add_reactions(
    model=test_model,
    # The list has 4 reactions
    obj=[
        "Redox_ADP_ATP_, Redox_ADP_ATP_x | "
        +"ADP_x + Pi_x + PROTON_x <-> ATP_x + WATER_x",
        "TRANS_Pi_cx, Transport Phosphate_cp | Pi_c <-> Pi_x",
        "TRANS_GLUTATHIONE_cx, Transport GLUTATHIONE_cx | "
        + "GLUTATHIONE_c <-> GLUTATHIONE_x",
        "GLUTATHIONE-SYN-RXN, x",
    ],
    directory=dir_data,
    database="ECOLI",
    replacement={},
)
# Running test for GLUTATHIONE_SYN_RXN_x
test_non_zero_flux(
    model=test_model,
    reaction="GLUTATHIONE_SYN_RXN_x",
    ignore_list=["L_GAMMA_GLUTAMYLCYSTEINE_x"],
)

---------------------------------------------------------------------------
Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

Infeasible                                Traceback (most recent call last)
~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    352     try:
--> 353         model.slim_optimize(error_value=None)
    354         # if works, pass and return old objective

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/core/model.py in slim_optimize(self, error_value, message)
   1063         else:
-> 1064             assert_optimal(self, message)
   1065

~/miniconda3/envs/cobramod/lib/python3.7/site-packages/cobra/util/solver.py in assert_optimal(model, message)
    589         exception_cls = OPTLANG_TO_EXCEPTIONS_DICT.get(status, OptimizationError)
--> 590         raise exception_cls(f"{message} ({status}).")
    591

Infeasible: None (infeasible).

During handling of the above exception, another exception occurred:

NotInRangeError                           Traceback (most recent call last)
/tmp/ipykernel_314184/1097592407.py in <module>
     24     model=test_model,
     25     reaction="GLUTATHIONE_SYN_RXN_x",
---> 26     ignore_list=["L_GAMMA_GLUTAMYLCYSTEINE_x"],
     27 )

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    372             reaction=reaction,
    373             times=times + 1,
--> 374             ignore_list=ignore_list,
    375         )
    376

~/Documents/cobramod/src/cobramod/core/extension.py in test_non_zero_flux(model, reaction, times, ignore_list)
    347     # Setting maximum times for recursion
    348     if times == len(model.reactions.get_by_id(reaction).metabolites):
--> 349         raise NotInRangeError(reaction=model.reactions.get_by_id(reaction))
    350     # answer must be reasonable
    351     # comparison must be using absolute values

NotInRangeError: The following reaction "GLUTATHIONE_SYN_RXN_x" failed the non-zero flux test multiple times. Flux values are below solver tolerance. Please curate manually by adding reactions that enable turnover of metabolites: L_GAMMA_GLUTAMYLCYSTEINE_x

1.7. Curation process

CobraMod automatically performs the following curation steps during the creation of COBRApy reaction and metabolite objects and CobraMod pathway objects:

  1. If CobraMod encounters large molecules or data objects with missing entries, it prints a warning.

  2. When adding a metabolite or reaction to the model CobraMod tries to identify if these elements are already present in the model (potentially with a differnt name) and used the exiting elements instead of creating redundant ones (by checking the cross-reference database entries in the elements’s metadata against the model entries).

  3. CobraMod utilizes the COBRApy method cobra.Reaction.check_mass_balance() and returns a warning if imbalances are found.

  4. This package uses the reaction reversibility information provided with the obtained reaction data. If reversibility information is missing, CobraMod raises a warning.

  5. When CobraMod adds a pathway, every pathway reaction undergos a non-zero flux test. If a reaction cannot carry a non-zero flux, CobraMod adds auxiliary sink reactions to unblock the reaction and suggests manual curation steps based on these auxiliary modifications.

  6. All information about downloads, the creation of objects, warnings and exceptions are written to the log file debug.log. As an example, below we show part of such a log file.

[21]:
!head debug.log -n 20
2021-07-02 12:49:56,077 INFO Data for "CPD-14074" retrieved.
2021-07-02 12:49:56,082 INFO Data for "CPD-14075" retrieved.
2021-07-02 12:49:56,086 INFO Data for "CPD-14076" retrieved.
2021-07-02 12:49:56,092 INFO Data for "CPD-14553" retrieved.
2021-07-02 12:49:56,096 INFO Data for "CPD-15317" retrieved.
2021-07-02 12:49:56,102 INFO Data for "CPD-15322" retrieved.
2021-07-02 12:49:56,108 INFO Data for "CPD-15323" retrieved.
2021-07-02 12:49:56,114 INFO Data for "CPD-15326" retrieved.
2021-07-02 12:49:56,142 INFO Object 'C00026_c' identified as a metabolite
2021-07-02 12:49:56,616 INFO Metabolite "MET_c" was added to model.
2021-07-02 12:49:56,635 WARNING Metabolite "MET_c" was already in model. Skipping.
2021-07-02 12:49:56,635 INFO Metabolite "SUCROSE_c" was added to model.
2021-07-02 12:49:56,688 INFO Metabolite "SUCROSE_c" was added to model.
2021-07-02 12:49:56,689 INFO Metabolite "MET_c" was added to model.
2021-07-02 12:49:56,689 INFO Metabolite "MALTOSE_c" was added to model.
2021-07-02 12:49:56,719 INFO Metabolite "xu5p__D_c" was added to model.
2021-07-02 12:49:56,758 WARNING Gene-reaction rule for reaction "R04382" assumed as "OR".
2021-07-02 12:49:56,772 INFO For reaction "R04382_c", gene "c0319" was created and its name changed to "c0319".
2021-07-02 12:49:56,774 INFO Reaction "R04382_c" was added to model.
2021-07-02 12:49:56,788 WARNING Gene-reaction rule for reaction "R04382" assumed as "OR".

1.8. Converting COBRApy groups back to CobraMod pathways

The COBRApy function cobra.io.write_sbml_model() writes cobra models to sbml files. If the user calls this function with a model that contains cobramod.Pathway elements these will be saved as COBRApy Group elements and the CobraMod pathway objects are lost. To convert COBRApy groups back to CobraMod pathways (and use the associated functionalities), the user can call the function cobramod.model_convert(). In the following example, we create a Group with four reactions and add it to the model. We then use the cobramod.model_convert() function with the argument model and can then call the newly converted respective CobraMod pathway objects.

[22]:
from cobramod import model_convert
from cobramod.test import textbook_biocyc
from cobra.core.group import Group

test_model = textbook_biocyc.copy()
# Creation of group
test_group = Group(id="curated_pathway")
for reaction in ("GLCpts", "G6PDH2r", "PGL", "GND"):
    test_group.add_members([test_model.reactions.get_by_id(reaction)])
test_model.add_groups([test_group])

# Conversion to a Pathway
model_convert(model=test_model)
# Display to Jupyter
test_model.groups.get_by_id("curated_pathway")
[22]:
Pathway identifier curated_pathway
Name
Memory address 0x0139993642208144
Reactions involved

GLCpts, G6PDH2r, PGL, GND

Genes involved

b2417, b2416, b1101, b1818, b1621, b2415, b1817, b1819, b1852, b0767, b2029

Visualization attributes
  • vertical = False
  • color_negative = None
  • color_positive = None
  • color_quantile = False

 

1.9. Visualization with Escher

CobraMod uses Escher to visualize pathways and fluxes. Each CobraMod pathway includes a visualization method Pathway.visualize() which automatically generates pathway maps of the respective set of reactions. These pathway maps can be easily customized to visualize flux distributions using default or user-defined colors and gradients (linear or quantile normalized).

In the following example, we call the function visualize without any arguments.

[23]:
test_model.groups.get_by_id("curated_pathway").visualize()

We can modify the orientation of our pathway by changing the attribute vertical to True.

[24]:
test_model.groups.get_by_id("curated_pathway").vertical = True
test_model.groups.get_by_id("curated_pathway").visualize()

The visualization method can also be called with the argument solution_fluxes. This argument can be a dictionary with the fluxes of the reactions or a COBRApy Solution. CobraMod assigns colors to the flux values based on the chosen normalizaton method.

By default, the visualization method uses the minimal and maximal values of the solution_flux argument. The user can define the color of the values by changing the attributes color_negative and color_positive. CobraMod creates a linear color gradient with zero flux values colored in grey.

In the following example, we create a dictionary with fluxes and we pass it to the visualization method.

[25]:
# For flux visualization of the group
solution =  {
    "GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1
}
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

We can change the colors of the fluxes by changing the attribute color_negative and color_positive. In this example, we use the red color for negative fluxes and green for positive fluxes.

[26]:
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").color_negative = "red"
test_model.groups.get_by_id("curated_pathway").color_positive = "green"
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

The user can also manually set bounds for color gradients by modifying the CobraMod patway attribute color_min_max. In this example we change the bounds to -10 and 10. This option is useful when comparing different flux simulations.

[27]:
test_model.groups.get_by_id("curated_pathway").color_min_max = [-10, 10]
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

In the next example, we use the default settings for the color_min_max attribute by setting the respective entry to None and change the color gradient to orange and light blue. A list of available colors can be found here.

[28]:
# New flux with high value
solution =  {
    "GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1, "Other": 1000
}
# Using defaults
test_model.groups.get_by_id("curated_pathway").color_min_max = None

test_model.groups.get_by_id("curated_pathway").color_negative = "orange"
test_model.groups.get_by_id("curated_pathway").color_positive = "lightskyblue"
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

The user can change the color gradient to a quantile normalization. When choosing this option, the color gradient is determined by the quantiles of thesolution_fluxes argument and not by the minimal und minimal flux values. The user can specify this option by changing the attribute color_quantile to True. This option is useful when the fluxes values vary by several orders of magnitude. For instance, in the previous example, we added a reaction to the dictionary with a flux value of 1000. We can see that the positive colors are pale. Thus, in the next example we change the attribute color_quantile and now the colors are much brighter.

[29]:
test_model.groups.get_by_id("curated_pathway").color_quantile = True
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

The user can call the Pathway for a summary of the current attributes.

[30]:
test_model.groups.get_by_id("curated_pathway")
[30]:
Pathway identifier curated_pathway
Name
Memory address 0x0139993642208144
Reactions involved

GLCpts, G6PDH2r, PGL, GND

Genes involved

b2417, b2416, b1101, b1818, b1621, b2415, b1817, b1819, b1852, b0767, b2029

Visualization attributes
  • vertical = True
  • color_negative = orange
  • color_positive = lightskyblue
  • color_quantile = True

 

CobraMod pathway maps are saved as HTML files with the default name pathway.html. The user can specify the file name with the argument filename. In the following example, we name the file curated_pathway.html.

[31]:
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution, filename = "curated_pathway.html"
)

We can verify that the file exists by using the ls command.

[32]:
!ls curated_pathway.html
curated_pathway.html