[1]:

from IPython.display import display, display_html
from cobramod import __version__
from cobra import __version__ as cobra_version
from pandas import set_option
#set_option('max_rows', None)
print(f'CobraMod version: {__version__}')
print(f'COBRApy version: {cobra_version}')
# From Escher:
# This option turns off the warning message if you leave or refresh this page
import escher
escher.rc['never_ask_before_quit'] = True

CobraMod version: 1.2.0
COBRApy version: 0.29.0

6.2. Adding a synthetic homoserine cycle to a genome-scale model of E. coli to reproduce in silico experiments¶

6.2.1. Introduction¶

In this tutorial, we use a genome-scale model of E. coli (iML1515) [1] and extend it with a synthetic homoserine cycle as described in this study [2]. The authors of the article describe current efforts to engineer microorganisms toward using methanol as a carbon source. Methanol can be oxidized into formaldehyde, which can be later utilized for the production of C3-compounds.

In this paper, the authors designed a synthetic cycle (homoserine cycle) that can outperform the natural serine cycle with respect to biomass yield. In this test case, we recreate the in silico experiments using CobraMod and visualize the fluxes of the homoserine pathway.

6.2.2. Loading CobraMod, the E. coli model and setting the growth medium¶

We use the BiGG Models repository to retrieve the iML1515 model and load it using COBRApy. The model includes 2712 reactions, 1877 metabolites, and 1516 genes. Using basic shell commands, we download the model into our current working directory.

[2]:

!wget http://bigg.ucsd.edu/static/models/iML1515.xml  -nc -q

We load the E. coli model using the COBRApy function read_sbml_model(). From CobraMod we load the function cobramod.add_pathway() and the Python module pathlib to define the system’s path where CobraMod stores the metabolic pathway information (dir_data). All the pathway information will be saved here.

[3]:

from pathlib import Path

from cobramod import add_pathway, add_reactions
from cobra.io import write_sbml_model, read_sbml_model

# Defining system path for data
dir_data = Path.cwd().joinpath("data")

# Loading model of iML1515
original = read_sbml_model("iML1515.xml")
model = original.copy()
model

[3]:

Name	iML1515
Memory address	7bf6506b7fe0
Number of metabolites	1877
Number of reactions	2712
Number of genes	1516
Number of groups	0
Objective expression	1.0BIOMASS_Ec_iML1515_core_75p37M - 1.0BIOMASS_Ec_iML1515_core_75p37M_reverse_35685
Compartments	cytosol, extracellular space, periplasm

To demonstrate CobraMod’s functionalities, we first repeat the model curation as listed in the original publication.

[4]:

model.reactions.THD2pp.add_metabolites({"h_p": 1, "h_c": -1})
model.reactions.HSDy.bounds = (-1000, 0)
model.reactions.MMM.bounds = (-1000, 1000)

for reaction in ["POR5", "GLYCK", "FDH4pp", "FDH5pp", "ATPM", "PFL", "OBTFL"]:
    model.reactions.get_by_id(reaction).knock_out()

for rxn in model.boundary:
    # knock out all other carbon-related transporters
    if "C" in rxn.check_mass_balance():
        rxn.bounds = (0.0, 1000)

model.reactions.get_by_id("EX_meoh_e").bounds = (-10.0, 1000)
model.reactions.get_by_id("EX_co2_e").bounds = (-1000, 1000)

6.2.3. Adding the homoserine cycle¶

In the next step, the authors used a function which loads a .csv file and creates the reactions to be added to the model based on the information given in the file. The cell below shows the content of this file.

[5]:

!wget https://github.com/he-hai/PubSuppl/raw/master/2020_Promiscuous%20aldolases/2_FBA/NewRxns4Full_HomSer.csv  -nc -q | cat NewRxns4Full_HomSer.csv

RxnID, RxnName, RxnFormula, Subsystem, LowerBound, UpperBound
MeDH, methanol dehydrogenase (NAD), meoh_c + nad_c <=> fald_c + nadh_c + h_c, MeDH, -1000, 1000
SAL, serine aldolase, gly_c + fald_c --> ser__L_c, HomSerCyc, 0, 1000
HAL, HOB aldolase, pyr_c + fald_c --> hob_c, HomSerCyc, 0, 1000
HAT, HOB amino-transferase, hob_c + glu__L_c <=> hom__L_c + akg_c, HomSerCyc, 0, 1000
FDH, NAD-dependent formate dehydrogenase, for_c + nad_c --> co2_c + nadh_c, SerCyc, 0, 1000

The synthetic cycle consists of seven reactions of which three are not yet in the model. Additionally, there are three other reactions which are required for the pathway to be functional: - ACDH (Acetaldehyde dehydrogenase) - MeDH (Methanol dehydrogenase, converts methanol to formaldehyde) - FDH (formate dehydrogenase, converts formate and NAD+ to CO2 and NADH.)

Of these reactions, MeDH and FDH are not yet included in the model. Here, we use CobraMod to add these new reactions and to define the set of new and already existing reactions that together form the pathway.

Homoserine cycle

The next step is to create a list of reactions that we want to include. We must consider their position in the pathway to fully utilize CobraMod’s capabilities. For each newly introduced reaction, CobraMod performs a non-zero flux test (see Non-zero flux test and Curation process). For the test to be successful, we need to make sure that precursor metabolites for each newly added reaction can be synthezised in the model.

There are two types of strings here: - Strings with the new reaction information. This type of reaction is referred to as a “user-curated reaction” (in contrast to a reaction information downloaded from a database). The syntax follows: > Reaction identifier, name | equation

Strings with the identifiers of those reactions that are part of the pathway and already in the model.

CobraMod tries to find the respective reaction’s metabolites in the model. If they are not present in the model, CobraMod creates and adds these metabolites.

[6]:

reactions = [
    "MeDH, methanol dehydrogenase (NAD) | meoh_c + nad_c <=> fald_c + nadh_c + h_c",
    "SAL, serie aldolase | gly_c + fald_c --> ser__L_c",
    "HAL, HOB aldolase | pyr_c + fald_c --> hob_c",
    "HAT, HOB amino-transferase  | hob_c + glu__L_c <=> hom__L_c + akg_c",
    "FDH, NAD-dependent formate dehydrogenase | for_c + nad_c --> co2_c + nadh_c"
]

[7]:

add_reactions(model=model, obj=reactions, directory=dir_data)

Reaction "MeDH" was added to model.
Reaction "SAL" was added to model.
Reaction "HAL" was added to model.
Reaction "HAT" was added to model.
Reaction "FDH" was added to model.

[8]:

reactions = [
    "MeDH",
    "SAL",
    "SERD_L", # already in model
    "HAL",
    "HAT",
    "HSK", # already in model
    "THRS", # already in model (TS)
    "THRA", # already in model (LTA)
    # Reactions below will be also included
    "ACALD", # already in model (ACDH)
    "FDH"
]

We then use the add_pathway() function to add this list of reactions as a pathway to our model. The first argument model specifies the model to be extended. The argument group assigns our cobramod.Pathway a name, which in this case is Homoserine-PWY. When the argument filename is specified, CobraMod creates a summary of the changes that were made to the model. Here, we save the summary as summary.txt and use it to ensure that the new reactions have been added to the model.

[9]:

add_pathway(
    model=model,
    pathway=reactions,
    compartment="c",
    directory=dir_data,
    group="Homoserine-PWY",
    filename="summary.txt",
)

Non-zero flux test for reaction 'MeDH' passed.
Reaction "MeDH" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'SAL' passed.
Reaction "SAL" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'SERD_L' passed.
Reaction "SERD_L" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'HAL' passed.
Reaction "HAL" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'HAT' passed.
Reaction "HAT" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'HSK' passed.
Reaction "HSK" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'THRS' passed.
Reaction "THRS" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'THRA' passed.
Reaction "THRA" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'ACALD' passed.
Reaction "ACALD" added to group "Homoserine-PWY".
Non-zero flux test for reaction 'FDH' passed.
Reaction "FDH" added to group "Homoserine-PWY".

Number of       new   | removed entities in
*=====================|===================*
Reactions        0    |    0
Metabolites      0    |    0
Exchange         0    |    0
Demand           0    |    0
Sinks            0    |    0
Genes            0    |    0
Groups           1    |    0

We get three warnings after running the cell. CobraMod informs the user about inconsistencies and missing information in the model. Because we manually added reactions that were created from an equation (curated reaction), CobraMod recognizes these reactions and tries to identify if the respective metabolites are already in the model. In our case, only one metabolite hob_c was not found in the model. If a metabolite is not found in the model CobraMod creates it without a chemical formula and charge and raises a warning suggesting the user manually curate this metabolite.

The other two warnings refer to two reactions (HAL and HAT) which involves hob_c. Since this metabolite does not have a chemical formula, the reactions are recognized as unbalanced.

At the end, we see a short summary of the changes made to the model. We also have the option of reading the full summary. This summary was created by CobraMod using the argument filename. This full summary includes all reactions, metabolites, genes, and groups in the model after using add_pathway(). Here, we show the last 30 lines of the file.

[10]:

!tail summary.txt -n 32

New:
Reactions:
[]
Metabolites:
[]
Exchange:
[]
Demand:
[]
Sinks:
[]
Genes:
[]
Groups:
['Homoserine-PWY']

Removed:
Reactions:
[]
Metabolites:
[]
Exchange:
[]
Demand:
[]
Sinks:
[]
Genes:
[]
Groups:
[]

We can use COBRApy to inspect the pathways, i.e. groups in the model. Using the method Model.groups.get_by_id() we can search for the new pathway using its identifier 'Homoserine-PWY. In the cell below, the reactions and genes of the pathway are listed. Moreover, we can see the attributes for the pathway visualization.

The method Pathway.visualize() generates pathway maps which can be easily customized with user-defined colors, flux ranges, and gradient settings using Escher. For instance, we can change the color of the reaction arrows to green for positive flux values and to red for negative flux values by setting the respective attributes of the pathway and we can pass on a COBRApy flux solution with the argument solution_fluxes.

[11]:

model.groups.get_by_id("Homoserine-PWY")

[11]:

Pathway identifier	Homoserine-PWY
Name
Memory address	0x0136297760899344
Reactions involved	SAL, MeDH, HAL, HSK, THRS, ACALD, SERD_L, THRA, FDH, HAT
Genes involved	b0003, b0004, b0351, b1241, b3117, b3708, b4471, b2797, b1814, b0870, b2551
Visualization attributes	vertical = False color_negative = None color_positive = None color_quantile = False

[12]:

solution01 = model.optimize()
print(solution01)
model.groups.get_by_id("Homoserine-PWY").color_negative = "red"
model.groups.get_by_id("Homoserine-PWY").color_positive = "green"
model.groups.get_by_id("Homoserine-PWY").visualize(solution_fluxes=solution01)

Visualization saved in "pathway.html"

<Solution 0.180 at 0x7bf64b8ce330>

6.2.4. Adding cross-references¶

CobraMod queries MetaNetx [4] and tries to find the cross-references of the metabolites and reactions in the model and adds them to the corresponding objects. To show this functionality, we show the annotation of the 300th reaction in the original model. As we can see, the reaction already has multiple annotations.

[13]:

# Using the original model without any modification
test_ = read_sbml_model("iML1515.xml")
test_.reactions[300].annotation

[13]:

{'sbo': 'SBO:0000176',
 'bigg.reaction': 'G3PT',
 'biocyc': 'META:RXN-14965',
 'ec-code': ['3.1.3.2', '3.1.3.21'],
 'kegg.reaction': 'R00841',
 'metanetx.reaction': 'MNXR99894',
 'sabiork': '1182',
 'seed.reaction': 'rxn00610'}

We can extend the cross-references of the model by using the function add_crossreferences() and passing our model in the argument object. In addition, we specify the directory where our data is saved (‘dir data’) in the argument directory. The function displays a progress bar and the amount of time it takes to query all entries (See Using the automatic cross-reference expansion in the documentation).

NOTE: It might take a while to finish gathering all the crossreferences if the model is very large

[14]:

from cobramod.core.crossreferences import add_crossreferences

add_crossreferences(object=model, directory=dir_data)

100%|█████████████████████████████████████████████| 4595/4595 [11:51<00:00,  6.46it/s]

Now, we can see the annotation of the 300th reaction of the model. As we can see, the list of cross-references has increased.

[15]:

model.reactions[300].annotation

[15]:

{'sbo': 'SBO:0000176',
 'bigg.reaction': ['R_G3PT', 'G3PT'],
 'biocyc': 'META:RXN-14965',
 'ec-code': ['3.1.3.21', '3.1.3.2'],
 'kegg.reaction': 'R00841',
 'metanetx.reaction': 'MNXR99894',
 'sabiork': '1182',
 'seed.reaction': 'rxn00610',
 'sabiorkr': '1182',
 'rhear': ['66374', '66372', '66375', '66373'],
 'biggr': ['R_G3PT', 'G3PT'],
 'keggr': 'R00841',
 'metacyc.reaction': 'RXN-14965',
 'seedr': 'rxn00610',
 'rhea': ['66374', '66372', '66375', '66373'],
 'metacycr': 'RXN-14965',
 'sabiork.reaction': '1182',
 'brenda': ['3.1.3.21', '3.1.3.2']}

6.2.5. Saving and reloading the model¶

After modifying the E. coli model we can save it in our current working directory using the native COBRApy function write_sbml_model.

[16]:

%%capture
write_sbml_model(cobra_model=model, filename="iML1515_homoserine.sbml")

For demonstration purposes we delete the previous model from our environment and load the E. coli model with the new homoserine cycle. To do so, we use the native COBRApy function read_sbml_model() with the path of the saved model.

[17]:

%%capture
del model
model = read_sbml_model("iML1515_homoserine.sbml")

[18]:

model

[18]:

Name	iML1515
Memory address	7bf633ddd250
Number of metabolites	1878
Number of reactions	2717
Number of genes	1516
Number of groups	1
Objective expression	1.0BIOMASS_Ec_iML1515_core_75p37M - 1.0BIOMASS_Ec_iML1515_core_75p37M_reverse_35685
Compartments	cytosol, extracellular space, periplasm

Additionally, we can see that the new annotations for the model are saved properly.

[19]:

model.reactions[300].annotation

[19]:

{'sbo': 'SBO:0000176',
 'bigg.reaction': ['R_G3PT', 'G3PT'],
 'biocyc': 'META:RXN-14965',
 'ec-code': ['3.1.3.21', '3.1.3.2'],
 'kegg.reaction': 'R00841',
 'metanetx.reaction': 'MNXR99894',
 'sabiork': '1182',
 'seed.reaction': 'rxn00610',
 'sabiorkr': '1182',
 'rhear': ['66374', '66372', '66375', '66373'],
 'biggr': ['R_G3PT', 'G3PT'],
 'keggr': 'R00841',
 'metacyc.reaction': 'RXN-14965',
 'seedr': 'rxn00610',
 'rhea': ['66374', '66372', '66375', '66373'],
 'metacycr': 'RXN-14965',
 'sabiork.reaction': '1182',
 'brenda': ['3.1.3.21', '3.1.3.2']}

The model was loaded correctly and now behaves like a regular COBRApy Model object. However, CobraMod uses a child-class of cobrapy.Group to create the corresponding pathways. When loaded with COBRApy, these pathways objects lose their functionality. To work around this, we must convert the groups back into Pathway objects to be able to use Escher. To do so, we run the CobraMod function model_convert() which restores all pathway’s functionalities. (See Converting COBRApy groups back to CobraMod pathways in the documentation).

[20]:

from cobramod import model_convert

# Before
model.groups.get_by_id("Homoserine-PWY")

model_convert(model)

# After transformation
display(model.groups.get_by_id("Homoserine-PWY"))

model.groups.get_by_id("Homoserine-PWY").visualize()

Group-object "Homoserine-PWY" was transformed to a Pathway-object.

Pathway identifier	Homoserine-PWY
Name
Memory address	0x0136297856290928
Reactions involved	MeDH, HAT, THRS, THRA, HSK, SAL, ACALD, SERD_L, FDH, HAL
Genes involved	b0004, b2551, b0870, b0003, b0351, b1241, b2797, b1814, b3708, b3117, b4471
Visualization attributes	vertical = False color_negative = None color_positive = None color_quantile = False

Visualization saved in "pathway.html"

6.2.6. Using Memote¶

To assess the quality of our model we use the Memote test suite [3] and run the memote diff command. This command generates a report that details the differences between multiple models, i.e. we compare the original model with the extended model that used CobraMod. The output is saved as diff.html. When we open the file, we can see the results for the different MEMOTE tests and the overall score of the model. In this case, we see that the model’s overall score improved.

[21]:

%%capture
!memote report diff --filename ./_static/diff.html iML1515.xml iML1515_homoserine.sbml

[22]:

from IPython.display import IFrame
IFrame(src='./_static/diff.html', width=900, height=600)

[22]:

6.2.7. Reading the debug file¶

When creating and extending metabolic models, it is important to document the curation process to facilitate easy understanding and reproducibility of the model. To this end, CobraMod writes in a log file documenting all curation steps that were performed. The cell below shows the last 20 lines of this log file (debug.log). We can see that a non-zero-flux test was performed for every newly added reaction. These reactions were then added to the pathway object, and this object was then added to the model. At the end, we can see the transformation of the COBRApy group into a Pathway-object.

[23]:

!tail logs/cobramod_20240219.log -n 20

[23:49:08] INFO Reaction "SAL" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'SERD_L' passed.
[23:49:08] INFO Reaction "SERD_L" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'HAL' passed.
[23:49:08] INFO Reaction "HAL" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'HAT' passed.
[23:49:08] INFO Reaction "HAT" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'HSK' passed.
[23:49:08] INFO Reaction "HSK" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'THRS' passed.
[23:49:08] INFO Reaction "THRS" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'THRA' passed.
[23:49:08] INFO Reaction "THRA" added to group "Homoserine-PWY".
[23:49:08] INFO Non-zero flux test for reaction 'ACALD' passed.
[23:49:08] INFO Reaction "ACALD" added to group "Homoserine-PWY".
[23:49:09] INFO Non-zero flux test for reaction 'FDH' passed.
[23:49:09] INFO Reaction "FDH" added to group "Homoserine-PWY".
[23:49:09] INFO Visualization saved in "pathway.html"
[00:01:10] INFO Group-object "Homoserine-PWY" was transformed to a Pathway-object.
[00:01:10] INFO Visualization saved in "pathway.html"

6.2.8. References¶

Monk, J. M., Lloyd, C. J., Brunk, E., Mih, N., Sastry, A., King, Z., … Palsson, B. O. (2017). IML1515, a knowledgebase that computes Escherichia coli traits. Nature Biotechnology, 35(10), 904–908. doi: 10.1038/nbt.3956
He, H., Höper, R., Dodenhöft, M., Marlière, P., & Bar-Even, A. (2020). An optimized methanol assimilation pathway relying on promiscuous formaldehyde-condensing aldolases in E. coli. Metabolic Engineering, 60, 1–13. doi: 10.1016/j.ymben.2020.03.002
Beber, M., & Lieven, C. (2020). memote—The genome-scale metabolic model test suite (Version 0.11.1). Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark.
Moretti, S., Tran, V. D. T., Mehl, F., Ibberson, M., & Pagni, M. (2021). MetaNetX/MNXref: Unified namespace for metabolites and biochemical reactions in the context of metabolic models. Nucleic Acids Research, 49(D1), D570–D574. doi: 10.1093/nar/gkaa992