Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to extract Fruitfly-GEM.xml using COBRA #5

Closed
2 of 4 tasks
gmhhope opened this issue Mar 14, 2022 · 4 comments
Closed
2 of 4 tasks

Failure to extract Fruitfly-GEM.xml using COBRA #5

gmhhope opened this issue Mar 14, 2022 · 4 comments
Assignees

Comments

@gmhhope
Copy link

gmhhope commented Mar 14, 2022

Description of the issue:

Trying to parse the Fruitfly-GEM.xml but face errors:
model = cobra.io.read_sbml_model(xmlFile)

---------------------------------------------------------------------------
CobraSBMLError                            Traceback (most recent call last)
File ~/.local/lib/python3.8/site-packages/cobra/io/sbml.py:249, in read_sbml_model(filename, number, f_replace, **kwargs)
    248     doc = _get_doc_from_filename(filename)
--> 249     return _sbml_to_model(doc, number=number, f_replace=f_replace, **kwargs)
    250 except IOError as e:

File ~/.local/lib/python3.8/site-packages/cobra/io/sbml.py:522, in _sbml_to_model(doc, number, f_replace, set_missing_bounds, **kwargs)
    519 for (
    520     gp
    521 ) in model_fbc.getListOfGeneProducts():  # noqa: E501 type: libsbml.GeneProduct
--> 522     gid = _check_required(gp, gp.getIdAttribute(), "id")
    523     if f_replace and F_GENE in f_replace:

File ~/.local/lib/python3.8/site-packages/cobra/io/sbml.py:1352, in _check_required(sbase, value, attribute)
   1351         msg += " with metaId '%s'" % sbase.getName()
-> 1352     raise CobraSBMLError(msg)
   1353 if attribute == "id":

CobraSBMLError: Required attribute 'id' cannot be found or parsed in '<GeneProduct>'.

The above exception was the direct cause of the following exception:

CobraSBMLError                            Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 model = cobra.io.read_sbml_model(xmlFile)

File ~/.local/lib/python3.8/site-packages/cobra/io/sbml.py:263, in read_sbml_model(filename, number, f_replace, **kwargs)
    253 except Exception as original_error:
    254     cobra_error = CobraSBMLError(
    255         "Something went wrong reading the SBML model. Most likely the SBML"
    256         " model is not valid. Please check that your model is valid using "
   (...)
    261         "at https://github.com/opencobra/cobrapy/issues ."
    262     )
--> 263     raise cobra_error from original_error

CobraSBMLError: Something went wrong reading the SBML model. Most likely the SBML model is not valid. Please check that your model is valid using the `cobra.io.sbml.validate_sbml_model` function or via the online validator at http://sbml.org/validator .
	`(model, errors) = validate_sbml_model(filename)`
If the model is valid and cannot be read please open an issue at https://github.com/opencobra/cobrapy/issues .

Then I follow the instruction and do validation

cobra.io.sbml.validate_sbml_model(xmlFile)
It renders the following:

(None,
 {'SBML_FATAL': [],
  'SBML_ERROR': ["E1 (Error): SBML component consistency (fbc, L299114); Allowed fbc attributes on <GeneProduct>; A <GeneProduct> object must have the required attributes 'fbc:id' and 'fbc:label' may have the optional attributes 'fbc:name' and 'fbc:associatedSpecies'. No other attributes from the SBML Level 3 Flux Balance Constraints namespace are permitted on a <GeneProduct> object. \nReference: L3V1 Fbc V3 Section 3.5\n Fbc attribute 'id' is missing from 'geneProduct' object.\n",
   "E3 (Error): SBML component consistency (fbc, L299114); Allowed fbc attributes on <GeneProduct>; A <GeneProduct> object must have the required attributes 'fbc:id' and 'fbc:label' may have the optional attributes 'fbc:name' and 'fbc:associatedSpecies'. No other attributes from the SBML Level 3 Flux Balance Constraints namespace are permitted on a <GeneProduct> object. \nReference: L3V1 Fbc V3 Section 3.5\n Fbc attribute 'id' is missing from 'geneProduct' object.\n"],
  'SBML_SCHEMA_ERROR': [],
  'SBML_WARNING': ['E0 (Warning): General SBML conformance (core, L3); RDF does not contain valid ModelHistory; LibSBML expected to read the annotation into a ModelHistory object. Unfortunately, some attributes were not present or correct and the resulting ModelHistory object will not correctly produce the annotation.  This functionality will be improved in later versions of libSBML. \nReference: L3V1 Section 6.3\n An invalid ModelHistory element has been stored.\n',
   'E2 (Warning): General SBML conformance (core, L3); RDF does not contain valid ModelHistory; LibSBML expected to read the annotation into a ModelHistory object. Unfortunately, some attributes were not present or correct and the resulting ModelHistory object will not correctly produce the annotation.  This functionality will be improved in later versions of libSBML. \nReference: L3V1 Section 6.3\n An invalid ModelHistory element has been stored.\n'],
  'COBRA_FATAL': [],
  'COBRA_ERROR': ["Required attribute 'id' cannot be found or parsed in '<GeneProduct>'."],
  'COBRA_WARNING': [],
  'COBRA_CHECK': []})

Expected feature/value/output:

I would expect to be able to parse the Fruitfly-GEM.xml which will have identifiers synchronized with metabolites.tsv and yml files.

Reproducing these results:

# https://cobrapy.readthedocs.io/en/latest/io.html#SBML
import cobra
import os
import requests

# Human-GEM e.g.,
model_name = 'Fruitfly-GEM'

outputfdr = f"../../output/ATLAS_collection/{model_name}"
try:
    os.makedirs(outputfdr)
except:
        None

BRANCH = 'main'
XML = os.path.join(outputfdr,f'{model_name}.xml')
with open(XML, 'w') as f:
    r = requests.get(f'https://github.com/SysBioChalmers/{model_name}/blob/' + BRANCH + f'/model/{model_name}.xml?raw=true')
    f.write(r.text)

with open(XML, 'r') as f:
    xmlFile = f.read()

model = cobra.io.read_sbml_model(xmlFile)

I hereby confirm that I have:

  • Tested my code with all requirements for running the model
  • Done this analysis in the main branch of the repository
  • Checked that a similar issue does not exist already
  • If needed, asked first in the Gitter chat room about the issue
@gmhhope
Copy link
Author

gmhhope commented Apr 25, 2022

Dear colleagues,

The issue still persists! Please see if you can help update it!

Many thanks,
Minghao Gong

gmhhope added a commit to shuzhao-li-lab/JMS that referenced this issue Apr 26, 2022
## yeast-GEM
yeast-GEM has built from very different original model than Human/Rat/Mouse/Worm
- And there is no `pathway` info in the xml. Also, there is no `metabolite.tsv` under the `model` folder to fetch/updates identifiers (But the original identifiers in the model proved to be good enough I guess

## Fruitfly-GEM
- Cannot read by cobra: see SysBioChalmers/Fruitfly-GEM#5
@haowang-bioinfo haowang-bioinfo self-assigned this Apr 27, 2022
@haowang-bioinfo
Copy link
Member

The issue still persists! Please see if you can help update it!

@gmhhope thanks for reporting this, will fix it asap

@haowang-bioinfo
Copy link
Member

haowang-bioinfo commented May 3, 2022

The problems reported here are due to the invalid gene id (e.g. 5Ptasel) format that does not comply with libSBML standard. This was addressed by a fixing commit 1603557, which introduced a prefix G_ to gene ids, but might not be the optimal solution though.

@haowang-bioinfo
Copy link
Member

This has been fixed in v1.2, free to reopen it if problem stays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants