Human 1.17 #700

haowang-bioinfo · 2023-09-25T21:42:37Z

Main improvements in this PR:

Fixes (PR fix: update name for GAP in CellfieConsensus metabolic tasks #696)
- update name for GAP in Cellfie Consensus metabolic tasks
Features (PR fix: refine animal GEM updating pipeline #692)
- sync geneShortNames field (introduced by add gene name to Human-GEM #539) by padding with empty elements in function addMetabolicNetwork
- enable updateAnimalGEM with option of reusing Human-GEM biomass equation
Fixes (PR Remove PPARA Reactions #678)
- remove 9 reactions involving MAM01268 or MAM02757 in all compartments, and 10 metabolites and 22 genes that only participated in these reactions, as discussed in the comments on Pairs of reactions with identical stoichiometry but different GPRs #580
Fixes (PR Merge Ammonium/Ammonia Duplicates Into Single Reactions #677)
- remove MAR01515, MAR00039, MAR00096, MAR02027, MAR02006 that are duplicates of MAR08529, MAR07992, MAR04819, MAR04302, MAR03873, respectively, as proposed in Ammonia/Ammonium Duplicate Reactions #665
- merge references from removed reactions into kept ones
Fixes (PR Make Citrate Synthase Irreversible #676)
- convert MAR04145, the citrate synthase reaction, irreversible, as proposed in Citrate Synthase Is Not Reversible In Humans #673
- swap the products and reactants to have non-negative fluxes, and add textbook reference to rxnNotes
Features/Fixes (PR feat: add basic memote tests #672, fix: GH workflow test #635)
- fix GH workflow test of the import-export cycle for resolving bug identified in development branch #628
- introduce basic MEMOTE tests in the yaml-validation workflow, including checks for duplicate reactions and ensure all reactions have at least one metabolite
Fixes (PR Fix FAD/Oxygen-Peroxide Duplicate Reactions #670)
- remove MAR00059 for being a less accurate version of MAR01706, as discussed in FAD/Oxygen-Peroxide Duplicate Reactions #639
- change the GPR of MAR01706 to "ENSG00000113790 or ENSG00000133835" and add PMID:19357427 as reference
- remove MAR01639, MAR03301, MAR03306, MAR03311, MAR04966, MAR03364, MAR03369 that are duplicates of MAR01638, MAR02804, MAR02808, MAR02810, MAR02818, MAR03276, MAR03289, respectively
- create new reaction MAR20114 to represent reduction of O2 by FADH2: MAM02630x + MAM01803x -> MAM02041x + MAM01802x, GPR: ENSG00000161533 or ENSG00000168306 or ENSG00000087008 or ENSG00000110887 or ENSG00000203797 or ENSG00000007171 or ENSG00000148832 or ENSG00000179761 or ENSG00000158125, with reference: PMID:30378035
- replace MAM02630x (O2) with MAM01802x (FAD) and MAM02041x (H2O2) with MAM01803x (FADH2) in a total of 120 reactions, as detailed in FAD/Oxygen-Peroxide Duplicate Reactions #639
Fixes (PR Remove Mitochondrial Bile Acid Synthesis Pathway #669)
- remove implausible mitochondrial version Bile Acid Synthesis pathway (13 reactions and 8 metabolites), as outlined in Duplicate Bile Acid Synthesis Reactions #637
Fixes (PR Fix duplicate metabolites identified by ChEBI IDs(2) #667)
- merge MAM03231 into MAM03232, as reported in Suspected duplicated metabolites from ChEBI #627
- update ChEBI ID for MAM02679, MAM02579, MAM02043, MAM03887
Features/Fixes (PR feat: log duplicated ids to the column of rxnRetired #666)
- improves documentation of deprecated reactions in Fix FAD/Ubiquinone Duplicates #613 by adding them to corresponding rows "rxnRetired" column in reactions.tsv, as a tentative implementation according to the discussion in How to better document deprecated met/rxn ids #615
Fixes (PR add Alpha subunits of sodium channel genes to MAR01527 #664)
- update GPR of reversible transport reaction MAR01527 of sodium across cell membrane by expanding 19 transporter genes based on TCDB and literature, as discussed in Inaccurate GPR of MAR01527 #583
Fixes (PR Fix duplicate metabolites identified by HMDB IDs #663)
- remove mets MAM03318 and MAM03325 for being duplicates of MAM01435 and MAM00684, respectively, as discussed in Suspected duplicated metabolites from HMDB #617
- remove rxns MAR02276, MAR02277, MAR02384, MAR02390, MAR02392, MAR02394, MAR04144 and MAR10447 for being affected by duplicate mets
- change HMDB ID annotations for MAM02679 and MAM01090
Fixes (PR Fix duplicate metabolites identified by PubChem IDs(2) #662)
- remove met MAM03540, for being duplicate of MAM01729, and rxn MAR04506, as discussed in Suspected duplicated metabolites from PubChem #622
- change KEGG ID annotations for MAM01230, MAM02676, MAM02647, MAM03021, MAM00531, MAM02678, MAM02679
Fixes (PR Fix duplicate metabolites identified by KEGG IDs(2) #661)
- remove mets MAM00591 and MAM03162 for being duplicates of MAM0270 and MAM1739, respectively, as discussed in Suspected duplicated metabolites from KEGG #621
- remove rxns MAR00941, MAR06142, MAR06143, MAR06144, MAR06220, MAR06252, MAR10025, MAR09733, MAR09734 for being affected by duplicate mets
- update annotation for MAM01322, MAM02043, MAM02579, MAM02679
Fixes (PR Remove MAR03767 #660)
- remove MAR03767 for being duplicate of MAR06421+MAR06422, as proposed in Proposed Removal of MAR03767 #633
- update duplication info to "rxnRetired" column in reactions.tsv
Fixes (PR Remove lipid droplet breakdown reaction MAR00032 #658)
- remove lipid droplet reaction MAR00032 and associated non-metabolic genes ENSG00000072062, ENSG00000142875, ENSG00000165059, ENSG00000172531, ENSG00000186298, and ENSG00000213639, as proposed in Proposed removal of MAR00032 #631
Fixes (PR Merge Reactions With Same Stoichiometry But Different Genes #657)
- remove MAR02358, MAR20066, MAR20020, MAR20002, MAR00030 that are duplicates of MAR08482, MAR03984, MAR04521, MAR06631, MAR00025, respectively, as reported in Pairs of reactions with identical stoichiometry but different GPRs #580 and Suspected duplicated reactions #630
- update the pairing info to "rxnRetired" column for kept reactions in reactions.tsv
- merge annotations, references, E.C. codes, etc. from removed reactions into kept reactions
- make MAR08482 reversible
- revise compartment-specific GPRs for MAR08482 and MAR03931
Fixes (PR Fix Sodium Cotransport GPRs #655)
- remove MAR05310 for being partially duplicate to MAR06380, and merge its annotation into that of MAR06380, as proposed in Proposed changes to Sodium Transport GPRs #569
- revise GPR of MAR06380 to ENSG00000017483 or ENSG00000188338
- convert MAR06380 to reversible
- removes MAR09887 for being partially duplicate to MAR06896
Fixes (PR Fix duplicate metabolites identified by LipidMaps IDs #650)
- remove met MAM00759 for being duplicates of MAM01314, as investigated in Suspected duplicated metabolites from LipidMaps #620
- update LipidMaps ID annotation for MAM00892, MAM00895, MAM00057, MAM02647
Fixes (PR Fix duplicate metabolites identified by EHMN IDs #649)
- remove mets MAM03354, MAM03332, MAM00029, MAM03368, MAM00902 for being duplicates of MAM00131, MAM01696, MAM00028, MAM00748, and MAM00853, respectively, as discovered in Suspected duplicated metabolites from EHMN #616
- remove rxns MAR03441, MAR02588, MAR10164, MAR10433 for being affected by duplicate mets
- change EHMN ID annotations for MAM01123
Fixes (PR Fix duplicate metabolites identified by BiGG IDs #646)
- remove mets MAM03226, MAM03884, MAM03554 for being duplicates of MAM00785, MAM02746, and MAM00933, respectively, as discussed in Suspected duplicated metabolites from Bigg #610
- remove rxns MAR00042, MAR01617, MAR00740, MAR00659, MAR05323, MAR01709 for being affected by duplicate mets
- change BiGG ID annotations for MAM01746, MAM00078, MAM02328, MAM02008
Fixes (PR Make Reversible PPi Reactions Irreversible, Part 2 #641)
- convert pyrophosphate-producing MAR02786, MAR02802, MAR03388, MAR03390, MAR09508, MAR01845, MAR09523, and MAR09546 irreversible, as mentioned in (d)NTP <-> (d)NMP + PPi Reactions Should Be Irreversible #527
Fixes (PR Merge Pairs Of Irreversible Transport Reactions #625)
- merge 31 pairs of irreversible transport reactions, reported in Pairs of irreversible transport reactions that should probably be merged into single reversible reactions #562, into single reversible reactions
Fixes (PR Fix Acetyl-CoA Carboxylase Reactions #624)
- tidy up acetyl-CoA carboxylase reactions, as proposed in Problems With Acetyl-CoA Carboxylase Reactions #593:
Fixes (PR Fix FAD/Ubiquinone Duplicates #613)
- remove 15 FAD/Ubiquinone duplicate reactions, as detailed in FAD/Ubiquinone Duplicate Reactions #607
Fixes (PR fix: adapt simplifyGrRules to updates in Matlab children function #494)
- modify simplifyGrRules function to fix compatibility with Matlab "children" function, which has been changed since 2020b, to resolve issue described in update use of function children in simplifyGrRules #493

… but remove the FAD-dependent version of the SDH reaction

…rlier

…AR07673

…thesis (even-chain)

… MAR07672

… [m]

…n to fail

…ich I apparently forgot to do earlier

…kept reactions

…pt reactions in model/reactions.tsv

…kept ones

…ons.tsv

- This commit includes fixing typos, and reverting back to original way of extracting subsystem values

…tion direction

…_synth Remove Mitochondrial Bile Acid Synthesis Pathway

fix: update name for GAP in CellfieConsensus metabolic tasks

remove rows 'MAR01639', 'MAR01635', 'MAR01637', 'MAR01653', 'MAR01656' from "reactions.tsv"

Fix FAD/Oxygen-Peroxide Duplicate Reactions

JonathanRob

Wow, a lot of changes in this update - great to see the ongoing progress and improvements!

mihai-sysbio

Very nice!
@haowang-bioinfo will we see the essentiality results, like last time?

feiranl

LGTM!

haowang-bioinfo · 2023-09-27T11:21:23Z

@haowang-bioinfo will we see the essentiality results, like #645 (comment)?

yes, indeed!

haowang-bioinfo · 2023-09-28T07:22:23Z

updated Hart2015 essentiality results:

'cellLine'	'TP'	'TN'	'FP'	'FN'	'accuracy'	'sensitivity'	'specificity'	'F1'	'MCC'	'Penr'	'logPenr'	'PenrAdj'	'logPenrAdj'
'DLD1'	105	2077	166	211	0.852676826885502	0.332278481012658	0.925991975033437	0.357751277683135	0.276134233976690	4.77088648129624e-33	32.3214009169421	1.43126594438887e-32	31.8442796622224
'GBM'	105	2061	166	226	0.846755277560594	0.317220543806647	0.925460260440054	0.348837209302326	0.264662129889304	5.62054079140837e-31	30.2502218959211	1.12410815828167e-30	29.9491919002571
'HCT116'	135	2117	143	220	0.861185468451243	0.380281690140845	0.936725663716814	0.426540284360190	0.352278370608797	1.14499825836663e-52	51.9411951739200	6.86998955019978e-52	51.1630439235364
'HELA'	89	2143	189	195	0.853211009174312	0.313380281690141	0.918953687821612	0.316725978647687	0.234526274235307	5.49947761890298e-25	24.2596785610516	8.24921642835447e-25	24.0835873019959
'RPE1'	70	2084	201	203	0.842064112587959	0.256410256410256	0.912035010940919	0.257352941176471	0.168991753181488	3.40109542793128e-14	13.4683811824623	4.08131451351753e-14	13.3891999364147
'all'	41	2263	237	75	0.880733944954129	0.353448275862069	0.905200000000000	0.208121827411168	0.172768250444715	2.39047070704732e-13	12.6215165738028	2.39047070704732e-13	12.6215165738028

haowang-bioinfo · 2023-09-28T07:32:59Z

Updated essentiality evaluation using combined (all) Hart2015 datasets:

version	TP	TN	FP	FN	accuracy	sensitivity	specificity	F1	MCC
v1.12	40	2333	175	77	0.904000000000000	0.341880341880342	0.930223285486443	0.240963855421687	0.204768560393159
v1.13	40	2334	174	77	0.904380952380952	0.341880341880342	0.930622009569378	0.241691842900302	0.205504558241293
v1.14	40	2334	175	77	0.904036557501904	0.341880341880342	0.930251096054205	0.240963855421687	0.204787829413435
v1.15	40	2342	168	77	0.906737723639132	0.341880341880342	0.933067729083665	0.246153846153846	0.210053956889428
v1.16	41	2308	202	76	0.89417587	0.35042735	0.91952191	0.22777778	0.19220101
v1.17	41	2263	237	75	0.880733944954129	0.353448275862069	0.905200000000000	0.208121827411168	0.172768250444715

The trend of MCC is not very promising

mihai-sysbio · 2023-09-28T08:24:56Z

The trend of MCC is not very promising

Indeed. How about we pause this release and investigate deeper?

JonathanRob · 2023-09-28T08:31:01Z

It looks like the number of false positives is increasing quite a bit. This could be the result of e.g., removal of duplicate pathways, removal of incorrect isozymes, or closing of some infeasible loops - all of which are generally good things.

So I agree with @mihai-sysbio that it would be worth investigating, but I also wouldn't be too alarmed.

haowang-bioinfo · 2023-09-28T11:38:23Z

it's a good idea to investigate further for the MCC declining, any attempt is welcome.

while the pause of this release doesn't make sense, because:

Hart2015 is a limited dataset and essentiality is only one dimension of assessment;
Every single change has been evidence-based, openly discussed, and transparently tracked; this means we should move on
we may be choking on food, but still have to eat

mihai-sysbio · 2023-09-28T12:56:06Z

For the sake of the debate, I'm going to argue for the pause. Don't get me wrong, I am not entirely convinced pausing is the best way forward. I am just willing to engage in the debate.

A pause is not a cancellation, it's a delay, and the model can still be used. The model is openly available, even though not in the main branch, and it an still be uniquely identified with the commit hash, as visible in the permalink. However, it's only the yml format.

Sure, Hart2015 is a limited dataset, but maybe it's time to make the time investment to set the essentiality to run with Depmap. In my mind this is an unavoidable progressive step, and the latest essentiality results are enough of a motivation.
If I am to extrapolate, for me it would not be acceptable to have an evidence-based model with MCC of -0.3. That's obviously an extreme (implausible?) scenario. What I'm trying to say is that the condition of evidence-based is necessary but not sufficient.
Sorry, I'm not sure what this means.

Moreover, now the information is still relatively fresh, so digging into this now might be more comfortable than in say in 6 months. And what we will learn from this process will also be benefiting the next release.

mihai-sysbio · 2023-09-28T14:07:00Z

Some more thoughts:

The way I've mapped the MCC score in my head is like this: the interval [-1, 0] is "worse than useless," and the interval [0, 1] corresponds to 0% to 100% usefulness. With this perspective, a reduction of 5% in usefulness from 21% usefulness is not good. Of course, this thinking really abuses the numbers.

Another thing: if a new release would be affecting the MCC by say 0.2 (instead of 0.05 like now) we would likely take it very seriously. So then the question is not if but how much of a MCC change is acceptable with a release.

haowang-bioinfo · 2023-09-28T20:09:26Z

Debate is good, it often makes things clearer. Probably my previous message is a bit ambiguous, so try again.

My key point is investigating MCC and making this release can be separated, and proceed in parallel. They are not conflicting with each other, no need to pause one for another.

Sure, Hart2015 is a limited dataset, but maybe it's time to make the time investment to set the essentiality to run with Depmap. In my mind this is an unavoidable progressive step, and the latest essentiality results are enough of a motivation.

yes of course, I totally agree to check this out. Actually this and all previous releases, as well as all involving transparent changes, enable this kind of check.

If I am to extrapolate, for me it would not be acceptable to have an evidence-based model with MCC of -0.3. That's obviously an extreme (implausible?) scenario. What I'm trying to say is that the condition of evidence-based is necessary but not sufficient.

Agree, literature evidence, MCC, and some other indicators may still not be sufficient. But this shouldn't stop rational changes with critical review.

Moreover, now the information is still relatively fresh, so digging into this now might be more comfortable than in say in 6 months. And what we will learn from this process will also be benefiting the next release.

yes, just go ahead please

Another thing: if a new release would be affecting the MCC by say 0.2 (instead of 0.05 like now) we would likely take it very seriously. So then the question is not if but how much of a MCC change is acceptable with a release.

I'm not exactly sure what cutoff should be used for evaluating MCC, which is only one dimension of assessment. So any follow-up analysis in clarifying this is welcome

JonathanRob and others added 30 commits September 15, 2022 14:13

fix: adapt simplifyGrRules to updates in Matlab children function

8e05f05

fix: remove ubiquinone-dependent versions of all duplicated reactions…

383970a

… but remove the FAD-dependent version of the SDH reaction

fix: update reactions data files

924dd73

fix: deleted extra omap line that I missed when removing reactions ea…

676eaf2

…rlier

fix: correct spelling errors in names of MAM00185c and MAM00185m

20076fc

fix: remove mitochondrial ACACB from GPRs of cytosolic MAR07672 and M…

75b0170

…AR07673

fix: change subsystems for MAR07672 and MAR07673 to Fatty acid biosyn…

1dfe54c

…thesis (even-chain)

fix: remove MAR04156 for being redundant with MAR07672+MAR07673

d4caa65

fix: add new metabolite MAM01422m

a5c4d76

fix: edited MAR04295 to become mitochondrial version of MAR07673

7965567

fix: made new reaction MAR20112 to represent mitochondrial version of…

c799c7d

… MAR07672

fix: made new reaction MAR20113 to transport MAM00185 between [c] and…

a52e66c

… [m]

fix: resolved merge conflict in deprecatedReactions.tsv

ad198f7

fix: fixed typo in MAR20111 that was causing automated YAML validatio…

6f266d4

…n to fail

fix: typo in model/reactions.tsv

dacbfca

fix: merge reactions

65bd88a

fix: add IVD to GPR of MAR03770

76edfde

fix: remove MAR11652, which I apparently forgot to do earlier

1b670e4

fix: moved MAR11651 from reactions.tsv to deprecatedReactions.tsv, wh…

37a92e5

…ich I apparently forgot to do earlier

fix: add references and EC codes from deleted reactions to duplicate …

f04966d

…kept reactions

fix: merge annotations from deleted reactions into annotations for ke…

9269485

…pt reactions in model/reactions.tsv

fix: remove MAR02369 for being redundant with MAR03149

f46d539

fix: merged references, EC codes, and GPRs of deleted reactions into …

24a00d2

…kept ones

fix: merged annotations of removed reactions with kept ones in reacti…

30a0613

…ons.tsv

feat: improve test output for unused entities

f7f31f5

test: introduce a bug to see if it is picked up by the workflow

db8ca8c

fix: typos introduced in #606

e118e1d

- This commit includes fixing typos, and reverting back to original way of extracting subsystem values

test: revert back testYamlConversion to the original

3be0333

refactor: test yaml conversion

5410ebf

fix: make reversible NTP <-> PPi reactions irreversible in PPi-produc…

7f4f02b

…tion direction

haowang-bioinfo and others added 10 commits August 21, 2023 22:50

Merge pull request #669 from SysBioChalmers/fix/remove_mito_bile_acid…

95d30b7

…_synth Remove Mitochondrial Bile Acid Synthesis Pathway

fix: update name for GAP

797b1b3

Merge pull request #696 from exaexa/mk-fix-gap

5bade46

fix: update name for GAP in CellfieConsensus metabolic tasks

Merge branch 'develop' into fix/FAD_O2_dupes

cc2b5ca

fix: re-remove duplicated reaction MAR01639

4fa0d28

fix: re-remove deprecated reactions from "reactions.tsv"

452c1e8

remove rows 'MAR01639', 'MAR01635', 'MAR01637', 'MAR01653', 'MAR01656' from "reactions.tsv"

fix: removed MAR00059 for being less accurate version of MAR01706

1f3527d

Fixed GPR of MAR01706 and added reference

5852d9f

chore: resolve merge conflicts

d5dbfe1

Merge pull request #670 from SysBioChalmers/fix/FAD_O2_dupes

be554ff

Fix FAD/Oxygen-Peroxide Duplicate Reactions

haowang-bioinfo requested review from feiranl, Devlin-Moyer, mihai-sysbio and JonathanRob September 26, 2023 14:45

JonathanRob approved these changes Sep 26, 2023

View reviewed changes

mihai-sysbio approved these changes Sep 27, 2023

View reviewed changes

feiranl approved these changes Sep 27, 2023

View reviewed changes

Devlin-Moyer approved these changes Sep 27, 2023

View reviewed changes

haowang-bioinfo merged commit 8643327 into main Sep 30, 2023
4 checks passed

mihai-sysbio mentioned this pull request Oct 6, 2023

feat: gene essentiality workflow #675

Merged

3 tasks

mihai-sysbio mentioned this pull request Nov 28, 2023

add qualitative workflows #732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Human 1.17 #700

Human 1.17 #700

haowang-bioinfo commented Sep 25, 2023 •

edited

Loading

JonathanRob left a comment

mihai-sysbio left a comment

feiranl left a comment

haowang-bioinfo commented Sep 27, 2023

haowang-bioinfo commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023 •

edited

Loading

mihai-sysbio commented Sep 28, 2023

JonathanRob commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023

mihai-sysbio commented Sep 28, 2023

mihai-sysbio commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023

Human 1.17 #700

Human 1.17 #700

Conversation

haowang-bioinfo commented Sep 25, 2023 • edited Loading

Main improvements in this PR:

JonathanRob left a comment

Choose a reason for hiding this comment

mihai-sysbio left a comment

Choose a reason for hiding this comment

feiranl left a comment

Choose a reason for hiding this comment

haowang-bioinfo commented Sep 27, 2023

haowang-bioinfo commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023 • edited Loading

mihai-sysbio commented Sep 28, 2023

JonathanRob commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023

mihai-sysbio commented Sep 28, 2023

mihai-sysbio commented Sep 28, 2023

haowang-bioinfo commented Sep 28, 2023

haowang-bioinfo commented Sep 25, 2023 •

edited

Loading

haowang-bioinfo commented Sep 28, 2023 •

edited

Loading