Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user defined biomass function #156

Closed
hariszaf opened this issue Nov 21, 2022 · 10 comments
Closed

user defined biomass function #156

hariszaf opened this issue Nov 21, 2022 · 10 comments

Comments

@hariszaf
Copy link

Hello!

I noticed in your documentation that you're working on a
user-defined biomass reaction module.

Is there any progress with that?

If no, I would be really interested to contribute with that.

@hariszaf
Copy link
Author

With that in mind, could you please explain what the link in the biomass function stand for?
For example here.

@jotech
Copy link
Owner

jotech commented Nov 22, 2022

Hid @hariszaf
The field 'link' specifies an additional substance that has to be added for each unit of the biomass component to balance the product side of the biomass reaction. The overall biomass reaction is not balanced to simulate the consumption by biosynthesis during growth. Nonetheless, some metabolites, like water and ppi, must be recycled.

Please correct me @Waschina, if I missed something.

From my side, you are most welcome to work on a user-defined biomass reaction; thank you! I will talk to Silvio, and we will come back to you soon :)

@hariszaf
Copy link
Author

Hi @jotech,

thanks a lot for clarifying the link field.

I am now building a biomass function for a model of mine, so once I have this, we could discuss if you d like what one should keep in mind in terms of using gapseq with a user-defined biomass function.

I just made a fork of the repo and I d be happy to contribute (if I ll manage 😅

@jotech
Copy link
Owner

jotech commented Nov 23, 2022

sounds like a good plan :)

@Waschina
Copy link
Collaborator

Waschina commented Dec 1, 2022

Hi @hariszaf

I guess the easiest and most intuitive way is to provide a path to the custom-biomass file in json format with the parameter -b in the module gapseq draft.

Currently, the option -b expects the values "pos", "neg", "archaea", or "auto". I can change the code in the way, that gapseq recognizes if a path to a json file is provided instead of one of the preset-biomasses.

@hariszaf
Copy link
Author

hariszaf commented Dec 1, 2022

Hi @Waschina

provide a path to the custom-biomass file in json format with the parameter -b

Yes, that makes sense.

My question is if there should be something like a validation check for the user-defined biomass function.
Or something like a documentation of what one should consider for the .json file.

@Waschina
Copy link
Collaborator

Waschina commented Dec 1, 2022

There are already some consistency checks included in gapseq. We might need to add more checks as we come across potential pitfalls.

A draft documentation for the json-file format for biomasses:

The main structure of the required json file should be:

{ "id" : "User-Biomass",
  "name" : "A user-defined biomass composition",
  "ref" : "optionally a reference", 
  "energy_GAM" :  30,
  "domain" : "Bacteria",
  "met_groups" : [
    {
      "group_name" : "DNA",
      "mass" : 0.05,
      "unit_group" : "g",
      "unit_components" : "MOLFRACTION",
      "components" : [
        
        # List of DNA component molecules
        
      ]
    },
    
    {
      "group_name" : "RNA",
      "mass" : 0.15,
      "unit_group" : "g",
      "unit_components" : "MOLFRACTION",
      "components" : [
        
        # List of RNA component molecules
        
      ]
    },
    
    {
      "group_name" : "Protein",
      "mass" : 0.65,
      "unit_group" : "g",
      "unit_components" : "MOLFRACTION",
      "components" : [
        
        # List of protein component molecules
        
      ]
    },
    
    {
      "group_name" : "An additional component group",
      "mass" : 0.15,
      "unit_group" : "g",
      "unit_components" : "MOLFRACTION",
      "components" : [
        
         # List of the additional group's component molecules
        
      ]
    },
  ]
}

Biomass components should assigned to the different groups. Usually these groups are DNA, RNA, Protein, Others (e.g. inorganics, lipids, cell wall components, etc...). The total sum of each group shout be stated in the group parameters and in the unit gram. The sum of all groups should be 1g.

the field unit_components can be "MOLFRACTION" and "MOLSPLIT". Both would result in the same effect. Basically, the molar coefficients for the group component are scaled to result in the exact mass in gram that is specified for the group. The only difference between "MOLFRACTION" and "MOLSPLIT" is that "MOLFRACTION" throws a warning if the summed coefficients of the group's components do not add up to the value of 1.

Then we have component entries:

The format for each component is for instance:

        {
          "id"   : "cpd00161",
          "name" : "L-Threonine",
          "comp" : "c",
          "coef" : 0.0474316079511907,
          "link" : "cpd00001:-1"
        }, 
  • id Refers to the metabolite ID in ModelSEED/gapseq
  • name - Provide a name for the metabolite
  • c - From which compartment should the metabolite be taken for the biomass? Usually this should be "c" for cytosol.
  • coef - molar coefficient of the metabolite relative to all components within the same group
  • link - When this metabolite is flowing into the biomass, should also other metabolites be consumed/produced with it?

The link is optional. For instance, if amino acids are elongated to peptides a water molecule is produced with each amino acid added. The format in this case is <compound_id>:<stoichiometry>. In the above example: For every threonine incorporated into the biomass a water molecule (cpd00001) is released. In contrast, for Nukleotides that flow into DNA or RNA, pyrophospate (cpd00012) is produced with each nucleotide. Thus the link here is "link" : "cpd00012:-1".

Hope this helps.

Best
Silvio

Waschina added a commit that referenced this issue Dec 6, 2022
The feature is considered 'advanced usage' as it requires users to
create a biomass reaction definition in JSON format.

See documentation: https://gapseq.readthedocs.io/en/latest/database/biomassReaction.html

Refers to #156
@Waschina
Copy link
Collaborator

Waschina commented Dec 6, 2022

The new feature to provide a user-defined biomass reaction is now implemented. Also, the documentation (https://gapseq.readthedocs.io/en/latest/database/biomassReaction.html) provides details on how to use this feature.

@hariszaf
Copy link
Author

hariszaf commented Dec 6, 2022

Oh that's great @Waschina !
Thanks a lot for providing this and apologies; i could not work on that these weeks.
In any case i suppose this issue can be closed! 💯 🎉

@Waschina
Copy link
Collaborator

Waschina commented Dec 6, 2022

No worries, we had this already half-way implemented in gapseq for a while now but literally were missing the final push :) Let us know if you come across problems/pitfalls with this feature.

@Waschina Waschina closed this as completed Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants