-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database schema #3
Comments
Looks good. Here are some suggestions:
|
I think Infection# was listed as Disease, so I changed its name to Infection#. I added type and subtype as optional fields. I noticed that one (more?) of the Influenza metadata files has accession entries for each segment (see below). Is this the preferred way to store it, or should I also allow only one accession? It looks like I should maybe make two schemas, one for Dengue and one for Influenza, because they have a lot of differences--but that is really up to you.
|
Disease is different, that for Dengue could be DF, DHF1, DHF2, DHF3, and for flu could be severe or mild possibly... So Infection# and Disease options are different. Yes, influenza is submitted to GenBank by segment, so each segment has its own accession number. This is different in the GISAID database (EpiFlu). Here each virus has a unique number (segments are same). |
Got it, thanks for clarifying. |
I propose the following schema for the input CSV files:
https://github.com/averagehat/pux-starter-app/blob/sequence-db/schema.md
The text was updated successfully, but these errors were encountered: