Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize data source #1775

Closed
lpatiny opened this issue Sep 26, 2022 · 20 comments
Closed

Generalize data source #1775

lpatiny opened this issue Sep 26, 2022 · 20 comments
Assignees
Labels
enhancement New feature or request

Comments

@lpatiny
Copy link
Member

lpatiny commented Sep 26, 2022

Currently if the data is coming from the server the only source that we allow in the .nmrium file is jcampURL.

We should make this more general and allow a FileCollection as source.

@lpatiny
Copy link
Member Author

lpatiny commented Nov 17, 2022

This is related to this issue in order to keep in cache the data (and possibly the parsed data).

In the current situation we have:

{
  "version": 3,
  "spectra": [
    {
      "id": "53f1ff98-d146-4444-9a79-bbf8ee40102a",
      "source": {
        "jcampURL": "./data/cytisine/2d/COSY_Cytisin.dx",
        "jcampSpectrumIndex": 0
      }
    },
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "jcampURL": "./data/cytisine/2d/COSY_Cytisin.dx",
        "jcampSpectrumIndex": 1
      }
    }
  ],

In the future we should have something like:

{
  "version": 3,
  "spectra": [
    {
      "id": "53f1ff98-d146-4444-9a79-bbf8ee40102a",
      "source": {
        "fileCollection": [
          {
            "name": "COSY_Cytisin.dx",
            "size": 1234,
            "relativePath": "data/cytisine/2d/COSY_Cytisin.dx",
            "lastModified": 234234,
            "jcampSpectrumIndex": 0
          }
        ]
      }
    },
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "COSY_Cytisin.dx",
            "size": 1234,
            "relativePath": "data/cytisine/2d/COSY_Cytisin.dx",
            "lastModified": 234234,
            "jcampSpectrumIndex": 1
          }
        ]
      }
    }
  ]
}

The goal here is that the source could be as well a list of files for a Bruker Experiment.

Thinking about NMReData we will also need to know the index of each spectrum.

@hamed-musallam
Copy link
Member

@lpatiny

This deal only with the local files which is not always the case, we should keep it general and instead of have jcampURL we have url and it accept the local or remote path

@lpatiny
Copy link
Member Author

lpatiny commented Nov 29, 2022

@jobo322 This feature will have to be implemented in nmr-load-save. Is there some obvious problems with this approach ?

@hamed-musallam
Copy link
Member

hamed-musallam commented Nov 29, 2022

@lpatiny

we need to consider if the remote file is a zip file

@lpatiny
Copy link
Member Author

lpatiny commented Nov 29, 2022

This is managed by filelist-utils. It should be transparent.

@hamed-musallam
Copy link
Member

@lpatiny

what is going to be the behavior if you drag and drop a zip file, filelist-util will unzip the files and once you export as .nmrim will get the unzip files in the source object which is not available in your local machine because it is compressed

@lpatiny
Copy link
Member Author

lpatiny commented Nov 29, 2022

If you drag and drop something the source should stay empty. It is only when the data are coming from a server that we can store relative path to the '.nmrium' source file.

This could change in the future if we implement: https://developer.mozilla.org/en-US/docs/Web/API/FileSystem

@lpatiny
Copy link
Member Author

lpatiny commented Dec 1, 2022

How to deal with zip files ?

{
  "version": 3,
  "spectra": [
    {
      "id": "53f1ff98-d146-4444-9a79-bbf8ee40102a",
      "source": {
        "fileCollection": [
          {
            "name": "data.zip",
            "size": 1234,
            "relativePath": "data/data.zip",
            "lastModified": 234234,
          }
        ],
        "filter": {general: {keepFID: true, keepFT: false}, bruker: {expno:'10'}}
      }
    },
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "data.zip",
            "size": 1234,
            "relativePath": "data/data.zip",
            "lastModified": 234234,
          }
        ],
        "filter": {general: {keepFID: false, keepFT: true}, bruker: {expno:'10'}}
      }
    },
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "data.fid",
            "size": 1234,
            "relativePath": "data/data.fid",
            "lastModified": 234234,
          },
          {
            "name": "data.acqu",
            "size": 1234,
            "relativePath": "data/data.acqu",
            "lastModified": 234234,
          },
          {
            "name": "data.acqus",
            "size": 1234,
            "relativePath": "data/data.acqus",
            "lastModified": 234234,
          }
        ],
        "filter": {general: {keepFID:1}, bruker: {expno:'11'}}
      }
    },
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "test.jdx",
            "size": 1234,
            "relativePath": "data/text.jdx",
            "lastModified": 234234,
          }
        ],
        "filter": {general: {keepFID:1}, jcamp: {index:'2'}}
      }
    }
  ]
}

/

@lpatiny
Copy link
Member Author

lpatiny commented Dec 1, 2022

@CS76 In the zip files you store on the server do you have one or many experiments ? Meaning for Bruker do you have 1h, cosy, tocsy in the same zip or one experiment per zip ?

@CS76
Copy link
Collaborator

CS76 commented Dec 1, 2022

@lpatiny Currently, we are loading a single experiment at a time through one zip file via a URL provided to nmrium. Zip files are generated by nmrXiv, so we have total flexibility in generating them. Say one experiment per zip or multiple experiments (1h, cosy, tocsy) in a single zip (also, we generate these zip files on the fly).

@hamed-musallam
Copy link
Member

hamed-musallam commented Dec 1, 2022

@lpatiny

then it would be ok to go with having a file collection inside the source and it should be a role to have a single experiment per a zip file.

@CS76

what do you mean by a single experiment, does it mean one spectrum?

@CS76
Copy link
Collaborator

CS76 commented Dec 1, 2022

what do you mean by a single experiment, does it mean one spectrum?

Yes, either 1h or 13 or cosy or tocsy

@hamed-musallam
Copy link
Member

@CS76

Sorry for asking one more time, is this apply also on the Bruker ? as i know we could have multiple spectra inside the Bruker

@hamed-musallam
Copy link
Member

hamed-musallam commented Dec 1, 2022

@CS76

this will reflect on the shape of the nmrium object that you will get

  "version": 3,
  "spectra": [
    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "1h.zip",
            "size": 1234,
            "relativePath": "data/1h.zip",
            "lastModified": 234234,
          }
        ],
      }
    ,
    ranges:{values:[],options:{}},
    peaks:{values:[],options:{}},
    integrals:{values:[],options:{}},
    ...etc
    },
{
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "13c.zip",
            "size": 1234,
            "relativePath": "data/13c.zip",
            "lastModified": 234234,
          }
        ],
      }
    ,
    ranges:{values:[],options:{}},
    peaks:{values:[],options:{}},
    integrals:{values:[],options:{}},
    ...etc
    }
    ]
   }

@hamed-musallam
Copy link
Member

@lpatiny

Are we going to assume that the Bruker or JCAMP should include one spectrum, whereas others who use the NMRium have to be sure that loaded file == one experiment?

@lpatiny
Copy link
Member Author

lpatiny commented Dec 2, 2022

No we can not assume this and we need this filter property.
Also the zip file contains one experiment but actually often 2 spectra (FID + FT) so even in this 'simple' case we will need the filter.

@hamed-musallam
Copy link
Member

@lpatiny

but can not rely on the jcampSpectrumIndex for the jcamp files and we should move to filter otherwise it will be a missy if the order of the spectra changes in the .nmrium file, beside that, actually we change the order of the spectra from multiple spectra analysis.

and I think this change has to be listed in the migration file

@CS76
Copy link
Collaborator

CS76 commented Dec 2, 2022

@CS76

Sorry for asking one more time, is this apply also on the Bruker ? as i know we could have multiple spectra inside the Bruker

Sorry if I wasn't clear but Bruker's output (one experiment) often contains FID and FT spectra.

@lpatiny
Copy link
Member Author

lpatiny commented Dec 2, 2022

but can not rely on the jcampSpectrumIndex for the jcamp files and we should move to filter otherwise it will be a missy if the order of the spectra changes in the .nmrium file, beside that, actually we change the order of the spectra from multiple spectra analysis.

and I think this change has to be listed in the migration file

But this was already done in the proposal I did

    {
      "id": "f1ff98d1-46e4-44da-b9bb-f8ee40102a69",
      "source": {
        "fileCollection": [
          {
            "name": "test.jdx",
            "size": 1234,
            "relativePath": "data/text.jdx",
            "lastModified": 234234,
          }
        ],
        "filter": {general: {keepFID:1}, jcamp: {index:'2'}}
      }
    }

Now the problem I see if when we have a zip file containing many jcamp that contain many spectra. We will probably need also a zip filter.

@lpatiny lpatiny added enhancement New feature or request and removed to discuss labels Dec 2, 2022
@jobo322 jobo322 moved this from Todo to In Progress in NMR and Cheminfo projects organisation Dec 16, 2022
jobo322 added a commit that referenced this issue Dec 21, 2022
close #2021 #1976 #1775 #2013

* fix: wrong scale on fid

* chore: fix migration 3 to 4

* chore: fix migration and filtering jcamp

* fix: export as JSON

* chore: migrate Source to nmr-load-save

* fix: pass a copy to multiplet-analysis close

* chore: temp fix to avoid load twice a molecule
@jobo322
Copy link
Member

jobo322 commented Dec 22, 2022

done by 8aad5c6

@jobo322 jobo322 closed this as completed Dec 22, 2022
Repository owner moved this from In Progress to Done in NMR and Cheminfo projects organisation Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

4 participants