Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving Import error #16

Open
Lukas67 opened this issue Feb 16, 2023 · 8 comments
Open

Receiving Import error #16

Lukas67 opened this issue Feb 16, 2023 · 8 comments

Comments

@Lukas67
Copy link

Lukas67 commented Feb 16, 2023

Hi,

An error is returned when running the program:

ImportError: cannot import name 'gcd' from 'fractions' (/home/lukas/anaconda3/lib/python3.9/fractions.py)

As per stack overflow the problem is caused by the networkx module which import statements changed upon python updates (I am using python 3.9.7).

BR
Lukas

@atc3
Copy link
Contributor

atc3 commented Feb 16, 2023

The DART-ID conda environment (https://github.com/SlavovLab/DART-ID/blob/master/environment.yml) is set up to run Python 3.7.6. Is there a specific reason you need to run Python 3.9.7?

@Lukas67
Copy link
Author

Lukas67 commented Feb 20, 2023

Hi,

thanks for your reply.

I need that particular version to run other programs.

In your description of the program it says that it runs on python >= 3.7.

BR
Lukas

@atc3
Copy link
Contributor

atc3 commented Feb 21, 2023

Thanks for letting me know about the description -- this program was released when python 3.7 was the latest and I was trying to communicate that it would work with any 3.7 version. I will update the description to be more explicit about this requirement.

In the meantime, you should be able to use python virtualenvs (https://docs.python.org/3/library/venv.html) or conda environments (https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) to run DART-ID in a separate python environment, so that you can still run your other programs on their other python versions.

Let me know if you need and help with this

@Lukas67
Copy link
Author

Lukas67 commented Feb 27, 2023

Hi,

I ran the program now in an virtual environment and it works now.

I received an error while running my first evidence file:

dart_id -c /home/lukas/Desktop/MS-Data/Lukas/mq-run_150223/combined/txt/config_annotated.yaml -o /home/lukas/Desktop/MS-Data/Lukas/mq-run_150223/combined/txt/output_dart_id
2023-02-27 09:56:22 [ERROR] Number of experiments filter threshold 3 is greater than the number of experiments in the input list. Please provide an integer greater than or equal to 1 and less than the number of experiments with the "num_experiments" key.
Traceback (most recent call last):
File "/home/lukas/anaconda3/envs/dart_env/bin/dart_id", line 8, in
sys.exit(main())
File "/home/lukas/anaconda3/envs/dart_env/lib/python3.7/site-packages/dart_id/update.py", line 355, in main
df, df_original = process_files(config)
File "/home/lukas/anaconda3/envs/dart_env/lib/python3.7/site-packages/dart_id/converter.py", line 385, in process_files
raise ConfigFileError('Number of experiments filter threshold {} is greater than the number of experiments in the input list. Please provide an integer greater than or equal to 1 and less than the number of experiments with the "num_experiments" key.'.format(config['num_experiments']))
dart_id.exceptions.ConfigFileError: Number of experiments filter threshold 3 is greater than the number of experiments in the input list. Please provide an integer greater than or equal to 1 and less than the number of experiments with the "num_experiments" key.

I used your default config file and changed only the input files to one evidence file from my first mq run. Where do I define the n-experiment argument?

Thanks for your help!

BR
Lukas

@atc3
Copy link
Contributor

atc3 commented Feb 27, 2023

Hi Lukas,

The num_experiments parameter can be found at the bottom of the config files. See here for example:

More importantly however -- DART-ID is only able to infer latent retention times by using data from multiple LCMS runs. If you only provide one experiment, there is no statistical power to be gained.

i.e., if there is a low-confidence peptide in run A, we can increase confidence in our observation in run A if we see the same peptide at the same RT in run B (and ideally, in runs C, ..., N -- the more experiments we use, the more power we have).

I would strongly recommend not using DART-ID if you only have one run, and to only use this tool if you have multiple (and ideally many) similarly configured LCMS runs.

If you have any more questions let me know

@Lukas67
Copy link
Author

Lukas67 commented Feb 28, 2023

Hi Albert,

thank you for assistance.

My data is acquired from single cell monocytes. I think it would be a good idea to align retention times and include your program in my workflow. Although it is my 4th week in proteomics and I have only acquired 1 run successfully with MaxQuant. Hence the program should work, but with no improvements of PSM scores right?
So I hope the error does not rely because of observing the same run twice:

File "/home/lukas/anaconda3/envs/dart_env/lib/python3.7/shutil.py", line 104, in copyfile
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
shutil.SameFileError: 'config.yaml' and '/home/lukas/Desktop/MS-Data/Lukas/dart_id/config.yaml' are the same file

I appreciate your help.

BR
Lukas

@Lukas67
Copy link
Author

Lukas67 commented Feb 28, 2023

Hi Albert,

I have one additional question:
How similar runs need to be. Does it solely rely on labeling techniques such as TMT and else or is it possible to align different experimental designs.
If yes, what are the constraints?

BR
Lukas

@atc3
Copy link
Contributor

atc3 commented Feb 28, 2023

Although it is my 4th week in proteomics and I have only acquired 1 run successfully with MaxQuant. Hence the program should work, but with no improvements of PSM scores right?

Do not run this program with just one run -- there is no improvement to be gained and the code relies on multiple experiments (and PSMs existing across n experiments as defined by the num_experiments param)

How similar runs need to be. Does it solely rely on labeling techniques such as TMT and else or is it possible to align different experimental designs. If yes, what are the constraints?

There are no constraints to the chemistry of the labeling or LC -- the liquid chromatography just has to be consistent. In our paper we use DART-ID in TMT-labeled and label-free runs. However, do not mix runs of different chemistries/chromatographies or even runs that are far apart (and thus not reproducible). For example, do not mix label-free and TMT-labelled runs -- the TMT labeling itself is chemically modifying peptides and altering their retention times (which DART-ID assumes to be so consistent that each run only requires a small linear adjustment to hit the "true" retention time).

Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants