Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening discussion about samplesheet_generator #1

Merged
merged 19 commits into from
Mar 6, 2024

Conversation

tinavisnovska
Copy link
Collaborator

Hello everyone,

this is a first functional version of a samplesheet generator that can eventually become used by the nextflow pipeline in the situations when a samplesheet is not transferred from an instrument after a sequencing is performed.

In this specific pull request there are only some minor changes to the README.md file, but please have a look into the whole repo, any feedback and comments are more than welcome.

When we are done discussing about the feedback, please remember to approve the pull request so that it can get merged into the inpred/samplesheet_generator repository.

Copy link
Collaborator

@marrip marrip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the code yet but I think the README looks good and is very informative. I had some ideas and comments I included but nothing major except streamlining the use of the nextflow samplesheet a bit more. But we might need to have a discussion about that to consider every aspect. In general, good work, Tina! 👍

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

TSO500_NextSeq_simple_indexes_legacy.tsv : Would it be possible to remove the commented lines there or are they necessary for something?

@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

TSO500_NextSeq_simple_indexes_legacy.tsv: I understand that it is convenient to include the commando you ran to generate the file but I would also exclude that to keep those files clean 😉

@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

LICENSE: very good that you included this! 🙌 only an idea, the GPL 3 guarantees that the code stays open source, I think with MIT people can reuse your code for commercial applications but maybe you don't mind?

@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

requirements.txt: good practice 👍

@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

samplesheet_generator.py: Very nicely written. You have refactored smaller functions which allows you to add unit tests without bigger problems, I believe. 🎉
One thing about the argparse parameters, you could specify options for the parameters that have only a small selection of options, e.g. index_type, just include:

	parser.add_argument('-t', '--index-type', help='Type of indexes, allowed values are \'dual\' and \'simple\'.', required=True, type=str, choices=['dual', 'single'])

or similar.

@marrip
Copy link
Collaborator

marrip commented Feb 9, 2024

Dockerfile: this would be my suggestion:

FROM python:3.11.4-slim
ENV PATH=$PATH:/opt
COPY requirements.txt /
RUN pip install --no-cache-dir -r requirements.txt \
    && rm requirements.txt
COPY samplesheet_generator.py /opt/
COPY indexes /opt/indexes

You don't need to update any packages if you don't install anything. And as I said earlier, the test data does not need to be there. I think it is good that you use the slim image to have a smaller size and also nicely done that you remove requirements.txt after installing them. 👍

@tinavisnovska
Copy link
Collaborator Author

Will be removed.

TSO500_NextSeq_simple_indexes_legacy.tsv : Would it be possible to remove the commented lines there or are they necessary for something?

@tinavisnovska
Copy link
Collaborator Author

Will be removed.

TSO500_NextSeq_simple_indexes_legacy.tsv: I understand that it is convenient to include the commando you ran to generate the file but I would also exclude that to keep those files clean 😉

@tinavisnovska
Copy link
Collaborator Author

Thanks, can try to add unit tests, can you just quicky point me to some sensible howto?

Choices will be added.

samplesheet_generator.py: Very nicely written. You have refactored smaller functions which allows you to add unit tests without bigger problems, I believe. 🎉 One thing about the argparse parameters, you could specify options for the parameters that have only a small selection of options, e.g. index_type, just include:

	parser.add_argument('-t', '--index-type', help='Type of indexes, allowed values are \'dual\' and \'simple\'.', required=True, type=str, choices=['dual', 'single'])

or similar.

@tinavisnovska
Copy link
Collaborator Author

Will be adjusted.

Dockerfile: this would be my suggestion:

FROM python:3.11.4-slim
ENV PATH=$PATH:/opt
COPY requirements.txt /
RUN pip install --no-cache-dir -r requirements.txt \
    && rm requirements.txt
COPY samplesheet_generator.py /opt/
COPY indexes /opt/indexes

You don't need to update any packages if you don't install anything. And as I said earlier, the test data does not need to be there. I think it is good that you use the slim image to have a smaller size and also nicely done that you remove requirements.txt after installing them. 👍

@tinavisnovska
Copy link
Collaborator Author

tinavisnovska commented Feb 14, 2024

I don't have too strong opinions on licensing, just wanted it to be open.

However, I feel like licensing is for a bit more general discussion and that it would make sense to have a common agreement on this in InPreD, but not sure how and who shall do it... maybe could be discussed during the workshop though.

LICENSE: very good that you included this! 🙌 only an idea, the GPL 3 guarantees that the code stays open source, I think with MIT people can reuse your code for commercial applications but maybe you don't mind?

@marrip
Copy link
Collaborator

marrip commented Feb 15, 2024

Thanks, can try to add unit tests, can you just quicky point me to some sensible howto?

Choices will be added.

samplesheet_generator.py: Very nicely written. You have refactored smaller functions which allows you to add unit tests without bigger problems, I believe. 🎉 One thing about the argparse parameters, you could specify options for the parameters that have only a small selection of options, e.g. index_type, just include:

	parser.add_argument('-t', '--index-type', help='Type of indexes, allowed values are \'dual\' and \'simple\'.', required=True, type=str, choices=['dual', 'single'])

or similar.

I think I like this one for unit testing: https://www.datacamp.com/tutorial/pytest-tutorial-a-hands-on-guide-to-unit-testing. Aparently, unit test classes are not so desirable due to a lack of debug information and more boiler plate code - I used them for the local app prepper but will change them there as well. You can check out the repo for where to put your tests at least so that we keep it consistent: https://github.com/InPreD/local_app_prepper/tree/develop/prepper/tests

I hope this helps and if you add them to this PR or create a new one, we can simply look at them and discuss anything in the PR directly. 😉

@marrip
Copy link
Collaborator

marrip commented Feb 15, 2024

I don't have too strong opinions on licensing, just wanted it to be open. Anyway, even if used for commercial applications, they should stay open too, don't they?

However, I feel like licensing is for a bit more general discussion and that it would make sense to have a common agreement on this in InPreD, but not sure how and who shall do it... maybe could be discussed during the workshop though.

LICENSE: very good that you included this! 🙌 only an idea, the GPL 3 guarantees that the code stays open source, I think with MIT people can reuse your code for commercial applications but maybe you don't mind?

I think it would be good to discuss that, for sure. 👍

samplesheet_generator.py Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
generator/tests/cli_test.py Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
@tinavisnovska
Copy link
Collaborator Author

Ok, so testing works, building does not due to authentication as was expected and would resolve after merging. I'lll have a look at the other pytest repo later today or so.

@tinavisnovska tinavisnovska merged commit 5f6c522 into InPreD:main Mar 6, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants