Pan draft #205

nicola-debernardini · 2024-02-08T10:31:00Z

Dear Johannes,

I hope this email finds you well.
It's been a while since our first meeting. I wanted to reach out regarding the new feature implemented in gapseq. Silvio mentioned that you would be interested in checking it out.

I'm getting close to the final version of the manuscript and thought it would be a good moment to make the pull request. I would like to submit within the next week if the last implementations works smoothly. I've also shared the manuscript with you via email ([email protected]).
The added code doesn't interfere much with the original one, except for the "home" script. I've added a new script ("src/pan-draft.R") along with its supporting functions in "src/pan-draft_functions.R". While I tried to follow your layout, there may be some differences in the structure. Please let me know if there's anything that needs fixing.

One aspect I'm unsure about is the implementation of plots using the functions in "src/pan-draft_plotting.R". By default, graphs are not generated, but they can be useful for visualizing the results. I implemented them month ago when I was checking the features of the individual pan-Draft that I was generating. However, depending on the number of genomes in the input, the results may not scale well on the axis and could be slow. There are two point in the script that make plots, one of them is related to the "micropan" library. If you have the time, I'd appreciate if you could take a look and provide your thoughts on whether to exclude them or not.

Additionally, I couldn't visualize the documentation section because of access limitation. If there are any issues with it, please let me know, and I'll make the necessary adjustments.

Thank you for taking the time to review the code and documentation.
I will be available for anything at [email protected].

Best regards,
Nicola

…threshold to median

jotech · 2024-03-22T18:35:33Z

Dear Nicola

thank you very much for your work and pull request!
I'm happy to merge. Could you think perhaps about some
smaller changes?

I get why you added txt files as list for input files, I think this is a bit confusing because it hides the nature of the input files. One alternative would be to allow the user to provide either comma-separated file names or a generic file name with an asterisk (*)
I would suggest putting all files into the toy folder without a subfolder
Should the plotting script be part of gapseq, or is it rather further analysis that could be put somewhere else? I'm asking because the other modules also do not provide plotting, and your script would introduce new dependencies (ggplot, etc) that are not taken care of
Is the gapseq version stored in the models? (there is a TODO at the beginning of pan-draft.R script)
Could you add an example to the documentation on continuing the pipeline after gapseq pan?

…ironment

…ry path) and accordingly adjusted the documentation with new examples to continuing the pipeline

nicola-debernardini · 2024-03-27T11:09:08Z

Dear Johannes,

Thank you for your suggestion. I agree and have already implemented them. Let me know what you think about them.

To provide an overview, I made the following changes:

Removed the .txt files as input options and added three alternatives: comma-separated file names, file names with wildcards, paths to folders;
Removed the subfolder from toy/ for better organization;
Improved the documentation by adding input examples to the modules and further examples for connecting the output to the pipeline;
Modified pan-draft.R to include the gapseq version in the species-level model and removed the plotting options and plotting code.

Please let me know if you have any further feedback or suggestions.

- input files defined by wildcards was not working correctly - simplified code - update example in help

jotech · 2024-03-27T17:44:00Z

Dear Nicola
The additional commits look very good! I modified the input file handling a bit. Could you check it out?
Many thanks for your efforts!

jotech · 2024-03-28T11:28:13Z

great thank you for the additional fix!

nicola-debernardini added 11 commits December 1, 2023 09:29

first triall addition of pan-draft.R

2e808c5

add the functions for pan-Draft.R

6255e07

add option to call the panDraft module

e050843

add files for panDraft example

511d94c

add plotting functions for panDraft extended output

d1951b6

add missing files for panDraft example, RDS were ignored

479126b

added documentation panDraft.md

ae391db

minor changes on index.rst

45ee2f0

micropan installation on gapseq environment

eda43b9

modified updating rxn weigths strategy: brought it back from varying …

503af0e

…threshold to median

minor changes

97322c3

nicola-debernardini added 6 commits March 25, 2024 20:38

moved files from the subfolder into the toy folder

a3c2a3c

added the gapseq version to the description of pan.mod

cde9116

remove the plotting options and the associated libraries from the env…

b26b72c

…ironment

added new input options (comma separated files, wildcards and directo…

2890428

…ry path) and accordingly adjusted the documentation with new examples to continuing the pipeline

add files to toy/ excluded due to gitignore

222bdc9

.gitignore back to original

f272d60

minor changes to input file handling

56ec2ec

- input files defined by wildcards was not working correctly - simplified code - update example in help

minor changes: fix interpretation of extended options

c246303

jotech merged commit a354bbc into jotech:master Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pan draft #205

Pan draft #205

nicola-debernardini commented Feb 8, 2024

jotech commented Mar 22, 2024

nicola-debernardini commented Mar 27, 2024

jotech commented Mar 27, 2024

jotech commented Mar 28, 2024

Pan draft #205

Pan draft #205

Conversation

nicola-debernardini commented Feb 8, 2024

jotech commented Mar 22, 2024

nicola-debernardini commented Mar 27, 2024

jotech commented Mar 27, 2024

jotech commented Mar 28, 2024