You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, rsmpredict supports an undocumented option of specifying an output directory instead of file if the output_file does not have a .csv or .xlsx extension. However, there are several inconsistencies:
This option is not documented so the docstring is inaccurate.
The output file format is controlled by the file_format setting in the rsmpredict configuration file and the extension of the specified file is totally ignored, if specified.
The directory bit is untested in addition to being undocumented.
The .tsv file format is not represented in the check that determines whether the output is a file or a directory.
I think a reasonable solution would be to:
Get rid of the directory output option entirely.
Make it so that the output argument is called output_prefix with the file format specified in the configuration file overriding the file format on the command line and an appropriate warning generated.
The text was updated successfully, but these errors were encountered:
Actually, now that I think a bit more about it, I think we should strive for consistency. So, here's an alternative I'd prefer:
Make rsmpredict also use an output directory.
Make --features into a boolean flag so that the pre-processed features are always saved in the given output directory with a fixed name, just like the predictions.
I think this is much simpler than what I had suggested above.
I see the point about consistency, although I can see myself being very annoyed as a user: if I am running multiple experiments, I will end up with lots of directories each containing a single file with the same name. I personally prefer to have one directory with many files.
How about we take consistency even further and add a new field prediction_id that will be used as a prefix for the predictions file/other outputs files? (We could also make it optional and by default set to be equal to experiment_id)? Then if we also add -f option already available to other tools, I'll be able to continue doing what I want and we'll have consistent approach across the tools?
Currently,
rsmpredict
supports an undocumented option of specifying an output directory instead of file if the output_file does not have a.csv
or.xlsx
extension. However, there are several inconsistencies:file_format
setting in thersmpredict
configuration file and the extension of the specified file is totally ignored, if specified..tsv
file format is not represented in the check that determines whether the output is a file or a directory.I think a reasonable solution would be to:
output_prefix
with the file format specified in the configuration file overriding the file format on the command line and an appropriate warning generated.The text was updated successfully, but these errors were encountered: