Output file or directory in rsmpredict #396

desilinguist · 2020-03-09T17:37:52Z

Currently, rsmpredict supports an undocumented option of specifying an output directory instead of file if the output_file does not have a .csv or .xlsx extension. However, there are several inconsistencies:

This option is not documented so the docstring is inaccurate.
The output file format is controlled by the file_format setting in the rsmpredict configuration file and the extension of the specified file is totally ignored, if specified.
The directory bit is untested in addition to being undocumented.
The .tsv file format is not represented in the check that determines whether the output is a file or a directory.

I think a reasonable solution would be to:

Get rid of the directory output option entirely.
Make it so that the output argument is called output_prefix with the file format specified in the configuration file overriding the file format on the command line and an appropriate warning generated.

The text was updated successfully, but these errors were encountered:

desilinguist · 2020-03-09T22:11:19Z

Actually, now that I think a bit more about it, I think we should strive for consistency. So, here's an alternative I'd prefer:

Make rsmpredict also use an output directory.
Make --features into a boolean flag so that the pre-processed features are always saved in the given output directory with a fixed name, just like the predictions.

I think this is much simpler than what I had suggested above.

aloukina · 2020-03-10T00:30:43Z

I see the point about consistency, although I can see myself being very annoyed as a user: if I am running multiple experiments, I will end up with lots of directories each containing a single file with the same name. I personally prefer to have one directory with many files.
How about we take consistency even further and add a new field prediction_id that will be used as a prefix for the predictions file/other outputs files? (We could also make it optional and by default set to be equal to experiment_id)? Then if we also add -f option already available to other tools, I'll be able to continue doing what I want and we'll have consistent approach across the tools?

desilinguist · 2020-03-10T00:34:04Z

Hmm, yes I can see how that can be quite annoying. I like your suggestion! 👍

desilinguist added the bug label Mar 9, 2020 — with Slack

desilinguist added enhancement and removed bug labels Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output file or directory in rsmpredict #396

Output file or directory in rsmpredict #396

desilinguist commented Mar 9, 2020

desilinguist commented Mar 9, 2020

aloukina commented Mar 10, 2020 •

edited

Loading

desilinguist commented Mar 10, 2020

Output file or directory in rsmpredict #396

Output file or directory in rsmpredict #396

Comments

desilinguist commented Mar 9, 2020

desilinguist commented Mar 9, 2020

aloukina commented Mar 10, 2020 • edited Loading

desilinguist commented Mar 10, 2020

aloukina commented Mar 10, 2020 •

edited

Loading