-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics: separating scalar and continuous/graphs metrics #3409
Comments
From discussion: need to delete/hide |
In order to make this scalar/continuous metrics separation, the current scalar metrics need to be redesigned.
|
I would call these "series", "array", or "track" metrics. Not sure about term "continuous". Just a thought
ini=YAML?
Probably not. DVCx then? In this case x can also mean "eXperiment mgmt". I have doubts about the command names though. So we're keeping
Sounds smart to leave this to a viz package like thar and just refer users to their formatting guidelines. |
Is there a decision on what the hyperparam metafiles are going to look like? It could be relevant to know for this point ^
What about metrics that just spit a single number into a text file. That's not really something we want to support? (It could even be stored directly in the DVC-file in the future.) |
Note: point 6. in #3409 (comment) above should be automatic with our process. It will be addressed in iterative/dvc.org/pull/1097 and in the docs PR derived from addressing 2. 🙂 |
So we would remove
|
One last comment for now, sorry:
Keep in mind some of the text deleted in #1097 could be restored when writing docs for |
It seems like we agreed on most of the items and can finalize the scalar and continuous metrics change. ToDo:
To discuss (we should decide after the first iteration is implemented):
|
On this I think maybe not because metrics/plot support files tracked by Git (not outputs). Nice to-do list! Please note there's a corresponding docs issue here: iterative/dvc.org#1175 — its checkbox list may need updating. |
Follows `dvc params/metrics` convention and is needed in preparation for new commands and some refactoring. Part of iterative#3409
Kinda jumped the gun here #3802 , but it hurts my OCD and will simplify the new commands and set the terminology for the rest of the points. |
* dvc: rename plot to plots Follows `dvc params/metrics` convention and is needed in preparation for new commands and some refactoring. Part of #3409 * completion: rename plot -> plots
More explicit CLI-like options `metrics/persist_no_cache` are still supported, but we use flags when generating dvc.yaml. Part of iterative#3409, prerequisite for `plot`.
* run: add --plot/--plot-no-cache Part of #3409 * plot: move show/diff to API * plots: show marked plots by default Involves some refactoring to make plots comply with the rest of the code base. * dvc.yaml: use top-level metrics and plots * dvc.yaml: add support for plot template * plots: add support for templates in dvc.yaml * dvc.yaml: allow metrics/plots to have persist/cache * plots: render single page by-default and introduce --show-json * plot: use "working tree" instead of "workspace" Just to sync this with the rest of the codebase. We should consider renaming it to "workspace" everywhere, but for now I opt to not touch it to not break something that depends on it. * plots: rename -f|--file to -o|--out Better corresponds to other commands like `import`. * plots: use RepoTree and brancher instead of api * stage: get rid of OutputParams * tests: plots: add simple diff test * plots: use `--targets` instead of `--datafile` Makes it comply with `dvc metrics/params` CLI options. * tests: unit: plots: add --show-json test * stage: cleanup exception for format error Based on feedback from the users. * lockfile: use relpath in corruption error
Should we close this issue? It looks like everything is already implemented (probably except xpath/filer) - but it s fine for now. |
Closing in favor of #3945 |
Problem
There are two types of metrics in ML projects: scalar metrics or just a number (like AUC) and continuous metrics or a sequence of numbers (like ROC curve - an array of numbers).
Today we support csv and json files as metrics without specifying what metrics are needed. In fact, DVC commands support only scalars:
dvc metrics diff
.Solution
It is important to support both, understand the semantics of the metrics and provide more value.
dvc metrics diff
functionality. We just need clearly file format in the documentation to make sure only scalar metrics are supported. Json/Ini are both good formats for scalar metrics.dvc viz show roc.json
anddvc viz diff roc.json HEAD HEAD^^
. Both of the commands should generate graphs. The second command - two graphs (not diff).Types
Types might improve visualization a lot. There many possible types of graphs for
dvc viz
. It would be great to support a few common ones like regular plot, confusion matrix.For scalar types, it is also important to specify a type. A scalar can be the result of the minimization of some function or maximization. If the user can specify this information it might help to show proper color: red if max meterics was decreased and green if increased.
We should not make the metrics file format complicated. So, it might be a better solution to specify metrics format in a separate file (like
Dvcfile
or special metrics file).Open questions
dvc viz
in the core dvc package? Problem - most likely it will bring dependencies to some graphics libraries which can cause installation issues in OS without graphics (like default EC2 instances). What options do we have?Actions
dvc viz
command.The text was updated successfully, but these errors were encountered: