Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html: How to track/ignore it in DVC #242

Closed
dberenbaum opened this issue Apr 18, 2022 · 6 comments
Closed

html: How to track/ignore it in DVC #242

dberenbaum opened this issue Apr 18, 2022 · 6 comments
Labels

Comments

@dberenbaum
Copy link
Collaborator

After iterative/dvc.org#3411, the easiest way to track plots in dvc.yaml is:

    plots:
      - dvclive

However, this will track dvclive/report.html, which is generally not desired. Some options to handle this better:

  • Automatically add dvclive/report.html to .dvcignore (should we check for the presence of DVC or just add it anyway?).
  • Move report.html outside of the dvclive dir.
  • Allow DVC to track it -- is it harmful?
@dberenbaum
Copy link
Collaborator Author

One more option would be to always suggest using cache: false for plots and capturing all dvclive outputs in git instead of dvc.

@daavoo
Copy link
Contributor

daavoo commented Apr 19, 2022

Automatically add dvclive/report.html to .dvcignore (should we check for the presence of DVC or just add it anyway?).

I won't really like to add DVC-specific logic.

Move report.html outside of the dvclive dir.

Not sure if it's a good idea. General feedback is that DVC/DVCLive already generates too many files at the root of the repo.

Allow DVC to track it -- is it harmful?

I might be missing something but it would be just wasted storage.


What if we encourage through docs to use set outputs for each data subfolder?

This way the report won't be tracked by DVC and allows for more flexibility. The most common setup, IMO, would be:

    plots:
      - dvclive/images
      - dvclive/scalars:
          cache: false

@dberenbaum
Copy link
Collaborator Author

What if we encourage through docs to use set outputs for each data subfolder?

This way the report won't be tracked by DVC and allows for more flexibility. The most common setup, IMO, would be:

    plots:
      - dvclive/images
      - dvclive/scalars:
          cache: false

I think this makes sense for now. It feels a little burdensome to have to add all this to get dvclive outputs tracked properly, but at least it's explicit and flexible.

@daavoo daavoo added the A: dvc DVC integration label Sep 24, 2022
@daavoo
Copy link
Contributor

daavoo commented Oct 5, 2022

Would #322 close this? Given that plots will be isolated in it's own folder, the following won't track the report:

plots:
  - dvclive/plots

@dberenbaum
Copy link
Collaborator Author

🤔 It's definitely better. If we continue towards trying to auto-configure all DVCLive outputs in dvc.yaml, we would still want a way to specify the HTML as ignored, though, right?

@dberenbaum
Copy link
Collaborator Author

If we are writing dvc.yaml from dvclive, is there any downside to adding the report file to dvcignore?

@dberenbaum dberenbaum reopened this May 1, 2023
@dberenbaum dberenbaum closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants