-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo updates #1618
Demo updates #1618
Conversation
@@ -1,3 +1,3 @@ | |||
seed: 473987 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: why removing this? (to make it cleaner?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(to make it cleaner?)
Yep. It's just additional column in table not really a parameter to configure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and thanks, @daavoo , you saved me a lot of time! :)
I've put some comments, it works great for me locally. I've put some comments. Default (?) naming is confusing. And we need to get some images - I can even work on this, but I would appreciate some ideas.
return metrics | ||
return metrics, predictions | ||
|
||
def get_confusion_image(predictions, dataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_confusion_image
has a Cognitive Complexity of 18 (exceeds 5 allowed). Consider refactoring.
Code Climate has analyzed commit d91ec4d and detected 0 issues on this pull request. The test coverage on the diff in this pull request is 100.0% (85% is the threshold). This pull request will bring the total coverage in the repository to 96.7% (0.0% change). View more on Code Climate. |
🙏🏻 @daavoo |
md5: e04749646f33be203f210c5f1ea63a2a.dir | ||
size: 10783 | ||
nfiles: 1 | ||
md5: c7a5760efd52d3759d8e546ab867f4a6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo can you push this to the remote if you still have it. Please and thank you.
for k, v in metrics.items(): | ||
live.log(k, v) | ||
live.next_step() | ||
|
||
live.set_step(None) | ||
missclassified = get_confusion_image(predictions, mnist_test) | ||
live.log_image("missclassified.jpg", missclassified) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo there is a change in behaviour here for us. We used to get a single image for every checkpoint. It was saved when the checkpoint was written and it meant that we could have "live updates" in the comparison table. Due to the way that log_image
works if we log an image for each checkpoint we now see the following:
Is the old behaviour of being able to see only a single plot for each checkpoint desirable? Would a DS use that functionality? My feeling is that it would be more useful than having all previous images shown and/or only having a single image provided at the end of the epochs.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the old behaviour of being able to see only a single plot for each checkpoint desirable? Would a DS use that functionality?
It's dependent on the use case. But I can say that it is more desirable to be able to see all the images from previous checkpoints in the UI, not only the latest.
We introduced the behavior in DVCLive to cover this, but still lack a UI component that makes sense for visualizing it.
Tensorboard implements this functionality with a slider, and it's a very popular feature:
For now, I will remove the log_image
usage from DVCLive in favor of plain image saving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we can create a feature request out of this (we were discussing before, what is the best way to present images from the checkpoint)
for k, v in metrics.items(): | ||
live.log(k, v) | ||
live.next_step() | ||
|
||
live.set_step(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a look. I wasn't sure about logging the image with dvclive. We could remove It and just save the image directly
@@ -0,0 +1,10 @@ | |||
timestamp step acc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] After running an experiment for diff
we now have:
/demo add-exp-data !7 ❯ dvc diff
Added:
training_metrics/images/missclassified.jpg
training_metrics/report.html
training_metrics/scalars/acc.tsv
training_metrics/scalars/loss.tsv
Modified:
model.pt
predictions.json
training_metrics.json
training_metrics/
files summary: 4 added, 3 modified
but in dvc list . -R --dvc-only
we only get:
❯ dvc list . --dvc-only -R
data/MNIST/raw/t10k-images-idx3-ubyte
data/MNIST/raw/t10k-images-idx3-ubyte.gz
data/MNIST/raw/t10k-labels-idx1-ubyte
data/MNIST/raw/t10k-labels-idx1-ubyte.gz
data/MNIST/raw/train-images-idx3-ubyte
data/MNIST/raw/train-images-idx3-ubyte.gz
data/MNIST/raw/train-labels-idx1-ubyte
data/MNIST/raw/train-labels-idx1-ubyte.gz
model.pt
Should these files be tracked by DVC so that we can showcase the SCM view/decorations accordingly or is that bad practice?
full output from list:
❯ dvc list . -R 0.87841s .env 3.0.0 12:43:12
.DS_Store
.dvcignore
.gitignore
.vscode/extensions.json
.vscode/settings.json
data/MNIST/.gitignore
data/MNIST/raw.dvc
data/MNIST/raw/t10k-images-idx3-ubyte
data/MNIST/raw/t10k-images-idx3-ubyte.gz
data/MNIST/raw/t10k-labels-idx1-ubyte
data/MNIST/raw/t10k-labels-idx1-ubyte.gz
data/MNIST/raw/train-images-idx3-ubyte
data/MNIST/raw/train-images-idx3-ubyte.gz
data/MNIST/raw/train-labels-idx1-ubyte
data/MNIST/raw/train-labels-idx1-ubyte.gz
dvc.lock
dvc.yaml
dvclive.json
dvclive/scalars/acc.tsv
dvclive/scalars/loss.tsv
model.pt
params.yaml
predictions.json
requirements.txt
train.py
training_metrics.json
training_metrics/images/missclassified.jpg
training_metrics/report.html
training_metrics/scalars/acc.tsv
training_metrics/scalars/loss.tsv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its completely use case dependent. we usually set cache: false
for files that are small enough to be tracked by git.
If It makes more sense for the VSCode demo, there is nothing wrong with removing the cache: false
lines and track everything with DVC
Pending on DVC and DVCLive releases