Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo updates #1618

Merged
merged 5 commits into from
Apr 28, 2022
Merged

Demo updates #1618

merged 5 commits into from
Apr 28, 2022

Conversation

daavoo
Copy link
Contributor

@daavoo daavoo commented Apr 26, 2022

Pending on DVC and DVCLive releases

@mattseddon mattseddon added 🏠 housekeeping A: integration Area: DVC integration layer labels Apr 26, 2022
@@ -1,3 +1,3 @@
seed: 473987
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why removing this? (to make it cleaner?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to make it cleaner?)

Yep. It's just additional column in table not really a parameter to configure

demo/dvc.yaml Outdated Show resolved Hide resolved
Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and thanks, @daavoo , you saved me a lot of time! :)

I've put some comments, it works great for me locally. I've put some comments. Default (?) naming is confusing. And we need to get some images - I can even work on this, but I would appreciate some ideas.

@daavoo daavoo marked this pull request as ready for review April 28, 2022 18:29
return metrics
return metrics, predictions

def get_confusion_image(predictions, dataset):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_confusion_image has a Cognitive Complexity of 18 (exceeds 5 allowed). Consider refactoring.

@codeclimate
Copy link

codeclimate bot commented Apr 28, 2022

Code Climate has analyzed commit d91ec4d and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (85% is the threshold).

This pull request will bring the total coverage in the repository to 96.7% (0.0% change).

View more on Code Climate.

@mattseddon
Copy link
Member

🙏🏻 @daavoo

@mattseddon mattseddon merged commit efa7dfd into main Apr 28, 2022
@mattseddon mattseddon deleted the demo-updates branch April 28, 2022 23:59
md5: e04749646f33be203f210c5f1ea63a2a.dir
size: 10783
nfiles: 1
md5: c7a5760efd52d3759d8e546ab867f4a6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daavoo can you push this to the remote if you still have it. Please and thank you.

for k, v in metrics.items():
live.log(k, v)
live.next_step()

live.set_step(None)
missclassified = get_confusion_image(predictions, mnist_test)
live.log_image("missclassified.jpg", missclassified)
Copy link
Member

@mattseddon mattseddon Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daavoo there is a change in behaviour here for us. We used to get a single image for every checkpoint. It was saved when the checkpoint was written and it meant that we could have "live updates" in the comparison table. Due to the way that log_image works if we log an image for each checkpoint we now see the following:

Screen Shot 2022-04-29 at 11 38 29 am

Is the old behaviour of being able to see only a single plot for each checkpoint desirable? Would a DS use that functionality? My feeling is that it would be more useful than having all previous images shown and/or only having a single image provided at the end of the epochs.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the old behaviour of being able to see only a single plot for each checkpoint desirable? Would a DS use that functionality?

It's dependent on the use case. But I can say that it is more desirable to be able to see all the images from previous checkpoints in the UI, not only the latest.

We introduced the behavior in DVCLive to cover this, but still lack a UI component that makes sense for visualizing it.

Tensorboard implements this functionality with a slider, and it's a very popular feature:

slider

For now, I will remove the log_image usage from DVCLive in favor of plain image saving.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we can create a feature request out of this (we were discussing before, what is the best way to present images from the checkpoint)

for k, v in metrics.items():
live.log(k, v)
live.next_step()

live.set_step(None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this here also leads to the old "experiment is logged with the last epoch number" error

image

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look. I wasn't sure about logging the image with dvclive. We could remove It and just save the image directly

@@ -0,0 +1,10 @@
timestamp step acc
Copy link
Member

@mattseddon mattseddon Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] After running an experiment for diff we now have:

/demo add-exp-data !7 ❯ dvc diff
Added:                                                                                                                                               
    training_metrics/images/missclassified.jpg
    training_metrics/report.html
    training_metrics/scalars/acc.tsv
    training_metrics/scalars/loss.tsv

Modified:
    model.pt
    predictions.json
    training_metrics.json
    training_metrics/

files summary: 4 added, 3 modified

but in dvc list . -R --dvc-only we only get:

❯ dvc list . --dvc-only -R
data/MNIST/raw/t10k-images-idx3-ubyte
data/MNIST/raw/t10k-images-idx3-ubyte.gz
data/MNIST/raw/t10k-labels-idx1-ubyte
data/MNIST/raw/t10k-labels-idx1-ubyte.gz
data/MNIST/raw/train-images-idx3-ubyte
data/MNIST/raw/train-images-idx3-ubyte.gz
data/MNIST/raw/train-labels-idx1-ubyte
data/MNIST/raw/train-labels-idx1-ubyte.gz
model.pt

Should these files be tracked by DVC so that we can showcase the SCM view/decorations accordingly or is that bad practice?

full output from list:

❯ dvc list . -R                                                                0.87841s  .env  3.0.0 12:43:12
.DS_Store
.dvcignore
.gitignore
.vscode/extensions.json
.vscode/settings.json
data/MNIST/.gitignore
data/MNIST/raw.dvc
data/MNIST/raw/t10k-images-idx3-ubyte
data/MNIST/raw/t10k-images-idx3-ubyte.gz
data/MNIST/raw/t10k-labels-idx1-ubyte
data/MNIST/raw/t10k-labels-idx1-ubyte.gz
data/MNIST/raw/train-images-idx3-ubyte
data/MNIST/raw/train-images-idx3-ubyte.gz
data/MNIST/raw/train-labels-idx1-ubyte
data/MNIST/raw/train-labels-idx1-ubyte.gz
dvc.lock
dvc.yaml
dvclive.json
dvclive/scalars/acc.tsv
dvclive/scalars/loss.tsv
model.pt
params.yaml
predictions.json
requirements.txt
train.py
training_metrics.json
training_metrics/images/missclassified.jpg
training_metrics/report.html
training_metrics/scalars/acc.tsv
training_metrics/scalars/loss.tsv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its completely use case dependent. we usually set cache: false for files that are small enough to be tracked by git.

If It makes more sense for the VSCode demo, there is nothing wrong with removing the cache: false lines and track everything with DVC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: integration Area: DVC integration layer 🏠 housekeeping
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants