Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is # iterations actually being calculated? #79

Closed
fedarko opened this issue Sep 20, 2019 · 2 comments
Closed

How is # iterations actually being calculated? #79

fedarko opened this issue Sep 20, 2019 · 2 comments

Comments

@fedarko
Copy link
Collaborator

fedarko commented Sep 20, 2019

The README says that

Number of iterations = --epoch # multiplied by the --batch-size parameter

...But I'm testing out Songbird on the Red Sea dataset with --epochs 10000 and the default --batch-size 5, and instead of 50k iterations I'm seeing ~80k iterations. I do have --differential-prior 0.5; might that be a reason why there are more iterations than expected?

This isn't a big issue, but if the actual number of iterations is computed differently it'd be a good idea to update the README accordingly (at the very least, removing this equation if it is incorrect).

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 20, 2019

OOP I just read the part of the README that says

For example, if you have a 100 samples in your dataset and you specify --batch-size 5 and --epochs 200, then you will have (100/5)*200=4000 iterations total.

That'd make more sense I guess, since the red sea dataset has 45 samples (so (45/5)*10k = 90k). Still not sure why both the tensorboard and q2 summaries only go up to 80k iterations -- maybe summary-interval stuff, but I set that to 1 so that shouldn't have been a factor... hm.

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 23, 2019

Ah, so I think I understand why it's only at 80k iterations. There are 5 samples being held out for testing (since the default for --num-random-test-examples is 5).

In lieu of trying to provide an accurate formula for # of iterations, my opinion is we should just loosely specify what params will influence this instead of trying to pin down an accurate number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant