Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatable containerized experiments #228

Closed
lewfish opened this issue Apr 20, 2018 · 5 comments
Closed

Repeatable containerized experiments #228

lewfish opened this issue Apr 20, 2018 · 5 comments
Assignees

Comments

@lewfish
Copy link
Contributor

lewfish commented Apr 20, 2018

When running a job remotely, it is necessary to specify which version of RV you want to run. We currently specify a branch off of this repo using https://github.com/azavea/raster-vision/blob/develop/src/run_script.sh#L8

This makes it impossible to execute forks of this repo. Also, using branches is problematic because if you add a commit while a chain of jobs is running, different jobs in that chain will run different versions of RV. A quick way of fixing this is to allow specifying the repo URI and the commit id.

This approach is still problematic though because you could delete a commit on a branch while the jobs are running. For the sake of repeatability, we need some way of freezing the code that should be run and archiving it. We could do that by creating a zip file of the repo and the point of launching the Batch jobs and storing it along with the files for that run. I've seen this done in before in https://github.com/openai/evolution-strategies-starter/blob/master/scripts/launch.py

@lewfish
Copy link
Contributor Author

lewfish commented Jun 28, 2018

I think a good approach is to use a private fork of Raster Vision to store branches for experimental runs. These branches should never be deleted, and should follow some sensible naming convention.

@lewfish lewfish added the queue label Jun 28, 2018
@lewfish
Copy link
Contributor Author

lewfish commented Jun 28, 2018

To support this (and to support other RV users) we should add an option for the repo URI to use when submitting batch jobs.

@lewfish lewfish self-assigned this Jun 28, 2018
@lewfish
Copy link
Contributor Author

lewfish commented Jul 3, 2018

To support this, we need to pass Github credentials to the Batch job so it can check out the code from a private repo.

@lewfish
Copy link
Contributor Author

lewfish commented Jul 5, 2018

We've discussed a different way of doing this that involves creating a Docker image for each experiment. I will write an ADR on it.

@lewfish lewfish added priority and removed queue labels Jul 5, 2018
@lewfish lewfish changed the title Improve way of specifying which code to run remotely Repeatable containerized experiments Jul 10, 2018
@lewfish lewfish added queue and removed priority labels Jul 26, 2018
@lewfish lewfish removed the queue label Sep 26, 2018
@lewfish lewfish removed their assignment Sep 26, 2018
@lewfish lewfish self-assigned this Oct 31, 2018
@lewfish
Copy link
Contributor Author

lewfish commented Nov 2, 2018

Subsumed by #512

@lewfish lewfish closed this as completed Nov 2, 2018
@lewfish lewfish removed the queue label Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant