-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify buid and deploy processes across the various components of OpenPATH #1048
Comments
Shankari mentioned this: Does this mean:
|
The admin and public repos are built on top of e-mission- server image - the Dockerfiles for these build off of the base image of e-mission server. What we would want to do is that when an emission-server PR is merged, we want to bump up the dependency in the admin and public dash board Docker files to the latest tag; and then rebuild those images. The automation would include just the changes to the Dockerfile concerning the latest image tags to be updated with the base server image. This does not include other code changes with PRs as these would still need to go through the code review process that we are currently following. The automated merges with Docker tag updates must occur only when the underlying e-mission-server has been updated. The automated builds for the latest merged or updated code versions of these repos (and not any open / un-merged PRs) can occur if needed on every merge. |
Suggestions from Thursday:
Reusable workflows syntax example:
Composite actions with repository dispatching
Notes:
|
It is "sub-modules" not "some modules" 😄 |
So, I did take a look at git submodules and it might not be a good option for our usecase. Found some information here:
What it does:
Why not good?
|
Single Internal RepositoryA possible redesign for the internal repositories I came up with includes having a single repository internally with sub-directories for each repo similar to how server repo is being used internally currently. For server repo, internal repo is named as nrelopenpath. Similarly, can refactor join, admin-dash, public-dash repos to be used like server repo is being used in the internal GitHub repos.
Need to see how to fit in admin-dash (external remote upstream -> main -> dev) and public-dash (notebook image) Pros and Cons:
Cons:
|
Index for Topic wise updates
We have organized our findings, in a series of topics / sections for ease of read. |
Topic 1: Compiled list of Related issuesWe spent a lot of time just scouring through the existing documentation in GitHub issues, PRs (both open and closed) spread throughout the repositories for our OpenPATH project. Notes:
|
Topic 2: Learnings from Documentation across OpenPATH projectWhile it’s been a lot of information to digest, we’ve gained a much better understanding now of the history behind OpenPATH deployment and have compiled an expansive list of issues that we believe relate to this. Some knowledge points from our exploration are mentioned below: 1. Admin-dashboard
B. Wrappers added in Dockerfile (for other repos too) [1]
2. Join
3. Public-dash A. Basic understanding of public-dash images [1]
B. AWS Codebuild [1]
C. Switched to pinned notebook Image [1, 2]
D. Point where we switched to using notebook image instead of server image [1]
These public-dash changes that commented out code (used in external and ran notebooks directly for generating plots) were all done around the same time - Sep 2022. 4. e-mission-server A. CI publish multiple images from multiple branches + Image_build_push.yml origin [1, 2] Most important issue as this is very close to the current task we’re working on that highlights the redesign needed.
Looks like we also went from a monorepo / monolithic design to a split microservices design, especially for the server code: B. Learnt about the origin of Dockerhub usage + Automatic image build
C. Travis CI was tested by Shankari [1]
D. AWS Costs Detailed Discussion [1] E. Why or why not to use Dockerhub (costs, free tier, images removal) [1]
Dockerhub resources: [pricing, 6 months policy, policy delayed, retention policy + docker alternatives] F. MongoDB to AWS DocumentDB switch [1]
5. Nrelopenpath [INTERNAL]
6. e-mission-docker A. Origin of multi-tier Docker compose [1]
|
Topic 3: Questions for the Cloud Team + Key takeawaysWe have been collaborating with cloud services back and forth. Q1: We understand the current process to build and deploy images looks like this: Could the possible process be changed to only build once externally like this: A1:
Q2: A2:
Q3: A3:
Q4: A4:
Q5: A5:
Q6: A6:
Q7: Instead, in the post-build stage, just before “docker push” to AWS ECR, we can pull the required image from Dockerhub, tag it, then push to AWS ECR. A7:
Some key takeaways:
|
Topic 4: Understanding of the internal repos (Need + Rebuild)Two questions to answer and we’ve put forth our points having discussed with Jianli. 1. Do we need internal repos at all? Why?
2. Do we need to re-build images both externally and internally (two sets of Dockerfiles for each repo) A. Possibly not needed to rebuild:
B. Possibly need to rebuild:
To summarize:
|
Topic 5: Redesign Plan UpdatesRedesign steps suggested in previous meeting: 1. Add all four repos to the Multi-tier docker structure A. Current setup
B. New setup
C. Feedback from previous meeting:
2. Proposed deployment process We are still considering the one internal repository structure (mentioned above in 1.) with related files for each repo inside just one subdirectory per repo. A. Skipping the “build” job:
B. Streamlining repo-specific build processes i. E-mission-server:
ii. Join
iii. Admin-dash:
iv. Public-dash:
Two possibilities:
|
One high-level comment before our next meeting:
Do we need all these conf directories? A lot of them have a single configuration. |
Table for Differences in External and Internal Repositories
|
High-level thoughts:
Microscopic details:
|
…to crontab folder from em-public dashboard. 2. Added note to update once issue e-mission#1048 is checked in. 3. Removed instructions to use mongodb directly.
Current dealings: @MukuFlash03 and I have been collaborating on minimizing the differences between the internal and external repositories. I have been working on figuring out how to pass the image tag – created in the server action image-build-push.yml – between repositories. I’ve been attempting to use the upload-artifact/download-artifact method. It worked to upload the file, and we were able to retrieve the file in another repository, but we had to specify the We also looked into GitHub release and return dispatch as options, but decided they were not viable. There are ways to push files from one run to another repository, though we haven’t tried them yet. Write permissions might be a barrier to this, so creating tokens will be necessary. If we can get the file pushing to work, this is our intended workflow:
|
Created PR for implementation code changes here: A. External PRs:
B. Internal PR |
Had a meeting with @MukuFlash03 to discuss some issues with testing. Made a plan for documentation of testing and outlined what all needs to be done.
|
Docker Commands for Testing Code ChangesPosting a list of the docker commands I used to verify whether the docker images were building successfully. I had to ensure that the configurations setup in the docker-compose files were set manually by me since in the internal images docker-compose is not needed used any more. Creating a network so containers can be connected to each other:
DB container is needed for storage; data must be loaded into it (I did not load data when I did this testing initially)
A. Internal Repo Images Checkout to multi-tier branch in internal repo -> nrelopenpath
Sometimes, during local testing, join and public-dash frontend page might load the same html file as the port is still mapped to either join or public-dash depending on which is run first.
B. External Repo Images
Initially, I used option 1.
|
For automatic updates of the tags, we have three options:
|
@MukuFlash03 See my comment above, quoted below, for an outline of the steps to try to get the file pushing method to work:
|
There is also |
Summary of approaches tried for automating docker image tagsRequirements:
Notes:
Current status: Implemented
For reference, matrix strategy for workflow dispatch events
Pending
Approaches planned to try out:
Approaches actually tested, implemented and verified to run successfully
Reason for not trying out Approach 2: I tried out and implemented Approach 1 and 3 first. The next major task was to work on Req 3) which involves updating the Dockerfiles in the dashboard repos. |
Details of approaches In my forked repositories for e-mission-server, join, admin-dash, public-dash there are three branches available for the tags tags-artifact branch in: e-mission-server, admin-dash, public-dash, join tags-dispatch branch in: e-mission-server, admin-dash, public-dash, join tags-matrix branch in: e-mission-server, admin-dash, public-dash, join Approach 1: tags-artifact:
Cons:
Pros of workflow dispatch events and matrix strategy :
|
I don't think that the solution should be to update the Dockerfiles. That is overkill and is going to lead to merge conflicts. Instead, you should have the |
Our primary concern with this method was for users building locally. Is it acceptable to tell users to copy the latest image from the docker hub image library in the README? |
The comment around |
Docker tags automation working end-to-end!Finally got the tags automation to work completely in one click starting from the e-mission-server workflow, passing the latest timestamp used as the docker image tag suffix and then triggering the workflows in admin-dashboard and public- Final approach taken for this involves a combination of the artifact and the matrix-dispatch methods discussed here. Additionally, as suggested by Shankari here, I changed the Dockerfiles to use environment variables set in the workflow runs itself. Hence, not using / updating hardcoded timestamp values in the Dockerfiles anymore.
There is still a manual element remaining, however this is to do with any users or developers looking to work on the code base locally with the dashboard repositories. This is also what @nataliejschultz had mentioned here:
|
Implementation Combined approach (artifact + matrix) tags-combo-approach branch: e-mission-server, admin-dash, public-dash Successful workflow runs:
|
I decided to go ahead with the matrix-build strategy which dispatches workflows to multiple repositories when triggered from one source repositories. I had implemented this in tags-matrix branches of the dashboard repos (join repo as well, but this was just for initial testing purposes; final changes only on the dashboard repos). Initially, I only had a Now, for the workflow dispatch event, I was able to pass the latest generated docker image timestamp directly via the e-mission-server workflow in the form of an input parameter
This parameter was then accessible in the workflows of the dashboard repos:
|
With these changes done, I believed I was done but then I came across some more issues. I have resolved them all now but just mentioning them.
Why I chose to add artifact method as well? The issue I was facing was with fetching the latest timestamp for the image tag in case of a push event trigger. This is because in the workflow dispatch, the server workflow itself would trigger the workflows and hence was in a way connected to these workflows. However, push events would only trigger the specific workflow in that specific dashboard repository to build and push the image and hence would not be able to retrieve the image tag directly. So, I utilized the artifact upload and download method to:
Dockerfiles' FROM layer looks like:
Solution I implemented involves defining two
I then passed either of these as the --build-arg for the
The ReadMe.md can contain information on how to fetch this tag, similar to how we ask users to manually set their study-config, DB host, server host info for instance. |
wrt merging, I am fine with either approach
|
Build completely automated !No manual intervention required; not even by developers using code Referring to this comment:
I've gone ahead and implemented the automated build workflow with the addition of the .env file in the dashboard repos which just stores the latest timestamp from the last successfully completed server image. Thus, the build is completely automated now and users / developers who want to run the code locally will not have to manually feed in the timestamp from the docker hub images. The .env file will be updated and committed in the github actions workflow automatically and changes will be pushed to the dashboard repo by the github actions bot. Links to successful runs A. Triggered by Workflow_dispatch from e-mission-server Automated commits to update .env file: B. Triggered by push to remote dashboard repositories Automated commits to update .env file: |
I also tested another scenario where let's say a developer changed the timestamp in the .env file to test an older server image.
Thus expected workflow steps in this case would be:
Some outputs from my testing of this scenario, where I manually entered an older timestamp (2024-05-02--16-40) but the workflow automatically updated to latest timestamp (2024-05-03--14-37). A. Public-dash
B. Admin-dash
|
Also, added TODO s to change from my repository and branches to master branch and e-mission server repo. |
@shankari on the internal repo:
Initial thoughts about a script: *Pull the image tags from the external repos (GitHub API?) Where to run?
|
I can see that we have used This is bad. We use I have already commented on this before: I will not review or approve any further changes that use |
I got docker compose to work in actions for our process, but had to do it in a roundabout way.
Originally, I had planned to use an environment variable in my compose call
and then set the name of the image to ${ADMIN_DASH_IMAGE_TAG}. However, this does not seem ideal for people running locally. I found a way around this by adding a renaming step in the build process:
This way we can keep the names of the images the same and push them correctly. I tested the environment variable version here, and the renaming version here. Both worked! |
Merged all the related PRs; we can now create a new server build, have it generate cascading images, and then run a local script that pulls all the tags and builds internal images based on them. Yay! @MukuFlash03 @nataliejschultz However, this does not take backwards compatibility into account. The internal build process has not yet been changed to use the new method of building internal images. But to test this fully, we need to launch it using the internal build process. So we need to have a bridge that uses the new method of pulling an external image, but does so in a way that is compatible with the multiple repos that the current internal build process uses. Fixing that now... |
Backwards compat hack checked in. Let us now merge the changes to get the correct certs (i.e. e-mission/e-mission-server#976), run the script to verify that it works, and deploy to staging! Here's what we expect will happen:
let's see if it works. |
One of the github access tokens has expired. writing down the permissions needed for the record:
|
Spot checking some of the internal repos
We can consider fixing that if it persists (are we just adding a new line every time?) |
Declaring this closed for now; cleanup is in #1082 |
OpenPATH currently has four main server-side components:
the webapp and analysis containers are launched from e-mission-server; the others are in separate repos that build on e-mission-server.
There are also additional analysis-only repos (e-mission-eval-private-data and mobility-scripts) that build on e-mission-server but are never deployed directly to production.
In addition, there are internal versions of all the deployable containers that essentially configure them to meet the NREL hosting needs.
We want to unify our build and deploy processes such that:
The text was updated successfully, but these errors were encountered: