-
Notifications
You must be signed in to change notification settings - Fork 3
RFC: Naming conventions for the documentation containers #3
Comments
In fact, using a single repository for multiple, loosely related images is not that uncommon: I personally like the idea of keeping as few repositories as possible and using tags to organize the different images, specially when they're for the same project or share a common nexus, like a series of examples or a single program with different but related capabilities. I would even go a step further and suggest something like
It looks like Note: just being nosy, like the uninvited wicked fairy of the tale; please disregard my comments if they are untimely. 😸 |
Hi @iesahin what other docs besides Katacoda scenarios do you intend https://github.com/iterative/dvc-doc-containers to have containers for? Any specifics? Also, since this is about docker container naming conventions I'd keep it in that other repo. It's not really about documentation but on a docker implementation detail. As for the naming questions I think whoever has more experience publishing Docker images can better decide. The one thing you could take from dvc.org/doc is the ideal URL paths, which I would reshape into:
|
Ah, no, certainly not, this is the kind of comment I would like to have. Thank you very much.
At the end, what I would like to have is also as few repositories as possible, but I prefer to have one repository for one Number of repositories disturbs me too, but I think to use this motivation to reduce the number of different setups. We may need 10-12 containers in the beginning, after some time all data/code examples should be merged into 3-4. Actually the reason of this endeavor is to merge all datasets and examples into as few setups as possible, but keeping these in tags will be like sweeping them under the rug. For example, we would like to have no
This is a good point. It seems better to use Thank you @0x2b3bfa0 |
Hi @jorgeorpinel. What I have in mind is to have an associated container for all pages that have a code example. This doesn't mean all of them should use separate containers, actually what we need is to stabilize all the examples and datasets into as few moving parts as possible. Containers are to test this goal, not a goal by itself. A side benefit may be to link these containers in the documentation and the users can test the commands right away, after a
You're right, but that repository is so new that I hesitated to invite comments to there. Thank you. |
Agreed! That's a good constraint for most of the usual containerized applications, though a repository with a sequential collection of loosely related examples could be a good fit for a single repository approach. In fact, the custom
What do you think about automating both building and publishing with GitHub Actions for continuous delivery? This would eliminate both the permission requirements and the possible human errors derived from manual pushes, similarly to the current
That's a good point. I lack enough context about the current needs to give any valuable suggestion in that sense, but, definitely, tags should not be used just as a patch to hide a still evolving project structure. |
I wrote that script because I was too lazy to check the repository names (and they were changing), but, yes, we can put ˋkatacoda:gs-versioningˋ into a ˋDockertagˋ file and it will be fine.
We already need to do that or a ˋcronˋ job somewhere to push the images daily. DVC seems to have a new release every day. The images become stale quickly. Other than CI, however, we'll probably update some images frequently. But on a second thought, frequent image updates are mostly needed for Katacoda, in which I had to update the images to see the effects on the platform. Other images can be developed locally and pushed once they are done. So human error seems to be a minor issue to me now. Thank you. @0x2b3bfa0 Today, when I was checking dvc-checkpoints-mnist, I thought the branches can correspond to tags and we can have a single repository to run the example code. (We plan to base the new examples on top of MNIST and this repository.) So a hybrid approach may also be feasible. Something along the lines of:
If a page doesn't need a specialized container, it can use the ˋbaseˋ version, otherwise it can derive from it and add any specialized setup. Docker repository names can also reflect the URLs of dvc.org, like ˋdoc-startˋ or ˋdoc-command-referenceˋ. It'll be easier to relate to the documentation pages. |
I think it's better to have a naming scheme similar to the dvc.org documentation organization, with the repositories
as the repositories. Each repository will have a I'm transferring this issue to Thank you 🤝 @0x2b3bfa0 @jorgeorpinel @shcheklein |
This is actually related to the Doc Containers Repository but here seems more appropriate to discuss:
I read the discussion in iterative/cml#217 and it may be worthwhile to discuss the naming convention for documentation containers.
@shcheklein and I discussed having a single (Docker) repository and maintain containers for different Katacoda scenarios with tags. My points about having different containers for each scenario, in summary:
docker pull dvcorg/katacoda-gs-versioning
is more cleaner thandocker pull dvcorg/katacoda:gs-versioning
. Repositories have easy to remember URLs.Dockerfile
s for different scenarios. Although we can pushdvc-doc-containers/katacoda/get-started/01-initialize
todvcorg/katacoda:gs-initalize
and.../06-experiments/
todvcorg/katacoda:gs-experiments
, this is usually not expected. It's a bit likedebian
andubuntu
having the same repository and differ by tags.build-all.zsh
script to build and push, but manually pushing the containers becomes error-prone. It's very easy to push to thelatest
tag or having a typo in the tag. Asking developers to build and push the containers using only the script seems unfeasible to me.Anyhow, I can update the script to push to a single Docker repository if you prefer single repo with multiple tags.
I didn't push the images to https://hub.docker.com/u/dvcorg/ yet. Currently, they all reside in https://hub.docker.com/u/emresult/
When pushed they will look like:
dvcorg/katacoda-base
(As the base image which installs DVC and other requirements. All Katacoda images derive from this image.)dvcorg/katacoda-gs-initialize
(forgs
= Get Started)For tutorials in https://katacoda.com/dvc/courses/tutorials, naming will be like:
and for the examples in https://katacoda.com/dvc/courses/examples
For the containers which will run the examples in the documentation, I plan a naming convention similar to the URLs in the dvc.org site. So, a container that runs the examples in https://dvc.org/doc/start/ will be named
dvcorg/doc-start
, anda container that replays the commands in https://dvc.org/doc/use-cases/data-registries will be named
dvcorg/doc-use-cases-data-registries
.Our goal is to keep the number of containers low by reusing them for all the documentation. So REF pages will share a single container in their examples and at the end, (hopefully) we'll have a few images that contain all the example code and data for the whole documentation.
I can update the naming convention to have
dvc
prefix, likedvcorg/dvc-katacoda-gs-versioning
ordvcorg/dvc-doc-start
. CML containers havecml-
as their prefix, so this may be more clear or repeatingdvc
may be unnecessary.Any comments, questions? Thank you.
@shcheklein @DavidGOrtega @jorgeorpinel @dberenbaum @dmpetrov
Related: iterative/dvc.org#2318
Related: iterative/dvc.org#2355
The text was updated successfully, but these errors were encountered: