Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for job sandboxes #13

Open
5 of 9 tasks
chrisburr opened this issue Jun 27, 2023 · 0 comments
Open
5 of 9 tasks

Implement support for job sandboxes #13

chrisburr opened this issue Jun 27, 2023 · 0 comments
Assignees

Comments

@chrisburr
Copy link
Member

chrisburr commented Jun 27, 2023

The aim of this should be to be able to submit a job to DiracX with an input sandbox and be able to run it with the existing DIRAC infrastructure.

There are two parts of this as uploaded sandboxes need to be available via both DISET and DiracX.

How sandboxes currently work?

  • Files are uploaded with DISET and the path is SE:ProductionSandboxSE:/Sandbox/u/username.group/hash.tar.gz where ProductionSandboxSE comes from the CS (LocalSEName) and /Sandbox/ is hard coded.
  • Data is stored on the service's local disk.
  • Data is deduplicated for a given user+group using the file's MD5 hash.
  • Files are downloaded with DISET from local storage.

How will sandboxes work in DiracX?

  • Data will be stored on S3-compatible storage (default is minio from the helm chart)
  • Sandbox paths will be SE:ProductionSandboxSE:/S3/u/username.group/hash.tar.gz
  • Doing POST /jobs/sandbox returns a payload indicating the presigned URL which is used to send the data. The body of the post request includes the SHA-256 of the payload and the payload size.
  • If the data already exists a JSON blob is returned containing the sandbox ID which should be included in the JDL.
  • If the data doesn't exist, a JSON blob including S3 pre-signed URL is returned. This pre-signed URL should use the x-amz-content-sha256 header to make the storage verify the hash of the sandbox and limit the Content-Lenth.
  • Data is downloaded by doing GET /jobs/sandbox/SE:ProductionSandboxSE:/S3/u/username.group/hash.tar.gz with a HTTP temporary redirect to a pre-signed URL.

Tasks to implement this in DiracX:

For assigning sandboxes to job IDs (implementation tbd):

  • Input sandboxes are handled by the JobSanity executor
  • Output sandboxes

Migration

The exisiting SandboxStoreHandler need to be modified to:

  • Proxy data from S3 if the path starts with /S3/.
  • The existing SandboxStoreHandler should have a flag (UseS3Backend) added to make it upload to S3 instead of the local disk and return the appropriate path.
  • Clean up sandboxes on S3

Once all sandboxes exist on S3, the legacy adaptor mechanism can be used to make the SandboxStoreClient talk directly to diracx. At this point the SandboxStoreHandler can be removed.

We don't need to expose old sandboxes via DiracX as we can rely on the UseDiracXBackend flag having been set for a while.
If an installation cares about keeping older sandboxes a migration script can be created to move sandboxes from the local disk to S3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants