-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make preparelocal use S3 for tarball and unify with --dryrun #6544
Comments
should be enough to go in this order:
|
this relates with #7461 |
|
I would like to explain how we move input files (wrapper, scripts, and user sandbox) in crab systems first. In the normal job submission:
|
So, to unify This is why I said earlier that I want to deprecate the behavior, not remove the command. This is possible and easy to do in client side. Thanks to Dario again for Basically, For the server side, well..a lot of code changes needed, obviously the I will write the detail down tomorrow. |
And of course thanks to Stefano for the original ideas on how to unify both commands and the attempted of improving the |
sounds like you know more than me about this matter now :-)
|
Here is the pointer to the code: These PR changes 3 things:
Because I separated sandbox from |
@belforte I have moved this task to Todo in "CRAB Work Planner". |
I looked at the code and have only some minor questions.
|
OK. Let's deploy the change to DagmanCreator ASAP so we can push new client in production. |
doing 4. I found a bug in #8740 which was only affecting sandbox existance check in new code. But I have also found that we also still need |
let's go with a:
|
at the moment |
I decided to go in steps.
@aspiringmind-code @novicecpp I will gladly get your advice if you feel like suggesting something different |
time to look at
so long for a "smooth transition" :-) At least desperate users have a fall-back until we update TW. [1]
[2]
traceback
|
so currently OTOH if we deploy a client which works with current TW, nobody can complain ! So let's see if I can achieve that. |
New client requires to update REST. |
REST updated :-) |
I tried various things, but too much gets broken. Better break dryrun for Main problem is that current DryRunUploader action creates as Best solutions seems to me to modify In other words:
|
I believe I converged on implementation. Then I am going for:
DagmanCreator creates InputFiles.tar.gz and Uploader uploads it to S3 DagmanCreator also creates splitter-summary.json used for --dryrun and adds it to InputFilers.tar.gz submit --dryrun is modified to use the new JSON way to pass arguments to CMSRunAnalysis.py and gets simplere and more readable, and keeps working. my current branch https://github.com/belforte/CRABServer/tree/make-dryrun-work-again should be all that's needed TW-side I will further simplify by removing the --jobid option for crab preparelocal. Users will have these possibilities:
|
time for a new, extended, compatibility matrix
Error legend: |
so maybe I do not need "my new" TW. But it is not acceptable that new client will fail to do Let's see if I can enhance https://github.com/belforte/CRABClient/tree/new-preparelocal-dryrun to solve [nt] |
[nt] failure is due to running This means that when we will deploy this (after TW v3.241018 or later) there will no problem in running I verified this on a recently submitted user task https://cmsweb.cern.ch/crabserver/ui/task/241106_185715%3Askeshri_crab_370355_Muon0 |
conclusion: #8767 will close this from TW side. |
And of course there's no point in making |
only remaining issue that new client with new TW prints an annoying message for The noise comes from CRABServer/src/python/ServerUtilities.py Line 269 in 483adb2
which in this cases uses But why do we get http 500 instead of a simple 'None' as webdir ? Maybe something to be improved in REST ? REST log has
Which comes from CRABServer/src/python/CRABInterface/RESTTask.py Lines 218 to 224 in 483adb2
Let's address this in a ad-hoc issue for the REST |
All OK now ! Last bit of cosmetics could be to try "new" way first in Well can simply remove the msg, but all in all try new way first is better and will make it more clear which code to remove later on. |
test new client as per Client Pull Request PR 5347after changing to use "new way first"
[nowb] : |
and of course the status for TW v3.241018 is only for information during this development, we are not going to deploy that, we'll go straight for a new tag with the fix from #8767 |
closed via #8767 and dmwm/CRABClient#5347 |
currently preparelocal creates a tarball with all needed stuff for executing the job wrapper
(inside DagmanCreator) called InputFiles.tar.gz and sends it to the schedd, from where crab preparelocal
fetches it to create local directory where to run the job.
Such tarball should be transferred via S3 cache instead, and possibly with same code as for --dryrun.
Even better, --dryrun should be executed inside the directory created by preparelocal, and should not be part of the submit command.
something like:
Also, currently in the schedd there are both
InputFiles.tar.gz
andinput_files.tar.gz
! pretty damn confusingDifficulty is to have a way to implement a piece at a time, w/o breaking things.
The text was updated successfully, but these errors were encountered: