Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v8.0] fix: AREX out and err need to exist before file integrity check #7588

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

aldbr
Copy link
Contributor

@aldbr aldbr commented Apr 22, 2024

The Solution introduced in #7549 does not work: RemoteRunner._checkOutputIntegrity() checks the integrity of the output, but the .out and .err files are removed by the AREXCE before being checked. Therefore, the output is considered corrupted.

Here we remove the .out and .err in the RemoteRunner after the integrity check, following the AREXCE format. I don't think it is a problem because:

  • the RemoteRunner works only with AREX/ARC CEs
  • the RemoteRunner is going to disappear at some point.

BEGINRELEASENOTES
*WorkloadManagement
FIX: AREX "out" and "err" need to exist before file integrity check
ENDRELEASENOTES

@DIRACGridBot DIRACGridBot added the alsoTargeting:integration Cherry pick this PR to integration after merge label Apr 22, 2024
@@ -898,10 +898,8 @@ def getJobOutput(self, jobID, workingDirectory=None):
if remoteOutput == f"{stamp}.out":
with open(localOutput) as f:
stdout = f.read()
os.unlink(localOutput)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is possible that we are not using the RemoteRunner with AREX, right? So who cleans up these files then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the PilotManager is also using <CE>.getJobOutput() and removes the WorkingDirectory (containing the outputs) at the end of the operation:

shutil.rmtree(ce.ceParameters["WorkingDirectory"])

I agree that this is not ideal, but IIUC, at some point, pilots will report their outputs by themselves and I guess that the getJobOutput method would only be used for such a use case(?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then there is no problem!

@fstagni fstagni merged commit e57a310 into DIRACGrid:rel-v8r0 Apr 22, 2024
26 checks passed
@DIRACGridBot DIRACGridBot added the sweep:done All sweeping actions have been done for this PR label Apr 22, 2024
DIRACGridBot pushed a commit to DIRACGridBot/DIRAC that referenced this pull request Apr 22, 2024
@DIRACGridBot
Copy link

Sweep summary

Sweep ran in https://github.com/DIRACGrid/DIRAC/actions/runs/8786397925

Successful:

  • integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alsoTargeting:integration Cherry pick this PR to integration after merge sweep:done All sweeping actions have been done for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants