-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.6.8.29 #103
Merged
Merged
3.6.8.29 #103
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
PalNilsson
commented
Sep 27, 2023
•
edited
Loading
edited
- Redirecting stdout/stderr to files for trace service curls
- This could prevent thread deadlocks in the standard python subprocess.communicate() function in case of overwhelming amount of stdout/stderr. The subprocess.communicate() function is no longer used, which also means that the internal timeout capability in subprocess can no longer be used and had to be reimplemented by a threading timer which sets the relevant error code if necessary
- The 'last’ output from curl is stored in trace_curl_last.stdout/stderr, and gets appended to trace_curl.stdout/stderr
- The trace_curl_last.stdout/stderr files are searched for any curl errors (curl command always returns 0 exit code even when there was an error, so the output has to be processed)
- A failed rucio trace curl operation is now reported with job metrics
- Example: rucioTraceError=N
- Increased connection timeout from 20s to 100s to be in line with panda server curl operations (where we don’t see any problems)
- Related JIRA ticket: https://its.cern.ch/jira/browse/ATLASPANDA-835
- Reporting prmon read_bytes/total_input_size with job metrics (‘readbyterate’)
- Information to be used for optimizing brokerage
- Requested by J. Elmsheuser, R. Walker
- Extended usage of psutil
- Job monitoring is now using psutil to discover prmon pid
- If psutil is not available (e.g. as is the case on marenostrum), the code falls back to old ps command usage
- Added protection against expired job objects in job_monitor loop
- Reported by W. Guan/Z. Yang
- Updated GitHub Action workflows
- Unit tests and flake8 are are now independent workflows
- Moved to latest flake8 version 6.1.0 for flake8 verification
- All tests are run for python versions 3.8, 3.9, 3.10 and 3.11
- Tested pilot running under python versions 3.9.18 and 3.11.5
- Grid jobs are currently running under python version 3.9.14 but will soon switch to 3.9.18 to be in line with user tools (like rucio)
- Python version 3.11.5 will be the default version on EL9
- Requested by A. De Silva
Initial Pilot 2 to Pilot 3 changes
3.3.4.118
…ded additional exception handling. Cleanup
…ary. Add read_bytes
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.