-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grd failure #161
Grd failure #161
Conversation
* Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]>
* Do not cross metric day boundaries. * add log file arg delete empty directories when done * Tweak remove empty dir removal * Tweak remove empty dir removal again * Merge day boundary (gwpy#146) * Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]> * rebase agaist last approved PR * rebase against last approved PR * rebase against last approved PR again, fix flake8 * Fix a bug in remove empty directories. Co-authored-by: Joseph Areeda <[email protected]>
* Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]>
* Do not cross metric day boundaries. * Merge day boundary (gwpy#146) * Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]> * Check point merge (gwpy#147) * Do not cross metric day boundaries. * add log file arg delete empty directories when done * Tweak remove empty dir removal * Tweak remove empty dir removal again * Merge day boundary (gwpy#146) * Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]> * rebase agaist last approved PR * rebase against last approved PR * rebase against last approved PR again, fix flake8 * Fix a bug in remove empty directories. Co-authored-by: Joseph Areeda <[email protected]> * Merge day boundary (gwpy#146) * Address issue gwpy#126. Allow pyomicron to run from a frame cache without accessing dqsegdb. Add documentation for this * Do not merge files if they overlap "metric days" * Do not cross metric day boundaries. Co-authored-by: Joseph Areeda <[email protected]> * minor doc changes * Fix a bug where an xml.gz file could get compressed again in merge-with-gaps * Implement a periodic vacate to address permanent D-state (uninterupptible wait) causing jobs to fail to complete * Always create a log file. If not specified put one in the output directory * Fix a problem with periodic vacate. * Up the periodic vacate time to 3 hrs * Found a job killing typo * Add time limits to post processing also * Don't save segments.txt file if no sgments founds because we don't know if it's an issue of not finding them or a valid not analyzable state. * disable periodic vacate to demo the problem. * Fix reported version in some utilities. Only update segments.txt if omicron is actually run. * Clarify relative imports. and add details to a few log messages * Resolve flake8 issues --------- Co-authored-by: Joseph Areeda <[email protected]>
tweak logging to better underst guardian channel usage-
omicron/cli/process.py
Outdated
logger.critical(f'configuration file does not exist: {config_file.absolute()}') | ||
exit(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this using sys.exit
and not raising an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, how about FileNotFoundError wit the same text?
if ((online and statechannel) or (statechannel and not stateflag) or (statechannel and args.no_segdb)) and \ | ||
dataduration <= 3600: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be easier to read (and maintain) by splitting over many lines
if ((online and statechannel) or (statechannel and not stateflag) or (statechannel and args.no_segdb)) and \ | |
dataduration <= 3600: | |
if dataduration <= 3600 and ( | |
(online and statechannel) | |
or (statechannel and not stateflag) | |
or (statechannel and args.no_segdb) | |
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, why 3600 seconds, why not 1800, or 7200?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this effectively forbid using a statevector when processing more than 1 hour of data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been a troublesome issue that I'm still searching for a solution.
The original problem was that the segment file was updated regardless of whether the online job succeeded. So when we had delays we didn't process anything. Now the problem is if we get too far behind, a way too common occurrence we get stuck reading hours of raw frame files until evicted with no checkpointing.
I'm guessing at a robust limit so perhaps the "solution" is to make it another command line argument.
from .. import io | ||
from .. import const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this change made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly but that looks like how I'd respond to a flake8 complaint
…st processing jobs. Also implement that the "right" way instead of my hacky paren-child model. Verify we work with dqsegdb 1.2.1
Codecov Report
@@ Coverage Diff @@
## master #161 +/- ##
==========================================
+ Coverage 35.16% 35.23% +0.07%
==========================================
Files 30 30
Lines 2944 2941 -3
==========================================
+ Hits 1035 1036 +1
+ Misses 1909 1905 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Update default concurrency to 25 per group Verify we work with dqsegdb 1.2.1, after changes
Hello @areeda! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2023-10-12 14:48:43 UTC |
…s pyomicron as dependency
Limit merges to metric day boundaries
Don't use guardian channels from raw frames for more than 1 hour
get flake8 and pytests to pass