-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: increase the default timeout to help TMO load properly #112
Conversation
timeout = 10 | ||
EpicsSignalBase.set_defaults( | ||
timeout=timeout, | ||
connection_timeout=timeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd recommend the auto monitor option as well:
And we have this in so many repos... probably time for a pcdsutils inclusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have it in a shared spot like pcdsutils, we can look at general env vars for tweakability:
PCDS_GUI_CA_TIMEOUT
(PCDSUTILS_GUI_CA_TIMEOUT
?) or something similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm 👍 on a pcdsutils
inclusion.
I can put on the auto_monitor
option but I'm actually not sure the purpose/result of that. Specifically, we'll be hitting ophyd any time any PV updates? That might be a lot of updates. Something to keep in mind is that the PyDM widgets are separately monitoring as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For background: auto_monitor
is a pyepics thing that sets up its subscription mechanism automatically: https://pyepics.github.io/pyepics/pv.html#pv-automonitor-label
I had found that it reduced the amount of timeouts I got from interacting with ophyd signals/devices via .get()
.put()
and .set()
in the above project. You'll also see it used in EpicsMotor, avoiding different-but-similar timeouts that Pete was seeing (https://github.com/bluesky/ophyd/blob/0e8acb3df3d17e0ab7aa2cc924831f4a0c580449/ophyd/epics_motor.py#L42-L64)
You're definitely right that it could affect performance. Perhaps it has no place in the general LUCID config at the moment, unless we continue to have timeout issues after this PR. That then leads to Robert's point as to "LUCID wants this EpicsSignal
configuration but las-dispersion-scan and btms-ui want a different configuration"... Which then leads me to think we shouldn't have this in pcdsutils 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try it and see if it feels any different at all (and check cpu usage etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto_monitor=True
had the following affects on the TMO lucid:
- Slightly smaller cpu usage spike during startup
- Higher cpu usage in general after startup (4% -> 7% with no typhos open on psbuild-rhel7-03)
I'm going to err on the side of leaving it as-is because we haven't done a lot of typhos/lucid testing with auto_monitor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to say: the app was exactly as responsive as before RE button clicks etc., but keep in mind that a lot of the PyDM stuff sidesteps ophyd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the testing. I agree with your assessment 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thought I had is that the timeout settings could be stuffed into an env variable, but then they might conflict across apps (maybe lightpath wants a different timeout than lucid? 🤷 )
This seems fine as a band-aid to me. My lucid dots seemed to fill in fine, but maybe I'm looking at the wrong thing.
In lightpath we also solved some of these issues by punting some of the requests to a run-later queue. I'm not sure if that pattern would be good here but it's a (more involved) thought
TMO was complaining about their typhos screens launched from lucid not being quite right. The enum value selection on some of the state devices would often come up with no options at all.
It looked similar to issues we saw in lightpath so I decided to boost the EpicsSignal timeouts for the application, like we did in lightpath. This does fix the issue and you can make the issue worse (more devices effected) if you drop the timeout to be arbitrarily low.
This probably shouldn't be the final form of this PR- I think timeout should be configurable, right? I'm not sure if it's best as a cli arg as part of the toolbar yaml or somewhere else. This is a reasonable band-aid though.
REF https://jira.slac.stanford.edu/browse/LCLSECSD-1659