Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(metrics): handle the case where the worker is already assigned to a thread #1171
fix(metrics): handle the case where the worker is already assigned to a thread #1171
Changes from 9 commits
941506e
ffebef3
1899b34
3a35c42
d225d58
d4bb687
93bada2
5f92d99
cb7dc2e
0fe81b4
4023b59
290d65c
32330b4
3eff85e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the correct approach. What if the worker script takes longer than 200ms to start and then fails? Additionally, this will unnecessarily delay startup in most cases.
This line should probably be moved somewhere to
metrics.ReadyWorker
, right afterfrankenphp_handle_request()
was called.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I'm not sure why we need to wait for them to be fully ready other than for ensuring that tests can be run. This implementation ensures that we can continue and can be reasonably confident that the workers are executing and not immediately crashing. Whether the application is ready to serve or not is in the domain of the application through liveness and health checks.
In fact, in the current implementation, it is continuing before all threads are ready anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it will definitely wait for all threads to be ready via the
workersReadyWG
, which only reaches 0 after each thread has calledfrankenphp_handle_request()
. With this implementation we could reach a 'ready' state even if the worker script doesn't containfrankenphp_handle_request()
I guess it depends on when you would consider the worker thread to be 'ready', but why go with 'reasonable confidence' when you can have absolute confidence instead? (and be faster in most cases)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly don't remember what I ran into last night that specifically triggered this rabbit hole, other than turning on the watcher caused all kinds of problems with the metrics being output. TL;DR: I was just trying to simplify things to understand why it is causing issues, then running into other issues as I went, yak shaving my way through it all.
I ran into a number of minor issues that sometimes made it past the atomics, like restarting during shutdown. Atomics are terrible unless you are using CAS, and you are not guaranteed to get the latest value ... so I deleted them. From there, it was just figuring out what was going wrong with the
WorkersReadyWG
because as mentioned, the current architecture doesn't really make sense for them, but we don't want to run tests until workers are started.So, most of the refactoring is just in getting rid of atomics. This then uncovered new failing tests 100% of the time instead of randomly, particularly when restarting workers... and so on.
After all this, I still don't know why watchers specifically caused my issues 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it.
I don't think I'll merge this PR, but continue working on a PR for your PR. This metric is fubar for now, but I think it is worth focusing on the work there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I originally intended my refactor PR to just be a small step towards more control over PHP threads. But then I realized it could potentially also fix Fibers and it kinda ended up doing more than I wanted it to.
Problem with these big PRs is that they kind of block other PRs (like this one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should take smaller steps in our PR. Unfortunately, I never did figure out what was broken here, short of refactoring the entire worker stuff. :(
It might be worth working on an actual design in a Discussion? That way we'd be able to work on different parts independently and break it down into smaller PRs. Maybe we could take a 'strangler fig' approach and reimplement it separate from the current code. So, we could just use build tags to switch between worker implementations (at least while we are working on it), and people could test it more easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1175