-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use TimerTask in the 3scale batcher policy #786
Conversation
local function ensure_report_timer_on(self, service_id, backend) | ||
local check_timer = self.semaphore_report_timer:wait(0) | ||
local function ensure_timer_task_created(self, service_id, backend) | ||
local check_timer_task = self.semaphore_report_timer:wait(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this to .new
? This policy should not be initialized in init
phase, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't because we don't have access to the service ID in .new()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Maybe better to leave it for later, butI think we should think about changing the data structure to remove the service_id dependency. It should be as simple as create timer in new that gets reports from reports batcher. There should be no need for service_id because reports batcher instance is specific to this policy and timer, so it will ever contain only reports from this service. I see the reports batcher actually uses that service_id to do a lock on shmem. Hopefully we can solve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. That part can be simplified a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we generate some UUID instead of using service id ? Then every policy would take care just of its own data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to an idea that we discussed some time ago.
Right now, pending reports are stored in a shared dictionary and every instance of the policy creates a timer (per worker) to report all the pending reports for its service ID.
After introducing TimerTask
I see it more clearly that it might be better to adopt a different strategy. We could store the pending reports in a table of the instance, as we know that there'll only be one instance of the policy for a given service ID per Apicast worker. Pros: no locks, simpler code. Cons: possibility of losing reports if a worker dies, which in practice I don't think it'll be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave this for a future PR.
In this one, I'd like to focus on switching to using TimerTask
to be able to cancel timers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, definitely 👍
11ccd3d
to
511cce3
Compare
511cce3
to
bc4c74a
Compare
0b1cddd
to
74ba6b1
Compare
local usage = context.usage | ||
local service = context.service | ||
local service_id = service.id | ||
self.service_id = service_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it is being assigned to self ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this is a leftover from some test. Same with self.backend
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
function _M:schedule_one(delay) | ||
-- Need to wrap the task in a function that discards the first param (the | ||
-- "premature" one sent by ngx.timer.at) | ||
ngx.timer.at(delay or 0, function(_, ...) self.task(...) end, unpack(self.args)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to allocate a function for each call. Is that necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does no longer apply. This method has been replaced.
|
||
-- Can't check all the arguments of ngx.timer.at because it calls an | ||
-- private function but at least we can check the interval (first arg) | ||
assert.equals(interval, ngx_timer_stub.calls[1].vals[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can change this to use matchers and verify it was called with some function (matcher.function
) and exact second parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the function is private, so I don't think we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matcher can match the type like:
assert.spy(ngx_timer_stub).was_called_with(matcher.is_function(), 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm familiar with matchers, but I misunderstood you.
I thought that you wanted to check that ngx_timer_stub was called with a specific function and I was saying that it was not possible because it's a private one. Now I see that you suggested just to check that it's a function no matter which one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a check for this 👍
|
||
if self.timer_task then | ||
self.timer_task:schedule_one(1) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we will rely on cancelling the timer by __gc callbacks ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works, but let's cancel the timer_task here to be explicit about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
-- run before that so we do not leave any pending reports. | ||
|
||
if self.timer_task then | ||
self.timer_task:schedule_one(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delay 1 and not 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
From the point of view of the policy, I think it does not matter. However, busted crashes with a segfault when this is 0. Not sure why exactly. It might be related to how busted collects the garbage after all the tests. It does not crash with a delay of 0.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then this is definitely worth a comment :-D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end I changed this and implemented it as we discussed off-line: ba883c1
f3d949e
to
5398758
Compare
Unfortunately, the specs were mocking ReportsBatcher and that was hiding a bug. ReportsBatcher:add() was not being called with the correct parameters.
…er to real usage Before, the tests set a high reporting interval so a report was not triggered in the middle of the test. The tests relied on calling a specific endpoint to force a report to backend. Now, instead of doing that, we let the policy to run report jobs as it would normally do, we aggregate the data we receive in the reporting endpoint, and at the end of the test, we verify it. This approach is closer to a real usage of the policy.
5398758
to
b04d254
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR replaces the usages of
ngx.timer.every
withTimerTask
instances.This solves the issue of not being able to cancel timers. Recurrent tasks created using
TimerTask
can be canceled, but we cannot cancel timers created withngx.timer.every
. This was causing a bug, Timer would continue running even after a config reload.