Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frontend acknowledge backfill #1293

Merged
merged 24 commits into from
Dec 28, 2020

Conversation

aLekSer
Copy link
Collaborator

@aLekSer aLekSer commented Nov 25, 2020

Add AcknowledgeBackfill() request handler.

What this PR does / Why we need it:
Create AssignmentGroup for all associated tickets and use Redis stateStore, not backend to assign all tickets.

Which issue(s) this PR fixes:
Work on #1240

Special notes for your reviewer:
Contains part of #1279 as well as #1288 , so check only latest commit when viewing files changes. Namely:
9b5cfde
AcknowledgeBackfill do not update Generation, unlike other UpdateBackfill methods (from GS and Frontend).

Store Backfills in a sorted set to retrieve all expired backfills by time.
It has the same input as output of GetExpiredBackfillIDs function.
Some backfillIDs can be acknowledged in between,But this should not affect the work of cleanup service.
Used CreateBackfill implementation from another PR.
@google-cla google-cla bot added the cla: yes label Nov 25, 2020
@aLekSer
Copy link
Collaborator Author

aLekSer commented Nov 26, 2020

CreateBackfill() should also query AcknowledgeBackfill in Redis, just to set initial initial timestamp.
Frontend AcknowledgeBackfill request, should first check if Backfill exists in Redis, otherwise AcknowledgeBackfill can create a new record in sorted set for non-existent Backfill.

@aLekSer aLekSer force-pushed the frontend/acknowledge-backfill branch from c9c664e to ec34b3b Compare December 8, 2020 19:22
@aLekSer aLekSer marked this pull request as ready for review December 8, 2020 20:58
panic(err)
}
return result
actual, err := om.Frontend().GetBackfill(ctx, &pb.GetBackfillRequest{BackfillId: bfID})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you test GetBackfill here?
The name of the function is createMatchWithBackfill and you are passing backfill as a parameter.
I think if you want to test GetBackfill you should do it out of this method in order to not overload one with a lot of code. createMatchWithBackfill does more than a simple create match step here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a refactor of @yeukovichd test. Now it is used in 3 place, so that we can associate ticket with backfill using a match proposal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more additional check is not bad. So I will not remove this line.

@aLekSer aLekSer force-pushed the frontend/acknowledge-backfill branch 2 times, most recently from c93f0b1 to 151e7a8 Compare December 9, 2020 11:16
test fixes.
@aLekSer aLekSer force-pushed the frontend/acknowledge-backfill branch from 151e7a8 to f43894c Compare December 9, 2020 11:50
if req.GetBackfillId() == "" {
return nil, status.Errorf(codes.InvalidArgument, ".BackfillId is required")
}
if req.GetAssignment().Connection == "" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Assignment is nil you will fail here with invalid memory address or nil pointer dereference error


ticketID := createMatchWithBackfill(ctx, om, createdBf, t)
conn := "127.0.0.1:4242"
getBF, err := om.Frontend().AcknowledgeBackfill(ctx, &pb.AcknowledgeBackfillRequest{BackfillId: createdBf.Id, Assignment: &pb.Assignment{Connection: conn, Extensions: map[string]*any.Any{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests with parameters validation are missing. seems you are testing only a happy path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I think parameters validation can form a new test - frontend_service_test.go

@aLekSer aLekSer force-pushed the frontend/acknowledge-backfill branch from 051c741 to 7b4f7db Compare December 9, 2020 16:27
@@ -132,20 +131,39 @@ func TestBackfillFrontendLifecycle(t *testing.T) {
require.Nil(t, get)
}

func TestProposedBackfillCreate(t *testing.T) {
ctx := context.Background()
func TestAcknowledgeBackfill(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this test does not cover AcknowledgeBackfill at all, otherwise, you would notice that you do not call the Acknowledge backfill method from a statestore. This test should be completely rewritten.

Copy link
Collaborator Author

@aLekSer aLekSer Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is E2E test, so it is hard to test this.
I know that there is an additional Redis state change - this would be covered in a Unit test.
It covers one of main effects of acknowledgment - all tickets get assigned.

Copy link

@akremsa akremsa Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see why you can not cover AcknowledgeBackfill step in e2e, you still can do that in unit test to check the redis state, that is what I meant. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added UT.

@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 9, 2020

@akremsa Addressed your comments. Tests which are flaky on CI are TestProposedBackfillCreate and TestProposedBackfillUpdate

@aLekSer aLekSer force-pushed the frontend/acknowledge-backfill branch from 9f7440c to c0b35d8 Compare December 9, 2020 23:17
When FetchMatches takes more than 200 ms pendingReleaseTimeout Ticket
got expired and no Tickets can be queried thereafter.
if req.GetAssignment() == nil {
return nil, status.Errorf(codes.InvalidArgument, ".Assignment is required")
}
if req.GetAssignment().Connection == "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection is not a required field on assignments.

internal/app/frontend/frontend_service.go Show resolved Hide resolved
return nil, err
}
}
return bf, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the tickets are assigned, they need to be removed from the associated ticket list? Otherwise they'll be assigned every acknowledge call?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated: now removing only assigned tickets from Associated tickets list.

Copy link
Collaborator Author

@aLekSer aLekSer Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And inside the mutex section

Add Mutex lock, UpdateBackfill accordingly after UpdateAssignments call.
@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 11, 2020

All comments are resolved

@thisisnotapril
Copy link
Collaborator

@HazWard @calebatwd would love to have y'all review

Copy link

@davidedmonds davidedmonds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good from over here - seems all the comments have been addressed as well.

return nil, status.Errorf(codes.InvalidArgument, ".Assignment is required")
}

m := s.store.NewMutex(req.GetBackfillId())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am curious what this mutex is protecting? I assume it protects the backfill structure itself from getting updated when we are doing this. However, I see that the UpdateAssignments call is independent of the backfill structure - so you could potentially have another parallel UpdateAssignments stomp on the tickets when this method is doing so too. Is that OK?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that on line 406 we execute UpdateBackfill and Backfill internal protobuf format includes also ticketed associated with this Backfill. So this lock is necessary here.
https://github.com/googleforgames/open-match/pull/1293/files#diff-fbf6a8aad66158048250071262ad0c5fd8c08c9034840c5802784888408881f3R406

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that the lock is necessary here prevent the backfill from being modified from concurrent calls. I am not sure if it is sufficient. Can you explain how it protects the UpdateAssignments call from modifying the tickets while they are being updated from the AcknowledgeBackfill call?

Copy link
Collaborator Author

@aLekSer aLekSer Dec 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tickets are immutable, so we don't need to keep track of their state. They can only be assigned, but searchFields could not be updated, this is the main difference with Backfill which has generation (version) number.
That's why no changes to that end is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not mention updating the searchFields at all.

TicketA is a part of BackfillA. You receive a call to AcknowledgeBackfill for BackfillA - and are updating the assignment on the TicketA by making UpdateAssignments call - Line 389.

Concurrently, the Backend receives AssignTickets method setting Assignment for TicketA to a different assignment.

Is there anything that prevents the AssignTickets method from calling UpdateAssignments call on TicketA concurrently with AcknowledgeBackfill calliong UpdateAssignments for TicketA?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is not possible to have one ticket being associated to two backfills or used in a normal match. Evaluator would not make this happen. Only one match would be selected with a ticketA and later on it would be moved into pendingRelease tickets not visible by QueryTickets later on. We are working on scale tests it will let us know if concurrent issues like this would arise.

return nil, err
}

// all assigned tickets should be removed from the associatedTickets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For my understanding)
I see us select all tickets on the backfill in associatedTickets (from call to s.store.GetBackfill), We then assign all those tickets the assignment in s.store.UpdateAssignments - which returns us tickets. We later update the backfill with the delta between associatedTickets and this 'tickets' which are essentially unassigned.

Why does this 'unassigned tickets' exist? didnt we intend to update assignment on all the tickets when calling the UpdateAssignment call?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that UpdateAssignment can return not the full list and some tickets are left unassigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code in UpdateAssignment, any ticket left unassigned is because it was not found. (See: AssignmentFailure_TICKET_NOT_FOUND). So currently, any ticket that is returned as not found from UpdateAssignment is still left on the Backfill? We should atleast log the tickets that we leave unassigned here as UpdateAssignment returns a list of errors along with a list of assigned tickets (currently we ignore the errors list)

Copy link
Collaborator Author

@aLekSer aLekSer Dec 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added logging of the errors on updateAssignment request.
We decided to remove all tickets after UpdateAssignment() call, since those unassignedTickets are not found.
We can add this unassigned tickets back in a separate PR if a need arise.

internal/app/frontend/frontend_service.go Show resolved Hide resolved
@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 23, 2020

Actually I have chosen probably not the best name for statestore function AcknowledgeBackfill(). I will rename statestore function to UpdateAcknowledgmentTimestamp().

New function name seems more reasonable as it do only Redis timestamp
updates, updateAcknowledgmentTimestamp.
@aLekSer aLekSer requested a review from sawagh December 23, 2020 14:07
@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 23, 2020

Simplified the e2e/backfill_test.go review process, refactoring of the same code which is part of createMatchWithBackfill() I decided to move out of this PR.

@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 23, 2020

@sawagh I have applied all your comments. PTAL

We need to deindex tickets and add error logging for all NotFound
tickets.
@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 24, 2020

@sawagh I have added error logging with Test case containing one of two tickets got deleted before the AcknowledgeBackfill call.

Copy link
Contributor

@sawagh sawagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we dont resolve the concurrency comment, I am ok checking this in and tracking that in a separate issue.

@aLekSer aLekSer merged commit defac90 into googleforgames:master Dec 28, 2020
@syntxerror syntxerror added this to the v1.2.0 milestone Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants