-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[api] Implement replace endpoint #1162
Conversation
Adds an endpoint for replacing an instance across all services.
Codecov Report
@@ Coverage Diff @@
## master #1162 +/- ##
========================================
- Coverage 71.5% 71.5% -0.1%
========================================
Files 726 727 +1
Lines 60839 60905 +66
========================================
+ Hits 43540 43566 +26
- Misses 14523 14547 +24
- Partials 2776 2792 +16
Continue to review full report at Codecov.
|
acb76a8
to
fda5c4b
Compare
fda5c4b
to
e132f84
Compare
@@ -126,16 +126,31 @@ After sending the add command you will need to wait for the M3DB cluster to reac | |||
|
|||
#### Removing a Node | |||
|
|||
Send a DELETE request to the `/api/v1/placement/<NODE_ID>` endpoint. | |||
Send a DELETE request to the `/api/v1/services/m3db/placement/<NODE_ID>` endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice thanks, I think the legacy stuff still works but good to update the docs to the new stuff
|
||
```bash | ||
curl -X POST <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/services/m3db/placement/replace -d '{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this all work out correctly with various numbers of inputs and outputs? I.E if I want to replace one node with two, or remove two and add one etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should so long as that algorithm considers that change valid, since we don't do much rather than be a glorified wrapper around that. @cw9 can chime in on the algo stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, the algo basically calls the add()
and then call the remove()
on non sharded placements
@@ -154,7 +154,7 @@ func TestPlacementAddHandler_SafeOK(t *testing.T) { | |||
) | |||
switch serviceName { | |||
case M3AggregatorServiceName: | |||
req = httptest.NewRequest(AddHTTPMethod, M3DBAddURL, strings.NewReader(`{"instances":[{"id": "host1","isolation_group": "rack1","zone": "test","weight": 1,"endpoint": "http://host1:1234","hostname": "host1","port": 1234}]}`)) | |||
req = httptest.NewRequest(AddHTTPMethod, M3AggAddURL, strings.NewReader(`{"instances":[{"id": "host1","isolation_group": "rack1","zone": "test","weight": 1,"endpoint": "http://host1:1234","hostname": "host1","port": 1234}]}`)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 thanks
} | ||
} | ||
|
||
newPlacement, err := algo.ReplaceInstances(curPlacement, req.LeavingInstanceIDs, candidates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment about why you use the algo / check and set stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
return nil, err | ||
} | ||
|
||
return newPlacement.SetVersion(version + 1), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little annoying that CheckAndSet doesn't return the new version, but this is guaranteed to be correct right (version +1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It currently is but admittedly that's tied to the etcd implementation I think. The API CheckAndSet uses under the hood propagates the version but we drop that when the placement service does CheckAndSet.
We could (and probably should) refactor that but it would likely require updating a bunch of interface implementations. I can open an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, you could just do a get placement operation to avoid this possibly (i.e. just read what you wrote)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey can you return perhaps just:
p, _, err := service.Placement()
return p, err
Just so before we have some wrapper that does this for you in a way that can be reused, you at least just read back what you wrote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That read after CAS breaks the atomicity, the current implementation is the only way to get atomicity in placement change with validation, a more proper solution is this
tags: | ||
- "M3DB Placement" | ||
- "M3DB" | ||
summary: "Replace an M3DBNode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we concatenate M3DB and Node together in the other summaries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, will fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other M3DB ones:
summary: "Add an instance to the placement"
Maybe "Replace an instance in the placement"? shrug
|
||
var ( | ||
// M3DBReplaceURL is the url for the m3db replace handler (method POST). | ||
M3DBReplaceURL = path.Join(handler.RoutePrefixV1, M3DBServicePlacementPathName, replacePathName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wrote documentation for m3db/m3coordiantor/m3agg but only created a route for M3DB. Wanna create the other routes? Also we should get @cw9 to confirm there are no other special changes we need to make to support M3Agg replaces (when I did the m3agg stuff I looked at the r2admin code to see if there was anything special but Chao can probably just answer this question quickly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might need to change ServiceWithAlgo
function to make sure it's using the same logic with r2admin, in there we had functions like GetOptionsForReplace
, Richies work probably only moved GetOptions
over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cw9 do we need more logic past what's here?
case M3AggregatorServiceName: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w nits
@@ -64,10 +64,10 @@ The instructions below all contain sample curl commands, but you can always revi | |||
|
|||
#### Placement Initialization | |||
|
|||
Send a POST request to the `/api/v1/placement/init` endpoint | |||
Send a POST request to the `/api/v1/services/m3db/placement/init` endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think right now it's a little ambiguous as to when to use /api/v1/...
and when to use others like /api/v1/services/m3db/...
. There are other mentions of /api/v1/placement
like in docs/how_to
(Github docs) and in our docker integration tests. It would be great to get some clarification on when to use which, but perhaps that's for a larger discussion in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinjc It should always be the /services/ ones, we just kept the old names in to not make a breaking change. Anywhere we have a URL without the service name should just be updated to include it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than minor comment about perhaps just returning latest placement instead of manual incrementing the version?
Adds an endpoint for replacing an instance across all services.
Closes #1095