Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: migrate grpc-gateway from v1.16.0 to v2.7.0 #16049

Closed

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented Jun 11, 2023

The PR is trying to migrate the api package from gogo to
google.golang.org/protobuf because the grpc-gateway runtime has been upgraded
to use protobuf-go v2 API. It's impossible to use gogo with new version
grpc-gateway and we can't re-generate all the proto codes by protobuf-go
because the raftpb uses gogoproto.nullable to use non-pointer for message
fileds, which protobuf-go doesn't support (golang/protobuf@1225). Fortunately,
both api/etcdserverpb/rpc.proto and
server/etcdserver/api/v3electionpb/v3election.proto doesn't use the
gogoproto.nullable features to message filed. For the migration, we can
just re-generate code for the following the protos as the approach:

  • api/**/*.proto
  • server/lease/leasepb/*.proto
  • server/etcdserver/api/snap/snappb/*.proto
  • server/etcdserver/api/v3election/v3electionpb/*.proto
  • server/etcdserver/api/v3lock/v3lockpb/*.proto

Highlight the change for proto:

  • Use protoc-gen-go-vtproto to generate Marshal/Unmarshal/Size code
    for gogoproto.{marshaler,unmarshaler,size,}_all interface, instead of
    using proto package;

  • There is no way to support gogoproto.goproto_enum_prefix_all. So in
    this patch, I add *.pb.ext.go to add the const alias without prefix
    manually. Maybe we can build a proto binary as generator in the
    future;

  • Both api/etcdserverpb/etcdserver.proto and server/etcdserver/api/snap/snappb/snap.proto
    use gogoproto.nullable = true for builtin type. In this patch, I upgrade
    the proto syntax from 2 to 3 with optional features. We can re-generate
    the optional fileds as pointer by --experimental_allow_proto3_optional.
    Since it's syntax upgrade, the marshal methods is also changed. For instance,
    in proto3 syntax, if the bool is empty, the marshaler won't encode it 2.
    It's compatible.

Highlight the change for grpc-gateway:

  • The grpc-gateway/third_party/googleapis has been removed. We need to
    download the https://github.com/googleapis/googleapis.git to get the
    target proto.

  • The protobuf-go won't generate json tag for oneOf filed 3. We have
    to use sed in scripts/genproto.sh to append the tag before
    upstream accepts the change (golang/protobuf@140).

  • Both Hash and HashKV shares the one URL /v3/maintenance/hash.
    New grpc-gateway openapi can detect the error. So in this patch, we
    should use /v3/maintenance/hashkv for HashKV.

  • New grpc-gateway always renders default value in json payload. In this
    patch, we use set the EmitUnpopulated: false.

  • The error message of grpc-gateway has been changed. In v1.16.0, the
    error message is rendered by 6. In v2.7.0, the error string has
    been removed 7.

Signed-off-by: Wei Fu [email protected]

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@serathius
Copy link
Member

I would be on side for total removal of grpc-gateway for v3.6. It's worthless to have API without proper support, client or any tests, when there is a identical API with all of those things. It was a good idea to provide a JSON API for a transition between JSON V2 API to GRPC V3 API. However, now there is no reason to keep it.

@fuweid
Copy link
Member Author

fuweid commented Jun 12, 2023

I would be on side for total removal of grpc-gateway for v3.6

+1. It seems reasonable to provide the protobuf file only and anyone can use the proto to generate the gw pacakge and run it as standalone process. So etcd doesn't need to support cmux any more after that.

@ahrtr
Copy link
Member

ahrtr commented Jun 12, 2023

Removing grpc-gateway

Good side

It can greatly simplify the code base.

Also It has no any impact on K8s either (not a good side, just to explicitly mention it).

Bad side

I am not sure how many users are using the RESTful API (HTTP + JSON) provided by etcd. Also note that not all languages are well supported by gRPC.

See also grpc-gateway#background

gRPC is great -- it generates API clients and server stubs in many programming languages, 
it is fast, easy-to-use, bandwidth-efficient and its design is combat-proven by Google. However, 
you might still want to provide a traditional RESTful JSON API as well. Reasons can range from 
maintaining backward-compatibility, supporting languages or clients that are not well supported 
by gRPC, to simply maintaining the aesthetics and tooling involved with a RESTful JSON architecture.

This project aims to provide that HTTP+JSON interface to your gRPC service. A small amount of 
configuration in your service to attach HTTP semantics is all that's needed to generate a reverse-proxy 
with this library.

cc @mitake @ptabor @spzala for opinions.
Also cc @liggitt @dims

@serathius
Copy link
Member

serathius commented Jun 12, 2023

See also grpc-gateway#background

This is a statement by the grpc-gateway project to justify itself from 5 years ago. I would not want to make our decisions on project self build assumptions of other projects. I don't think those assumptions hold for etcd as of 2023. I think there is a benefit to be gained from streamlining the API and removing subpar options.

As a midground I would propose to make grpc-gateway deprecated and disabled by default for v3.6 with optional flag to re-enable. We would see how many users complain, and use it to decide whether we should totally remove grpc-gateway support or we should move it to separate project that runs as a sidecar to etcdserver. No matter the usage of grpc-gateway I would still want to remove it from etcdserver.

@ahrtr
Copy link
Member

ahrtr commented Jun 12, 2023

separate project is also a potential option. In that case, the users (no such usage data for now), who are using the RESTful API, have an alternative. K8s will also be happy because #14499 can be resolved.

@dims
Copy link
Contributor

dims commented Jun 12, 2023

make grpc-gateway deprecated and disabled by default for v3.6 with optional flag to re-enable +100 (would also add a time period of say 6 months or number of releases to get rid of the code)

@jberkus
Copy link

jberkus commented Jun 12, 2023

I am not sure how many users are using the RESTful API (HTTP + JSON) provided by etcd. Also note that not all languages are well supported by gRPC.

Adding to user survey draft in process.

The PR is trying to migrate the api package from gogo to
google.golang.org/protobuf because the grpc-gateway runtime has been upgraded
to use protobuf-go v2 API. It's impossible to use gogo with new version
grpc-gateway and we can't re-generate all the proto codes by protobuf-go
because the raftpb uses gogoproto.nullable to use non-pointer for message
fileds, which protobuf-go doesn't support (golang/protobuf@1225). Fortunately,
both `api/etcdserverpb/rpc.proto` and
`server/etcdserver/api/v3electionpb/v3election.proto` doesn't use the
`gogoproto.nullable` features to message filed. For the migration, we can
just re-generate code for the following the protos as the approach:

* api/**/*.proto
* server/lease/leasepb/*.proto
* server/etcdserver/api/snap/snappb/*.proto
* server/etcdserver/api/v3election/v3electionpb/*.proto
* server/etcdserver/api/v3lock/v3lockpb/*.proto

Highlight the change for proto:

* Use `protoc-gen-go-vtproto` to generate Marshal/Unmarshal/Size code
  for `gogoproto.{marshaler,unmarshaler,size,}_all` interface, instead of
  using proto package;

* There is no way to support `gogoproto.goproto_enum_prefix_all`. So in
  this patch, I add `*.pb.ext.go` to add the const alias without prefix
  manually. Maybe we can build a proto binary as generator in the
  future;

* Both `api/etcdserverpb/etcdserver.proto` and `server/etcdserver/api/snap/snappb/snap.proto`
  use `gogoproto.nullable = true` for builtin type. In this patch, I upgrade
  the proto syntax from `2` to `3` with `optional` features. We can re-generate
  the optional fileds as pointer by `--experimental_allow_proto3_optional`.
  Since it's syntax upgrade, the marshal methods is also changed. For instance,
  in proto3 syntax, if the bool is empty, the marshaler won't encode it [2].
  It's compatible.

Highlight the change for grpc-gateway:

* The grpc-gateway/third_party/googleapis has been removed. We need to
  download the https://github.com/googleapis/googleapis.git to get the
  target proto.

* The protobuf-go won't generate json tag for `oneOf` filed [3]. We have
  to use `sed` in `scripts/genproto.sh` to append the `tag` before
  upstream accepts the change (golang/protobuf@140).

* Both `Hash` and `HashKV` shares the one URL `/v3/maintenance/hash`.
  New grpc-gateway openapi can detect the error. So in this patch, we
  should use `/v3/maintenance/hashkv` for `HashKV`.

* New grpc-gateway always renders default value in json payload. In this
  patch, we use set the `EmitUnpopulated: false`.

* The error message of grpc-gateway has been changed. In v1.16.0, the
  error message is rendered by [6]. In v2.7.0, the `error` string has
  been removed [7].

[2]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/internal/impl/codec_gen.go#L74>
[3]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/cmd/protoc-gen-go/internal_gengo/main.go#L814>
[5]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/40ee80cbcc0751e0c943d4abc99ca4bd196bec3b/docs/docs/development/grpc-gateway_v2_migration_guide.md?plain=1#L92>
[6]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/v1.16.0/internal/errors.proto#L9>
[7]: <https://github.com/googleapis/go-genproto/blob/e85fd2cbaebc/googleapis/rpc/status/status.pb.go#L46>

Signed-off-by: Wei Fu <[email protected]>
@fuweid fuweid force-pushed the migrate-grpc-gateway-from-v1-to-v2 branch from a922e4a to b97a5ec Compare June 19, 2023 02:34
@fuweid
Copy link
Member Author

fuweid commented Jun 19, 2023

Hi @ahrtr @serathius @ptabor @chaochn47 @jmhbnz

I have updated the PR for migration and seek a way to upgrade the protobuf-go without breaking existing package API.
The most detail has been updated in the pull request's description. And I want to highlight one issue here.

The pb message struct generated by new protobuf-go introduces an embed [pragma.DoNotCopy] with lock. And in etcd codebase, we didn't use pointer for pb value. It seems like we can use value for immutable reason. So there are a lot of go vet errors, like the following errror.

concurrency/election.go:116:28: call of client.Txn(ctx).If copies lock value: go.etcd.io/etcd/client/v3.Cmp contains google.golang.org/protobuf/internal/impl.MessageState contains sync.Mutex

For the embed field, it's unlikely to disable for this. https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/cmd/protoc-gen-go/internal_gengo/main.go#LL349C6-L349C30

If we have to upgrade protobuf-go, we have to consider to use pointer for the value as far as I known. Or introduce proto-generator for alias type without lock. Looking forward to your all thoughts. Thanks!

cc @pchan @Azanul

@fuweid fuweid marked this pull request as ready for review June 19, 2023 03:09
@ahrtr
Copy link
Member

ahrtr commented Jul 14, 2023

Adding to user survey draft in process.

Are there any data to share? @jberkus

If we have to upgrade protobuf-go, we have to consider to use pointer for the value as far as I known. Or introduce proto-generator for alias type without lock. Looking forward to your all thoughts. Thanks!

I do not get time to dig into it so far. Let's revisit this after probably we completely remove the grpc-gateway.

@fuweid
Copy link
Member Author

fuweid commented Jul 14, 2023

Let's revisit this after probably we completely remove the grpc-gateway.

Thanks @ahrtr ! SGTM

@liggitt
Copy link
Contributor

liggitt commented Jul 14, 2023

Let's revisit this after probably we completely remove the grpc-gateway.

is there a link to that effort? curious how close that is or what the blockers are

@ahrtr
Copy link
Member

ahrtr commented Jul 14, 2023

is there a link to that effort? curious how close that is or what the blockers are

It's a priority now. I will raise an umbrella ticket, and breakdown it soon (no latter than next week). The decision hasn't been made yet. Roughly, there are two choice,

  • Completely remove the grpc-gateway from etcd 3.6.0.
  • Deprecate and disable (by default) grpc-gateway in 3.6.0, but add a flag to re-enable it. Let's see the feedback after 3.6.0 is released; most likely we will completely remove grpc-gateway in etcd 3.7.0 or 4.0 if we follow this approach.

Looks like option 2 is more reasonable.

@liggitt
Copy link
Contributor

liggitt commented Jul 14, 2023

  • most likely we will completely remove grpc-gateway in etcd 3.7.0 or 4.0 if we follow this approach.

it's unfortunate to have a target ~years away before this dependency issue can be resolved :-/

@CyberDem0n
Copy link

I am really oppose of this change.
There is simply no good python gRPC based etcd client.
By good I mean:

  • flexible and configurable retries
  • possibility to reconnect to other etcd node and retry the request
  • auto-discovery of etcd client topology

As a result Patroni uses gRPC-gateway and for us it works good.

@tao12345666333
Copy link
Contributor

tao12345666333 commented Jul 19, 2023

FYI: Apache APISIX uses etcd as the only storage, which is written in Lua. Due to the lack of grpc support in the Lua ecosystem, only grpc-gateway can be used at present. The ecology cannot be completed in a short time.

Apache APISIX currently has a large number of users and continues to grow. Please let me know if etcd-gateway will be deprecated. Thanks

https://github.com/apache/apisix

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2023

thx for the feedback. It seems that we need to revisit #16049 (comment)

@ahrtr
Copy link
Member

ahrtr commented Aug 21, 2023

It's impossible to use gogo with new version
grpc-gateway and we can't re-generate all the proto codes by protobuf-go
because the raftpb uses gogoproto.nullable to use non-pointer for message
fileds, which protobuf-go doesn't support (golang/protobuf@1225).

Have you tried to only upgrade grpc-gateway from v1 to v2 [keep the gogo/protobuf as it's for now]?

@fuweid
Copy link
Member Author

fuweid commented Aug 21, 2023

Have you tried to only upgrade grpc-gateway from v1 to v2 [keep the gogo/protobuf as it's for now]?

Let me revisit my solution. Need to take times on this. Will update it later.

@ahrtr
Copy link
Member

ahrtr commented Aug 21, 2023

Have you tried to only upgrade grpc-gateway from v1 to v2 [keep the gogo/protobuf as it's for now]?

Let me revisit my solution. Need to take times on this. Will update it later.

First attempt in #16454. The good side of only bumping grpc-gateway is that it won't cause any compatibility issue (e.g. etcd 3.6 vs 3.5) for now, but there are some workflow failures...

@fuweid
Copy link
Member Author

fuweid commented Aug 23, 2023

Close it because #16454 is good patch.

@fuweid fuweid closed this Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants