Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: update abseil and cel-cpp #31456

Merged
merged 42 commits into from
Jan 10, 2024
Merged

Conversation

kyessenov
Copy link
Contributor

@kyessenov kyessenov commented Dec 20, 2023

Commit Message: Update CEL CPP to latest, and abseil to 9/18/23. Abseil seems coupled with protobuf, so we'll have to update them together later.
Additional Description:
Risk Level: low
Testing: regression
Docs Changes: none
Release Notes: none

Fix #29731
Fix #27607

Signed-off-by: Kuat Yessenov <[email protected]>
@repokitteh-read-only repokitteh-read-only bot added the deps Approval required for changes to Envoy's external dependencies label Dec 20, 2023
Copy link

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @mattklein123

🐱

Caused by: #31456 was opened by kyessenov.

see: more, trace.

@kyessenov
Copy link
Contributor Author

kyessenov commented Dec 20, 2023

Windows failure:

external/com_google_cel_cpp\base/internal/value.h(58): error C2338: absl::Duration must be trivially copyable.
external/com_google_cel_cpp\base/internal/value.h(63): error C2338: absl::Time must be trivially copyable.
external/com_google_cel_cpp\base/internal/value.h(95): warning C4624: 'cel::base_internal::InlineValue': destructor was implicitly defined as deleted
external/com_google_cel_cpp\base/values/duration_value.h(78): error C2338: DurationValue must be trivially copyable
external/com_google_cel_cpp\base/values/timestamp_value.h(77): error C2338: TimestampValue must be trivially copyable

which seems to be caused by this interesting snippet from abseil:

https://github.com/abseil/abseil-cpp/blob/5afcd807d08d89cd1abfd2ed0090796460db6b19/absl/time/time.h#L172-L176

I think we still need to update MSVC to get past this hurdle?

version = "da0aba702f44a41ec6d2eb4bbf6a9f01efc2746d",
sha256 = "d62b93fd07c6151749e83855157f3f2778d62c168318f9c40dfcfe1c336c496f",
version = "e14d338d0f1bc64e54c4275ac5034fe5dedba969",
sha256 = "a0fcf8126d4fb8e0e00be79e626cac741e40b4dcef7d549e99f7a7ef6d7c5443",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out-of-curiosity, how do you get this sha256 number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably a better way but I just put something wrong and then copy the right one from bazel error message :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for ref we have a tool that can update this (the sha is for the downloaded tarball)

bazel run  //bazel:\($target) \($dependency) \($version)

$target can be update or api-update

there is also a bot for this here but you need to be a maintainer to use it https://github.com/envoyproxy/envoy/actions/workflows/envoy-dependency.yml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info!

Copy link
Contributor

@jbohanon jbohanon Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way (just for reference):

curl -L https://github.com/google/cel-cpp/archive/e14d338d0f1bc64e54c4275ac5034fe5dedba969.tar.gz | sha256sum

Admittedly this requires trusting the version you downloaded is not yet compromised

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - this is what the tool does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Linux I download the binary and run sha256sum against it

Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@phlax phlax self-assigned this Dec 20, 2023
@@ -162,12 +162,12 @@ REPOSITORY_LOCATIONS_SPEC = dict(
project_name = "Abseil",
project_desc = "Open source collection of C++ libraries drawn from the most fundamental pieces of Google’s internal codebase",
project_url = "https://abseil.io/",
version = "c8b33b0191a2db8364cacf94b267ea8a3f20ad83",
sha256 = "a7803eac00bf68eae1a84ee3b9fcf0c1173e8d9b89b2cee92c7b487ea65be2a9",
version = "1adf896ec842bd9788a1bbede94a33e1402b8ecb",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we cant use release versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abseil "lives at head" and we don't seem to use LTS branches. See https://abseil.io/about/releases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure im following this exactly - isnt that how all repos work ?

their release strategy seems a bit confusing - but afaict the "LTS branches" are the releases

if there is a reason not to use the releases - then i think its fine to use HEAD but generally its preferred to use published releases where possible/available

cc @htuch @moderation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW grpc does use abseil LTS releases https://github.com/grpc/grpc/blob/master/bazel/grpc_deps.bzl#L344C9-L344C9. I'd be fine instituting this as long as we're OK with some delay in picking up changes. I see that protoc uses some internal abseil detail that don't compile on every version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to use LTS releases by default and bump to HEAD as needed. I think this break glass might be needed as people use new features and we wouldn't artifically push back on that. This is consistent with other dependencies.

@@ -1201,8 +1201,8 @@ REPOSITORY_LOCATIONS_SPEC = dict(
project_name = "Common Expression Language (CEL) C++ library",
project_desc = "Common Expression Language (CEL) C++ library",
project_url = "https://opensource.google/projects/cel",
version = "da0aba702f44a41ec6d2eb4bbf6a9f01efc2746d",
sha256 = "d62b93fd07c6151749e83855157f3f2778d62c168318f9c40dfcfe1c336c496f",
version = "e14d338d0f1bc64e54c4275ac5034fe5dedba969",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same principle as abseil. We can ask them to cut a release, however.

Signed-off-by: Kuat Yessenov <[email protected]>
@phlax
Copy link
Member

phlax commented Dec 20, 2023

the absl update seems to be hitting the same issue i hit previously attempting this (#30394) - something to do with the hot restart test (cc @ravenblackx ) https://dev.azure.com/cncf/envoy/_build/results?buildId=158291&view=logs&j=767be981-567e-57d8-68c3-2140ede0a0bd&t=2181edf2-f610-59f2-c43a-04bb9d0bca00&l=2608

updating tcmalloc with absl appears to work for both on x64 but unfortunately that also breaks arm

cc @RyanTheOptimist i think we will need to update quite a few things (probs together) before we can update the gcc toolchain

Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@kyessenov
Copy link
Contributor Author

@phlax ARM build is fixed by rolling one patch release of abseil, but coverage fails with this:

ld.lld: error: /opt/llvm/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu/libclang_rt.profile.a(InstrProfilingFile.c.o):(function initializeProfileForContinuousMode: .text.initializeProfileForContinuousMode+0x26): relocation R_X86_64_REX_GOTPCRELX out of range: 2170722148 is not in [-2147483648, 2147483647]; references __llvm_profile_counter_bias_default

Signed-off-by: Kuat Yessenov <[email protected]>
@phlax
Copy link
Member

phlax commented Dec 20, 2023

coverage fails with this:

i hit this previously attempting to update protobuf - not sure what is apropos of but seems relevant

#29238

@kyessenov
Copy link
Contributor Author

Windows failures are frustrating. Should we disable CEL extensions on Windows? There's only so much I can patch to help the compiler. These are all extensions that transitively depend on CEL:

            "envoy.access_loggers.extension_filters.cel",
            "envoy.access_loggers.wasm",
            "envoy.bootstrap.wasm",
            "envoy.rate_limit_descriptors.expr",
            "envoy.filters.http.rate_limit_quota",
            "envoy.filters.http.rbac",
            "envoy.filters.http.wasm",
            "envoy.filters.network.rbac",
            "envoy.filters.network.wasm",
            "envoy.stat_sinks.wasm",
            "envoy.rbac.matchers.upstream_ip_port",
            "envoy.formatter.cel",
            "envoy.matching.inputs.cel_data_input",
            "envoy.matching.matchers.cel_matcher",

Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@alyssawilk
Copy link
Contributor

/wait

@repokitteh-read-only repokitteh-read-only bot added deps Approval required for changes to Envoy's external dependencies and removed waiting labels Jan 8, 2024
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@phlax phlax added this to the v1.29.0 milestone Jan 9, 2024
@alyssawilk alyssawilk removed their assignment Jan 9, 2024
@alyssawilk alyssawilk dismissed their stale review January 9, 2024 13:26

issue fixed

@kyessenov
Copy link
Contributor Author

@phlax coverage is restored, so I think this is ready to be merged - beware that abseil update will bust the build cache.

Copy link
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @kyessenov

@repokitteh-read-only repokitteh-read-only bot removed the deps Approval required for changes to Envoy's external dependencies label Jan 10, 2024
@kyessenov kyessenov merged commit bd1dca3 into envoyproxy:main Jan 10, 2024
54 checks passed
@kyessenov kyessenov deleted the cel_cpp_up branch January 10, 2024 20:02
@moderation
Copy link
Contributor

moderation commented Jan 11, 2024

@kyessenov anyone else seeing build failures on MacOS? Like #31760 (comment) I cleared Bazel and still hit this issue

ERROR: /Users/moderation/Applications/envoyproxy/envoy/source/exe/BUILD:25:16: Linking source/exe/envoy-static failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //source/exe:envoy-static) external/local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/source/exe/envoy-static-2.params

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
ld: warning: -undefined error is deprecated
ld: warning: ignoring duplicate libraries: '-lc++', '-lm', '-lpthread'
ld: Undefined symbols:
  cel::interop_internal::ProtoStructValueToMessageWrapper(cel::Value const&), referenced from:
      cel::interop_internal::ToLegacyValue(google::protobuf::Arena*, cel::Handle<cel::Value> const&, bool) in libinterop.a[2](interop.o)
      cel::interop_internal::ToLegacyValue(google::protobuf::Arena*, cel::Handle<cel::Value> const&, bool) in libinterop.a[2](interop.o)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error in child process '/usr/bin/xcrun'. 1
Target //source/exe:envoy-static.stripped failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 13.700s, Critical Path: 1.48s
INFO: 3 processes: 3 internal.
FAILED: Build did NOT complete successfully

@ramaraochavali
Copy link
Contributor

I also run in to the same issue as @moderation. Any suggestions for fixing this?

@phlax
Copy link
Member

phlax commented Jan 16, 2024

ensuring that versions - esp compiler version - match the envoy build env (ie container) is i think the only ~guaranteed way it will work

@moderation
Copy link
Contributor

moderation commented Jan 17, 2024

@ramaraochavali I haven't found a way to fix this. At work building on RHEL 8 with clang 16.0.6 I've confirmed that after git checkout 914ab6a7908466726cc51d6f6da666729ba7122d which is the last commit before the cel update, I can build. So basically, I can't produce custom (extensions compiled out, build options etc.) any more.

As per @phlax's comment about only guaranteeing the build container version of gcc which is 11, I think we should be more explicit in the documentation.

I can still build on:

  • Xubuntu 23.10 x86_64 with clang 17.0.6
  • Xubuntu 22.04 aarch64 with clang 17.0.6
  • OpenSUSE Tumbleweed x86_64 with clang 17.0.6

In addition to RHEL 8, MacOS latest fails for me using latest Sonoma and xcode.

I suspect the issue is more complex than - "you are only guaranteed is using the build compiler version". gcc 11 is pretty old and the latest gcc is 13. Google is using clang - #31760 (comment)

I might be able to reconfigure my work environment to use gcc 11 but I'm prepared to bet it won't work.

I suspect that this change will break many people / platforms and we are just the tip of the spear. Before the changes causes multiple issues, maybe we should document that post 914ab6a7908466726cc51d6f6da666729ba7122d you can no longer build on MacOS or Red Hat Linux 8, and that you likely need a very modern version of Linux to build.

Thoughts?

@phlax
Copy link
Member

phlax commented Jan 17, 2024

As per @phlax's comment about only guaranteeing the build container version of gcc which is 11, I think we should be more explicit in the documentation.

i think that is a good idea - i have a feeling there are some docs somewhere that specify (almost certainly) incorrect versions - updating/collating dev docs is pretty high on my priority list fwiw

what we definitely cannot do is say "you can build with any versions between x and z" if we arent testing those versions - i would hope/expect that its possible to build with much more than we test but that is not something we can "guarantee".

i think its helpful for discussions/threads here on what does/not work especially wrt to release branches. main is changing all the time - its difficult enough keep everything working with very specific versions

I suspect that this change will break many people

not sure tbh. i would guess that a lot of people use containers one way/other - personally i build my own with tooling etc that i use frequently. Ultimately its based on the official container as that has the authoritative list of host requirements.

gcc 11 is pretty old and the latest gcc is 13

we tried to get the build working with gcc 13, but it didnt work out of the box. I dont remember the specifics but my impression is that whatever issues were probably not so hard to resolve if someone spends some time doing it. cc @RyanTheOptimist

you likely need a very modern version of Linux to build

not sure this is correct - but again this is what the build image is for

this image OS is not very current for glibc reasons, although it backports/installs compilers/extras as required.

ultimately i think the solution is to make the build fully hermetic and not rely on any host utils. That is easier for llvm than gcc which is kinda the blocker to doing that

i think this conversation probably warrants its own ticket - i think this is a general problem and is not really specific to @kyessenov 's PR.

ggreenway added a commit to ggreenway/envoy that referenced this pull request Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants