Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent commit about CEL causes GCC contrib/golang build failure #31760

Closed
zhxie opened this issue Jan 11, 2024 · 24 comments · Fixed by #31814
Closed

Recent commit about CEL causes GCC contrib/golang build failure #31760

zhxie opened this issue Jan 11, 2024 · 24 comments · Fixed by #31814

Comments

@zhxie
Copy link
Contributor

zhxie commented Jan 11, 2024

Title: Recent commit causes GCC build failure

Description:

After bd1dca3 (#31456), GCC within Envoy build docker fails to build some CEL-related components and extensions. An example is given below.

git rev-parse HEAD
# bd1dca3cf366bc5287295de73e4354d2f5d1f1d9
cat user.bazelrc
# build --config=gcc
./ci/run_envoy_docker.sh 'bash'
bazel build //contrib/golang/filters/http/source:golang_filter_lib
# ERROR: /source/contrib/golang/filters/http/source/BUILD:15:17: Compiling contrib/golang/filters/http/source/golang_filter.cc failed: (Exit 1): gcc failed: error executing command (from target //contrib/golang/filters/http/source:golang_filter_lib) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 221 arguments skipped)
#
# Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
# In file included from ./source/extensions/filters/common/expr/context.h:11,
#                  from ./source/extensions/filters/common/expr/evaluator.h:7,
#                  from ./contrib/golang/filters/http/source/golang_filter.h:14,
#                  from contrib/golang/filters/http/source/golang_filter.cc:1:
# external/com_google_cel_cpp/eval/public/cel_value.h:621:42: error: 'virtual absl::lts_20230802::StatusOr<const google::api::expr::runtime::CelList*> google::api::expr::runtime::CelMap::ListKeys(google::protobuf::Arena*) const' was hidden [-Werror=overloaded-virtual]
#   621 |   virtual absl::StatusOr<const CelList*> ListKeys(google::protobuf::Arena* arena) const {
#       |                                          ^~~~~~~~
# In file included from external/com_google_cel_cpp/eval/public/containers/field_backed_map_impl.h:8,
#                  from contrib/golang/filters/http/source/golang_filter.cc:26:
# external/com_google_cel_cpp/eval/public/containers/internal_field_backed_map_impl.h:46:34: note:   by 'virtual absl::lts_20230802::StatusOr<const google::api::expr::runtime::CelList*> google::api::expr::runtime::internal::FieldBackedMapImpl::ListKeys() const'
#    46 |   absl::StatusOr<const CelList*> ListKeys() const override;
#       |                                  ^~~~~~~~
# cc1plus: all warnings being treated as errors

The command succeeds in the previous commit.

Relevant Links:
#31456 (comment)

@zhxie zhxie added the triage Issue requires triage label Jan 11, 2024
@zhxie
Copy link
Contributor Author

zhxie commented Jan 11, 2024

CC @kyessenov

@kyessenov
Copy link
Contributor

There's a patch applied in the PR to fix this error. Can you clean the cache and rebuild? I'm not sure how CI let it in.

@zhxie
Copy link
Contributor Author

zhxie commented Jan 11, 2024

No, I tried bazel clean but it still fails to build.
Actually, I also do not know how CI works now. ./ci/run_envoy_docker.sh './ci/do_ci.sh gcc' just gives me a confusing cannot find 'ld'. Also CC @phlax for help.

@phlax
Copy link
Member

phlax commented Jan 11, 2024

I'm not sure how CI let it in.

im struggling to follow here - CI is passing after this landed

@zhxie given that ci is passing im wondering what is different - most likely the gcc version im thinking

@moderation
Copy link
Contributor

MacOS not building for me as per #31456 (comment). Linux x64_64 and aarch64 are ok

@phlax
Copy link
Member

phlax commented Jan 11, 2024

@moderation can you compare the gcc versions

@kyessenov
Copy link
Contributor

@zhxie If you look at sources in bazel-envoy/external, do you see using CelMap::ListKeys there? There's a GCC bug that causes it to fail to dis-ambiguate overloads, so it could be due to GCC version.

@moderation I don't have a mac, but it's possibly related to weak symbols, and needs newer linkers. This commit might help b5da585.

@moderation
Copy link
Contributor

I'm running standard latest xcode on MacOS Sonoma 14.2.1. It is an Apple Silicon M1

gcc --version; g++ --version
Apple clang version 15.0.0 (clang-1500.1.0.2.5)
Target: arm64-apple-darwin23.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Apple clang version 15.0.0 (clang-1500.1.0.2.5)
Target: arm64-apple-darwin23.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

@phlax
Copy link
Member

phlax commented Jan 11, 2024

our current/tested build version is 11

https://github.com/envoyproxy/envoy-build-tools/blob/7cd964feb564419532ec8a825031e4e1b7b6974f/docker/linux/ubuntu/fun.sh#L39

we tested other versions when updating and pinned to that one due to various issues - ymmv with any other gcc version

@moderation
Copy link
Contributor

That is for Linux. The current issue is latest MacOS. I suspect there aren't many folks building on this platform. It has been successful for me for ages up until this CEL change. Will keep looking

@kyessenov
Copy link
Contributor

@moderation I don't have any macs to try it on, and Google doesn't easily give me machines either anymore. I suspect it's something to do with the weak linkage, so maybe changing LD can help.

@phlax
Copy link
Member

phlax commented Jan 11, 2024

@zhxie what platform are you hitting this on?

@moderation it would be great to resolve this issue so it can build on mac but i think the warning above still stands - i assume that if nothing else is breaking its because its not being built/tested

@zhxie
Copy link
Contributor Author

zhxie commented Jan 12, 2024

I am using Envoy build docker to build, so I guess it is aligned to the CI.

envoybuild@1ce4c6523b42:/source$ gcc --version
# gcc (Ubuntu 11.4.0-2ubuntu1~20.04) 11.4.0
# Copyright (C) 2021 Free Software Foundation, Inc.
# This is free software; see the source for copying conditions.  There is NO
# warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

envoybuild@1ce4c6523b42:/source$ env
# SHELL=/bin/sh
# SUDO_GID=0
# HOSTNAME=1ce4c6523b42
# SUDO_COMMAND=/bin/bash -c bash -c cd\ /source\ &&\ bash
# SUDO_USER=root
# ENVOY_BUILD_IMAGE=envoyproxy/envoy-build-ubuntu:0ca52447572ee105a4730da5e76fe47c9c5a7c64@sha256:d736c58f06f36848e7966752cc7e01519cc1b5101a178d5c6634807e8ac3deab
# PWD=/source
# LOGNAME=envoybuild
# HOME=/build
# LANG=en_US.utf8
# TERM=xterm
# USER=envoybuild
# GOPROXY=
# SHLVL=1
# BUILD_DIR=/build
# PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
# SUDO_UID=0
# OLDPWD=/
# _=/usr/bin/env

@phlax
Copy link
Member

phlax commented Jan 12, 2024

I am using Envoy build docker to build

hmm, so that rules out gcc version in your case i think - the container is up-to-date - wondering what else could be different

@phlax
Copy link
Member

phlax commented Jan 12, 2024

k - i see - original error is building contrib

@phlax
Copy link
Member

phlax commented Jan 12, 2024

yeah - original bug is easily reproducible

diff --git a/ci/do_ci.sh b/ci/do_ci.sh
index e500dd8792..3948a5a773 100755
--- a/ci/do_ci.sh
+++ b/ci/do_ci.sh
@@ -756,14 +756,7 @@ case $CI_TARGET in
     gcc)
         BAZEL_BUILD_OPTIONS+=("--test_env=HEAPCHECK=")
         setup_gcc_toolchain
-        echo "Testing ${TEST_TARGETS[*]}"
-        bazel_with_collection \
-            test "${BAZEL_BUILD_OPTIONS[@]}" \
-            -c fastbuild  \
-            --remote_download_minimal \
-            -- "${TEST_TARGETS[@]}"
-        echo "bazel release build with gcc..."
-        bazel_envoy_binary_build fastbuild
+        bazel build ${BAZEL_BUILD_OPTIONS[@]} //contrib/golang/filters/http/source:golang_filter_lib
         ;;
 
     info)
# Configuration: f3b5920441742ff2e9ce1de338d1e21350b3861db729239d6673cebfb08df02f
# Execution platform: @envoy_build_tools//toolchains:rbe_linux_gcc_platform
In file included from ./source/extensions/filters/common/expr/context.h:11,
                 from ./source/extensions/filters/common/expr/evaluator.h:7,
                 from ./contrib/golang/filters/http/source/golang_filter.h:14,
                 from contrib/golang/filters/http/source/golang_filter.cc:1:
external/com_google_cel_cpp/eval/public/cel_value.h:621:42: error: ‘virtual absl::lts_20230802::StatusOr<const google::api::expr::runtime::CelList*> google::api::expr::runtime::CelMap::ListKeys(google::protobuf::Arena*) const’ was hidden [-Werror=overloaded-virtual]
  621 |   virtual absl::StatusOr<const CelList*> ListKeys(google::protobuf::Arena* arena) const {
      |                                          ^~~~~~~~
In file included from external/com_google_cel_cpp/eval/public/containers/field_backed_map_impl.h:8,
                 from contrib/golang/filters/http/source/golang_filter.cc:26:
external/com_google_cel_cpp/eval/public/containers/internal_field_backed_map_impl.h:46:34: note:   by ‘virtual absl::lts_20230802::StatusOr<const google::api::expr::runtime::CelList*> google::api::expr::runtime::internal::FieldBackedMapImpl::ListKeys() const’
   46 |   absl::StatusOr<const CelList*> ListKeys() const override;
      |                                  ^~~~~~~~
cc1plus: all warnings being treated as errors
Target //contrib/golang/filters/http/source:golang_filter_lib failed to build
INFO: Elapsed time: 79.923s, Critical Path: 77.39s
INFO: 1389 processes: 1386 remote cache hit, 3 inte

fwiw - trying to build a contrib bin with gcc fails with

/usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -std=c++0x -fno-canonical-system-headers -Wno-builtin-macro-redefined -D__DATE__=redacted -D__TIMESTAMP__=redacted -D__TIME__=redacted -DABSL_MIN_LOG_LEVEL=4 -fdebug-types-section -fPIC -Wno-deprecated-declarations -std=c++17 -fPIC -DU_CHARSET_IS_UTF8=1 -DU_USING_ICU_NAMESPACE=0 -DUCONFIG_ONLY_HTML_CONVERSION=1 -DUCONFIG_NO_LEGACY_CONVERSION=1 -DUCONFIG_NO_BREAK_ITERATION=1 -DUCONFIG_NO_COLLATION=1 -DUCONFIG_NO_FORMATTING=1 -DUCONFIG_NO_TRANSLITERATION=1 -DUCONFIG_NO_REGULAR_EXPRESSIONS=1 -W -Wall -pedantic -Wpointer-arith -Wwrite-strings -Wno-long-long   -fuse-ld=lld -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -pass-exit-codes -lm -fuse-ld=gold -l:libstdc++.a -Wl,--gc-sections   -o ../../bin/makeconv gencnvex.o genmbcs.o makeconv.o ucnvstat.o -L../../lib -licutu -L../../lib -licui18n -L../../lib -licuuc -L../../stubdata -licudata -lpthread -lm                                                                            makeconv.o:makeconv.cpp:DW.ref.__gxx_personality_v0: error: undefined reference to '__gxx_personality_v0'                                                                         ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::mutex::lock(): error: undefined reference to 'std::__throw_system_error(int)'                                            ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::once_flag::_Prepare_execution::~_Prepare_execution(): error: undefined reference to 'std::__once_callable'               ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::once_flag::_Prepare_execution::~_Prepare_execution(): error: undefined reference to 'std::__once_call'                   ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function umtx_cleanup: error: undefined reference to 'std::condition_variable::~condition_variable()'                                  ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function umtx_init::{lambda()#2}::operator()() const: error: undefined reference to 'std::condition_variable::condition_variable()'    ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function icu_72::umtx_initImplPreInit(icu_72::UInitOnce&): error: undefined reference to 'std::condition_variable::wait(std::unique_lock<std::mutex>&)'                                                                                                                                                                  ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function icu_72::umtx_initImplPostInit(icu_72::UInitOnce&): error: undefined reference to 'std::condition_variable::notify_all()'      ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function void std::call_once<void (&)()>(std::once_flag&, void (&)()): error: undefined reference to '__once_proxy'                    ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function void std::call_once<void (&)()>(std::once_flag&, void (&)()): error: undefined reference to 'std::__throw_system_error(int)'  ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<void (&)()>(std::once_flag&, void (&)())::{lambda()#1}>(void (&)())::{lambda()#1}::operator()() const: error: undefined reference to 'std::__once_callable'                                                                               ../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<void (&)()>(std::once_flag&, void (&)())::{lambda()#1}>(
void (&)()): error: undefined reference to 'std::__once_callable'                                                                                                                 
../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<void (&)()>(std::once_flag&, void (&)())::{lambda()#1}>(
void (&)()): error: undefined reference to 'std::__once_call'                                                                                                                     
../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::unique_lock<std::mutex>::lock(): error: undefined reference to 'std::__throw_system_error(int)'                          
../../lib/libicuuc.a(umutex.ao):umutex.cpp:function std::unique_lock<std::mutex>::lock(): error: undefined reference to 'std::__throw_system_error(int)'                          
collect2: error: ld returned 1 exit status                                                                                                                                        
make[2]: *** [Makefile:81: ../../bin/makeconv] Error 1                                                                                                                            make[2]: Leaving directory '/b/f/w/bazel-out/k8-fastbuild/bin/bazel/foreign_cc/unicode_icu_build.build_tmpdir/tools/makeconv'                                                     
make[1]: *** [Makefile:47: all-recursive] Error 2                                                                                                                                 
make[1]: Leaving directory '/b/f/w/bazel-out/k8-fastbuild/bin/bazel/foreign_cc/unicode_icu_build.build_tmpdir/tools'                                                              
make: *** [Makefile:153: all-recursive] Error 2   

not sure if related or a new issue

@phlax
Copy link
Member

phlax commented Jan 12, 2024

yep - this is new - 1.28 can build gcc contrib fine - i think this is a release blocker testing properly ICU issue is present in 1.28

@moderation im tempted to say that your issue is probably a different one (if related)

@phlax phlax changed the title Recent commit about CEL causes GCC build failure Recent commit about CEL causes GCC contrib/golang build failure Jan 12, 2024
@phlax
Copy link
Member

phlax commented Jan 12, 2024

#31807
#31806

@moderation
Copy link
Contributor

I just tried to build like I normally do on a Red Hat Linux 8 work environment and it is failing now too on the CEL changes with both clang 16 and gcc 13. This build has been successful for a long period of time, probably close to a year. So the blast radius for me so far is broken MacOS aarch64 / Apple Silicon and Red Hat Linux 8, both of which worked for long periods of time prior to this change

@phlax
Copy link
Member

phlax commented Jan 12, 2024

it would be good if we could isolate an exact problem - it works for our CI - which is all we can guarantee - but obv there are some issues - my guess is that you need to use the tested compilers - esp given its known that our CI cant pass with at least some other versions

in terms of blocking release i think if it wasnt broken in 1.28 - ie you could build contrib with gcc - then its a blocker - but if it has just broke we need to fix (edited again to make sense)

@phlax
Copy link
Member

phlax commented Jan 12, 2024

testing that here #31812

@phlax
Copy link
Member

phlax commented Jan 12, 2024

fails in 1.28 - defo we should address this but i dont think it should block release

@kyessenov
Copy link
Contributor

I understand the frustration with CEL but it's not easy to detect all compiler issues in Envoy because the upstream is strictly focused on the clang variant. We'd probably need to more requirements on the upstream CEL project to detect the issues earlier rather than trying to patch them up here.

@ramaraochavali
Copy link
Contributor

@kyessenov It could be related to "LD", Can you elaborate more on your comment #31760 (comment) ? What do you want to change with LD?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants