Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when updating UDP clusters through CDS #14866

Closed
bartebor opened this issue Jan 29, 2021 · 19 comments · Fixed by #15195
Closed

Crash when updating UDP clusters through CDS #14866

bartebor opened this issue Jan 29, 2021 · 19 comments · Fixed by #15195

Comments

@bartebor
Copy link
Contributor

Title: Envoy crashes after CDS update of multiple clusters

Description:
I have an envoy server working with our custom control plane based on current (0.1.27) java-control-plane via ADS (LDS, RDS, CDS, EDS). When there is sudden change in all of our clusters (500+), envoy crashes. I have checked v1.15.3, v1.16.2 and v1.17.0 - all of them crash.

Repro steps:
I have no isolated steps to reproduce this, but it is easy to reproduce in our environment:

  • start control plane with many (?) clusters and make envoy connect to it
  • change some field value in cluster template such as connect_timeout and trigger CDS update for all clusters
  • envoy crashes on first or second try

There is no crash when small number of cluster is updated.

I don't know what information could be useful here, so if you find something missing please let me know.

Call Stack:
Data for envoy v1.17.0-debug follow:

[2021-01-29 19:41:06.476][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x55f55af285fd
[2021-01-29 19:41:06.476][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2021-01-29 19:41:06.476][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 5c801b25cae04f06bf48248c90e87d623d7a6283/1.17.0/Clean/RELEASE/BoringSSL
[2021-01-29 19:41:06.476][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: __restore_rt [0x7fc4fdf50140]
[2021-01-29 19:41:06.487][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Upstream::PrioritySetImpl::updateHosts() [0x55f55d31e624]
[2021-01-29 19:41:06.497][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl::updateClusterMembership() [0x55f55d18b9dc]
[2021-01-29 19:41:06.508][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: std::__1::__function::__func<>::operator()() [0x55f55d195c43]
[2021-01-29 19:41:06.519][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: std::__1::__function::__func<>::operator()() [0x55f55d1932a3]
[2021-01-29 19:41:06.528][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: std::__1::__function::__func<>::operator()() [0x55f55d114a4c]
[2021-01-29 19:41:06.538][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: Envoy::Event::DispatcherImpl::runPostCallbacks() [0x55f55d13fb4d]
[2021-01-29 19:41:06.550][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: event_process_active_single_queue [0x55f55d5afe48]
[2021-01-29 19:41:06.559][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: event_base_loop [0x55f55d5aeb0e]
[2021-01-29 19:41:06.569][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: Envoy::Server::WorkerImpl::threadRoutine() [0x55f55d131eb8]
[2021-01-29 19:41:06.581][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::{lambda()#1}::__invoke() [0x55f55d77c573]
[2021-01-29 19:41:06.581][1017802][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: start_thread [0x7fc4fdf44ea7]
Segmentation fault (core dumped)

GDB output:

#0  raise (warning: (Internal error: pc 0x55f55d5ee295 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d5ee295 in read in CU, but not in symtab.)
sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
warning: (Internal error: pc 0x55f55d5ee295 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d5ee295 in read in CU, but not in symtab.)
#1  0x000055f55d5ee296 in Envoy::SignalAction::sigHandler(int, siginfo_t*, void*) (warning: (Internal error: pc 0x55f55d5ee295 in read in CU, but not in symtab.)
) at source/common/signal/signal_action.cc:53
#2  <signal handler called>
#3  0x000055f55af285fd in typeinfo name for std::__1::__shared_ptr_pointer<Envoy::Upstream::ClusterInfoImpl*, std::__1::default_delete<Envoy::Upstream::ClusterInfoImpl>, std::__1::allocator<Envoy::Upstream::ClusterInfoImpl> > ()
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/upstream/_objs/subset_lb_lib/subset_lb.dwo(0x6f5237edc291bf07) referenced by CU at offset 0x14abf67 [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d1b37d0 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1b3823 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1b3823 in read in CU, but not in symtab.)
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/upstream/_objs/upstream_lib/upstream_impl.dwo(0xce963bbbfcaeaba) referenced by CU at offset 0x14b1943 [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d1b3823 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1b3823 in read in CU, but not in symtab.)
#4  0x000055f55d1b3824 in Envoy::Upstream::HostSetImpl::runUpdateCallbacks(std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&, std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&) (warning: (Internal error: pc 0x55f55d1b3823 in read in CU, but not in symtab.)
)
    at /opt/llvm/bin/../include/c++/v1/functional:1867
warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d31e5c0 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/upstream/_objs/cluster_manager_lib/cluster_manager_impl.dwo(0xd5b03dc22c80bafa) referenced by CU at offset 0x14abdfb [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
#5  0x000055f55d31e624 in Envoy::Upstream::PrioritySetImpl::updateHosts(unsigned int, Envoy::Upstream::PrioritySet::UpdateHostsParams&&, std::__1::shared_ptr<std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const>, std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&, std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&, std::__1::optional<unsigned int>) 
    (warning: (Internal error: pc 0x55f55d31e623 in read in CU, but not in symtab.)
) at source/common/upstream/upstream_impl.cc:570
warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d18b7c0 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
#6  0x000055f55d18b9dc in Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl::updateClusterMembership(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, Envoy::Upstream::PrioritySet::UpdateHostsParams, std::__1::shared_ptr<std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const>, std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&, std::__1::vector<std::__1::shared_ptr<Envoy::Upstream::Host>, std::__1::allocator<std::__1::shared_ptr<Envoy::Upstream::Host> > > const&, unsigned long) (warning: (Internal error: pc 0x55f55d18b9db in read in CU, but not in symtab.)
) at source/common/upstream/cluster_manager_impl.cc:1202
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195610 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
#7  0x000055f55d195c43 in std::__1::__function::__func<Envoy::Upstream::ClusterManagerImpl::postThreadLocalClusterUpdate(Envoy::Upstream::ClusterManagerCluster&, Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterUpdateParams&&)::$_19, std::__1::allocator<Envoy::Upstream::ClusterManagerImpl::postThreadLocalClusterUpdate(Envoy::Upstream::ClusterManagerCluster&, Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterUpdateParams&&)::$_19>, void (Envoy::OptRef<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>)>::operator()(Envoy::OptRef<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>&&) (warning: (Internal error: pc 0x55f55d195c42 in read in CU, but not in symtab.)
) at source/common/upstream/cluster_manager_impl.cc:956
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d193260 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/thread_local/_objs/thread_local_lib/thread_local_impl.dwo(0x677b5656f03c0694) referenced by CU at offset 0x14ab753 [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
#8  0x000055f55d1932a3 in std::__1::__function::__func<Envoy::ThreadLocal::TypedSlot<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>::makeSlotUpdateCb(std::__1::function<void (Envoy::OptRef<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>)>)::{lambda(std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>)#1}, std::__1::allocator<Envoy::ThreadLocal::TypedSlot<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>::makeSlotUpdateCb(std::__1::function<void (Envoy::OptRef<Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl>)>)::{lambda(std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>)#1}>, void (std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>)>::operator()(std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>&&) (warning: (Internal error: pc 0x55f55d1932a2 in read in CU, but not in symtab.)
) at /opt/llvm/bin/../include/c++/v1/functional:1867
warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d1149c0 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/event/_objs/dispatcher_lib/dispatcher_impl.dwo(0x2360ad412874adb) referenced by CU at offset 0x14aba27 [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
#9  0x000055f55d114a4c in std::__1::__function::__func<Envoy::ThreadLocal::InstanceImpl::SlotImpl::dataCallback(std::__1::function<void (std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>)> const&)::$_1, std::__1::allocator<Envoy::ThreadLocal::InstanceImpl::SlotImpl::dataCallback(std::__1::function<void (std::__1::shared_ptr<Envoy::ThreadLocal::ThreadLocalObject>)> const&)::$_1>, void ()>::operator()() (warning: (Internal error: pc 0x55f55d114a4b in read in CU, but not in symtab.)
) at /opt/llvm/bin/../include/c++/v1/functional:1867
warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d13fa10 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
#10 0x000055f55d13fb4d in Envoy::Event::DispatcherImpl::runPostCallbacks() (warning: (Internal error: pc 0x55f55d13fb4c in read in CU, but not in symtab.)
) at /opt/llvm/bin/../include/c++/v1/functional:1867
#11 0x000055f55d5afe48 in event_process_active_single_queue ()
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/server/_objs/worker_lib/worker_impl.dwo(0x517dd58167496f19) referenced by CU at offset 0x14ab957 [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
#12 0x000055f55d5aeb0e in event_base_loop ()
warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d131d30 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
warning: Could not find DWO CU bazel-out/k8-opt/bin/source/common/common/_objs/thread_impl_lib_posix/thread_impl.dwo(0xedc454f3b19f13ad) referenced by CU at offset 0x14b805b [in module /home/borek/Work/current/szwadron/envoy/dynamic/envoy-1.17.0-debug]
warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
#13 0x000055f55d131eb8 in Envoy::Server::WorkerImpl::threadRoutine(Envoy::Server::GuardDog&) (warning: (Internal error: pc 0x55f55d131eb7 in read in CU, but not in symtab.)
) at source/server/worker_impl.cc:134
warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d77c560 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
#14 0x000055f55d77c573 in Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::__1::function<void ()>, std::__1::optional<Envoy::Thread::Options> const&)::{lambda(void*)#1}::__invoke(void*) (warning: (Internal error: pc 0x55f55d77c572 in read in CU, but not in symtab.)
) at /opt/llvm/bin/../include/c++/v1/functional:1867
#15 0x00007fc4fdf44ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#16 0x00007fc4fde74def in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
@bartebor bartebor added bug triage Issue requires triage labels Jan 29, 2021
@mattklein123 mattklein123 added area/cluster_manager and removed triage Issue requires triage labels Jan 31, 2021
@mattklein123
Copy link
Member

cc @lambdai @snowp

Can you provide more info about your setup? Cluster types? Config? Is there any way to make a self contained repro of the issue?

@mattklein123 mattklein123 added the help wanted Needs help! label Jan 31, 2021
@lambdai
Copy link
Contributor

lambdai commented Feb 2, 2021

Hmm, this stacks looks legit. It's probably data race.

Also I realize the cluster destroy on worker thread is not fully resolved by recent changes so I am going to pick up my left over #14089

Once it is done let's see if that PR helps

@bartebor
Copy link
Contributor Author

bartebor commented Feb 2, 2021

I have spent more time investigating this and it turns out that my initial assumptions were wrong. It is not the CDS size what triggers the crash, but the existence of UDP proxy/clusters. I got to the point where updating single udp listener/cluster is enough to crash envoy. I'm preparing test environment for you to investigate.

@bartebor
Copy link
Contributor Author

bartebor commented Feb 2, 2021

I created a simple management server capable of crashing envoy on request. You will need to build it, run it and connect envoy instance to it. Having done that a few "Trigger" button clicks should crash envoy - at least it crashes my instance :)
I hope this will help you diagnose the problem.

https://github.com/bartebor/crash-management-server

@bartebor bartebor changed the title Crash while updating many clusters at once through CDS Crash when updating UDP clusters through CDS Feb 2, 2021
@mattklein123
Copy link
Member

@bartebor thanks, is there any chance you could wire this up using docker compose so we don't have to figure out how to build things, etc.? Then I think someone can look at this.

@mattklein123 mattklein123 self-assigned this Feb 3, 2021
@bartebor
Copy link
Contributor Author

bartebor commented Feb 3, 2021

Oh, I thought that it was easy enough to use - the build process is dockerized, there are no other requirements than docker itself to run this. I also thought that developers would use it against their own built envoy, so I did not see the reason to use docker-compose. The README has all commands to just copy&paste:

# clone
git clone https://github.com/bartebor/crash-management-server.git
cd crash-management-server

# build
docker build -t c-m-s:latest .

# run
docker run --rm -p8080:8080 -p12345:12345 c-m-s:latest

# in other terminal, start your custom-built envoy using sample config file
envoy -c envoy/envoy-dynamic-v3.yaml --concurrency 1

# ... or use the official docker image
docker run --rm --net=host -v $(pwd)/envoy/envoy-dynamic-v3.yaml:/etc/envoy.yaml:ro envoyproxy/envoy-debug:v1.17.0 -c /etc/envoy.yaml

# trigger change - open in browser http://127.0.0.1:8080 or use curl:
curl -s -XPOST 127.0.0.1:8080/triggerChange

@mattklein123
Copy link
Member

Ah ok, perfect, thanks. I will take a look.

@jamesbattersby
Copy link

I believe that I am having the same issue. I have a UDP proxy and updating the cluster definition most times causes a crash. I see this in the logs:

[2021-02-20 12:33:01.493][1][info][upstream] [source/common/upstream/cds_api_impl.cc:71] cds: add 1 cluster(s), remove 0 cluster(s)
[2021-02-20 12:33:01.495][1][info][upstream] [source/common/upstream/cds_api_impl.cc:86] cds: add/update cluster 'my_cluster'
[2021-02-20 12:33:01.495][18][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2021-02-20 12:33:01.495][18][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2021-02-20 12:33:01.495][18][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 5c801b25cae04f06bf48248c90e87d623d7a6283/1.17.0/Clean/RELEASE/BoringSSL

The output does vary, sometimes I get more backtrace. I see that this is already being looked at. If it's useful I can try and pull together the minimum required to trigger the issue.

@mattklein123
Copy link
Member

I just tried to repro this on current main branch and I can't repro. @bartebor can you check current main branch with your repro instructions to check me and see if I'm maybe doing something wrong? Thank you.

@mattklein123
Copy link
Member

I just tried the repro instructions on 1.17.0 and I also can't repro so I'm probably doing something wrong?

@bartebor
Copy link
Contributor Author

bartebor commented Feb 23, 2021

Hmm, I just tried above instructions (with just copying into terminal) and envoy (docker image, envoy-debug:1.17.0) crashed on the first try:

[2021-02-23 20:21:12.061][1][info][upstream] [source/common/upstream/cds_api_impl.cc:71] cds: add 1 cluster(s), remove 1 cluster(s)
[2021-02-23 20:21:12.063][1][info][upstream] [source/common/upstream/cds_api_impl.cc:86] cds: add/update cluster 'cluster-1'
[2021-02-23 20:21:12.065][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x56429504b5fd
[2021-02-23 20:21:12.065][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2021-02-23 20:21:12.065][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 5c801b25cae04f06bf48248c90e87d623d7a6283/1.17.0/Clean/RELEASE/BoringSSL
[2021-02-23 20:21:12.065][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: __restore_rt [0x7fbf3e6cb980]
[2021-02-23 20:21:12.112][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Upstream::PrioritySetImpl::updateHosts() [0x564297441624]
[2021-02-23 20:21:12.128][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Upstream::ClusterManagerImpl::ThreadLocalClusterManagerImpl::updateClusterMembership() [0x5642972ae9dc]
[2021-02-23 20:21:12.146][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: std::__1::__function::__func<>::operator()() [0x5642972b8c43]
[2021-02-23 20:21:12.165][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: std::__1::__function::__func<>::operator()() [0x5642972b62a3]
[2021-02-23 20:21:12.184][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: std::__1::__function::__func<>::operator()() [0x564297237a4c]
[2021-02-23 20:21:12.196][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: Envoy::Event::DispatcherImpl::runPostCallbacks() [0x564297262b4d]
[2021-02-23 20:21:12.210][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: event_process_active_single_queue [0x5642976d2e48]
[2021-02-23 20:21:12.222][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: event_base_loop [0x5642976d1b0e]
[2021-02-23 20:21:12.233][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: Envoy::Server::WorkerImpl::threadRoutine() [0x564297254eb8]
[2021-02-23 20:21:12.245][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::{lambda()#1}::__invoke() [0x56429789f573]
[2021-02-23 20:21:12.245][16][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: start_thread [0x7fbf3e6c06db]

@mattklein123 we need to make sure that:

  1. envoy connects to crash management server and creates listener cluster-1:
[2021-02-23 20:28:25.498][1][info][upstream] [source/server/lds_api.cc:79] lds: add/update listener 'cluster-1'
  1. when calling /triggerChange endpoint, management server emits a confirmation:
2021-02-23 20:29:52.599  INFO 1 --- [nio-8080-exec-4] p.w.c.SnapshotGenerator                  : Triggering change in CDS

and envoy reloads cluster cluster-1

[2021-02-23 20:29:52.606][1][info][upstream] [source/common/upstream/cds_api_impl.cc:71] cds: add 1 cluster(s), remove 1 cluster(s)
[2021-02-23 20:29:52.611][1][info][upstream] [source/common/upstream/cds_api_impl.cc:86] cds: add/update cluster 'cluster-1'

You sometimes need to call the endpoint few times in a row without restarting anything to trigger a crash. I'm using a i7-5600U laptop with security fixes for numerous cpu vulnerabilities.

-- EDIT
I currently have no sandbox to build envoy on my laptop, so I am unable to test the main branch now. I can build it tomorrow if needed.

@mattklein123
Copy link
Member

OK thanks I thought it was an instant crash with single repro. I can repro if I do the update over and over again. I will take a look.

@lambdai
Copy link
Contributor

lambdai commented Feb 23, 2021

Is UDP in the title the red herring? Is TCP cluster impacted as well?

@mattklein123 I am not sure if #14954 fixes it... I remember the fix was initially aiming to address the callback data race in #13209

@mattklein123
Copy link
Member

I don't know if it's a red herring or not. Is still repros on current main branch. I'm debugging now and will report back.

@mattklein123
Copy link
Member

mattklein123 commented Feb 23, 2021

The problem is

member_update_cb_handle_->remove();

Is no longer valid due to the order of the update callbacks when a cluster is updated. It's possible that redis proxy also has this problem but I'm not sure yet. I will work on fixing this. I think the best thing to do would be to move the member_update_cb_handle_ to some type of RAII wrapper that uses weak_ptr internally (@htuch I think this came up in one of your recent PRs). Unless @htuch is working on this actively I will sort it out.

@lambdai
Copy link
Contributor

lambdai commented Feb 24, 2021

Maybe the ~ClusterInfo wrapping this line should be executed on worker thread either

@lambdai
Copy link
Contributor

lambdai commented Feb 24, 2021

Maybe the ~ClusterInfo wrapping this line should be executed on worker thread either

I take it back.

@L3o-pold
Copy link

L3o-pold commented Feb 25, 2021

I think we have the same issue with TCP redis. The crash occurs when we try to change the cds.yaml file with new endpoints.

node:
        id: id_1
        cluster: test

    static_resources:
      listeners:
        - name: listener_0
          address:
            socket_address:
              protocol: TCP
              address: 0.0.0.0
              port_value: 6379
          filter_chains:
            - filters:
              - name: envoy.filters.network.redis_proxy
                typed_config:
                  "@type": type.googleapis.com/envoy.extensions.filters.network.redis_proxy.v3.RedisProxy
                  stat_prefix: redis_proxy
                  settings:
                      op_timeout: 1s
                      enable_hashtagging: false
                  prefix_routes:
                      catch_all_route:
                          cluster: redis

    dynamic_resources:
      cds_config:
          path: /var/lib/envoy/cds.yaml

    admin:
        access_log_path: /tmp/admin_access.log
        address:
            socket_address:
                protocol: TCP
                address: 0.0.0.0
                port_value: 9901
resources:
  - "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
    name: redis
    connect_timeout: 1s
    type: STRICT_DNS
    lb_policy: RING_HASH
    dns_lookup_family: V4_ONLY
    health_checks:
      timeout: 1s
      interval: 1s
      unhealthy_threshold: 3
      healthy_threshold: 1
      custom_health_check:
        name: envoy.health_checkers.redis
        typed_config: {}
    load_assignment:
      cluster_name: redis
      
      endpoints:
        - lb_endpoints:
          
          - endpoint:
              address:
                socket_address:
                  address: 10.42.0.37
                  port_value: 6379
          
          - endpoint:
              address:
                socket_address:
                  address: 10.42.0.80
                  port_value: 6379
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:21:00.194][1][info][main] [source/server/server.cc:731] starting main dispatch loop
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:21:00.194][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:191] cm init: all clusters initialized
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:21:00.194][1][info][main] [source/server/server.cc:712] all clusters initialized. initializing init manager
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:21:00.194][1][info][config] [source/server/listener_manager_impl.cc:888] all dependencies initialized. starting workers
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.851][1][info][upstream] [source/common/upstream/cds_api_impl.cc:71] cds: add 1 cluster(s), remove 0 cluster(s)
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.854][1][info][upstream] [source/common/upstream/cds_api_impl.cc:86] cds: add/update cluster 'redis'
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x18
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 5c801b25cae04f06bf48248c90e87d623d7a6283/1.17.0/Clean/RELEASE/BoringSSL
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: __restore_rt [0x7fb7c5224980]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #1: [0x5568bac2b7f4]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #2: [0x5568bac35c43]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #3: [0x5568bac332a3]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #4: [0x5568bac52b59]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #5: [0x5568bac2b9dc]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #6: [0x5568bac35c43]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #7: [0x5568bac332a3]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #8: [0x5568babb4a4c]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #9: [0x5568babb3843]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #10: [0x5568babb370b]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #11: [0x5568bac26545]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #12: [0x5568bac2837e]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #13: [0x5568bac314bb]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #14: [0x5568bac52bc4]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #15: [0x5568bac53824]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #16: [0x5568badbe624]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #17: [0x5568badc58b4]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #18: [0x5568bae0354d]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #19: [0x5568b94cdb26]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #20: [0x5568b9bdb9f8]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #21: [0x5568b9bdfb4b]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #22: [0x5568b9bdea4b]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #23: [0x5568b9bdb4d7]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #24: [0x5568b9bdc6ed]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #25: [0x5568babf388f]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #26: [0x5568babee00d]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #27: [0x5568babebcb9]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #28: [0x5568babe0fd1]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #29: [0x5568babe1dbc]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #30: [0x5568bb050138]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.860][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #31: [0x5568bb04eb0e]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.861][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #32: [0x5568babc1bff]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.861][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #33: [0x5568b93d7e28]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.861][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #34: [0x5568b93d8627]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.861][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #35: [0x5568b93d69dc]
envoy-7d56b8b77f-8796f envoy [2021-02-25 18:22:55.861][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #36: __libc_start_main [0x7fb7c4e42bf7]
envoy-7d56b8b77f-8796f envoy ConnectionImpl 0x47643f340000, connecting_: 0, bind_error_: 0, state(): Open, read_buffer_limit_: 1048576
envoy-7d56b8b77f-8796f envoy socket_: 
envoy-7d56b8b77f-8796f envoy   ListenSocketImpl 0x47643f55d170, transport_protocol_: , server_name_: 
envoy-7d56b8b77f-8796f envoy   address_provider_: 
envoy-7d56b8b77f-8796f envoy     SocketAddressSetterImpl 0x47643f54a1f8, remote_address_: 10.42.0.37:6379, direct_remote_address_: 10.42.0.37:6379, local_address_: 10.42.0.82:59920

mattklein123 added a commit that referenced this issue Feb 25, 2021
This changes the callback code to use RAII with weak
pointers. This allows both the callee and the callback
manager to be safely destructed in different orders which
does happen during normal operation, for example with cluster
and listener changes.

Fixes #14866

Signed-off-by: Matt Klein <[email protected]>
mattklein123 added a commit that referenced this issue Feb 26, 2021
This changes the callback code to use RAII with weak
pointers. This allows both the callee and the callback
manager to be safely destructed in different orders which
does happen during normal operation, for example with cluster
and listener changes.

Fixes #14866

Signed-off-by: Matt Klein <[email protected]>
rexengineering pushed a commit to rexengineering/istio-envoy that referenced this issue Oct 15, 2021
This changes the callback code to use RAII with weak
pointers. This allows both the callee and the callback
manager to be safely destructed in different orders which
does happen during normal operation, for example with cluster
and listener changes.

Fixes envoyproxy/envoy#14866

Signed-off-by: Matt Klein <[email protected]>
@aojea
Copy link

aojea commented May 16, 2024

I have a crash when using CDS with UDP, it is completely reproducible on Kubernetes CI

kubernetes/kubernetes#124729

Crashdump on CDS

[2024-05-16 07:11:34.464][1][info][upstream] [source/common/upstream/cds_api_helper.cc:32] cds: add 1 cluster(s), remove 0 cluster(s)
[2024-05-16 07:11:34.465][1][info][upstream] [source/common/upstream/cds_api_helper.cc:71] cds: added/updated 1 cluster(s), skipped 0 unmodified cluster(s)
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:127] Caught Segmentation fault, suspect faulting address 0x18
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:111] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:112] Envoy version: 816188b86a0a52095b116b107f576324082c7c02/1.30.1/Clean/RELEASE/BoringSSL
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:114] Address mapping: 5585541ca000-558556b72000 /usr/local/bin/envoy
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #0: [0x7f9a5f4e1520]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #1: [0x5585548d4caa]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #2: [0x5585548d2bbf]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #3: [0x5585561ed36d]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #4: [0x5585561f1600]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #5: [0x55855651bcc1]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #6: [0x55855651998f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #7: [0x55855651ab2f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #8: [0x5585561f0fa6]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #9: [0x5585561f0d4e]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #10: [0x5585562ea081]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #11: [0x5585562eb62d]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #12: [0x55855653bd40]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #13: [0x55855653a681]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #14: [0x558555b79f9f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #15: [0x5585565b7d03]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #16: [0x7f9a5f533ac3]

Config applied
LDS

resources:
- "@type": type.googleapis.com/envoy.config.listener.v3.Listener
  name: listener_IPv4_80_UDP
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 80
      protocol: UDP
  udp_listener_config:
    downstream_socket_config:
      max_rx_datagram_size: 9000
  listener_filters:
  - name: envoy.filters.udp_listener.udp_proxy
    typed_config:
      '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.UdpProxyConfig
      access_log:
      - name: envoy.file_access_log
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
      stat_prefix: udp_proxy
      matcher:
        on_no_match:
          action:
            name: route
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.Route
              cluster: cluster_IPv4_80_UDP
      upstream_socket_config:
        max_rx_datagram_size: 9000

CDS

resources:
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
  name: cluster_IPv4_80_UDP
  connect_timeout: 5s
  type: STATIC
  lb_policy: RANDOM
  health_checks:
  - timeout: 5s
    interval: 3s
    unhealthy_threshold: 2
    healthy_threshold: 1
    no_traffic_interval: 5s
    always_log_health_check_failures: true
    always_log_health_check_success: true
    event_log_path: /dev/stdout
    http_health_check:
      path: /healthz
  load_assignment:
    cluster_name: cluster_IPv4_80_UDP
    endpoints:
      - lb_endpoints:
        - endpoint:
            health_check_config:
              port_value: 10256
            address:
              socket_address:
                address: 192.168.8.4
                port_value: 32557
                protocol: UDP
      - lb_endpoints:
        - endpoint:
            health_check_config:
              port_value: 10256
            address:
              socket_address:
                address: 192.168.8.2
                port_value: 32557
                protocol: UDP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants