Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202012][teammgrd]: Improve LAGs cleanup on shutdown #1916

Merged

Conversation

nazariig
Copy link
Collaborator

Signed-off-by: Nazarii Hnydyn [email protected]

This PR is intended to fix LAGs cleanup degradation caused by python2.7 -> python3 migration.
The approach is to replace teamd -k -t call with the raw SIGTERM and add PID alive check.
This will make sure the teammgrd is stopped only after all managed processes are being killed.

resolves: sonic-net/sonic-buildimage#8071

What I did

  • Replaced teamd -k -t call with raw SIGTERM
  • Added PID alive check

Why I did it

  • To fix LAGs cleanup timeout issue caused by python2.7 -> python3 upgrade

How I verified it

  1. Configure 64 LAG RIFs
  2. Reload config

Details if related

  • N/A

…c-net#1841)

This PR is intended to fix LAGs cleanup degradation caused by python2.7 -> python3 migration.
The approach is to replace `teamd -k -t` call with the raw `SIGTERM` and add PID alive check.
This will make sure the `teammgrd` is stopped only after all managed processes are being killed.

resolves: sonic-net/sonic-buildimage#8071

**What I did**
* Replaced `teamd -k -t` call with raw `SIGTERM`
* Added PID alive check

**Why I did it**
* To fix LAGs cleanup timeout issue caused by python2.7 -> python3 upgrade

**How I verified it**
1. Configure 64 LAG RIFs
2. Reload config
@nazariig nazariig requested a review from judyjoseph September 17, 2021 09:11
@nazariig
Copy link
Collaborator Author

A cherry-pick of: #1841 for 202012

@nazariig nazariig changed the title [teammgrd]: Improve LAGs cleanup on shutdown [202012][teammgrd]: Improve LAGs cleanup on shutdown Sep 17, 2021
@nazariig
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@qiluo-msft
Copy link
Contributor

@judyjoseph Could you check?

@judyjoseph
Copy link
Contributor

@judyjoseph Could you check?

I don't think it is related to this change -- triggering an azp run again,

@judyjoseph
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@nazariig
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@nazariig
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@nazariig
Copy link
Collaborator Author

Only LGTM check is failing:

[2021-09-17 15:46:48] [build-stderr] main.cpp:2:10: fatal error: sai.h: No such file or directory
[2021-09-17 15:46:48] [build-stderr]     2 | #include "sai.h"
[2021-09-17 15:46:48] [build-stderr]       |          ^~~~~~~
[2021-09-17 15:46:48] [build-stderr] compilation terminated.
[2021-09-17 15:46:48] [build-stderr] make[2]: *** [Makefile:623: orchagent-main.o] Error 1
[2021-09-17 15:46:48] [build-stdout] make[2]: Leaving directory '/opt/src/orchagent'
[2021-09-17 15:46:48] [build-stderr] make[1]: *** [Makefile:410: all-recursive] Error 1
[2021-09-17 15:46:48] [build-stdout] make[1]: Leaving directory '/opt/src'
[2021-09-17 15:46:48] [build-stderr] make: *** [Makefile:342: all] Error 2
[2021-09-17 15:46:48] [build-stderr] + '[' -f build.ninja ']'
[2021-09-17 15:46:48] [build-stderr] + '[' -d ../_lgtm_build_dir ']'
[2021-09-17 15:46:48] [build-stdout] Semmle autobuild: no supported build system detected.
[2021-09-17 15:46:48] [build-stderr] + for f in build build.sh
[2021-09-17 15:46:48] [ERROR] Spawned process exited abnormally (code 1; tried to run: [/opt/dist/tools/linux64/preload_tracer, /opt/dist/cpp/tools/do-build])
[2021-09-17 15:46:48] [build-stderr] + '[' -x build ']'
[2021-09-17 15:46:48] [build-stderr] + for f in build build.sh
[2021-09-17 15:46:48] [build-stderr] + '[' -x build.sh ']'
[2021-09-17 15:46:48] [build-stderr] + '[' -f setup.py ']'
[2021-09-17 15:46:48] [build-stderr] + echo 'Semmle autobuild: no supported build system detected.'
[2021-09-17 15:46:48] [build-stderr] + exit 1
[2021-09-17 15:46:48] [build-stderr] A fatal error occurred: Exit status 1 from command: [/opt/dist/cpp/tools/do-build]
[2021-09-17 15:46:48] [build-stderr] deptrace-server: received exit command
[2021-09-17 15:46:48] [ERROR] Spawned process exited abnormally (code 2; tried to run: [/opt/work/lgtm-workspace/lgtm/extract.sh])
A fatal error occurred: Exit status 2 from command: [/opt/work/lgtm-workspace/lgtm/extract.sh]

Potential fix:
#1247
sonic-net/sonic-sairedis#595

@judyjoseph / @qiluo-msft taking into consideration that this PR is a cherry-pick and LGTM failures are not relevant, can we proceed with the merge?

@qiluo-msft
Copy link
Contributor

Override lgtm on 202012, which is still under investigation.

@qiluo-msft qiluo-msft merged commit 5a4678e into sonic-net:202012 Sep 20, 2021
@kcudnik
Copy link
Contributor

kcudnik commented Sep 20, 2021

directory for 202012 SAI headers should be
/opt/work/lgtm-workspace/usr/include/sai not /opt/work/lgtm-workspace/usr/include

@kcudnik
Copy link
Contributor

kcudnik commented Sep 20, 2021

some how SAI headers are not installed in that directory for 202012

@nazariig
Copy link
Collaborator Author

some how SAI headers are not installed in that directory for 202012

@kcudnik please have a look at:

[2021-09-17 15:46:48] [build-stdout] make[2]: Entering directory '/opt/src/orchagent'
[2021-09-17 15:46:48] [build-stdout] g++ -DHAVE_CONFIG_H -I. -I.. -I ../lib -I .. -I ../warmrestart -I flex_counter -I debug_counter -g -DNDEBUG  -std=c++14 -Wall -fPIC -Wno-write-strings -I/usr/include/libnl3 -I/usr/include/swss -Werror -Wno-reorder -Wcast-align -Wcast-qual -Wconversion -Wdisabled-optimization -Wextra -Wfloat-equal -Wformat=2 -Wformat-nonliteral -Wformat-security -Wformat-y2k -Wimport -Winit-self -Winvalid-pch -Wlong-long -Wmissing-field-initializers -Wmissing-format-attribute -Wno-aggregate-return -Wno-padded -Wno-switch-enum -Wno-unused-parameter -Wpacked -Wpointer-arith -Wredundant-decls -Wstack-protector -Wstrict-aliasing=3 -Wswitch -Wswitch-default -Wunreachable-code -Wunused -Wvariadic-macros -Wno-switch-default -Wno-long-long -Wno-redundant-decls -I /usr/include/sai -I/opt/work/lgtm-workspace/usr/include -I/opt/work/lgtm-workspace/usr/include/swss -I/opt/work/lgtm-workspace/usr/include/sai  -g -O2 -MT orchagent-main.o -MD -MP -MF .deps/orchagent-main.Tpo -c -o orchagent-main.o `test -f 'main.cpp' || echo './'`main.cpp
[2021-09-17 15:46:48] [build-stderr] main.cpp:2:10: fatal error: sai.h: No such file or directory
[2021-09-17 15:46:48] [build-stderr]     2 | #include "sai.h"
[2021-09-17 15:46:48] [build-stderr]       |          ^~~~~~~
[2021-09-17 15:46:48] [build-stderr] compilation terminated.
[2021-09-17 15:46:48] [build-stderr] make[2]: *** [Makefile:623: orchagent-main.o] Error 1
[2021-09-17 15:46:48] [build-stdout] make[2]: Leaving directory '/opt/src/orchagent'
[2021-09-17 15:46:48] [build-stderr] make[1]: *** [Makefile:410: all-recursive] Error 1
[2021-09-17 15:46:48] [build-stdout] make[1]: Leaving directory '/opt/src'
[2021-09-17 15:46:48] [build-stderr] make: *** [Makefile:342: all] Error 2
[2021-09-17 15:46:48] [build-stderr] + '[' -f build.ninja ']'
[2021-09-17 15:46:48] [build-stderr] + '[' -d ../_lgtm_build_dir ']'
[2021-09-17 15:46:48] [build-stdout] Semmle autobuild: no supported build system detected.
[2021-09-17 15:46:48] [build-stderr] + for f in build build.sh
[2021-09-17 15:46:48] [ERROR] Spawned process exited abnormally (code 1; tried to run: [/opt/dist/tools/linux64/preload_tracer, /opt/dist/cpp/tools/do-build])
[2021-09-17 15:46:48] [build-stderr] + '[' -x build ']'
[2021-09-17 15:46:48] [build-stderr] + for f in build build.sh
[2021-09-17 15:46:48] [build-stderr] + '[' -x build.sh ']'
[2021-09-17 15:46:48] [build-stderr] + '[' -f setup.py ']'
[2021-09-17 15:46:48] [build-stderr] + echo 'Semmle autobuild: no supported build system detected.'
[2021-09-17 15:46:48] [build-stderr] + exit 1
[2021-09-17 15:46:48] [build-stderr] A fatal error occurred: Exit status 1 from command: [/opt/dist/cpp/tools/do-build]
[2021-09-17 15:46:48] [build-stderr] deptrace-server: received exit command
[2021-09-17 15:46:48] [ERROR] Spawned process exited abnormally (code 2; tried to run: [/opt/work/lgtm-workspace/lgtm/extract.sh])
A fatal error occurred: Exit status 2 from command: [/opt/work/lgtm-workspace/lgtm/extract.sh]

Basically we have:

-I/usr/include/swss
-I /usr/include/sai

-I/opt/work/lgtm-workspace/usr/include

-I/opt/work/lgtm-workspace/usr/include/sai
-I/opt/work/lgtm-workspace/usr/include/swss

@kcudnik
Copy link
Contributor

kcudnik commented Sep 20, 2021

yes, i noticed that, but none of those directories contains sai.h, dont know yet why that happens, otherwise gcc would pick that include

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

im testing this issue on this branch: #1921
but, also i want to notice, that none of this builds in lgtm.yml is actually taking into account 202012 branch if you take a look:

git clone https://github.com/Azure/sonic-swss-common; pushd sonic-swss-common; ./autogen.sh; fakeroot dpkg-buildpackage -us -uc -b; popd
git clone --recursive https://github.com/Azure/sonic-sairedis; pushd sonic-sairedis; ./autogen.sh; DEB_BUILD_OPTIONS=nocheck SWSS_COMMON_INC="$LGTM_WORKSPACE/usr/include" SWSS_COMMON_LIB="$LGTM_WORKSPACE/usr/lib/x86_64-linux-gnu" fakeroot debian/rules binary-syncd-vs; popd

this builds master binaries, and probably only swss is compiled on 202012 branch PR

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

output:

[2021-09-21 08:56:54] [build-stderr] ls: cannot access '/opt/work/lgtm-workspace/usr/include/sai': No such file or directory
[2021-09-21 08:56:54] [build-stderr] + true
[2021-09-21 08:56:54] [build-stderr] + ls -al /usr/include/sai
[2021-09-21 08:56:54] [build-stderr] ls: cannot access '/usr/include/sai': No such file or directory
[2021-09-21 08:56:54] [build-stderr] + true
[2021-09-21 08:56:54] [build-stderr] + find / -name sai.h -ls
[2021-09-21 08:59:36] [build-stdout]   2107297     12 -rw-rw-r--   1 semmle-build semmle-build    10107 Sep 21 08:56 /opt/src/sonic-sairedis/SAI/inc/sai.h
[2021-09-21 08:59:36] [build-stdout]   2107282     12 -rw-rw-r--   1 semmle-build semmle-build     8496 Sep 21 08:56 /opt/src/sonic-sairedis/SAI/flexsai/p4/backend/output_stage/SAI_templates/sai.h

directories sai don't even exists after instaling libsaivs-dev, and headers dont seems to be instale d at all, since the only sai.h is in sairedis source

@nazariig
Copy link
Collaborator Author

@kcudnik -> this builds master binaries, and probably only swss is compiled on 202012 branch PR
Can you please fix it?

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

don't know how to force lgtm to use different commands on different branches :(

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

[2021-09-21 09:26:21] [build-stderr] ++ dpkg-deb -vx libsaivs-dev_1.0.0_amd64.deb /opt/work/lgtm-workspace
[2021-09-21 09:26:21] [build-stdout] ./
[2021-09-21 09:26:21] [build-stdout] ./usr/
[2021-09-21 09:26:21] [build-stdout] ./usr/include/
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/x86_64-linux-gnu/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/libsaivs-dev/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/libsaivs-dev/changelog.gz
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/x86_64-linux-gnu/libsaivs.so

libsaivs-dev don't contain any sai headers :(

@nazariig
Copy link
Collaborator Author

[2021-09-21 09:26:21] [build-stderr] ++ dpkg-deb -vx libsaivs-dev_1.0.0_amd64.deb /opt/work/lgtm-workspace
[2021-09-21 09:26:21] [build-stdout] ./
[2021-09-21 09:26:21] [build-stdout] ./usr/
[2021-09-21 09:26:21] [build-stdout] ./usr/include/
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/x86_64-linux-gnu/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/libsaivs-dev/
[2021-09-21 09:26:21] [build-stdout] ./usr/share/doc/libsaivs-dev/changelog.gz
[2021-09-21 09:26:21] [build-stdout] ./usr/lib/x86_64-linux-gnu/libsaivs.so

libsaivs-dev don't contain any sai headers :(

@kcudnik nice catch! Seems to be a buildsystem issue

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

[2021-09-21 09:26:10] [build-stderr] + dh build -N syncd -N syncd-dbg -N syncd-rpc -N syncd-rpc-dbg --with autotools-dev
[2021-09-21 09:26:10] [build-stderr] dh: The autotools-dev sequence is deprecated and replaced by dh in debhelper (>= 9.20160115)
[2021-09-21 09:26:10] [build-stderr] dh: This feature will be removed in compat 12.
[2021-09-21 09:26:10] [build-stdout]    dh_testdir -O-Nsyncd -O-Nsyncd-dbg -O-Nsyncd-rpc -O-Nsyncd-rpc-dbg
[2021-09-21 09:26:10] [build-stdout]    dh_update_autotools_config -O-Nsyncd -O-Nsyncd-dbg -O-Nsyncd-rpc -O-Nsyncd-rpc-dbg
[2021-09-21 09:26:10] [build-stdout]    dh_autoreconf -O-Nsyncd -O-Nsyncd-dbg -O-Nsyncd-rpc -O-Nsyncd-rpc-dbg
[2021-09-21 09:26:12] [build-stderr] configure.ac:32: error: AM_COND_IF: no such condition "CODE_COVERAGE_ENABLED"
[2021-09-21 09:26:12] [build-stderr] /usr/share/aclocal-1.16/cond-if.m4:23: AM_COND_IF is expanded from...
[2021-09-21 09:26:12] [build-stderr] /usr/share/aclocal-1.16/cond-if.m4:23: AM_COND_IF is expanded from...
[2021-09-21 09:26:12] [build-stderr] configure.ac:32: the top level

seems like autogen fails, and dont even configure/build required libraries, not sure how this passes on master

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

https://lgtm.com/help/lgtm/lgtm.yml-configuration-file does not specify any conditions for branches :(

@nazariig
Copy link
Collaborator Author

nazariig commented Sep 21, 2021

@kcudnik can you please check with autoconf-archive?

master:

    prepare:
      packages:
      - libxml-simple-perl
      - aspell
      - aspell-en
      - libhiredis-dev
      - libnl-3-dev
      - libnl-genl-3-dev
      - libnl-route-3-dev
      - libnl-nf-3-dev
      - libzmq3-dev
      - libzmq5
      - swig3.0
      - libpython2.7-dev
      - libgtest-dev
      - dh-exec
      - doxygen
      - cdbs
      - bison
      - flex
      - graphviz
      - autoconf-archive

202012:

    prepare:
      packages:
      - libxml-simple-perl
      - aspell
      - aspell-en
      - libhiredis-dev
      - libnl-3-dev
      - libnl-genl-3-dev
      - libnl-route-3-dev
      - libnl-nf-3-dev
      - libzmq3-dev
      - libzmq5
      - swig3.0
      - libpython2.7-dev
      - libgtest-dev
      - dh-exec
      - doxygen
      - graphviz

@nazariig
Copy link
Collaborator Author

@kcudnik seems that it has been forgotten to cherry-pick: #1868

@kcudnik
Copy link
Contributor

kcudnik commented Sep 21, 2021

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants