-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[201911] Warmboot fails on Arista 7050 #5255
Labels
Comments
abdosi
changed the title
Warmboot fails with 201911 on Arista 7050
[201911] Warmboot fails on Arista 7050
Aug 29, 2020
Looking into fix. |
abdosi
added a commit
to abdosi/sonic-buildimage
that referenced
this issue
Sep 4, 2020
sonic-net#5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.
abdosi
added a commit
that referenced
this issue
Sep 4, 2020
#5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.
abdosi
added a commit
that referenced
this issue
Sep 6, 2020
#5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.
santhosh-kt
pushed a commit
to santhosh-kt/sonic-buildimage
that referenced
this issue
Feb 25, 2021
sonic-net#5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Issue:
Warmboot from 201911 image to same image fails on Arista 7050 with below logs.
Root Cause
Isssue is generic and can happen on any platform. It is more of timing issue. With the new supervisor way of starting process
swssconfig.sh starts after orchagent goes in running state
[program:swssconfig]
command=/usr/bin/swssconfig.sh
priority=6
autostart=false
autorestart=unexpected
startretries=0
startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=orchagent:running
Meanwhile it's possible because of delay(timing ussue) swssconfig.sh check that Warm restart is enable might be ignored
and it is possible we will again load APP_DP with files (00-copp.config.json ipinip.json ports.json switch.json )
which can cause below issue (as seen in logs).
Possible fix can be add some delay so that orchagent can do initial processing
and update the State DB
if [[ "$SYSTEM_WARM_START" == "true" ]] || [[ "$SWSS_WARM_START" == "true" ]]; then
RESTORE_COUNT=
sonic-db-cli STATE_DB hget "WARM_RESTART_TABLE|orchagent" restore_count
if [[ -n "$RESTORE_COUNT" ]] && [[ "$RESTORE_COUNT" != "0" ]]; then
exit 0
fi
fi
Logs
Aug 26 22:06:23.358466 str-a7050-acs-1 ERR swss#orchagent: :- processCoppRule: Failed to apply attribute[2].id=0 to policer for trap group:default, error:-5
Aug 26 22:06:23.358466 str-a7050-acs-1 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.
Aug 26 22:06:23.358642 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_POLICER_ATTR_METER_TYPE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.358693 str-a7050-acs-1 ERR swss#orchagent: :- processCoppRule: Failed to apply attribute[2].id=0 to policer for trap group:trap.group.arp, error:-5
Aug 26 22:06:23.358693 str-a7050-acs-1 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_DSCP_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute dscp_mode with value pipe
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 192.168.0.1 already exists. Did not create entry.
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 10.1.0.32 already exists. Did not create entry.
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 10.0.0.56 already exists. Did not create entry.
Aug 26 22:06:23.359964 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 10.0.0.58 already exists. Did not create entry.
Aug 26 22:06:23.360052 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 10.0.0.60 already exists. Did not create entry.
Aug 26 22:06:23.360052 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: 10.0.0.62 already exists. Did not create entry.
Aug 26 22:06:23.360066 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_ECN_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.360093 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute ecn_mode with value copy_from_outer
Aug 26 22:06:23.360093 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_TTL_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.360940 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute ttl_mode with value pipe
Aug 26 22:06:23.360940 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_DSCP_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.360940 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute dscp_mode with value pipe
Aug 26 22:06:23.361022 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: fc00::71 already exists. Did not create entry.
Aug 26 22:06:23.361022 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: fc00::75 already exists. Did not create entry.
Aug 26 22:06:23.361039 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: fc00::79 already exists. Did not create entry.
Aug 26 22:06:23.361062 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: fc00::7d already exists. Did not create entry.
Aug 26 22:06:23.361062 str-a7050-acs-1 ERR swss#orchagent: :- addDecapTunnelTermEntries: fc00:1::32 already exists. Did not create entry.
Aug 26 22:06:23.361080 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_ECN_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.361092 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute ecn_mode with value copy_from_outer
Aug 26 22:06:23.361121 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_TUNNEL_ATTR_DECAP_TTL_MODE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.361252 str-a7050-acs-1 ERR swss#orchagent: :- setTunnelAttribute: Failed to set attribute ttl_mode with value pipe
Aug 26 22:06:23.361359 str-a7050-acs-1 NOTICE swss#orchagent: :- processCoppRule: Set trap group trap.group.bgp.lacp to host interface
Aug 26 22:06:23.361406 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_create: attribute key SAI_HOSTIF_TRAP_ATTR_TRAP_TYPE:16387; already exists, can't create
Aug 26 22:06:23.361406 str-a7050-acs-1 ERR swss#orchagent: :- applyAttributesToTrapIds: Failed to create trap 16387, rv:-5
Aug 26 22:06:23.361423 str-a7050-acs-1 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.
Aug 26 22:06:23.361880 str-a7050-acs-1 ERR swss#orchagent: :- meta_generic_validation_set: SAI_POLICER_ATTR_METER_TYPE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Aug 26 22:06:23.361880 str-a7050-acs-1 ERR syncd#syncd: [none] brcm_sai_set_policer_attribute:470 policer create failed with error Operation still running (0xfffffff6).
Aug 26 22:06:23.361880 str-a7050-acs-1 ERR syncd#syncd: :- processEvent: VID: oid:0x120000000007f8 RID: oid:0x41200000002
Aug 26 22:06:23.361880 str-a7050-acs-1 ERR syncd#syncd: :- processEvent: attr: SAI_POLICER_ATTR_CBS: 600
Aug 26 22:06:23.361880 str-a7050-acs-1 ERR syncd#syncd: :- processEvent: failed to execute api: set, key: SAI_OBJECT_TYPE_POLICER:oid:0x120000000007f8, status: SAI_STATUS_OBJECT_IN_USE
Aug 26 22:06:23.361931 str-a7050-acs-1 ERR syncd#syncd: :- syncd_main: Runtime error: :- processEvent: failed to execute api: set, key: SAI_OBJECT_TYPE_POLICER:oid:0x120000000007f8, status: SAI_STATUS_OBJECT_IN_USE
Aug 26 22:06:23.361931 str-a7050-acs-1 NOTICE syncd#syncd: :- notify_OA_about_syncd_exception: sending switch_shutdown_request notification to OA
The text was updated successfully, but these errors were encountered: