Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort the system port ID generation in minigraph parser for Chassis #20075

Merged
merged 1 commit into from
Oct 3, 2024

Conversation

judyjoseph
Copy link
Contributor

@judyjoseph judyjoseph commented Aug 29, 2024

Why I did it

The system_port_id generation was based on the loop below

for interface in interface_metadata.findall(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID.

The for loop was based on interface_metadata list extracted by findall() API matching tag "DeviceInterfaceMetadata" lxml.etree._Element. findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

Work item tracking
  • Microsoft ADO 29260084:

How I did it

Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way.

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@judyjoseph judyjoseph requested a review from arlakshm August 29, 2024 06:15
@judyjoseph
Copy link
Contributor Author

Validated on Nokia/Arista voq chassis - IBGP/EBGP up. @arlakshm @rlhui please review and merge .

@lguohan
Copy link
Collaborator

lguohan commented Aug 31, 2024

@qiluo-msft , this is problem we discussed, people keep adding logic on minigraph parser, and this is going to break your flow. we need to catch such case. #Resolved

@qiluo-msft
Copy link
Collaborator

For this one, it is good to merge. We will model the config generation later for this table.


In reply to: 2323018102

@qiluo-msft qiluo-msft merged commit e18cecb into sonic-net:master Oct 3, 2024
23 checks passed
saksarav-nokia pushed a commit to saksarav-nokia/sonic-buildimage that referenced this pull request Oct 9, 2024
#### Why I did it
The system_port_id generation was based on the loop below

for interface in **interface_metadata.findall**(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID. 

The for loop was based on interface_metadata list extracted by findall() API matching tag **"DeviceInterfaceMetadata"** [lxml.etree._Element](https://lxml.de/api/lxml.etree._Element-class.html). findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

### How I did it
Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way. 

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

#### How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.
sschlafman pushed a commit to sschlafman/sonic-buildimage that referenced this pull request Oct 15, 2024
#### Why I did it
The system_port_id generation was based on the loop below

for interface in **interface_metadata.findall**(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID. 

The for loop was based on interface_metadata list extracted by findall() API matching tag **"DeviceInterfaceMetadata"** [lxml.etree._Element](https://lxml.de/api/lxml.etree._Element-class.html). findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

### How I did it
Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way. 

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

#### How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 6, 2024
#### Why I did it
The system_port_id generation was based on the loop below

for interface in **interface_metadata.findall**(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID. 

The for loop was based on interface_metadata list extracted by findall() API matching tag **"DeviceInterfaceMetadata"** [lxml.etree._Element](https://lxml.de/api/lxml.etree._Element-class.html). findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

### How I did it
Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way. 

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

#### How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #20721

mssonicbld pushed a commit that referenced this pull request Nov 14, 2024
#### Why I did it
The system_port_id generation was based on the loop below

for interface in **interface_metadata.findall**(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID. 

The for loop was based on interface_metadata list extracted by findall() API matching tag **"DeviceInterfaceMetadata"** [lxml.etree._Element](https://lxml.de/api/lxml.etree._Element-class.html). findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

### How I did it
Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way. 

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

#### How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.
aidan-gallagher pushed a commit to aidan-gallagher/sonic-buildimage that referenced this pull request Nov 16, 2024
#### Why I did it
The system_port_id generation was based on the loop below

for interface in **interface_metadata.findall**(str(QName(ns1, "DeviceInterfaceMetadata"))):

"DeviceInterfaceMetadata" defined in the minigraph in DeviceInfo section, which is per interface and in this loop we increment the system_port_id++ so that each interface will have a unique ID. 

The for loop was based on interface_metadata list extracted by findall() API matching tag **"DeviceInterfaceMetadata"** [lxml.etree._Element](https://lxml.de/api/lxml.etree._Element-class.html). findall() doesn't guarantee document order. Hence the interface list and the corresponding system_port ids generated - has a possibility of not matching across config_db's in different linecards of a chassis.

When SYSTEM_PORT table entries hve mismatch across linecards, a few line cards behaving erratically, resulting in continuous pkt error interrupts getting fired and the IBGP sessions not getting established with other peer ASIC's in other line cards.

### How I did it
Add logic to do a sort of the system_ports dictionary based on the key (eg: "str-sonic-lc03|ASIC0|Ethernet120") and assign the system_port_id in an incremental way. 

This makes sure the system_port_ids in SYSTEM_PORT table in config_db matches in all linecards/asic

Thanks to @abdosi and @vmittal-msft in triaging and coming to this solution.

#### How to verify it

Verified by manually patching this logic in the minigraph parser in the sonic T2 chassis and make sure the dockers, interfaces, IBGP, EBGP comes up in all linecards across the chassis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants