Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Ping test between ACA containers fail with core dump #216

Closed
kiran1048 opened this issue Feb 18, 2021 · 3 comments · Fixed by #218
Closed

Ping test between ACA containers fail with core dump #216

kiran1048 opened this issue Feb 18, 2021 · 3 comments · Fixed by #218
Assignees
Labels
bug Something isn't working P0 Drop everything and fix it now

Comments

@kiran1048
Copy link
Contributor

The unit test for running ping between two ACA containers (aca_CHILD and aca_PARENT) fails.
The gtest filter is: DISABLED_2_ports_CREATE_test_traffic_PARENT

The ping test run on two child and parent container are:
./build/tests/aca_tests --gtest_also_run_disabled_tests --gtest_filter=*DISABLED_2_ports_CREATE_test_traffic_CHILD -p <parent_IP>
./build/tests/aca_tests --gtest_also_run_disabled_tests --gtest_filter=*DISABLED_2_ports_CREATE_test_traffic_PARENT -c <child_IP>

The command line output from child and parent attached.
child_output.txt
parent_output.txt

@er1cthe0ne er1cthe0ne added bug Something isn't working P0 Drop everything and fix it now labels Feb 18, 2021
@er1cthe0ne
Copy link
Contributor

There are two problems in the test run, likely after the ARP responder change. @lly00 - can you look into this?

  1. ping failed after ARP responder return success.
  2. core dump at the end of parent test (not 100% repro), logs below:
    Executing command: ping -I 10.10.1.102 -c1 10.10.1.104
    PING 10.10.1.104 (10.10.1.104) from 10.10.1.102 : 56(84) bytes of data.
    2021-02-18 18:54:03.542: NXT_PACKET_IN2 (OF1.3) (xid=0x0): table_id=classifier cookie=0x0 total_len=46 in_port="patch-int" (via action) data_len=46 (unbuffered)
    arp,dl_vlan=2,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=fa:16:3e:d7:f2:6d,dl_dst=fa:16:3e:d7:f2:6f,arp_spa=10.10.1.102,arp_tpa=10.10.1.104,arp_op=1,arp_sha=fa:16:3e:d7:f2:6d,arp_tha=00:00:00:00:00:00
    00000000 fa 16 3e d7 f2 6f fa 16-3e d7 f2 6d 81 00 00 02
    00000010 08 06 00 01 08 00 06 04-00 01 fa 16 3e d7 f2 6d
    00000020 0a 0a 01 66 00 00 00 00-00 00 0a 0a 01 68
    Source Mac: fa:16:3e:d7:f2:6d
    Ethernet Type: 802.1Q VLAN tagging (0x8100)
    Ethernet Type: ARP (0x0806)
    From: 10.10.1.102
    to: 10.10.1.104
    Receiving arp message from inport=1
    ARP entry does not exist! (ip = 10.10.1.104 and vlan id = 2)
    munmap_chunk(): invalid pointer
    Aborted
    root@2ea2866724fe:/mnt/host/code/aca-dev#
    --- 10.10.1.104 ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 0ms

@luyaoluo
Copy link
Contributor

luyaoluo commented Mar 2, 2021

@er1cthe0ne
I have checked that the test in my own environment works well. The child_output.txt file above shows that there were arp packets but the arp responder didn't work. This may cause the failure of ping tests, but I still can not figure how it happens.

As for the core dump problem, when we receive the arp request and the corresponding mac address is not in the cache, arp responder resubmits it to table 22. The reason for the failure is most likely that the table does not exist.

(My child output is as below:)
Screenshot from 2021-03-02 20-43-06

@er1cthe0ne
Copy link
Contributor

@er1cthe0ne
I have checked that the test in my own environment works well. The child_output.txt file above shows that there were arp packets but the arp responder didn't work. This may cause the failure of ping tests, but I still can not figure how it happens.

As for the core dump problem, when we receive the arp request and the corresponding mac address is not in the cache, arp responder resubmits it to table 22. The reason for the failure is most likely that the table does not exist.

(My child output is as below:) ...

@lly00 - got it. Thanks for sharing your output. We are able to reproduce the success you see in a fresh environement and we also reproduced the failures mentioned in this issue. We are investigating it now and will update you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working P0 Drop everything and fix it now
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants