Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable packetdrill tests #26

Closed
matttbe opened this issue May 21, 2020 · 13 comments
Closed

Unstable packetdrill tests #26

matttbe opened this issue May 21, 2020 · 13 comments

Comments

@matttbe
Copy link
Member

matttbe commented May 21, 2020

Hello,

I just noticed that in the recent builds from my CI, packetdrill tests were unstable: sometimes OK, sometimes not. Always the same error:

00:16:27.237 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:16:27.237 script packet:  0.420431 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:16:27.237 actual packet:  0.410428 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 2402075853973337702 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:16:27.237 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:16:27.237 stdout: 
00:16:27.237 stderr: 
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:16:27.237 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:16:27.237 Ran   12 tests:   11 passing,    1 failing,    0 timed out (4.08 sec): mptcp/dss
00:31:41.454 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:31:41.454 script packet:  0.421795 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:31:41.454 actual packet:  0.411773 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 509 ecr 700,dss dack4 3007449509 dsn8 1816876830602220548 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:31:41.454 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:31:41.454 stdout: 
00:31:41.454 stderr: 
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:31:41.454 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:31:41.454 Ran   12 tests:   11 passing,    1 failing,    0 timed out (9.67 sec): mptcp/dss

So far, I cannot reproduce it locally.

@dcaratti
Copy link
Contributor

the numbers in the "script packet" are not realistic, while the actual packet contains meaningful numbers. I suspect that the parser of DSS option has something bad in the packetdrill code (see dcaratti/packetdrill#12)

@matttbe
Copy link
Member Author

matttbe commented May 27, 2020

(just to track the frequency, not to push, not to bring new info, I post here the error from yesterday's build)

00:17:33.923 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:17:33.923 script packet:  0.421244 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:17:33.923 actual packet:  0.411222 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 14189444129784724388 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:17:33.923 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:17:33.923 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:17:33.923 stdout: 
00:17:33.923 stderr: 
00:17:33.923 Ran   12 tests:   11 passing,    1 failing,    0 timed out (9.67 sec): mptcp/dss

@matttbe
Copy link
Member Author

matttbe commented May 28, 2020

(just to track the frequency, not to push, not to bring new info, I post here the error from yesterday's build)

00:19:19.471 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:19:19.471 script packet:  0.420301 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:19:19.471 actual packet:  0.410297 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 509 ecr 700,dss dack4 3007449509 dsn8 12089840182925113239 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:19:19.471 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:19:19.471 stdout: 
00:19:19.471 stderr: 
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:19:19.471 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:19:19.471 Ran   12 tests:   11 passing,    1 failing,    0 timed out (4.08 sec): mptcp/dss

@dcaratti
Copy link
Contributor

dcaratti commented Jun 1, 2020

@matttbe: while adding support for ADD_ADDR_V1, @malsbat found and fixed a bug that probably has something to do with the reported problem. With commit dcaratti/packetdrill@b3ef5ef , I fixed usage of htonl() in place of htonll() in the DSS inbound parser for DSN8 + DACK8, but I omitted to propagate the fix to the inbound parser of DSN8 + DACK4. So, we might try a one-line fix where we do htonll() in place of htonl() in mptcp.c at line 1189.

@matttbe
Copy link
Member Author

matttbe commented Jun 5, 2020

@dcaratti

I just got the same error:

00:29:59.056 Error when launching packetdrill
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB_3rd_ack.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH_3rd_ack.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver_3rd_ack.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver_3rd_ack.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagB.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagH.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagH.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_wrongver.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_bind_no_cs.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_bind_no_cs.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_connect_no_cs.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB_3rd_ack.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH_3rd_ack.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver_3rd_ack.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagB.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_wrongver.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_bind_no_cs.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_connect_no_cs.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagB_3rd_ack.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH_3rd_ack.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagB.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_wrongver.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_connect_tcpfallback_flagH.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_wrongver.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_mp_capable_connect_no_cs.pkt (ipv4)]
00:29:59.056 Ran   33 tests:   33 passing,    0 failing,    0 timed out (8.07 sec): mptcp/mp_capable
00:29:59.056 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:29:59.056 script packet:  0.420603 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:29:59.056 actual packet:  0.410600 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 15209662116814822082 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:29:59.056 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:29:59.056 stdout: 
00:29:59.056 stderr: 
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:29:59.056 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:29:59.056 Ran   12 tests:   11 passing,    1 failing,    0 timed out (4.08 sec): mptcp/dss

Can we add your fix in https://github.com/multipath-tcp/packetdrill ? :)

@matttbe
Copy link
Member Author

matttbe commented Jun 9, 2020

(just to track the frequency, not to push, not to bring new info, I post here the error from yesterday's build)

01:22:06.011 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
01:22:06.011 script packet:  0.421686 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
01:22:06.011 actual packet:  0.411663 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 1745351718947042804 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
01:22:06.011 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
01:22:06.011 stdout: 
01:22:06.011 stderr: 
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
01:22:06.011 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
01:22:06.011 Ran   12 tests:   11 passing,    1 failing,    0 timed out (9.68 sec): mptcp/dss

@dcaratti
Copy link
Contributor

@dcaratti

Can we add your fix in https://github.com/multipath-tcp/packetdrill ? :)

added , but I still see the problem. I fear that the time has come to check dcaratti/packetdrill#12 :)

@dcaratti
Copy link
Contributor

dcaratti commented Jun 11, 2020

we are probably seeing a combination of bugs:
01:22:06.011 script packet: 0.421686 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop> the DACK4 value parsed from the script is 1677216, that is htonl(1). The DACK4 value at line 24 of mpc_with_data_client.pkt is 1, so probably at least this is unrelated to the binary layout of DSS option mentioned in dcaratti/packetdrill#12

@matttbe
Copy link
Member Author

matttbe commented Jun 15, 2020

@dcaratti : the CI reported this error during the WE:

00:35:08.620 + ./packetdrill/run_all.py -l -v mptcp/dss
00:35:12.965 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:35:12.966 script packet:  0.420661 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:35:12.969 actual packet:  0.410657 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 11117460021725776823 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:35:12.973 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:35:12.974 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:35:12.975 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:35:12.976 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:35:12.977 stdout: 
00:35:12.978 stderr: 
00:35:12.978 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:35:12.979 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:35:12.980 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:35:12.981 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:35:12.982 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:35:12.983 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:35:12.983 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:35:12.984 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:35:12.985 Ran   12 tests:   11 passing,    1 failing,    0 timed out (4.08 sec): mptcp/dss

So you were right, there is another issue :-)

@matttbe
Copy link
Member Author

matttbe commented Jun 15, 2020

(just to track the frequency, maybe some details (numbers) but not to push, not really to bring new info, I post here the error from today's build)

00:27:51.259 + ./packetdrill/run_all.py -l -v mptcp/dss
00:27:55.613 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:27:55.613 script packet:  0.420405 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:27:55.616 actual packet:  0.410400 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 510 ecr 700,dss dack4 3007449509 dsn8 3882161713089481225 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:27:55.620 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:27:55.621 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:27:55.622 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:27:55.623 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:27:55.624 stdout: 
00:27:55.625 stderr: 
00:27:55.625 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:27:55.626 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:27:55.626 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:27:55.627 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:27:55.628 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:27:55.629 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:27:55.630 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:27:55.631 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:27:55.632 Ran   12 tests:   11 passing,    1 failing,    0 timed out (4.08 sec): mptcp/dss

@matttbe
Copy link
Member Author

matttbe commented Jun 18, 2020

(just to track the frequency, maybe some details (numbers) but not to push, not really to bring new info, I post here the error from today's build)

00:51:34.789 + ./packetdrill/run_all.py -l -v mptcp/dss
00:51:36.618 [  290.854261] ip (4492) used greatest stack depth: 23064 bytes left
00:51:46.666 mpc_with_data_client.pkt:24: error handling packet: MPTCP option mismatch: 30
00:51:46.667 script packet:  0.421903 P. 1001:1501(500) ack 1 <nop,nop,TS val 100 ecr 700,dss dack4 16777216 dsn8 16790263835767341056 ssn 3909287936 dll 62465 no_checksum flags: MmA,nop,nop>
00:51:46.670 actual packet:  0.411877 P. 1001:1501(500) ack 1 win 256 <nop,nop,TS val 511 ecr 700,dss dack4 3007449509 dsn8 1728093687051474061 ssn 1001 dll 500 no_checksum flags: MmA,nop,nop>
00:51:46.675 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4)]
00:51:46.680 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4)]
00:51:46.682 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv4-mapped-v6)]
00:51:46.687 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv4-mapped-v6)]
00:51:46.691 FAIL [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4)]
00:51:46.694 stdout: 
00:51:46.694 stderr: 
00:51:46.694 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv6)]
00:51:46.696 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4-mapped-v6)]
00:51:46.699 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_client.pkt (ipv6)]
00:51:46.700 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_client.pkt (ipv4-mapped-v6)]
00:51:46.702 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv6)]
00:51:46.703 OK   [/opt/packetdrill/gtests/net/mptcp/dss/dss_ssn_specified_server.pkt (ipv6)]
00:51:46.705 OK   [/opt/packetdrill/gtests/net/mptcp/dss/mpc_with_data_server.pkt (ipv4)]
00:51:46.706 Ran   12 tests:   11 passing,    1 failing,    0 timed out (10.70 sec): mptcp/dss

@dcaratti
Copy link
Contributor

I don't see it anymore after merge of @malsbat PR#6 . Since the problem is in MP_CAPABLE with data, it could be reconducted to commit:

commit a254118e4fd2c2e728144d1ab6122dc7d36a4eef (malsbat/get_mptcp_option)
Author: Todd Malsbary <[email protected]>
Date:   Wed May 27 14:30:03 2020 -0700

    packetdrill: use get_mptcp_option instead of get_tcp_option

I'm not seeing the problem anymore after merge. @matttbe , can you please verify again?

@matttbe
Copy link
Member Author

matttbe commented Jun 19, 2020

I did a few tests in my VM (100 iterations) and on my CI (10 iterations) and everything is OK!
Good job finding the bug and reviewing the code!

@matttbe matttbe closed this as completed Jun 19, 2020
jenkins-tessares pushed a commit that referenced this issue Aug 3, 2020
I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Link: https://lore.kernel.org/linux-trace-devel/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Sep 7, 2020
Netpoll can try to poll napi as soon as napi_enable() is called.
It crashes trying to access a doorbell which is still NULL:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 CPU: 59 PID: 6039 Comm: ethtool Kdump: loaded Tainted: G S                5.9.0-rc1-00469-g5fd99b5d9950-dirty #26
 RIP: 0010:bnxt_poll+0x121/0x1c0
 Code: c4 20 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 41 8b 86 a0 01 00 00 41 23 85 18 01 00 00 49 8b 96 a8 01 00 00 0d 00 00 00 24 <89> 02
41 f6 45 77 02 74 cb 49 8b ae d8 01 00 00 31 c0 c7 44 24 1a
  netpoll_poll_dev+0xbd/0x1a0
  __netpoll_send_skb+0x1b2/0x210
  netpoll_send_udp+0x2c9/0x406
  write_ext_msg+0x1d7/0x1f0
  console_unlock+0x23c/0x520
  vprintk_emit+0xe0/0x1d0
  printk+0x58/0x6f
  x86_vector_activate.cold+0xf/0x46
  __irq_domain_activate_irq+0x50/0x80
  __irq_domain_activate_irq+0x32/0x80
  __irq_domain_activate_irq+0x32/0x80
  irq_domain_activate_irq+0x25/0x40
  __setup_irq+0x2d2/0x700
  request_threaded_irq+0xfb/0x160
  __bnxt_open_nic+0x3b1/0x750
  bnxt_open_nic+0x19/0x30
  ethtool_set_channels+0x1ac/0x220
  dev_ethtool+0x11ba/0x2240
  dev_ioctl+0x1cf/0x390
  sock_do_ioctl+0x95/0x130

Reported-by: Rob Sherwood <[email protected]>
Fixes: c0c050c ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
matttbe pushed a commit that referenced this issue Jun 19, 2021
[BUG]
When running btrfs/027 with "-o compress" mount option, it always
crashes with the following call trace:

  BTRFS critical (device dm-4): mapping failed logical 298901504 bio len 12288 len 8192
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/volumes.c:6651!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 5 PID: 31089 Comm: kworker/u24:10 Tainted: G           OE     5.13.0-rc2-custom+ #26
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
  RIP: 0010:btrfs_map_bio.cold+0x58/0x5a [btrfs]
  Call Trace:
   btrfs_submit_compressed_write+0x2d7/0x470 [btrfs]
   submit_compressed_extents+0x3b0/0x470 [btrfs]
   ? mark_held_locks+0x49/0x70
   btrfs_work_helper+0x131/0x3e0 [btrfs]
   process_one_work+0x28f/0x5d0
   worker_thread+0x55/0x3c0
   ? process_one_work+0x5d0/0x5d0
   kthread+0x141/0x160
   ? __kthread_bind_mask+0x60/0x60
   ret_from_fork+0x22/0x30
  ---[ end trace 63113a3a91f34e68 ]---

[CAUSE]
The critical message before the crash means we have a bio at logical
bytenr 298901504 length 12288, but only 8192 bytes can fit into one
stripe, the remaining 4096 bytes go to another stripe.

In btrfs, all bios are properly split to avoid cross stripe boundary,
but commit 764c7c9 ("btrfs: zoned: fix parallel compressed writes")
changed the behavior for compressed writes.

Previously if we find our new page can't be fitted into current stripe,
ie. "submit == 1" case, we submit current bio without adding current
page.

       submit = btrfs_bio_fits_in_stripe(page, PAGE_SIZE, bio, 0);

   page->mapping = NULL;
   if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
       PAGE_SIZE) {

But after the modification, we will add the page no matter if it crosses
stripe boundary, leading to the above crash.

       submit = btrfs_bio_fits_in_stripe(page, PAGE_SIZE, bio, 0);

   if (pg_index == 0 && use_append)
           len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
   else
           len = bio_add_page(bio, page, PAGE_SIZE, 0);

   page->mapping = NULL;
   if (submit || len < PAGE_SIZE) {

[FIX]
It's no longer possible to revert to the original code style as we have
two different bio_add_*_page() calls now.

The new fix is to skip the bio_add_*_page() call if @submit is true.

Also to avoid @len to be uninitialized, always initialize it to zero.

If @submit is true, @len will not be checked.
If @submit is not true, @len will be the return value of
bio_add_*_page() call.
Either way, the behavior is still the same as the old code.

Reported-by: Josef Bacik <[email protected]>
Fixes: 764c7c9 ("btrfs: zoned: fix parallel compressed writes")
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Jul 24, 2021
To quote Alexey[1]:

    I was adding custom tracepoint to the kernel, grabbed full F34 kernel
    .config, disabled modules and booted whole shebang as VM kernel.

    Then did

	perf record -a -e ...

    It crashed:

	general protection fault, probably for non-canonical address 0x435f5346592e4243: 0000 [#1] SMP PTI
	CPU: 1 PID: 842 Comm: cat Not tainted 5.12.6+ #26
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
	RIP: 0010:t_show+0x22/0xd0

    Then reproducer was narrowed to

	# cat /sys/kernel/tracing/printk_formats

    Original F34 kernel with modules didn't crash.

    So I started to disable options and after disabling AFS everything
    started working again.

    The root cause is that AFS was placing char arrays content into a
    section full of _pointers_ to strings with predictable consequences.

    Non canonical address 435f5346592e4243 is "CB.YFS_" which came from
    CM_NAME macro.

    Steps to reproduce:

	CONFIG_AFS=y
	CONFIG_TRACING=y

	# cat /sys/kernel/tracing/printk_formats

Fix this by the following means:

 (1) Add enum->string translation tables in the event header with the AFS
     and YFS cache/callback manager operations listed by RPC operation ID.

 (2) Modify the afs_cb_call tracepoint to print the string from the
     translation table rather than using the string at the afs_call name
     pointer.

 (3) Switch translation table depending on the service we're being accessed
     as (AFS or YFS) in the tracepoint print clause.  Will this cause
     problems to userspace utilities?

     Note that the symbolic representation of the YFS service ID isn't
     available to this header, so I've put it in as a number.  I'm not sure
     if this is the best way to do this.

 (4) Remove the name wrangling (CM_NAME) macro and put the names directly
     into the afs_call_type structs in cmservice.c.

Fixes: 8e8d7f1 ("afs: Add some tracepoints")
Reported-by: Alexey Dobriyan (SK hynix) <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Marc Dionne <[email protected]>
cc: Andrew Morton <[email protected]>
cc: [email protected]
Link: https://lore.kernel.org/r/YLAXfvZ+rObEOdc%[email protected]/ [1]
Link: https://lore.kernel.org/r/[email protected]/
Link: https://lore.kernel.org/r/162430903582.2896199.6098150063997983353.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162609463957.3133237.15916579353149746363.stgit@warthog.procyon.org.uk/ # v1 (repost)
Link: https://lore.kernel.org/r/162610726860.3408253.445207609466288531.stgit@warthog.procyon.org.uk/ # v2
matttbe pushed a commit that referenced this issue Aug 2, 2021
The memory reserved by console/PALcode or non-volatile memory is not added
to memblock.memory.

Since commit fa3354e (mm: free_area_init: use maximal zone PFNs rather
than zone sizes) the initialization of the memory map relies on the
accuracy of memblock.memory to properly calculate zone sizes. The holes in
memblock.memory caused by absent regions reserved by the firmware cause
incorrect initialization of struct pages which leads to BUG() during the
initial page freeing:

BUG: Bad page state in process swapper  pfn:2ffc53
page:fffffc000ecf14c0 refcount:0 mapcount:1 mapping:0000000000000000 index:0x0
flags: 0x0()
raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-03841-gfa3354e4ea39-dirty #26
       fffffc0001b5bd68 fffffc0001b5be80 fffffc00011cd148 fffffc000ecf14c0
       fffffc00019803df fffffc0001b5be80 fffffc00011ce340 fffffc000ecf14c0
       0000000000000000 fffffc0001b5be80 fffffc0001b482c0 fffffc00027d6618
       fffffc00027da7d0 00000000002ff97a 0000000000000000 fffffc0001b5be80
       fffffc00011d1abc fffffc000ecf14c0 fffffc0002d00000 fffffc0001b5be80
       fffffc0001b2350c 0000000000300000 fffffc0001b48298 fffffc0001b482c0
Trace:
[<fffffc00011cd148>] bad_page+0x168/0x1b0
[<fffffc00011ce340>] free_pcp_prepare+0x1e0/0x290
[<fffffc00011d1abc>] free_unref_page+0x2c/0xa0
[<fffffc00014ee5f0>] cmp_ex_sort+0x0/0x30
[<fffffc00014ee5f0>] cmp_ex_sort+0x0/0x30
[<fffffc000101001c>] _stext+0x1c/0x20

Fix this by registering the reserved ranges in memblock.memory.

Link: https://lore.kernel.org/lkml/20210726192311.uffqnanxw3ac5wwi@ivybridge
Fixes: fa3354e ("mm: free_area_init: use maximal zone PFNs rather than zone sizes")
Reported-by: Matt Turner <[email protected]>
Cc: <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Matt Turner <[email protected]>
matttbe pushed a commit that referenced this issue Nov 2, 2021
The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  test_perf_buffer:PASS:nr_cpus 0 nsec
  test_perf_buffer:PASS:nr_on_cpus 0 nsec
  test_perf_buffer:PASS:skel_load 0 nsec
  test_perf_buffer:PASS:attach_kprobe 0 nsec
  test_perf_buffer:PASS:perf_buf__new 0 nsec
  test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #24
  skipping offline CPU #25
  skipping offline CPU #26
  skipping offline CPU #27
  skipping offline CPU #28
  skipping offline CPU #29
  skipping offline CPU #30
  skipping offline CPU #31
  test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  test_perf_buffer:FAIL:buf_cnt got 24, expected 32
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Changing the test to check online cpus instead of possible.

Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
jenkins-tessares pushed a commit that referenced this issue Jan 28, 2022
struct rpmsg_ctrldev contains a struct cdev. The current code frees
the rpmsg_ctrldev struct in rpmsg_ctrldev_release_device(), but the
cdev is a managed object, therefore its release is not predictable
and the rpmsg_ctrldev could be freed before the cdev is entirely
released, as in the backtrace below.

[   93.625603] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c
[   93.636115] WARNING: CPU: 0 PID: 12 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0
[   93.644799] Modules linked in: veth xt_cgroup xt_MASQUERADE rfcomm algif_hash algif_skcipher af_alg uinput ip6table_nat fuse uvcvideo videobuf2_vmalloc venus_enc venus_dec videobuf2_dma_contig hci_uart btandroid btqca snd_soc_rt5682_i2c bluetooth qcom_spmi_temp_alarm snd_soc_rt5682v
[   93.715175] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G    B             5.4.163-lockdep #26
[   93.723855] Hardware name: Google Lazor (rev3 - 8) with LTE (DT)
[   93.730055] Workqueue: events kobject_delayed_cleanup
[   93.735271] pstate: 60c00009 (nZCv daif +PAN +UAO)
[   93.740216] pc : debug_print_object+0x13c/0x1b0
[   93.744890] lr : debug_print_object+0x13c/0x1b0
[   93.749555] sp : ffffffacf5bc7940
[   93.752978] x29: ffffffacf5bc7940 x28: dfffffd000000000
[   93.758448] x27: ffffffacdb11a800 x26: dfffffd000000000
[   93.763916] x25: ffffffd0734f856c x24: dfffffd000000000
[   93.769389] x23: 0000000000000000 x22: ffffffd0733c35b0
[   93.774860] x21: ffffffd0751994a0 x20: ffffffd075ec27c0
[   93.780338] x19: ffffffd075199100 x18: 00000000000276e0
[   93.785814] x17: 0000000000000000 x16: dfffffd000000000
[   93.791291] x15: ffffffffffffffff x14: 6e6968207473696c
[   93.796768] x13: 0000000000000000 x12: ffffffd075e2b000
[   93.802244] x11: 0000000000000001 x10: 0000000000000000
[   93.807723] x9 : d13400dff1921900 x8 : d13400dff1921900
[   93.813200] x7 : 0000000000000000 x6 : 0000000000000000
[   93.818676] x5 : 0000000000000080 x4 : 0000000000000000
[   93.824152] x3 : ffffffd0732a0fa4 x2 : 0000000000000001
[   93.829628] x1 : ffffffacf5bc7580 x0 : 0000000000000061
[   93.835104] Call trace:
[   93.837644]  debug_print_object+0x13c/0x1b0
[   93.841963]  __debug_check_no_obj_freed+0x25c/0x3c0
[   93.846987]  debug_check_no_obj_freed+0x18/0x20
[   93.851669]  slab_free_freelist_hook+0xbc/0x1e4
[   93.856346]  kfree+0xfc/0x2f4
[   93.859416]  rpmsg_ctrldev_release_device+0x78/0xb8
[   93.864445]  device_release+0x84/0x168
[   93.868310]  kobject_cleanup+0x12c/0x298
[   93.872356]  kobject_delayed_cleanup+0x10/0x18
[   93.876948]  process_one_work+0x578/0x92c
[   93.881086]  worker_thread+0x804/0xcf8
[   93.884963]  kthread+0x2a8/0x314
[   93.888303]  ret_from_fork+0x10/0x18

The cdev_device_add/del() API was created to address this issue (see
commit '233ed09d7fda ("chardev: add helper function to register char
devs with a struct device")'), use it instead of cdev add/del().

Fixes: c0cdc19 ("rpmsg: Driver for user space endpoint interface")
Signed-off-by: Sujit Kautkar <[email protected]>
Signed-off-by: Matthias Kaehlcke <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/20220110104706.v6.1.Iaac908f3e3149a89190ce006ba166e2d3fd247a3@changeid
jenkins-tessares pushed a commit that referenced this issue Feb 25, 2022
When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
matttbe pushed a commit that referenced this issue Mar 23, 2022
Add test for percpu btf_type_tag. Similar to the "user" tag, we test
the following cases:

 1. __percpu struct field.
 2. __percpu as function parameter.
 3. per_cpu_ptr() accepts dynamically allocated __percpu memory.

Because the test for "user" and the test for "percpu" are very similar,
a little bit of refactoring has been done in btf_tag.c. Basically, both
tests share the same function for loading vmlinux and module btf.

Example output from log:

 > ./test_progs -v -t btf_tag

 libbpf: prog 'test_percpu1': BPF program load failed: Permission denied
 libbpf: prog 'test_percpu1': -- BEGIN PROG LOAD LOG --
 ...
 ; g = arg->a;
 1: (61) r1 = *(u32 *)(r1 +0)
 R1 is ptr_bpf_testmod_btf_type_tag_1 access percpu memory: off=0
 ...
 test_btf_type_tag_mod_percpu:PASS:btf_type_tag_percpu 0 nsec
 #26/6 btf_tag/btf_type_tag_percpu_mod1:OK

 libbpf: prog 'test_percpu2': BPF program load failed: Permission denied
 libbpf: prog 'test_percpu2': -- BEGIN PROG LOAD LOG --
 ...
 ; g = arg->p->a;
 2: (61) r1 = *(u32 *)(r1 +0)
 R1 is ptr_bpf_testmod_btf_type_tag_1 access percpu memory: off=0
 ...
 test_btf_type_tag_mod_percpu:PASS:btf_type_tag_percpu 0 nsec
 #26/7 btf_tag/btf_type_tag_percpu_mod2:OK

 libbpf: prog 'test_percpu_load': BPF program load failed: Permission denied
 libbpf: prog 'test_percpu_load': -- BEGIN PROG LOAD LOG --
 ...
 ; g = (__u64)cgrp->rstat_cpu->updated_children;
 2: (79) r1 = *(u64 *)(r1 +48)
 R1 is ptr_cgroup_rstat_cpu access percpu memory: off=48
 ...
 test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu_load 0 nsec
 #26/8 btf_tag/btf_type_tag_percpu_vmlinux_load:OK

 load_btfs:PASS:could not load vmlinux BTF 0 nsec
 test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu 0 nsec
 test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu_helper 0 nsec
 #26/9 btf_tag/btf_type_tag_percpu_vmlinux_helper:OK

Signed-off-by: Hao Luo <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
matttbe pushed a commit that referenced this issue Nov 14, 2022
The following panic is observed when bringing up (veth_open) a veth device
that has an XDP program attached.

[   61.519185] kernel BUG at net/core/dev.c:6442!
[   61.519456] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[   61.519752] CPU: 0 PID: 408 Comm: ip Tainted: G        W          6.1.0-rc2-185930-gd9095f92950b-dirty #26
[   61.520288] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   61.520806] RIP: 0010:napi_enable+0x3d/0x40
[   61.521077] Code: f6 f6 80 61 08 00 00 02 74 0d 48 83 bf 88 01 00 00 00 74 03 80 cd 01 48 89 d0 f0 48 0f b1 4f 10 48 39 c2 75 c8 c3 cc cc cc cc <0f> 0b 90 48 8b 87 b0 00 00 00 48 81 c7 b0 00 00 00 45 31 c0 48 39
[   61.522226] RSP: 0018:ffffbc9800cc36f8 EFLAGS: 00010246
[   61.522557] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000001
[   61.523004] RDX: 0000000000000010 RSI: ffffffff8b0de852 RDI: ffff9f03848e5000
[   61.523452] RBP: 0000000000000000 R08: 0000000000000800 R09: 0000000000000000
[   61.523899] R10: ffff9f0384a96800 R11: ffffffffffa48061 R12: ffff9f03849c3000
[   61.524345] R13: 0000000000000300 R14: ffff9f03848e5000 R15: 0000001000000100
[   61.524792] FS:  00007f58cb64d2c0(0000) GS:ffff9f03bbc00000(0000) knlGS:0000000000000000
[   61.525301] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   61.525673] CR2: 00007f6cc629b498 CR3: 000000010498c000 CR4: 00000000000006f0
[   61.526121] Call Trace:
[   61.526284]  <TASK>
[   61.526425]  __veth_napi_enable_range+0xd6/0x230
[   61.526723]  veth_enable_xdp+0xd0/0x160
[   61.526969]  veth_open+0x2e/0xc0
[   61.527180]  __dev_open+0xe2/0x1b0
[   61.527405]  __dev_change_flags+0x1a1/0x210
[   61.527673]  dev_change_flags+0x1c/0x60

This happens because we are calling veth_napi_enable() on already enabled
queues. The root cause is in commit 2e0de63 changed the control logic
dropping this case,

        if (priv->_xdp_prog) {
                err = veth_enable_xdp(dev);
                if (err)
                        return err;
-       } else if (veth_gro_requested(dev)) {
+               /* refer to the logic in veth_xdp_set() */
+               if (!rtnl_dereference(peer_rq->napi)) {
+                       err = veth_napi_enable(peer);
+                       if (err)
+                               return err;
+               }

so that now veth_napi_enable is called if the peer has not yet
initialiazed its peer_rq->napi. The issue is this will happen
even if the NIC is not up. Then in veth_enable_xdp just above
we have similar path,

  veth_enable_xdp
   napi_already_on = (dev->flags & IFF_UP) && rcu_access_pointer(rq->napi)
    err = veth_enable_xdp_range(dev, 0, dev->real_num_rx_queues, napi_already_on);

The trouble is an xdp prog is assigned before bringing the device up each
of the veth_open path will enable the peers xdp napi structs. But then when
we bring the peer up it will similar try to enable again because from
veth_open the IFF_UP flag is not set until after the op in __dev_open so
we believe napi_alread_on = false.

To fix this just drop the IFF_UP test and rely on checking if the napi
struct is enabled. This also matches the peer check in veth_xdp for
disabling.

To reproduce run ./test_xdp_meta.sh I found adding Cilium/Tetragon tests
for XDP.

Fixes: 2e0de63 ("veth: Avoid drop packets when xdp_redirect performs")
Signed-off-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Jan 29, 2023
Driver's probe allocates memory for RX FIFO (port->rx_fifo) based on
default RX FIFO depth, e.g. 16.  Later during serial startup the
qcom_geni_serial_port_setup() updates the RX FIFO depth
(port->rx_fifo_depth) to match real device capabilities, e.g. to 32.

The RX UART handle code will read "port->rx_fifo_depth" number of words
into "port->rx_fifo" buffer, thus exceeding the bounds.  This can be
observed in certain configurations with Qualcomm Bluetooth HCI UART
device and KASAN:

  Bluetooth: hci0: QCA Product ID   :0x00000010
  Bluetooth: hci0: QCA SOC Version  :0x400a0200
  Bluetooth: hci0: QCA ROM Version  :0x00000200
  Bluetooth: hci0: QCA Patch Version:0x00000d2b
  Bluetooth: hci0: QCA controller version 0x02000200
  Bluetooth: hci0: QCA Downloading qca/htbtfw20.tlv
  bluetooth hci0: Direct firmware load for qca/htbtfw20.tlv failed with error -2
  Bluetooth: hci0: QCA Failed to request file: qca/htbtfw20.tlv (-2)
  Bluetooth: hci0: QCA Failed to download patch (-2)
  ==================================================================
  BUG: KASAN: slab-out-of-bounds in handle_rx_uart+0xa8/0x18c
  Write of size 4 at addr ffff279347d578c0 by task swapper/0/0

  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rt5-00350-gb2450b7e00be-dirty #26
  Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
  Call trace:
   dump_backtrace.part.0+0xe0/0xf0
   show_stack+0x18/0x40
   dump_stack_lvl+0x8c/0xb8
   print_report+0x188/0x488
   kasan_report+0xb4/0x100
   __asan_store4+0x80/0xa4
   handle_rx_uart+0xa8/0x18c
   qcom_geni_serial_handle_rx+0x84/0x9c
   qcom_geni_serial_isr+0x24c/0x760
   __handle_irq_event_percpu+0x108/0x500
   handle_irq_event+0x6c/0x110
   handle_fasteoi_irq+0x138/0x2cc
   generic_handle_domain_irq+0x48/0x64

If the RX FIFO depth changes after probe, be sure to resize the buffer.

Fixes: f9d690b ("tty: serial: qcom_geni_serial: Allocate port->rx_fifo buffer in probe")
Cc: <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Jiri Slaby <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
jenkins-tessares pushed a commit that referenced this issue May 26, 2023
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Sep 5, 2023
Commit bf5c25d ("skbuff: in skb_segment, call zerocopy functions
once per nskb") added the call to zero copy functions in skb_segment().
The change introduced a bug in skb_segment() because skb_orphan_frags()
may possibly change the number of fragments or allocate new fragments
altogether leaving nrfrags and frag to point to the old values. This can
cause a panic with stacktrace like the one below.

[  193.894380] BUG: kernel NULL pointer dereference, address: 00000000000000bc
[  193.895273] CPU: 13 PID: 18164 Comm: vh-net-17428 Kdump: loaded Tainted: G           O      5.15.123+ #26
[  193.903919] RIP: 0010:skb_segment+0xb0e/0x12f0
[  194.021892] Call Trace:
[  194.027422]  <TASK>
[  194.072861]  tcp_gso_segment+0x107/0x540
[  194.082031]  inet_gso_segment+0x15c/0x3d0
[  194.090783]  skb_mac_gso_segment+0x9f/0x110
[  194.095016]  __skb_gso_segment+0xc1/0x190
[  194.103131]  netem_enqueue+0x290/0xb10 [sch_netem]
[  194.107071]  dev_qdisc_enqueue+0x16/0x70
[  194.110884]  __dev_queue_xmit+0x63b/0xb30
[  194.121670]  bond_start_xmit+0x159/0x380 [bonding]
[  194.128506]  dev_hard_start_xmit+0xc3/0x1e0
[  194.131787]  __dev_queue_xmit+0x8a0/0xb30
[  194.138225]  macvlan_start_xmit+0x4f/0x100 [macvlan]
[  194.141477]  dev_hard_start_xmit+0xc3/0x1e0
[  194.144622]  sch_direct_xmit+0xe3/0x280
[  194.147748]  __dev_queue_xmit+0x54a/0xb30
[  194.154131]  tap_get_user+0x2a8/0x9c0 [tap]
[  194.157358]  tap_sendmsg+0x52/0x8e0 [tap]
[  194.167049]  handle_tx_zerocopy+0x14e/0x4c0 [vhost_net]
[  194.173631]  handle_tx+0xcd/0xe0 [vhost_net]
[  194.176959]  vhost_worker+0x76/0xb0 [vhost]
[  194.183667]  kthread+0x118/0x140
[  194.190358]  ret_from_fork+0x1f/0x30
[  194.193670]  </TASK>

In this case calling skb_orphan_frags() updated nr_frags leaving nrfrags
local variable in skb_segment() stale. This resulted in the code hitting
i >= nrfrags prematurely and trying to move to next frag_skb using
list_skb pointer, which was NULL, and caused kernel panic. Move the call
to zero copy functions before using frags and nr_frags.

Fixes: bf5c25d ("skbuff: in skb_segment, call zerocopy functions once per nskb")
Signed-off-by: Mohamed Khalfella <[email protected]>
Reported-by: Amit Goyal <[email protected]>
Cc: [email protected]
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Sep 18, 2023
The KASAN stack instrumentation when CONFIG_KASAN_STACK is true poisons
the stack of a function when it is entered and unpoisons it when
leaving. However, in the case of bpf_throw, we will never return as we
switch our stack frame to the BPF exception callback. Later, this
discrepancy will lead to confusing KASAN splats when kernel resumes
execution on return from the BPF program.

Fix this by unpoisoning everything below the stack pointer of the BPF
program, which should cover the range that would not be unpoisoned. An
example splat is below:

BUG: KASAN: stack-out-of-bounds in stack_trace_consume_entry+0x14e/0x170
Write of size 8 at addr ffffc900013af958 by task test_progs/227

CPU: 0 PID: 227 Comm: test_progs Not tainted 6.5.0-rc2-g43f1c6c9052a-dirty #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-2.fc39 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4a/0x80
 print_report+0xcf/0x670
 ? arch_stack_walk+0x79/0x100
 kasan_report+0xda/0x110
 ? stack_trace_consume_entry+0x14e/0x170
 ? stack_trace_consume_entry+0x14e/0x170
 ? __pfx_stack_trace_consume_entry+0x10/0x10
 stack_trace_consume_entry+0x14e/0x170
 ? __sys_bpf+0xf2e/0x41b0
 arch_stack_walk+0x8b/0x100
 ? __sys_bpf+0xf2e/0x41b0
 ? bpf_prog_test_run_skb+0x341/0x1c70
 ? bpf_prog_test_run_skb+0x341/0x1c70
 stack_trace_save+0x9b/0xd0
 ? __pfx_stack_trace_save+0x10/0x10
 ? __kasan_slab_free+0x109/0x180
 ? bpf_prog_test_run_skb+0x341/0x1c70
 ? __sys_bpf+0xf2e/0x41b0
 ? __x64_sys_bpf+0x78/0xc0
 ? do_syscall_64+0x3c/0x90
 ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
 kasan_save_stack+0x33/0x60
 ? kasan_save_stack+0x33/0x60
 ? kasan_set_track+0x25/0x30
 ? kasan_save_free_info+0x2b/0x50
 ? __kasan_slab_free+0x109/0x180
 ? kmem_cache_free+0x191/0x460
 ? bpf_prog_test_run_skb+0x341/0x1c70
 kasan_set_track+0x25/0x30
 kasan_save_free_info+0x2b/0x50
 __kasan_slab_free+0x109/0x180
 kmem_cache_free+0x191/0x460
 bpf_prog_test_run_skb+0x341/0x1c70
 ? __pfx_bpf_prog_test_run_skb+0x10/0x10
 ? __fget_light+0x51/0x220
 __sys_bpf+0xf2e/0x41b0
 ? __might_fault+0xa2/0x170
 ? __pfx___sys_bpf+0x10/0x10
 ? lock_release+0x1de/0x620
 ? __might_fault+0xcd/0x170
 ? __pfx_lock_release+0x10/0x10
 ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
 __x64_sys_bpf+0x78/0xc0
 ? syscall_enter_from_user_mode+0x20/0x50
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f0fbb38880d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d
89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 45 12 00 f7 d8 64
89 01 48
RSP: 002b:00007ffe13907de8 EFLAGS: 00000206 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007ffe13908708 RCX: 00007f0fbb38880d
RDX: 0000000000000050 RSI: 00007ffe13907e20 RDI: 000000000000000a
RBP: 00007ffe13907e00 R08: 0000000000000000 R09: 00007ffe13907e20
R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000003
R13: 0000000000000000 R14: 00007f0fbb532000 R15: 0000000000cfbd90
 </TASK>

The buggy address belongs to stack of task test_progs/227
KASAN internal error: frame info validation failed; invalid marker: 0

The buggy address belongs to the virtual mapping at
 [ffffc900013a8000, ffffc900013b1000) created by:
 kernel_clone+0xcd/0x600

The buggy address belongs to the physical page:
page:00000000b70f4332 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11418f
flags: 0x2fffe0000000000(node=0|zone=2|lastcpupid=0x7fff)
page_type: 0xffffffff()
raw: 02fffe0000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffffc900013af800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc900013af880: 00 00 00 f1 f1 f1 f1 00 00 00 f3 f3 f3 f3 f3 00
>ffffc900013af900: 00 00 00 00 00 00 00 00 00 00 00 f1 00 00 00 00
                                                    ^
 ffffc900013af980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc900013afa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Disabling lock debugging due to kernel taint

Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Acked-by: Andrey Konovalov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Sep 22, 2023
The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Mike Christie <[email protected]>
Cc: [email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
matttbe pushed a commit that referenced this issue Nov 28, 2023
As of commit b92143d ("net: dsa: mv88e6xxx: add infrastructure for
phylink_pcs") probing of a Marvell 88e6350 switch causes a NULL pointer
de-reference like this example:

    ...
    mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
    8<--- cut here ---
    Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read
    [00000000] *pgd=00000000
    Internal error: Oops: 5 [#1] ARM
    Modules linked in:
    CPU: 0 PID: 8 Comm: kworker/u2:0 Not tainted 6.7.0-rc2-dirty #26
    Hardware name: Marvell Armada 370/XP (Device Tree)
    Workqueue: events_unbound deferred_probe_work_func
    PC is at mv88e6xxx_port_setup+0x1c/0x44
    LR is at dsa_port_devlink_setup+0x74/0x154
    pc : [<c057ea24>]    lr : [<c0819598>]    psr: a0000013
    sp : c184fce0  ip : c542b8f4  fp : 00000000
    r10: 00000001  r9 : c542a540  r8 : c542bc00
    r7 : c542b838  r6 : c5244580  r5 : 00000005  r4 : c5244580
    r3 : 00000000  r2 : c542b840  r1 : 00000005  r0 : c1a02040
    ...

The Marvell 6350 switch has no SERDES interface and so has no
corresponding pcs_ops defined for it. But during probing a call is made
to mv88e6xxx_port_setup() which unconditionally expects pcs_ops to exist -
though the presence of the pcs_ops->pcs_init function is optional.

Modify code to check for pcs_ops first, before checking for and calling
pcs_ops->pcs_init. Modify checking and use of pcs_ops->pcs_teardown
which may potentially suffer the same problem.

Fixes: b92143d ("net: dsa: mv88e6xxx: add infrastructure for phylink_pcs")
Signed-off-by: Greg Ungerer <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
matttbe pushed a commit that referenced this issue Dec 19, 2023
With previous patch, one of subtests in test_btf_id becomes
flaky and may fail. The following is a failing example:

  Error: #26 btf
  Error: #26/174 btf/BTF ID
    Error: #26/174 btf/BTF ID
    btf_raw_create:PASS:check 0 nsec
    btf_raw_create:PASS:check 0 nsec
    test_btf_id:PASS:check 0 nsec
    ...
    test_btf_id:PASS:check 0 nsec
    test_btf_id:FAIL:check BTF lingersdo_test_get_info:FAIL:check failed: -1

The test tries to prove a btf_id not available after the map is closed.
But btf_id is freed only after workqueue and a rcu grace period, compared
to previous case just after a rcu grade period.
Depending on system workload, workqueue could take quite some time
to execute function bpf_map_free_deferred() which may cause the test failure.
Instead of adding arbitrary delays, let us remove the logic to
check btf_id availability after map is closed.

Signed-off-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
matttbe pushed a commit that referenced this issue Nov 12, 2024
Syzkaller reported this warning:
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 16 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x1c5/0x1e0
 Modules linked in:
 CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc5 #26
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
 RIP: 0010:inet_sock_destruct+0x1c5/0x1e0
 Code: 24 12 4c 89 e2 5b 48 c7 c7 98 ec bb 82 41 5c e9 d1 18 17 ff 4c 89 e6 5b 48 c7 c7 d0 ec bb 82 41 5c e9 bf 18 17 ff 0f 0b eb 83 <0f> 0b eb 97 0f 0b eb 87 0f 0b e9 68 ff ff ff 66 66 2e 0f 1f 84 00
 RSP: 0018:ffffc9000008bd90 EFLAGS: 00010206
 RAX: 0000000000000300 RBX: ffff88810b172a90 RCX: 0000000000000007
 RDX: 0000000000000002 RSI: 0000000000000300 RDI: ffff88810b172a00
 RBP: ffff88810b172a00 R08: ffff888104273c00 R09: 0000000000100007
 R10: 0000000000020000 R11: 0000000000000006 R12: ffff88810b172a00
 R13: 0000000000000004 R14: 0000000000000000 R15: ffff888237c31f78
 FS:  0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffc63fecac8 CR3: 000000000342e000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  ? __warn+0x88/0x130
  ? inet_sock_destruct+0x1c5/0x1e0
  ? report_bug+0x18e/0x1a0
  ? handle_bug+0x53/0x90
  ? exc_invalid_op+0x18/0x70
  ? asm_exc_invalid_op+0x1a/0x20
  ? inet_sock_destruct+0x1c5/0x1e0
  __sk_destruct+0x2a/0x200
  rcu_do_batch+0x1aa/0x530
  ? rcu_do_batch+0x13b/0x530
  rcu_core+0x159/0x2f0
  handle_softirqs+0xd3/0x2b0
  ? __pfx_smpboot_thread_fn+0x10/0x10
  run_ksoftirqd+0x25/0x30
  smpboot_thread_fn+0xdd/0x1d0
  kthread+0xd3/0x100
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x34/0x50
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>
 ---[ end trace 0000000000000000 ]---

Its possible that two threads call tcp_v6_do_rcv()/sk_forward_alloc_add()
concurrently when sk->sk_state == TCP_LISTEN with sk->sk_lock unlocked,
which triggers a data-race around sk->sk_forward_alloc:
tcp_v6_rcv
    tcp_v6_do_rcv
        skb_clone_and_charge_r
            sk_rmem_schedule
                __sk_mem_schedule
                    sk_forward_alloc_add()
            skb_set_owner_r
                sk_mem_charge
                    sk_forward_alloc_add()
        __kfree_skb
            skb_release_all
                skb_release_head_state
                    sock_rfree
                        sk_mem_uncharge
                            sk_forward_alloc_add()
                            sk_mem_reclaim
                                // set local var reclaimable
                                __sk_mem_reclaim
                                    sk_forward_alloc_add()

In this syzkaller testcase, two threads call
tcp_v6_do_rcv() with skb->truesize=768, the sk_forward_alloc changes like
this:
 (cpu 1)             | (cpu 2)             | sk_forward_alloc
 ...                 | ...                 | 0
 __sk_mem_schedule() |                     | +4096 = 4096
                     | __sk_mem_schedule() | +4096 = 8192
 sk_mem_charge()     |                     | -768  = 7424
                     | sk_mem_charge()     | -768  = 6656
 ...                 |    ...              |
 sk_mem_uncharge()   |                     | +768  = 7424
 reclaimable=7424    |                     |
                     | sk_mem_uncharge()   | +768  = 8192
                     | reclaimable=8192    |
 __sk_mem_reclaim()  |                     | -4096 = 4096
                     | __sk_mem_reclaim()  | -8192 = -4096 != 0

The skb_clone_and_charge_r() should not be called in tcp_v6_do_rcv() when
sk->sk_state is TCP_LISTEN, it happens later in tcp_v6_syn_recv_sock().
Fix the same issue in dccp_v6_do_rcv().

Suggested-by: Eric Dumazet <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Fixes: e994b2f ("tcp: do not lock listener to process SYN packets")
Signed-off-by: Wang Liang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants