Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF: packet scheduler #75

Open
2 of 9 tasks
Tracked by #350
matttbe opened this issue Aug 7, 2020 · 11 comments
Open
2 of 9 tasks
Tracked by #350

BPF: packet scheduler #75

matttbe opened this issue Aug 7, 2020 · 11 comments
Assignees
Labels

Comments

@matttbe
Copy link
Member

matttbe commented Aug 7, 2020

Extending MPTCP with BPF is clearly something we want.

It looks like extending the Upstream MPTCP kernel to allow taking some packet scheduling decisions with BPF will be needed and would be needed in priority to #74.

I think the implementation would be similar to what is done in the kernel with BPF TCP CC: the ability to write a congestion control protocol in BPF with BPF_STRUCT_OPS, see: https://linuxplumbersconf.org/event/7/contributions/687/

Or check these file:

  • BPF "kernelspace": net/ipv4/bpf_tcp_ca.c
  • BPF "userspace": tools/testing/selftests/bpf/progs/bpf_cubic.c

From what I saw, the kernel side is a bit tricky. Here, it looks like this solution with TCP CC is designed like that because adding a new TCP CC is done by adding a new TCP CC kernel module. For BPF TCP CC, this module can be controlled via BPF.

On our side with MPTCP, we currently don't have the ability to create other packet schedulers (or path managers).

  • Maybe a first step would be to add the ability to select different packets schedulers implemented in the kernel.
  • Or maybe we could have the current scheduler having this ability to be controlled via BPF. But in this case, can we easily have both: a single packet scheduler that can do the job with and without a BPF program controlling it?

Issues:

Linked to #350:

jenkins-tessares pushed a commit that referenced this issue Jan 29, 2021
…abled

When booting a kernel which has been built with CONFIG_AMD_MEM_ENCRYPT
enabled as a Xen pv guest a warning is issued for each processor:

[    5.964347] ------------[ cut here ]------------
[    5.968314] WARNING: CPU: 0 PID: 1 at /home/gross/linux/head/arch/x86/xen/enlighten_pv.c:660 get_trap_addr+0x59/0x90
[    5.972321] Modules linked in:
[    5.976313] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.11.0-rc5-default #75
[    5.980313] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A05 12/05/2013
[    5.984313] RIP: e030:get_trap_addr+0x59/0x90
[    5.988313] Code: 42 10 83 f0 01 85 f6 74 04 84 c0 75 1d b8 01 00 00 00 c3 48 3d 00 80 83 82 72 08 48 3d 20 81 83 82 72 0c b8 01 00 00 00 eb db <0f> 0b 31 c0 c3 48 2d 00 80 83 82 48 ba 72 1c c7 71 1c c7 71 1c 48
[    5.992313] RSP: e02b:ffffc90040033d38 EFLAGS: 00010202
[    5.996313] RAX: 0000000000000001 RBX: ffffffff82a141d0 RCX: ffffffff8222ec38
[    6.000312] RDX: ffffffff8222ec38 RSI: 0000000000000005 RDI: ffffc90040033d40
[    6.004313] RBP: ffff8881003984a0 R08: 0000000000000007 R09: ffff888100398000
[    6.008312] R10: 0000000000000007 R11: ffffc90040246000 R12: ffff8884082182a8
[    6.012313] R13: 0000000000000100 R14: 000000000000001d R15: ffff8881003982d0
[    6.016316] FS:  0000000000000000(0000) GS:ffff888408200000(0000) knlGS:0000000000000000
[    6.020313] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.024313] CR2: ffffc900020ef000 CR3: 000000000220a000 CR4: 0000000000050660
[    6.028314] Call Trace:
[    6.032313]  cvt_gate_to_trap.part.7+0x3f/0x90
[    6.036313]  ? asm_exc_double_fault+0x30/0x30
[    6.040313]  xen_convert_trap_info+0x87/0xd0
[    6.044313]  xen_pv_cpu_up+0x17a/0x450
[    6.048313]  bringup_cpu+0x2b/0xc0
[    6.052313]  ? cpus_read_trylock+0x50/0x50
[    6.056313]  cpuhp_invoke_callback+0x80/0x4c0
[    6.060313]  _cpu_up+0xa7/0x140
[    6.064313]  cpu_up+0x98/0xd0
[    6.068313]  bringup_nonboot_cpus+0x4f/0x60
[    6.072313]  smp_init+0x26/0x79
[    6.076313]  kernel_init_freeable+0x103/0x258
[    6.080313]  ? rest_init+0xd0/0xd0
[    6.084313]  kernel_init+0xa/0x110
[    6.088313]  ret_from_fork+0x1f/0x30
[    6.092313] ---[ end trace be9ecf17dceeb4f3 ]---

Reason is that there is no Xen pv trap entry for X86_TRAP_VC.

Fix that by adding a generic trap handler for unknown traps and wire all
unknown bare metal handlers to this generic handler, which will just
crash the system in case such a trap will ever happen.

Fixes: 0786138 ("x86/sev-es: Add a Runtime #VC Exception Handler")
Cc: <[email protected]> # v5.10
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Andrew Cooper <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
@matttbe
Copy link
Member Author

matttbe commented Apr 19, 2021

@geliangtang I just updated the description following our discussion we had.

@geliangtang
Copy link
Member

Round-robin packet scheduler support #194

@geliangtang
Copy link
Member

Hi Matt, I just assigned this issue to myself. I'll dry to implement the Round-robin scheduler using BPF.

@matttbe
Copy link
Member Author

matttbe commented Oct 7, 2021

(PS: I don't know if notifications are sent when I move items in Github Project but just in case: I'm moving all assigned tickets from "Future" to "Next". It doesn't mean it has to be implemented for the next version, just easier for the tracking to generate a changelog ;-) )

jenkins-tessares pushed a commit that referenced this issue Nov 19, 2021
…fails

Check for a valid hv_vp_index array prior to derefencing hv_vp_index when
setting Hyper-V's TSC change callback.  If Hyper-V setup failed in
hyperv_init(), the kernel will still report that it's running under
Hyper-V, but will have silently disabled nearly all functionality.

  BUG: kernel NULL pointer dereference, address: 0000000000000010
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP
  CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc2+ #75
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:set_hv_tscchange_cb+0x15/0xa0
  Code: <8b> 04 82 8b 15 12 17 85 01 48 c1 e0 20 48 0d ee 00 01 00 f6 c6 08
  ...
  Call Trace:
   kvm_arch_init+0x17c/0x280
   kvm_init+0x31/0x330
   vmx_init+0xba/0x13a
   do_one_initcall+0x41/0x1c0
   kernel_init_freeable+0x1f2/0x23b
   kernel_init+0x16/0x120
   ret_from_fork+0x22/0x30

Fixes: 9328626 ("x86/hyperv: Reenlightenment notifications support")
Cc: [email protected]
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wei Liu <[email protected]>
@matttbe
Copy link
Member Author

matttbe commented Sep 8, 2022

Status update:

  • some patches are already in our 'export' branch
  • but still in development, e.g. patches

jenkins-tessares pushed a commit that referenced this issue Sep 1, 2023
With latest clang18, I hit test_progs failures for the following test:

  #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
  #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
  #13      bpf_cookie:FAIL
  #75      fentry_fexit:FAIL
  #76/1    fentry_test/fentry:FAIL
  #76      fentry_test:FAIL
  #80/1    fexit_test/fexit:FAIL
  #80      fexit_test:FAIL
  #110/1   kprobe_multi_test/skel_api:FAIL
  #110/2   kprobe_multi_test/link_api_addrs:FAIL
  #110/3   kprobe_multi_test/link_api_syms:FAIL
  #110/4   kprobe_multi_test/attach_api_pattern:FAIL
  #110/5   kprobe_multi_test/attach_api_addrs:FAIL
  #110/6   kprobe_multi_test/attach_api_syms:FAIL
  #110     kprobe_multi_test:FAIL

For example, for #13/2, the error messages are:

  [...]
  kprobe_multi_test_run:FAIL:kprobe_test7_result unexpected kprobe_test7_result: actual 0 != expected 1
  [...]
  kprobe_multi_test_run:FAIL:kretprobe_test7_result unexpected kretprobe_test7_result: actual 0 != expected 1

clang17 does not have this issue.

Further investigation shows that kernel func bpf_fentry_test7(), used in
the above tests, is inlined by the compiler although it is marked as
noinline.

  int noinline bpf_fentry_test7(struct bpf_fentry_test_t *arg)
  {
        return (long)arg;
  }

It is known that for simple functions like the above (e.g. just returning
a constant or an input argument), the clang compiler may still do inlining
for a noinline function. Adding 'asm volatile ("")' in the beginning of the
bpf_fentry_test7() can prevent inlining.

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.7) Sep 25, 2023
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.8) Nov 7, 2023
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.9) Jan 8, 2024
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.10) Mar 11, 2024
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.11) May 10, 2024
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.12) Aug 5, 2024
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.13) Sep 26, 2024
@matttbe matttbe moved this to In Progress in MPTCP Upstream: Next (6.14) Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

3 participants