Skip to content

Commit

Permalink
bpf, x64: fix JIT emission for dead code
Browse files Browse the repository at this point in the history
Commit 2a5418a ("bpf: improve dead code sanitizing") replaced dead
code with a series of ja-1 instructions, for safety. That made JIT
compilation much more complex for some BPF programs. One instance of such
programs is, for example:

bool flag = false
...
/* A bunch of other code */
...
if (flag)
        do_something()

In some cases llvm is not able to remove at compile time the code for
do_something(), so the generated BPF program ends up with a large amount
of dead instructions. In one specific real life example, there are two
series of ~500 and ~1000 dead instructions in the program. When the
verifier replaces them with a series of ja-1 instructions, it causes an
interesting behavior at JIT time.

During the first pass, since all the instructions are estimated at 64
bytes, the ja-1 instructions end up being translated as 5 bytes JMP
instructions (0xE9), since the jump offsets become increasingly large (>
127) as each instruction gets discovered to be 5 bytes instead of the
estimated 64.

Starting from the second pass, the first N instructions of the ja-1
sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
become <= 127 this time. In particular, N is defined as roughly 127 / (5
- 2) ~= 42. So, each further pass will make the subsequent N JMP
instructions shrink from 5 to 2 bytes, making the image shrink every time.
This means that in order to have the entire program converge, there need
to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
for translating the dead code. If we add this number to the passes needed
to translate the other non dead code, it brings such program to 40+
passes, and JIT doesn't complete. Ultimately the userspace loader fails
because such BPF program was supposed to be part of a prog array owner
being JITed.

While it is certainly possible to try to refactor such programs to help
the compiler remove dead code, the behavior is not really intuitive and it
puts further burden on the BPF developer who is not expecting such
behavior. To make things worse, such programs are working just fine in all
the kernel releases prior to the ja-1 fix.

A possible approach to mitigate this behavior consists into noticing that
for ja-1 instructions we don't really need to rely on the estimated size
of the previous and current instructions, we know that a -1 BPF jump
offset can be safely translated into a 0xEB instruction with a jump offset
of -2.

Such fix brings the BPF program in the previous example to complete again
in ~9 passes.

Fixes: 2a5418a ("bpf: improve dead code sanitizing")
Signed-off-by: Gianluca Borello <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
  • Loading branch information
gianlucaborello authored and borkmann committed Apr 25, 2018
1 parent 5540fbf commit 1612a98
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion arch/x86/net/bpf_jit_comp.c
Original file line number Diff line number Diff line change
Expand Up @@ -1027,7 +1027,17 @@ xadd: if (is_imm8(insn->off))
break;

case BPF_JMP | BPF_JA:
jmp_offset = addrs[i + insn->off] - addrs[i];
if (insn->off == -1)
/* -1 jmp instructions will always jump
* backwards two bytes. Explicitly handling
* this case avoids wasting too many passes
* when there are long sequences of replaced
* dead code.
*/
jmp_offset = -2;
else
jmp_offset = addrs[i + insn->off] - addrs[i];

if (!jmp_offset)
/* optimize out nop jumps */
break;
Expand Down

0 comments on commit 1612a98

Please sign in to comment.