diff options
author | Daniel Borkmann <daniel@iogearbox.net> | 2018-05-14 23:22:32 +0200 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2018-05-14 19:11:45 -0700 |
commit | 6d2eea6fb0791bf191043a68937c9aa59c5ea678 (patch) | |
tree | 24500a6739e431df2dab4aadc262cac4233a25cb /samples/hidraw | |
parent | 09ece3d0f2ee524b8d1d3c775d9eeb50c40558d8 (diff) | |
download | linux-6d2eea6fb0791bf191043a68937c9aa59c5ea678.tar.gz linux-6d2eea6fb0791bf191043a68937c9aa59c5ea678.tar.bz2 linux-6d2eea6fb0791bf191043a68937c9aa59c5ea678.zip |
bpf, arm64: optimize 32/64 immediate emission
Improve the JIT to emit 64 and 32 bit immediates, the current
algorithm is not optimal and we often emit more instructions
than actually needed. arm64 has movz, movn, movk variants but
for the current 64 bit immediates we only use movz with a
series of movk when needed.
For example loading ffffffffffffabab emits the following 4
instructions in the JIT today:
* movz: abab, shift: 0, result: 000000000000abab
* movk: ffff, shift: 16, result: 00000000ffffabab
* movk: ffff, shift: 32, result: 0000ffffffffabab
* movk: ffff, shift: 48, result: ffffffffffffabab
Whereas after the patch the same load only needs a single
instruction:
* movn: 5454, shift: 0, result: ffffffffffffabab
Another example where two extra instructions can be saved:
* movz: abab, shift: 0, result: 000000000000abab
* movk: 1f2f, shift: 16, result: 000000001f2fabab
* movk: ffff, shift: 32, result: 0000ffff1f2fabab
* movk: ffff, shift: 48, result: ffffffff1f2fabab
After the patch:
* movn: e0d0, shift: 16, result: ffffffff1f2fffff
* movk: abab, shift: 0, result: ffffffff1f2fabab
Another example with movz, before:
* movz: 0000, shift: 0, result: 0000000000000000
* movk: fea0, shift: 32, result: 0000fea000000000
After:
* movz: fea0, shift: 32, result: 0000fea000000000
Moreover, reuse emit_a64_mov_i() for 32 bit immediates that
are loaded via emit_a64_mov_i64() which is a similar optimization
as done in 6fe8b9c1f41d ("bpf, x64: save several bytes by using
mov over movabsq when possible"). On arm64, the latter allows to
use a single instruction with movn due to zero extension where
otherwise two would be needed. And last but not least add a
missing optimization in emit_a64_mov_i() where movn is used but
the subsequent movk not needed. With some of the Cilium programs
in use, this shrinks the needed instructions by about three
percent. Tested on Cavium ThunderX CN8890.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'samples/hidraw')
0 files changed, 0 insertions, 0 deletions