summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* llc: fix netdevice reference leaks in llc_ui_bind()Eric Dumazet2022-03-231-0/+8
| | | | | | | | | | | | | | | | | Whenever llc_ui_bind() and/or llc_ui_autobind() took a reference on a netdevice but subsequently fail, they must properly release their reference or risk the infamous message from unregister_netdevice() at device dismantle. unregister_netdevice: waiting for eth0 to become free. Usage count = 3 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: 赵子轩 <beraphin@gmail.com> Reported-by: Stoyan Manolov <smanolov@suse.de> Link: https://lore.kernel.org/r/20220323004147.1990845-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtoolSondhauß, Jan2022-03-231-4/+2
| | | | | | | | | | | | | | | | | | | | | | cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync when successful. pm_runtime_get_sync returns -error code on failure and 0 on successful resume but also 1 when the device is already active. So the common case for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls to pm_runtime_put in the call-chain so that pm_runtime_put is called one too many times and as result leaving the cpsw dev behind suspended. The suspended cpsw dev leads to an access violation later on by different parts of the cpsw driver. Fix this by calling the return-friendly pm_runtime_resume_and_get function. Fixes: d43c65b05b84 ("ethtool: runtime-resume netdev parent in ethnl_ops_begin") Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com> Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> Link: https://lore.kernel.org/r/20220323084725.65864-1-jan.sondhauss@wago.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net/sched: fix incorrect vlan_push_eth dest fieldLouis Peens2022-03-231-1/+1
| | | | | | | | | | | Seems like a potential copy-paste bug slipped in here, the second memcpy should of course be populating src and not dest. Fixes: ab95465cde23 ("net/sched: add vlan push_eth and pop_eth action to the hardware IR") Signed-off-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20220323092506.21639-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: bridge: mst: Restrict info size queries to bridge portsTobias Waldekranz2022-03-231-1/+1
| | | | | | | | | | | Ensure that no bridge masters are ever considered for MST info dumping. MST states are only supported on bridge ports, not bridge masters - which br_mst_info_size relies on. Fixes: 122c29486e1f ("net: bridge: mst: Support setting and reporting MST port states") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Link: https://lore.kernel.org/r/20220322133001.16181-1-tobias@waldekranz.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: marvell: prestera: add missing destroy_workqueue() in ↵Yang Yingliang2022-03-231-1/+3
| | | | | | | | | | | | prestera_module_init() Add the missing destroy_workqueue() before return from prestera_module_init() in the error handling case. Fixes: 4394fbcb78cf ("net: marvell: prestera: handle fib notifications") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220322090236.1439649-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* drivers: net: xgene: Fix regression in CRC strippingStephane Graber2022-03-231-5/+7
| | | | | | | | | | | | | | | | | All packets on ingress (except for jumbo) are terminated with a 4-bytes CRC checksum. It's the responsability of the driver to strip those 4 bytes. Unfortunately a change dating back to March 2017 re-shuffled some code and made the CRC stripping code effectively dead. This change re-orders that part a bit such that the datalen is immediately altered if needed. Fixes: 4902a92270fb ("drivers: net: xgene: Add workaround for errata 10GE_8/ENET_11") Cc: stable@vger.kernel.org Signed-off-by: Stephane Graber <stgraber@ubuntu.com> Tested-by: Stephane Graber <stgraber@ubuntu.com> Link: https://lore.kernel.org/r/20220322224205.752795-1-stgraber@ubuntu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: geneve: add missing netlink policy and size for ↵Eyal Birger2022-03-221-0/+3
| | | | | | | | | | | | IFLA_GENEVE_INNER_PROTO_INHERIT Add missing netlink attribute policy and size calculation. Also enable strict validation from this new attribute onwards. Fixes: 435fe1c0c1f7 ("net: geneve: support IPv4/IPv6 as inner protocol") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20220322043954.3042468-1-eyal.birger@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dsa: fix missing host-filtered multicast addressesVladimir Oltean2022-03-221-10/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | DSA ports are stacked devices, so they use dev_mc_add() to sync their address list to their lower interface (DSA master). But they are also hardware devices, so they program those addresses to hardware using the __dev_mc_add() sync and unsync callbacks. Unfortunately both cannot work at the same time, and it seems that the multicast addresses which are already present on the DSA master, like 33:33:00:00:00:01 (added by addrconf.c as in6addr_linklocal_allnodes) are synced to the master via dev_mc_sync(), but not to hardware by __dev_mc_sync(). This happens because both the dev_mc_sync() -> __hw_addr_sync_one() code path, as well as __dev_mc_sync() -> __hw_addr_sync_dev(), operate on the same variable: ha->sync_cnt, in a way that causes the "sync" method (dsa_slave_sync_mc) to no longer be called. To fix the issue we need to work with the API in the way in which it was intended to be used, and therefore, call dev_uc_add() and friends for each individual hardware address, from the sync and unsync callbacks. Fixes: 5e8a1e03aa4d ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB") Link: https://lore.kernel.org/netdev/20220321163213.lrn5sk7m6grighbl@skbuf/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220322003701.2056895-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net/mlx5e: Fix build warning, detected write beyond size of fieldSaeed Mahameed2022-03-222-2/+6
| | | | | | | | | | | | | | | | | | | | | When merged with Linus tree, the cited patch below will cause the following build warning: In function 'fortify_memset_chk', inlined from 'mlx5e_xmit_xdp_frame' at drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c:438:3: include/linux/fortify-string.h:242:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix that by grouping the fields to memeset in struct_group() to avoid the false alarm. Fixes: 9ded70fa1d81 ("net/mlx5e: Don't prefill WQEs in XDP SQ in the multi buffer mode") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20220322172224.31849-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* iwlwifi: mvm: Don't fail if PPAG isn't supportedMiri Korenblit2022-03-221-1/+2
| | | | | | | | | | | | | | | | | When we're copying the PPAG table into the cmd structure we're failing if the table doesn't exist in ACPI or is invalid, or if the FW doesn't support PPAG setting etc. This is wrong because those are valid scenarios. Fix this by not failing in those cases. Fixes: e8e10a37c51c ("iwlwifi: acpi: move ppag code from mvm to fw/acpi") Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/iwlwifi.20220322173828.fa47f369b717.I6a9c65149c2c3c11337f3a802dff22f514a3a436@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2022-03-22143-1092/+7123
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-03-21 v2 We've added 137 non-merge commits during the last 17 day(s) which contain a total of 143 files changed, 7123 insertions(+), 1092 deletions(-). The main changes are: 1) Custom SEC() handling in libbpf, from Andrii. 2) subskeleton support, from Delyan. 3) Use btf_tag to recognize __percpu pointers in the verifier, from Hao. 4) Fix net.core.bpf_jit_harden race, from Hou. 5) Fix bpf_sk_lookup remote_port on big-endian, from Jakub. 6) Introduce fprobe (multi kprobe) _without_ arch bits, from Masami. The arch specific bits will come later. 7) Introduce multi_kprobe bpf programs on top of fprobe, from Jiri. 8) Enable non-atomic allocations in local storage, from Joanne. 9) Various var_off ptr_to_btf_id fixed, from Kumar. 10) bpf_ima_file_hash helper, from Roberto. 11) Add "live packet" mode for XDP in BPF_PROG_RUN, from Toke. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (137 commits) selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" bpftool: Fix a bug in subskeleton code generation bpf: Fix bpf_prog_pack when PMU_SIZE is not defined bpf: Fix bpf_prog_pack for multi-node setup bpf: Fix warning for cast from restricted gfp_t in verifier bpf, arm: Fix various typos in comments libbpf: Close fd in bpf_object__reuse_map bpftool: Fix print error when show bpf map bpf: Fix kprobe_multi return probe backtrace Revert "bpf: Add support to inline bpf_get_func_ip helper on x86" bpf: Simplify check in btf_parse_hdr() selftests/bpf/test_lirc_mode2.sh: Exit with proper code bpf: Check for NULL return from bpf_get_btf_vmlinux selftests/bpf: Test skipping stacktrace bpf: Adjust BPF stack helper functions to accommodate skip > 0 bpf: Select proper size for bpf_prog_pack ... ==================== Link: https://lore.kernel.org/r/20220322050159.5507-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests/bpf: Fix kprobe_multi test.Alexei Starovoitov2022-03-221-1/+3
| | | | | | | | | | | | | | | | | | When compiler emits endbr insn the function address could be different than what bpf_get_func_ip() reports. This is a short term workaround. bpf_get_func_ip() will be fixed later. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * Revert "rethook: x86: Add rethook x86 implementation"Alexei Starovoitov2022-03-225-129/+1
| | | | | | | | | | | | This reverts commit 75caf33eda242e2f34f61e475d666359749ae5ff. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * Revert "arm64: rethook: Add arm64 rethook implementation"Alexei Starovoitov2022-03-226-121/+2
| | | | | | | | | | | | This reverts commit 83acdce6894908337ca82973149d9709d28204d7. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * Revert "powerpc: Add rethook support"Alexei Starovoitov2022-03-223-74/+0
| | | | | | | | | | | | This reverts commit 02752bd99dc2daae05c12f7063bf0632e22b4c1c. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * Revert "ARM: rethook: Add rethook arm implementation"Alexei Starovoitov2022-03-225-113/+2
| | | | | | | | | | | | This reverts commit 515a49173b80a4aabcbad9a4fa2a247042378ea1. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * bpftool: Fix a bug in subskeleton code generationYonghong Song2022-03-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiled with clang by adding LLVM=1 both kernel and selftests/bpf build, I hit the following compilation error: In file included from /.../tools/testing/selftests/bpf/prog_tests/subskeleton.c:6: ./test_subskeleton_lib.subskel.h:168:6: error: variable 'err' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] if (!s->progs) ^~~~~~~~~ ./test_subskeleton_lib.subskel.h:181:11: note: uninitialized use occurs here errno = -err; ^~~ ./test_subskeleton_lib.subskel.h:168:2: note: remove the 'if' if its condition is always false if (!s->progs) ^~~~~~~~~~~~~~ The compilation error is triggered by the following code ... int err; obj = (struct test_subskeleton_lib *)calloc(1, sizeof(*obj)); if (!obj) { errno = ENOMEM; goto err; } ... err: test_subskeleton_lib__destroy(obj); errno = -err; ... in test_subskeleton_lib__open(). The 'err' is not initialized, yet it is used in 'errno = -err' later. The fix is to remove 'errno = -err' since errno has been set properly in all incoming branches. Fixes: 00389c58ffe9 ("bpftool: Add support for subskeletons") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220320032009.3106133-1-yhs@fb.com
| * bpf: Fix bpf_prog_pack when PMU_SIZE is not definedSong Liu2022-03-211-2/+13
| | | | | | | | | | | | | | | | | | | | | | PMD_SIZE is not available in some special config, e.g. ARCH=arm with CONFIG_MMU=n. Use bpf_prog_pack of PAGE_SIZE in these cases. Fixes: ef078600eec2 ("bpf: Select proper size for bpf_prog_pack") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220321180009.1944482-3-song@kernel.org
| * bpf: Fix bpf_prog_pack for multi-node setupSong Liu2022-03-211-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | module_alloc requires num_online_nodes * PMD_SIZE to allocate huge pages. bpf_prog_pack uses pack of size num_online_nodes * PMD_SIZE. OTOH, module_alloc returns addresses that are PMD_SIZE aligned (instead of num_online_nodes * PMD_SIZE aligned). Therefore, PMD_MASK should be used to calculate pack_ptr in bpf_prog_pack_free(). Fixes: ef078600eec2 ("bpf: Select proper size for bpf_prog_pack") Reported-by: syzbot+c946805b5ce6ab87df0b@syzkaller.appspotmail.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220321180009.1944482-2-song@kernel.org
| * bpf: Fix warning for cast from restricted gfp_t in verifierJoanne Koong2022-03-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the sparse warning reported by the kernel test robot: kernel/bpf/verifier.c:13499:47: sparse: warning: cast from restricted gfp_t kernel/bpf/verifier.c:13501:47: sparse: warning: cast from restricted gfp_t This fix can be verified locally by running: 1) wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O make.cross 2) chmod +x ~/bin/make.cross 3) COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 ./make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' Fixes: b00fa38a9c1c ("bpf: Enable non-atomic allocations in local storage") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220321185802.824223-1-joannekoong@fb.com
| * bpf, arm: Fix various typos in commentsJulia Lawall2022-03-211-2/+2
| | | | | | | | | | | | | | | | | | Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220318103729.157574-9-Julia.Lawall@inria.fr
| * libbpf: Close fd in bpf_object__reuse_mapHengqi Chen2022-03-211-1/+1
| | | | | | | | | | | | | | | | | | pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it in bpf_object__reuse_map after reuse. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com
| * bpftool: Fix print error when show bpf mapYafang Shao2022-03-211-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no btf_id or frozen, it will not show the pids, but the pids don't depend on any one of them. Below is the result after this change: $ ./bpftool map show 2: lpm_trie flags 0x1 key 8B value 8B max_entries 1 memlock 4096B pids systemd(1) 3: lpm_trie flags 0x1 key 20B value 8B max_entries 1 memlock 4096B pids systemd(1) While before this change, the 'pids systemd(1)' can't be displayed. Fixes: 9330986c0300 ("bpf: Add bloom filter map implementation") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220320060815.7716-1-laoar.shao@gmail.com
| * bpf: Fix kprobe_multi return probe backtraceJiri Olsa2022-03-211-30/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrii reported that backtraces from kprobe_multi program attached as return probes are not complete and showing just initial entry [1]. It's caused by changing registers to have original function ip address as instruction pointer even for return probe, which will screw backtrace from return probe. This change keeps registers intact and store original entry ip and link address on the stack in bpf_kprobe_multi_run_ctx struct, where bpf_get_func_ip and bpf_get_attach_cookie helpers for kprobe_multi programs can find it. [1] https://lore.kernel.org/bpf/CAEf4BzZDDqK24rSKwXNp7XL3ErGD4bZa1M6c_c4EvDSt3jrZcg@mail.gmail.com/T/#m8d1301c0ea0892ddf9dc6fba57a57b8cf11b8c51 Fixes: ca74823c6e16 ("bpf: Add cookie support to programs attached with kprobe multi link") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220321070113.1449167-3-jolsa@kernel.org
| * Revert "bpf: Add support to inline bpf_get_func_ip helper on x86"Jiri Olsa2022-03-212-21/+1
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 97ee4d20ee67eb462581a7af01442de6586e390b. Following change is adding more complexity to bpf_get_func_ip helper for kprobe_multi programs, which can't be inlined easily. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321070113.1449167-2-jolsa@kernel.org
| * bpf: Simplify check in btf_parse_hdr()Yuntao Wang2022-03-211-2/+1
| | | | | | | | | | | | | | | | | | Replace offsetof(hdr_len) + sizeof(hdr_len) with offsetofend(hdr_len) to simplify the check for correctness of btf_data_size in btf_parse_hdr() Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220320075240.1001728-1-ytcoode@gmail.com
| * selftests/bpf/test_lirc_mode2.sh: Exit with proper codeHangbin Liu2022-03-211-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When test_lirc_mode2_user exec failed, the test report failed but still exit with 0. Fix it by exiting with an error code. Another issue is for the LIRCDEV checking. With bash -n, we need to quote the variable, or it will always be true. So if test_lirc_mode2_user was not run, just exit with skip code. Fixes: 6bdd533cee9a ("bpf: add selftest for lirc_mode2 type program") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321024149.157861-1-liuhangbin@gmail.com
| * bpf: Check for NULL return from bpf_get_btf_vmlinuxKumar Kartikeya Dwivedi2022-03-202-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_DEBUG_INFO_BTF is disabled, bpf_get_btf_vmlinux can return a NULL pointer. Check for it in btf_get_module_btf to prevent a NULL pointer dereference. While kernel test robot only complained about this specific case, let's also check for NULL in other call sites of bpf_get_btf_vmlinux. Fixes: 9492450fd287 ("bpf: Always raise reference in btf_get_module_btf") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220320143003.589540-1-memxor@gmail.com
| * selftests/bpf: Test skipping stacktraceNamhyung Kim2022-03-202-0/+131
| | | | | | | | | | | | | | | | | | | | | | Add a test case for stacktrace with skip > 0 using a small sized buffer. It didn't support skipping entries greater than or equal to the size of buffer and filled the skipped part with 0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220314182042.71025-2-namhyung@kernel.org
| * bpf: Adjust BPF stack helper functions to accommodate skip > 0Namhyung Kim2022-03-202-36/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's say that the caller has storage for num_elem stack frames. Then, the BPF stack helper functions walk the stack for only num_elem frames. This means that if skip > 0, one keeps only 'num_elem - skip' frames. This is because it sets init_nr in the perf_callchain_entry to the end of the buffer to save num_elem entries only. I believe it was because the perf callchain code unwound the stack frames until it reached the global max size (sysctl_perf_event_max_stack). However it now has perf_callchain_entry_ctx.max_stack to limit the iteration locally. This simplifies the code to handle init_nr in the BPF callstack entries and removes the confusion with the perf_event's __PERF_SAMPLE_CALLCHAIN_EARLY which sets init_nr to 0. Also change the comment on bpf_get_stack() in the header file to be more explicit what the return value means. Fixes: c195651e565a ("bpf: add bpf_get_stack helper") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/30a7b5d5-6726-1cc2-eaee-8da2828a9a9c@oracle.com Link: https://lore.kernel.org/bpf/20220314182042.71025-1-namhyung@kernel.org Based-on-patch-by: Eugene Loh <eugene.loh@oracle.com>
| * bpf: Select proper size for bpf_prog_packSong Liu2022-03-201-23/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using HPAGE_PMD_SIZE as the size for bpf_prog_pack is not ideal in some cases. Specifically, for NUMA systems, __vmalloc_node_range requires PMD_SIZE * num_online_nodes() to allocate huge pages. Also, if the system does not support huge pages (i.e., with cmdline option nohugevmalloc), it is better to use PAGE_SIZE packs. Add logic to select proper size for bpf_prog_pack. This solution is not ideal, as it makes assumption about the behavior of module_alloc and __vmalloc_node_range. However, it appears to be the easiest solution as it doesn't require changes in module_alloc and vmalloc code. Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220311201135.3573610-1-song@kernel.org
| * Merge branch 'Make 2-byte access to bpf_sk_lookup->remote_port endian-agnostic'Alexei Starovoitov2022-03-202-6/+27
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Sitnicki says: ==================== This patch set is a result of a discussion we had around the RFC patchset from Ilya [1]. The fix for the narrow loads from the RFC series is still relevant, but this series does not depend on it. Nor is it required to unbreak sk_lookup tests on BE, if this series gets applied. To summarize the takeaways from [1]: 1) we want to make 2-byte load from ctx->remote_port portable across LE and BE, 2) we keep the 4-byte load from ctx->remote_port as it is today - result varies on endianess of the platform. [1] https://lore.kernel.org/bpf/20220222182559.2865596-2-iii@linux.ibm.com/ v1 -> v2: - Remove needless check that 4-byte load is from &ctx->remote_port offset (Martin) [v1]: https://lore.kernel.org/bpf/20220317165826.1099418-1-jakub@cloudflare.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * selftests/bpf: Fix test for 4-byte load from remote_port on big-endianJakub Sitnicki2022-03-201-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The context access converter rewrites the 4-byte load from bpf_sk_lookup->remote_port to a 2-byte load from bpf_sk_lookup_kern structure. It means that we cannot treat the destination register contents as a 32-bit value, or the code will not be portable across big- and little-endian architectures. This is exactly the same case as with 4-byte loads from bpf_sock->dst_port so follow the approach outlined in [1] and treat the register contents as a 16-bit value in the test. [1]: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com/ Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220319183356.233666-4-jakub@cloudflare.com
| | * selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_portJakub Sitnicki2022-03-201-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") ->remote_port field changed from __u32 to __be16. However, narrow load tests which exercise 1-byte sized loads from offsetof(struct bpf_sk_lookup, remote_port) were not adopted to reflect the change. As a result, on little-endian we continue testing loads from addresses: - (__u8 *)&ctx->remote_port + 3 - (__u8 *)&ctx->remote_port + 4 which map to the zero padding following the remote_port field, and don't break the tests because there is no observable change. While on big-endian, we observe breakage because tests expect to see zeros for values loaded from: - (__u8 *)&ctx->remote_port - 1 - (__u8 *)&ctx->remote_port - 2 Above addresses map to ->remote_ip6 field, which precedes ->remote_port, and are populated during the bpf_sk_lookup IPv6 tests. Unsurprisingly, on s390x we observe: #136/38 sk_lookup/narrow access to ctx v4:OK #136/39 sk_lookup/narrow access to ctx v6:FAIL Fix it by removing the checks for 1-byte loads from offsets outside of the ->remote_port field. Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220319183356.233666-3-jakub@cloudflare.com
| | * bpf: Treat bpf_sk_lookup remote_port as a 2-byte fieldJakub Sitnicki2022-03-201-2/+18
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") the remote_port field has been split up and re-declared from u32 to be16. However, the accompanying changes to the context access converter have not been well thought through when it comes big-endian platforms. Today 2-byte wide loads from offsetof(struct bpf_sk_lookup, remote_port) are handled as narrow loads from a 4-byte wide field. This by itself is not enough to create a problem, but when we combine 1. 32-bit wide access to ->remote_port backed by a 16-wide wide load, with 2. inherent difference between litte- and big-endian in how narrow loads need have to be handled (see bpf_ctx_narrow_access_offset), we get inconsistent results for a 2-byte loads from &ctx->remote_port on LE and BE architectures. This in turn makes BPF C code for the common case of 2-byte load from ctx->remote_port not portable. To rectify it, inform the context access converter that remote_port is 2-byte wide field, and only 1-byte loads need to be treated as narrow loads. At the same time, we special-case the 4-byte load from &ctx->remote_port to continue handling it the same way as do today, in order to keep the existing BPF programs working. Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220319183356.233666-2-jakub@cloudflare.com
| * Merge branch 'Enable non-atomic allocations in local storage'Alexei Starovoitov2022-03-207-41/+103
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Joanne Koong says: ==================== From: Joanne Koong <joannelkoong@gmail.com> Currently, local storage memory can only be allocated atomically (GFP_ATOMIC). This restriction is too strict for sleepable bpf programs. In this patchset, sleepable programs can allocate memory in local storage using GFP_KERNEL, while non-sleepable programs always default to GFP_ATOMIC. v3 <- v2: * Add extra case to local_storage.c selftest to test associating multiple elements with the local storage, which triggers a GFP_KERNEL allocation in local_storage_update(). * Cast gfp_t to __s32 in verifier to fix the sparse warnings v2 <- v1: * Allocate the memory before/after the raw_spin_lock_irqsave, depending on the gfp flags * Rename mem_flags to gfp_flags * Reword the comment "*mem_flags* is set by the bpf verifier" to "*gfp_flags* is a hidden argument provided by the verifier" * Add a sentence to the commit message about existing local storage selftests covering both the GFP_ATOMIC and GFP_KERNEL paths in bpf_local_storage_update. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * selftests/bpf: Test for associating multiple elements with the local storageJoanne Koong2022-03-201-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a few calls to the existing local storage selftest to test that we can associate multiple elements with the local storage. The sleepable program's call to bpf_sk_storage_get with sk_storage_map2 will lead to an allocation of a new selem under the GFP_KERNEL flag. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220318045553.3091807-3-joannekoong@fb.com
| | * bpf: Enable non-atomic allocations in local storageJoanne Koong2022-03-206-41/+84
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, local storage memory can only be allocated atomically (GFP_ATOMIC). This restriction is too strict for sleepable bpf programs. In this patch, the verifier detects whether the program is sleepable, and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a 5th argument to bpf_task/sk/inode_storage_get. This flag will propagate down to the local storage functions that allocate memory. Please note that bpf_task/sk/inode_storage_update_elem functions are invoked by userspace applications through syscalls. Preemption is disabled before bpf_task/sk/inode_storage_update_elem is called, which means they will always have to allocate memory atomically. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220318045553.3091807-2-joannekoong@fb.com
| * libbpf: Avoid NULL deref when initializing map BTF infoAndrii Nakryiko2022-03-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | If BPF object doesn't have an BTF info, don't attempt to search for BTF types describing BPF map key or value layout. Fixes: 262cfb74ffda ("libbpf: Init btf_{key,value}_type_id on internal map open") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org
| * bpf: Always raise reference in btf_get_module_btfKumar Kartikeya Dwivedi2022-03-191-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | Align it with helpers like bpf_find_btf_id, so all functions returning BTF in out parameter follow the same rule of raising reference consistently, regardless of module or vmlinux BTF. Adjust existing callers to handle the change accordinly. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220317115957.3193097-10-memxor@gmail.com
| * bpf: Factor out fd returning from bpf_btf_find_by_name_kindKumar Kartikeya Dwivedi2022-03-181-37/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In next few patches, we need a helper that searches all kernel BTFs (vmlinux and module BTFs), and finds the type denoted by 'name' and 'kind'. Turns out bpf_btf_find_by_name_kind already does the same thing, but it instead returns a BTF ID and optionally fd (if module BTF). This is used for relocating ksyms in BPF loader code (bpftool gen skel -L). We extract the core code out into a new helper bpf_find_btf_id, which returns the BTF ID in the return value, and BTF pointer in an out parameter. The reference for the returned BTF pointer is always raised, hence user must either transfer it (e.g. to a fd), or release it after use. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220317115957.3193097-2-memxor@gmail.com
| * bpftool: Add BPF_TRACE_KPROBE_MULTI to attach type names tableAndrii Nakryiko2022-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | BPF_TRACE_KPROBE_MULTI is a new attach type name, add it to bpftool's table. This fixes a currently failing CI bpftool check. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220318150106.2933343-1-andrii@kernel.org
| * Merge branch 'bpf-fix-sock-field-tests'Daniel Borkmann2022-03-181-7/+17
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Sitnicki says: ==================== I think we have reached a consensus [1] on how the test for the 4-byte load from bpf_sock->dst_port and bpf_sk_lookup->remote_port should look, so here goes v3. I will submit a separate set of patches for bpf_sk_lookup->remote_port tests. This series has been tested on x86_64 and s390 on top of recent bpf-next - ad13baf45691 ("selftests/bpf: Test subprog jit when toggle bpf_jit_harden repeatedly"). [1] https://lore.kernel.org/bpf/87k0cwxkzs.fsf@cloudflare.com/ v2 -> v3: - Split what was previously patch 2 which was doing two things - Use BPF_TCP_* constants (Martin) - Treat the result of 4-byte load from dst_port as a 16-bit value (Martin) - Typo fixup and some rewording in patch 4 description v1 -> v2: - Limit read_sk_dst_port only to client traffic (patch 2) - Make read_sk_dst_port pass on litte- and big-endian (patch 3) v1: https://lore.kernel.org/bpf/20220225184130.483208-1-jakub@cloudflare.com/ v2: https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: Fix test for 4-byte load from dst_port on big-endianJakub Sitnicki2022-03-181-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check for 4-byte load from dst_port offset into bpf_sock is failing on big-endian architecture - s390. The bpf access converter rewrites the 4-byte load to a 2-byte load from sock_common at skc_dport offset, as shown below. * s390 / llvm-objdump -S --no-show-raw-insn 00000000000002a0 <sk_dst_port__load_word>: 84: r1 = *(u32 *)(r1 + 48) 85: w0 = 1 86: if w1 == 51966 goto +1 <LBB5_2> 87: w0 = 0 00000000000002c0 <LBB5_2>: 88: exit * s390 / bpftool prog dump xlated _Bool sk_dst_port__load_word(struct bpf_sock * sk): 35: (69) r1 = *(u16 *)(r1 +12) 36: (bc) w1 = w1 37: (b4) w0 = 1 38: (16) if w1 == 0xcafe goto pc+1 39: (b4) w0 = 0 40: (95) exit * x86_64 / llvm-objdump -S --no-show-raw-insn 00000000000002a0 <sk_dst_port__load_word>: 84: r1 = *(u32 *)(r1 + 48) 85: w0 = 1 86: if w1 == 65226 goto +1 <LBB5_2> 87: w0 = 0 00000000000002c0 <LBB5_2>: 88: exit * x86_64 / bpftool prog dump xlated _Bool sk_dst_port__load_word(struct bpf_sock * sk): 33: (69) r1 = *(u16 *)(r1 +12) 34: (b4) w0 = 1 35: (16) if w1 == 0xfeca goto pc+1 36: (b4) w0 = 0 37: (95) exit This leads to surprises if we treat the destination register contents as a 32-bit value, ignoring the fact that in reality it contains a 16-bit value. On little-endian the register contents reflect the bpf_sock struct definition, where the lower 16-bits contain the port number: struct bpf_sock { ... __be16 dst_port; /* offset 48 */ __u16 :16; ... }; However, on big-endian the register contents suggest that field the layout of bpf_sock struct is as so: struct bpf_sock { ... __u16 :16; /* offset 48 */ __be16 dst_port; ... }; Account for this quirky access conversion in the test case exercising the 4-byte load by treating the result as 16-bit wide. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com
| | * selftests/bpf: Use constants for socket states in sock_fields testJakub Sitnicki2022-03-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace magic numbers in BPF code with constants from bpf.h, so that they don't require an explanation in the comments. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220317113920.1068535-4-jakub@cloudflare.com
| | * selftests/bpf: Check dst_port only on the client socketJakub Sitnicki2022-03-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup_skb/egress programs which sock_fields test installs process packets flying in both directions, from the client to the server, and in reverse direction. Recently added dst_port check relies on the fact that destination port (remote peer port) of the socket which sends the packet is known ahead of time. This holds true only for the client socket, which connects to the known server port. Filter out any traffic that is not egressing from the client socket in the BPF program that tests reading the dst_port. Fixes: 8f50f16ff39d ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220317113920.1068535-3-jakub@cloudflare.com
| | * selftests/bpf: Fix error reporting from sock_fields programsJakub Sitnicki2022-03-181-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The helper macro that records an error in BPF programs that exercise sock fields access has been inadvertently broken by adaptation work that happened in commit b18c1f0aa477 ("bpf: selftest: Adapt sock_fields test to use skel and global variables"). BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation always fails with -EEXIST, which in turn means the error never gets recorded, and the checks for errors always pass. Revert the change in update flags. Fixes: b18c1f0aa477 ("bpf: selftest: Adapt sock_fields test to use skel and global variables") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com
| * Merge branch 'Subskeleton support for BPF librariesThread-Topic: [PATCH ↵Andrii Nakryiko2022-03-1713-109/+910
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bpf-next v4 0/5' Delyan Kratunov says: ==================== In the quest for ever more modularity, a new need has arisen - the ability to access data associated with a BPF library from a corresponding userspace library. The catch is that we don't want the userspace library to know about the structure of the final BPF object that the BPF library is linked into. In pursuit of this modularity, this patch series introduces *subskeletons.* Subskeletons are similar in use and design to skeletons with a couple of differences: 1. The generated storage types do not rely on contiguous storage for the library's variables because they may be interspersed randomly throughout the final BPF object's sections. 2. Subskeletons do not own objects and instead require a loaded bpf_object* to be passed at runtime in order to be initialized. By extension, symbols are resolved at runtime by parsing the final object's BTF. 3. Subskeletons allow access to all global variables, programs, and custom maps. They also expose the internal maps *of the final object*. This allows bpf_var_skeleton objects to contain a bpf_map** instead of a section name. Changes since v3: - Re-add key/value type lookup for legacy user maps (fixing btf test) - Minor cleanups (missed sanitize_identifier call, error messages, formatting) Changes since v2: - Reuse SEC_NAME strict mode flag - Init bpf_map->btf_value_type_id on open for internal maps *and* user BTF maps - Test custom section names (.data.foo) and overlapping kconfig externs between the final object and the library - Minor review comments in gen.c & libbpf.c Changes since v1: - Introduced new strict mode knob for single-routine-in-.text compatibility behavior, which disproportionately affects library objects. bpftool works in 1.0 mode so subskeleton generation doesn't have to worry about this now. - Made bpf_map_btf_value_type_id available earlier and used it wherever applicable. - Refactoring in bpftool gen.c per review comments. - Subskels now use typeof() for array and func proto globals to avoid the need for runtime split btf. - Expanded the subskeleton test to include arrays, custom maps, extern maps, weak symbols, and kconfigs. - selftests/bpf/Makefile now generates a subskel.h for every skel.h it would make. For reference, here is a shortened subskeleton header: #ifndef __TEST_SUBSKELETON_LIB_SUBSKEL_H__ #define __TEST_SUBSKELETON_LIB_SUBSKEL_H__ struct test_subskeleton_lib { struct bpf_object *obj; struct bpf_object_subskeleton *subskel; struct { struct bpf_map *map2; struct bpf_map *map1; struct bpf_map *data; struct bpf_map *rodata; struct bpf_map *bss; struct bpf_map *kconfig; } maps; struct { struct bpf_program *lib_perf_handler; } progs; struct test_subskeleton_lib__data { int *var6; int *var2; int *var5; } data; struct test_subskeleton_lib__rodata { int *var1; } rodata; struct test_subskeleton_lib__bss { struct { int var3_1; __s64 var3_2; } *var3; int *libout1; typeof(int[4]) *var4; typeof(int (*)()) *fn_ptr; } bss; struct test_subskeleton_lib__kconfig { _Bool *CONFIG_BPF_SYSCALL; } kconfig; static inline struct test_subskeleton_lib * test_subskeleton_lib__open(const struct bpf_object *src) { struct test_subskeleton_lib *obj; struct bpf_object_subskeleton *s; int err; ... s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s)); ... s->var_cnt = 9; ... s->vars[0].name = "var6"; s->vars[0].map = &obj->maps.data; s->vars[0].addr = (void**) &obj->data.var6; ... /* maps */ ... /* programs */ s->prog_cnt = 1; ... err = bpf_object__open_subskeleton(s); ... return obj; } #endif /* __TEST_SUBSKELETON_LIB_SUBSKEL_H__ */ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
| | * selftests/bpf: Test subskeleton functionalityDelyan Kratunov2022-03-176-2/+194
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the selftests/bpf Makefile to also generate a subskel.h for every skel.h it would have normally generated. Separately, it also introduces a new subskeleton test which tests library objects, externs, weak symbols, kconfigs, and user maps. Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1bd24956940bbbfe169bb34f7f87b11df52ef011.1647473511.git.delyank@fb.com
| | * bpftool: Add support for subskeletonsDelyan Kratunov2022-03-173-85/+542
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subskeletons are headers which require an already loaded program to operate. For example, when a BPF library is linked into a larger BPF object file, the library userspace needs a way to access its own global variables without requiring knowledge about the larger program at build time. As a result, subskeletons require a loaded bpf_object to open(). Further, they find their own symbols in the larger program by walking BTF type data at run time. At this time, programs, maps, and globals are supported through non-owning pointers. Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/ca8a48b4841c72d285ecce82371bef4a899756cb.1647473511.git.delyank@fb.com