summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fpga: dfl: pci: add device id for Intel FPGA PAC N3000Xu Yilun2020-07-121-0/+2
| | | | | | | | | | | | Add PCIe Device ID for Intel FPGA PAC N3000. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com> Reviewed-by: Tom Rix <trix@redhat.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* Documentation: fpga: dfl: add descriptions for interrupt related interfaces.Xu Yilun2020-07-061-0/+19
| | | | | | | | | | | | This patch adds introductions of interrupt related interfaces for FME error reporting, port error reporting and AFU user interrupts features. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: afu: add AFU interrupt supportXu Yilun2020-07-062-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | AFU (Accelerated Function Unit) is dynamic region of the DFL based FPGA, and always defined by users. Some DFL based FPGA cards allow users to implement their own interrupts in AFU. In order to support this, hardware implements a new UINT (AFU Interrupt) private feature with related capability register which describes the number of supported AFU interrupts as well as the local index of the interrupts for software enumeration, and from software side, driver follows the common DFL interrupt notification and handling mechanism, and it implements two ioctls below for user to query number of irqs supported and set/unset interrupt triggers. Ioctls: * DFL_FPGA_PORT_UINT_GET_IRQ_NUM get the number of irqs, which is used to determine how many interrupts UINT feature supports. * DFL_FPGA_PORT_UINT_SET_IRQ set/unset eventfds as AFU interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: fme: add interrupt support for global error reportingXu Yilun2020-07-063-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | Error reporting interrupt is very useful to notify users that some errors are detected by the hardware. Once users are notified, they could query hardware logged error states, no need to continuously poll on these states. This patch adds interrupt support for fme global error reporting sub feature. It follows the common DFL interrupt notification and handling mechanism. And it implements two ioctls below for user to query number of irqs supported, and set/unset interrupt triggers. Ioctls: * DFL_FPGA_FME_ERR_GET_IRQ_NUM get the number of irqs, which is used to determine whether/how many interrupts fme error reporting feature supports. * DFL_FPGA_FME_ERR_SET_IRQ set/unset given eventfds as fme error reporting interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: afu: add interrupt support for port error reportingXu Yilun2020-07-063-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | Error reporting interrupt is very useful to notify users that some errors are detected by the hardware. Once users are notified, they could query hardware logged error states, no need to continuously poll on these states. This patch adds interrupt support for port error reporting sub feature. It follows the common DFL interrupt notification and handling mechanism, implements two ioctl commands below for user to query number of irqs supported, and set/unset interrupt triggers. Ioctls: * DFL_FPGA_PORT_ERR_GET_IRQ_NUM get the number of irqs, which is used to determine whether/how many interrupts error reporting feature supports. * DFL_FPGA_PORT_ERR_SET_IRQ set/unset given eventfds as error interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: introduce interrupt trigger setting APIXu Yilun2020-06-283-0/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | FPGA user applications may be interested in interrupts generated by DFL features. For example, users can implement their own FPGA logics with interrupts enabled in AFU (Accelerated Function Unit, dynamic region of DFL based FPGA). So user applications need to be notified to handle these interrupts. In order to allow userspace applications to monitor interrupts, driver requires userspace to provide eventfds as interrupt notification channels. Applications then poll/select on the eventfds to get notified. This patch introduces a generic helper functions to do eventfds binding with given interrupts. Sub feature drivers are expected to use XXX_GET_IRQ_NUM to query irq info, and XXX_SET_IRQ to set eventfds for interrupts. This patch also introduces helper functions for these 2 ioctls. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: pci: add irq info for feature devices enumerationXu Yilun2020-06-281-9/+67
| | | | | | | | | | | | | | | | Some DFL FPGA PCIe cards (e.g. Intel FPGA Programmable Acceleration Card) support MSI-X based interrupts. This patch allows PCIe driver to prepare and pass interrupt resources to DFL via enumeration API. These interrupt resources could then be assigned to actual features which use them. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: parse interrupt info for feature devices on enumerationXu Yilun2020-06-282-0/+194
| | | | | | | | | | | | | | | | | | | | | | DFL based FPGA devices could support interrupts for different purposes, but current DFL framework only supports feature device enumeration with given MMIO resources information via common DFL headers. This patch introduces one new API dfl_fpga_enum_info_add_irq for low level bus drivers (e.g. PCIe device driver) to pass its interrupt resources information to DFL framework for enumeration, and also adds interrupt enumeration code in framework to parse and assign interrupt resources for enumerated feature devices and their own sub features. With this patch, DFL framework enumerates interrupt resources for core features, including PORT Error Reporting, FME (FPGA Management Engine) Error Reporting and also AFU User Interrupts. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga manager: xilinx-spi: check INIT_B pin during write_initLuca Ceresoli2020-06-261-1/+54
| | | | | | | | | | | | | | | The INIT_B pin reports the status during startup and after the end of the programming process. However the current driver completely ignores it. Check the pin status during startup to make sure programming is never started too early and also to detect any hardware issues in the FPGA connection. This is optional for backward compatibility. If INIT_B is not passed by device tree, just fallback to the old udelays. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* dt-bindings: fpga: xilinx-slave-serial: add optional INIT_B GPIOLuca Ceresoli2020-06-261-1/+6
| | | | | | | | The INIT_B is used by the 6 and 7 series to report the programming status, providing more control and information about programming errors. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: Fix dead store in fpga-bridge.cTom Rix2020-06-181-4/+2
| | | | | | | | | | | Using clang's scan-build/view this issue was flagged a dead store issue in fpga-bridge.c warning: Value stored to 'ret' is never read [deadcode.DeadStores] ret = id; Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: Fix dead store fpga-mgr.cTom Rix2020-06-181-3/+1
| | | | | | | | | | Using clang's scan-build/view this issue was flagged in fpga-mgr.c drivers/fpga/fpga-mgr.c:585:3: warning: Value stored to 'ret' is never read [deadcode.DeadStores] ret = id; Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: Use struct_size() in kzalloc()Gustavo A. R. Silva2020-06-182-8/+1
| | | | | | | | | | | | Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. Also, remove unnecessary function dfl_feature_platform_data_size(). This code was detected with the help of Coccinelle and, audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga manager: xilinx-spi: remove unneeded, mistyped variablesLuca Ceresoli2020-06-181-4/+2
| | | | | | | | | | | Using variables does not add readability here: parameters passed to udelay*() are obviously in microseconds and their meaning is clear from the context. The type is also wrong, udelay expects an unsigned long. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga manager: xilinx-spi: valid for the 7 Series tooLuca Ceresoli2020-06-181-1/+1
| | | | | | | The Xilinx 7-series uses the same protocol, mention that. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* dt-bindings: fpga: xilinx-slave-serial: valid for the 7 Series tooLuca Ceresoli2020-06-181-3/+6
| | | | | | | | | The Xilinx 7-series uses the same protocol, mention that. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Acked-by: Moritz Fischer <mdf@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* fpga: dfl: afu: convert get_user_pages() --> pin_user_pages()John Hubbard2020-06-181-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | This code was using get_user_pages_fast(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages_fast() + put_page() calls to pin_user_pages_fast() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: Xu Yilun <yilun.xu@intel.com> Cc: Wu Hao <hao.wu@intel.com> Cc: Moritz Fischer <mdf@kernel.org> Cc: linux-fpga@vger.kernel.org Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>
* Linux 5.8-rc1v5.8-rc1Linus Torvalds2020-06-141-2/+2
|
* Merge tag 'LSM-add-setgid-hook-5.8-author-fix' of ↵Linus Torvalds2020-06-145-1/+40
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://github.com/micah-morton/linux Pull SafeSetID update from Micah Morton: "Add additional LSM hooks for SafeSetID SafeSetID is capable of making allow/deny decisions for set*uid calls on a system, and we want to add similar functionality for set*gid calls. The work to do that is not yet complete, so probably won't make it in for v5.8, but we are looking to get this simple patch in for v5.8 since we have it ready. We are planning on the rest of the work for extending the SafeSetID LSM being merged during the v5.9 merge window" * tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux: security: Add LSM hooks to set*gid syscalls
| * security: Add LSM hooks to set*gid syscallsThomas Cedeno2020-06-145-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | The SafeSetID LSM uses the security_task_fix_setuid hook to filter set*uid() syscalls according to its configured security policy. In preparation for adding analagous support in the LSM for set*gid() syscalls, we add the requisite hook here. Tested by putting print statements in the security_task_fix_setgid hook and seeing them get hit during kernel boot. Signed-off-by: Thomas Cedeno <thomascedeno@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
* | Merge tag 'for-5.8-part2-tag' of ↵Linus Torvalds2020-06-147-234/+286
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "This reverts the direct io port to iomap infrastructure of btrfs merged in the first pull request. We found problems in invalidate page that don't seem to be fixable as regressions or without changing iomap code that would not affect other filesystems. There are four reverts in total, but three of them are followup cleanups needed to revert a43a67a2d715 cleanly. The result is the buffer head based implementation of direct io. Reverts are not great, but under current circumstances I don't see better options" * tag 'for-5.8-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Revert "btrfs: switch to iomap_dio_rw() for dio" Revert "fs: remove dio_end_io()" Revert "btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK" Revert "btrfs: split btrfs_direct_IO to read and write part"
| * | Revert "btrfs: switch to iomap_dio_rw() for dio"David Sterba2020-06-144-166/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a43a67a2d715540c1368b9501a22b0373b5874c0. This patch reverts the main part of switching direct io implementation to iomap infrastructure. There's a problem in invalidate page that couldn't be solved as regression in this development cycle. The problem occurs when buffered and direct io are mixed, and the ranges overlap. Although this is not recommended, filesystems implement measures or fallbacks to make it somehow work. In this case, fallback to buffered IO would be an option for btrfs (this already happens when direct io is done on compressed data), but the change would be needed in the iomap code, bringing new semantics to other filesystems. Another problem arises when again the buffered and direct ios are mixed, invalidation fails, then -EIO is set on the mapping and fsync will fail, though there's no real error. There have been discussions how to fix that, but revert seems to be the least intrusive option. Link: https://lore.kernel.org/linux-btrfs/20200528192103.xm45qoxqmkw7i5yl@fiona/ Signed-off-by: David Sterba <dsterba@suse.com>
| * | Revert "fs: remove dio_end_io()"David Sterba2020-06-092-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit b75b7ca7c27dfd61dba368f390b7d4dc20b3a8cb. The patch restores a helper that was not necessary after direct IO port to iomap infrastructure, which gets reverted. Signed-off-by: David Sterba <dsterba@suse.com>
| * | Revert "btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK"David Sterba2020-06-092-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5f008163a559d566a0ee1190a0a24f3eec6f1ea7. The patch is a simplification after direct IO port to iomap infrastructure, which gets reverted. Signed-off-by: David Sterba <dsterba@suse.com>
| * | Revert "btrfs: split btrfs_direct_IO to read and write part"David Sterba2020-06-093-83/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d8f3e73587ce574f7a9bc165e0db69b0b148f6f8. The patch is a cleanup of direct IO port to iomap infrastructure, which gets reverted. Signed-off-by: David Sterba <dsterba@suse.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2020-06-13111-647/+1344
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix cfg80211 deadlock, from Johannes Berg. 2) RXRPC fails to send norigications, from David Howells. 3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from Geliang Tang. 4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu. 5) The ucc_geth driver needs __netdev_watchdog_up exported, from Valentin Longchamp. 6) Fix hashtable memory leak in dccp, from Wang Hai. 7) Fix how nexthops are marked as FDB nexthops, from David Ahern. 8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni. 9) Fix crashes in tipc_disc_rcv(), from Tuong Lien. 10) Fix link speed reporting in iavf driver, from Brett Creeley. 11) When a channel is used for XSK and then reused again later for XSK, we forget to clear out the relevant data structures in mlx5 which causes all kinds of problems. Fix from Maxim Mikityanskiy. 12) Fix memory leak in genetlink, from Cong Wang. 13) Disallow sockmap attachments to UDP sockets, it simply won't work. From Lorenz Bauer. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) net: ethernet: ti: ale: fix allmulti for nu type ale net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init net: atm: Remove the error message according to the atomic context bpf: Undo internal BPF_PROBE_MEM in BPF insns dump libbpf: Support pre-initializing .bss global variables tools/bpftool: Fix skeleton codegen bpf: Fix memlock accounting for sock_hash bpf: sockmap: Don't attach programs to UDP sockets bpf: tcp: Recv() should return 0 when the peer socket is closed ibmvnic: Flush existing work items before device removal genetlink: clean up family attributes allocations net: ipa: header pad field only valid for AP->modem endpoint net: ipa: program upper nibbles of sequencer type net: ipa: fix modem LAN RX endpoint id net: ipa: program metadata mask differently ionic: add pcie_print_link_status rxrpc: Fix race between incoming ACK parser and retransmitter net/mlx5: E-Switch, Fix some error pointer dereferences net/mlx5: Don't fail driver on failure to create debugfs net/mlx5e: CT: Fix ipv6 nat header rewrite actions ...
| * | | net: ethernet: ti: ale: fix allmulti for nu type aleGrygorii Strashko2020-06-131-9/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS allmulti setting does not allow unregistered mcast packets to pass. This happens, because ALE VLAN entries on these SoCs do not contain port masks for reg/unreg mcast packets, but instead store indexes of ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for reg/unreg mcast packets. This path was missed by commit 9d1f6447274f ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled"). Hence, fix it by taking into account ALE type in cpsw_ale_set_allmulti(). Fixes: 9d1f6447274f ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: ethernet: ti: am65-cpsw-nuss: fix ale parameters initGrygorii Strashko2020-06-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ALE parameters structure is created on stack, so it has to be reset before passing to cpsw_ale_create() to avoid garbage values. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2020-06-1327-93/+348
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf 2020-06-12 The following pull-request contains BPF updates for your *net* tree. We've added 26 non-merge commits during the last 10 day(s) which contain a total of 27 files changed, 348 insertions(+), 93 deletions(-). The main changes are: 1) sock_hash accounting fix, from Andrey. 2) libbpf fix and probe_mem sanitizing, from Andrii. 3) sock_hash fixes, from Jakub. 4) devmap_val fix, from Jesper. 5) load_bytes_relative fix, from YiFei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | bpf: Undo internal BPF_PROBE_MEM in BPF insns dumpAndrii Nakryiko2020-06-121-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BPF_PROBE_MEM is kernel-internal implmementation details. When dumping BPF instructions to user-space, it needs to be replaced back with BPF_MEM mode. Fixes: 2a02759ef5f8 ("bpf: Add support for BTF pointers to interpreter") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200613002115.1632142-1-andriin@fb.com
| | * | | libbpf: Support pre-initializing .bss global variablesAndrii Nakryiko2020-06-123-13/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove invalid assumption in libbpf that .bss map doesn't have to be updated in kernel. With addition of skeleton and memory-mapped initialization image, .bss doesn't have to be all zeroes when BPF map is created, because user-code might have initialized those variables from user-space. Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200612194504.557844-1-andriin@fb.com
| | * | | tools/bpftool: Fix skeleton codegenAndrii Nakryiko2020-06-121-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unnecessary check at the end of codegen() routine which makes codegen() to always fail and exit bpftool with error code. Positive value of variable n is not an indicator of a failure. Fixes: 2c4779eff837 ("tools, bpftool: Exit on error in function codegen") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/bpf/20200612201603.680852-1-andriin@fb.com
| | * | | bpf: Fix memlock accounting for sock_hashAndrey Ignatov2020-06-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add missed bpf_map_charge_init() in sock_hash_alloc() and correspondingly bpf_map_charge_finish() on ENOMEM. It was found accidentally while working on unrelated selftest that checks "map->memory.pages > 0" is true for all map types. Before: # bpftool m l ... 3692: sockhash name m_sockhash flags 0x0 key 4B value 4B max_entries 8 memlock 0B After: # bpftool m l ... 84: sockmap name m_sockmap flags 0x0 key 4B value 4B max_entries 8 memlock 4096B Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200612000857.2881453-1-rdna@fb.com
| | * | | bpf: sockmap: Don't attach programs to UDP socketsLorenz Bauer2020-06-121-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stream parser infrastructure isn't set up to deal with UDP sockets, so we mustn't try to attach programs to them. I remember making this change at some point, but I must have lost it while rebasing or something similar. Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20200611172520.327602-1-lmb@cloudflare.com
| | * | | bpf: tcp: Recv() should return 0 when the peer socket is closedSabrina Dubroca2020-06-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the peer is closed, we will never get more data, so tcp_bpf_wait_data will get stuck forever. In case we passed MSG_DONTWAIT to recv(), we get EAGAIN but we should actually get 0. >From man 2 recv: RETURN VALUE When a stream socket peer has performed an orderly shutdown, the return value will be 0 (the traditional "end-of-file" return). This patch makes tcp_bpf_wait_data always return 1 when the peer socket has been shutdown. Either we have data available, and it would have returned 1 anyway, or there isn't, in which case we'll call tcp_recvmsg which does the right thing in this situation. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/26038a28c21fea5d04d4bd4744c5686d3f2e5504.1591784177.git.sd@queasysnail.net
| | * | | tools, bpftool: Exit on error in function codegenTobias Klauser2020-06-111-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the codegen function might fail and return an error. But its callers continue without checking its return value. Since codegen can fail only in the unlikely case of the system running out of memory or the static template being malformed, just exit(-1) directly from codegen and make it void-returning. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200611103341.21532-1-tklauser@distanz.ch
| | * | | xdp: Fix xsk_generic_xmit errnoLi RongQing2020-06-111-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Propagate sock_alloc_send_skb error code, not set it to EAGAIN unconditionally, when fail to allocate skb, which might cause that user space unnecessary loops. Fixes: 35fcde7f8deb ("xsk: support for Tx") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1591852266-24017-1-git-send-email-lirongqing@baidu.com
| | * | | tools, bpftool: Fix memory leak in codegen error casesTobias Klauser2020-06-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Free the memory allocated for the template on error paths in function codegen. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200610130804.21423-1-tklauser@distanz.ch
| | * | | selftests/bpf: Add cgroup_skb/egress test for load_bytes_relativeYiFei Zhu2020-06-112-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cgroup_skb/egress triggers the MAC header is not set. Added a test that asserts reading MAC header is a -EFAULT but NET header succeeds. The test result from within the eBPF program is stored in an 1-element array map that the userspace then reads and asserts on. Another assertion is added that reading from a large offset, past the end of packet, returns -EFAULT. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/9028ccbea4385a620e69c0a104f469ffd655c01e.1591812755.git.zhuyifei@google.com
| | * | | net/filter: Permit reading NET in load_bytes_relative when MAC not setYiFei Zhu2020-06-111-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added a check in the switch case on start_header that checks for the existence of the header, and in the case that MAC is not set and the caller requests for MAC, -EFAULT. If the caller requests for NET then MAC's existence is completely ignored. There is no function to check NET header's existence and as far as cgroup_skb/egress is concerned it should always be set. Removed for ptr >= the start of header, considering offset is bounded unsigned and should always be true. len <= end - mac is redundant to ptr + len <= end. Fixes: 3eee1f75f2b9 ("bpf: fix bpf_skb_load_bytes_relative pkt length check") Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/76bb820ddb6a95f59a772ecbd8c8a336f646b362.1591812755.git.zhuyifei@google.com
| | * | | tools, bpf: Do not force gcc as CCBrett Mastbergen2020-06-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows transparent cross-compilation with CROSS_COMPILE by relying on 7ed1c1901fe5 ("tools: fix cross-compile var clobbering"). Same change was applied to tools/bpf/bpftool/Makefile in 9e88b9312acb ("tools: bpftool: do not force gcc as CC"). Signed-off-by: Brett Mastbergen <brett.mastbergen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200609213506.3299-1-brett.mastbergen@gmail.com
| | * | | libbpf: Handle GCC noreturn-turned-volatile quirkAndrii Nakryiko2020-06-101-9/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle a GCC quirk of emitting extra volatile modifier in DWARF (and subsequently preserved in BTF by pahole) for function pointers marked as __attribute__((noreturn)). This was the way to mark such functions before GCC 2.5 added noreturn attribute. Drop such func_proto modifiers, similarly to how it's done for array (also to handle GCC quirk/bug). Such volatile attribute is emitted by GCC only, so existing selftests can't express such test. Simple repro is like this (compiled with GCC + BTF generated by pahole): struct my_struct { void __attribute__((noreturn)) (*fn)(int); }; struct my_struct a; Without this fix, output will be: struct my_struct { voidvolatile (*fn)(int); }; With the fix: struct my_struct { void (*fn)(int); }; Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/bpf/20200610052335.2862559-1-andriin@fb.com
| | * | | libbpf: Define __WORDSIZE if not availableArnaldo Carvalho de Melo2020-06-101-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some systems, such as Android, don't have a define for __WORDSIZE, do it in terms of __SIZEOF_LONG__, as done in perf since 2012: http://git.kernel.org/torvalds/c/3f34f6c0233ae055b5 For reference: https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html I build tested it here and Andrii did some Travis CI build tests too. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200608161150.GA3073@kernel.org
| | * | | bpf: Selftests and tools use struct bpf_devmap_val from uapiJesper Dangaard Brouer2020-06-094-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync tools uapi bpf.h header file and update selftests that use struct bpf_devmap_val. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159170951195.2102545.1833108712124273987.stgit@firesoul
| | * | | bpf: Devmap adjust uapi for attach bpf programJesper Dangaard Brouer2020-06-092-13/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V2: - Defer changing BPF-syscall to start at file-descriptor 1 - Use {} to zero initialise struct. The recent commit fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry"), introduced ability to attach (and run) a separate XDP bpf_prog for each devmap entry. A bpf_prog is added via a file-descriptor. As zero were a valid FD, not using the feature requires using value minus-1. The UAPI is extended via tail-extending struct bpf_devmap_val and using map->value_size to determine the feature set. This will break older userspace applications not using the bpf_prog feature. Consider an old userspace app that is compiled against newer kernel uapi/bpf.h, it will not know that it need to initialise the member bpf_prog.fd to minus-1. Thus, users will be forced to update source code to get program running on newer kernels. This patch remove the minus-1 checks, and have zero mean feature isn't used. Followup patches either for kernel or libbpf should handle and avoid returning file-descriptor zero in the first place. Fixes: fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159170950687.2102545.7235914718298050113.stgit@firesoul
| | * | | bpf: cgroup: Allow multi-attach program to replace itselfLorenz Bauer2020-06-092-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using BPF_PROG_ATTACH to attach a program to a cgroup in BPF_F_ALLOW_MULTI mode, it is not possible to replace a program with itself. This is because the check for duplicate programs doesn't take the replacement program into account. Replacing a program with itself might seem weird, but it has some uses: first, it allows resetting the associated cgroup storage. Second, it makes the API consistent with the non-ALLOW_MULTI usage, where it is possible to replace a program with itself. Third, it aligns BPF_PROG_ATTACH with bpf_link, where replacing itself is also supported. Sice this code has been refactored a few times this change will only apply to v5.7 and later. Adjustments could be made to commit 1020c1f24a94 ("bpf: Simplify __cgroup_bpf_attach") and commit d7bf2c10af05 ("bpf: allocate cgroup storage entries on attaching bpf programs") as well as commit 324bda9e6c5a ("bpf: multi program support for cgroup+bpf") Fixes: af6eea57437a ("bpf: Implement bpf_link-based cgroup BPF program attachment") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200608162202.94002-1-lmb@cloudflare.com
| | * | | bpf: Reset data_meta before running programs attached to devmap entryDavid Ahern2020-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a new context that does not handle metadata at the moment, so mark data_meta invalid. Fixes: fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry") Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200608151723.9539-1-dsahern@kernel.org
| | * | | tracing/probe: Fix bpf_task_fd_query() for kprobes and uprobesJean-Philippe Brucker2020-06-092-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 60d53e2c3b75 ("tracing/probe: Split trace_event related data from trace_probe") removed the trace_[ku]probe structure from the trace_event_call->data pointer. As bpf_get_[ku]probe_info() were forgotten in that change, fix them now. These functions are currently only used by the bpf_task_fd_query() syscall handler to collect information about a perf event. Fixes: 60d53e2c3b75 ("tracing/probe: Split trace_event related data from trace_probe") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20200608124531.819838-1-jean-philippe@linaro.org
| | * | | scripts: Require pahole v1.16 when generating BTFLorenz Bauer2020-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bpf_iter requires the kernel BTF to be generated with pahole >= 1.16, since otherwise the function definitions that the iterator attaches to are not included. This failure mode is indistiguishable from trying to attach to an iterator that really doesn't exist. Since it's really easy to miss this requirement, bump the pahole version check used at build time to at least 1.16. Fixes: 15d83c4d7cef ("bpf: Allow loading of a bpf_iter program") Suggested-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200608094257.47366-1-lmb@cloudflare.com
| | * | | bpf, sockhash: Synchronize delete from bucket list on map freeJakub Sitnicki2020-06-091-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can end up modifying the sockhash bucket list from two CPUs when a sockhash is being destroyed (sock_hash_free) on one CPU, while a socket that is in the sockhash is unlinking itself from it on another CPU it (sock_hash_delete_from_link). This results in accessing a list element that is in an undefined state as reported by KASAN: | ================================================================== | BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280 | Write of size 8 at addr dead000000000122 by task kworker/2:1/95 | | CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 | Workqueue: events bpf_map_free_deferred | Call Trace: | dump_stack+0x97/0xe0 | ? sock_hash_free+0x13c/0x280 | __kasan_report.cold+0x5/0x40 | ? mark_lock+0xbc1/0xc00 | ? sock_hash_free+0x13c/0x280 | kasan_report+0x38/0x50 | ? sock_hash_free+0x152/0x280 | sock_hash_free+0x13c/0x280 | bpf_map_free_deferred+0xb2/0xd0 | ? bpf_map_charge_finish+0x50/0x50 | ? rcu_read_lock_sched_held+0x81/0xb0 | ? rcu_read_lock_bh_held+0x90/0x90 | process_one_work+0x59a/0xac0 | ? lock_release+0x3b0/0x3b0 | ? pwq_dec_nr_in_flight+0x110/0x110 | ? rwlock_bug.part.0+0x60/0x60 | worker_thread+0x7a/0x680 | ? _raw_spin_unlock_irqrestore+0x4c/0x60 | kthread+0x1cc/0x220 | ? process_one_work+0xac0/0xac0 | ? kthread_create_on_node+0xa0/0xa0 | ret_from_fork+0x24/0x30 | ================================================================== Fix it by reintroducing spin-lock protected critical section around the code that removes the elements from the bucket on sockhash free. To do that we also need to defer processing of removed elements, until out of atomic context so that we can unlink the socket from the map when holding the sock lock. Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200607205229.2389672-3-jakub@cloudflare.com