summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* net: constify some inline functions in sock.hPeter Lafreniere2022-06-081-3/+3
| | | | | | | | | | | Despite these inline functions having full visibility to the compiler at compile time, they still strip const from passed pointers. This change allows for functions in various network drivers to be marked as const that could not be marked const before. Signed-off-by: Peter Lafreniere <pjlafren@mtu.edu> Link: https://lore.kernel.org/r/20220606113458.35953-1-pjlafren@mtu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dropreason: reformat the comment fo skb drop reasonsMenglong Dong2022-06-071-156/+226
| | | | | | | | | | | | | To make the code clear, reformat the comment in dropreason.h to k-doc style. Now, the comment can pass the check of kernel-doc without warnning: $ ./scripts/kernel-doc -v -none include/linux/dropreason.h include/linux/dropreason.h:7: info: Scanning doc for enum skb_drop_reason Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* net: skb: use auto-generation to convert skb drop reason to stringMenglong Dong2022-06-072-88/+3
| | | | | | | | | | | | | | | | | | | It is annoying to add new skb drop reasons to 'enum skb_drop_reason' and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget to add the new reasons we added to TRACE_SKB_DROP_REASON. TRACE_SKB_DROP_REASON is used to convert drop reason of type number to string. For now, the string we passed to user space is exactly the same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_' prefix. Therefore, we can use 'auto-generation' to generate these drop reasons to string at build time. The new source 'dropreason_str.c' will be auto generated during build time, which contains the string array 'const char * const drop_reasons[]'. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* net: skb: move enum skb_drop_reason to standalone header fileMenglong Dong2022-06-072-178/+185
| | | | | | | | | As the skb drop reasons are getting more and more, move the enum 'skb_drop_reason' and related function to the standalone header 'dropreason.h', as Jakub Kicinski suggested. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* Merge tag 'net-5.19-rc1' of ↵Linus Torvalds2022-06-0210-37/+41
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - new code bugs: - af_packet: make sure to pull the MAC header, avoid skb panic in GSO - ptp_clockmatrix: fix inverted logic in is_single_shot() - netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag - dt-bindings: net: adin: fix adi,phy-output-clock description syntax - wifi: iwlwifi: pcie: rename CAUSE macro, avoid MIPS build warning Previous releases - regressions: - Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" - tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd - nf_tables: disallow non-stateful expression in sets earlier - nft_limit: clone packet limits' cost value - nf_tables: double hook unregistration in netns path - ping6: fix ping -6 with interface name Previous releases - always broken: - sched: fix memory barriers to prevent skbs from getting stuck in lockless qdiscs - neigh: set lower cap for neigh_managed_work rearming, avoid constantly scheduling the probe work - bpf: fix probe read error on big endian in ___bpf_prog_run() - amt: memory leak and error handling fixes Misc: - ipv6: expand & rename accept_unsolicited_na to accept_untracked_na" * tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits) net/af_packet: make sure to pull mac header net: add debug info to __skb_pull() net: CONFIG_DEBUG_NET depends on CONFIG_NET stmmac: intel: Add RPL-P PCI ID net: stmmac: use dev_err_probe() for reporting mdio bus registration failure tipc: check attribute length for bearer name ice: fix access-beyond-end in the switch code nfp: remove padding in nfp_nfdk_tx_desc ax25: Fix ax25 session cleanup problems net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels sfc/siena: fix considering that all channels have TX queues socket: Don't use u8 type in uapi socket.h net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6() net: ping6: Fix ping -6 with interface name macsec: fix UAF bug for real_dev octeontx2-af: fix error code in is_valid_offset() wifi: mac80211: fix use-after-free in chanctx code bonding: guard ns_targets by CONFIG_IPV6 tcp: tcp_rtx_synack() can be called from process context ...
| * net: add debug info to __skb_pull()Eric Dumazet2022-06-021-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | While analyzing yet another syzbot report, I found the following patch very useful. It allows to better understand what went wrong. This debug info is only enabled if CONFIG_DEBUG_NET=y, which is the case for syzbot builds. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * ax25: Fix ax25 session cleanup problemsDuoming Zhou2022-06-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are session cleanup problems in ax25_release() and ax25_disconnect(). If we setup a session and then disconnect, the disconnected session is still in "LISTENING" state that is shown below. Active AX.25 sockets Dest Source Device State Vr/Vs Send-Q Recv-Q DL9SAU-4 DL9SAU-3 ??? LISTENING 000/000 0 0 DL9SAU-3 DL9SAU-4 ??? LISTENING 000/000 0 0 The first reason is caused by del_timer_sync() in ax25_release(). The timers of ax25 are used for correct session cleanup. If we use ax25_release() to close ax25 sessions and ax25_dev is not null, the del_timer_sync() functions in ax25_release() will execute. As a result, the sessions could not be cleaned up correctly, because the timers have stopped. In order to solve this problem, this patch adds a device_up flag in ax25_dev in order to judge whether the device is up. If there are sessions to be cleaned up, the del_timer_sync() in ax25_release() will not execute. What's more, we add ax25_cb_del() in ax25_kill_by_device(), because the timers have been stopped and there are no functions that could delete ax25_cb if we do not call ax25_release(). Finally, we reorder the position of ax25_list_lock in ax25_cb_del() in order to synchronize among different functions that call ax25_cb_del(). The second reason is caused by improper check in ax25_disconnect(). The incoming ax25 sessions which ax25->sk is null will close heartbeat timer, because the check "if(!ax25->sk || ..)" is satisfied. As a result, the session could not be cleaned up properly. In order to solve this problem, this patch changes the improper check to "if(ax25->sk && ..)" in ax25_disconnect(). What`s more, the ax25_disconnect() may be called twice, which is not necessary. For example, ax25_kill_by_device() calls ax25_disconnect() and sets ax25->state to AX25_STATE_0, but ax25_release() calls ax25_disconnect() again. In order to solve this problem, this patch add a check in ax25_release(). If the flag of ax25->sk equals to SOCK_DEAD, the ax25_disconnect() in ax25_release() should not be executed. Fixes: 82e31755e55f ("ax25: Fix UAF bugs in ax25 timers") Fixes: 8a367e74c012 ("ax25: Fix segfault after sock connection timeout") Reported-and-tested-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/20220530152158.108619-1-duoming@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * Merge tag 'mlx5-fixes-2022-05-31' of ↵Jakub Kicinski2022-06-011-3/+2
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-05-31 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2022-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Fix mlx5_get_next_dev() peer device matching net/mlx5e: Update netdev features after changing XDP state net/mlx5: correct ECE offset in query qp output net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition net/mlx5: CT: Fix header-rewrite re-use for tupels net/mlx5e: TC NIC mode, fix tc chains miss table net/mlx5: Don't use already freed action pointer ==================== Link: https://lore.kernel.org/r/20220531205447.99236-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * net/mlx5: correct ECE offset in query qp outputChangcheng Liu2022-05-311-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | ECE field should be after opt_param_mask in query qp output. Fixes: 6b646a7e4af6 ("net/mlx5: Add ability to read and write ECE options") Signed-off-by: Changcheng Liu <jerrliu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | socket: Don't use u8 type in uapi socket.hTobias Klauser2022-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use plain 255 instead, which also avoid introducing an additional header dependency on <linux/types.h> Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/r/20220531094345.13801-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | bonding: guard ns_targets by CONFIG_IPV6Hangbin Liu2022-06-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guard ns_targets in struct bond_params by CONFIG_IPV6, which could save 256 bytes if IPv6 not configed. Also add this protection for function bond_is_ip6_target_ok() and bond_get_targets_ip6(). Remove the IS_ENABLED() check for bond_opts[] as this will make BOND_OPT_NS_TARGETS uninitialized if CONFIG_IPV6 not enabled. Add a dummy bond_option_ns_ip6_targets_set() for this situation. Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Link: https://lore.kernel.org/r/20220531063727.224043-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | net: sched: add barrier to fix packet stuck problem for lockless qdiscGuoju Fang2022-05-311-0/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In qdisc_run_end(), the spin_unlock() only has store-release semantic, which guarantees all earlier memory access are visible before it. But the subsequent test_bit() has no barrier semantics so may be reordered ahead of the spin_unlock(). The store-load reordering may cause a packet stuck problem. The concurrent operations can be described as below, CPU 0 | CPU 1 qdisc_run_end() | qdisc_run_begin() . | . ----> /* may be reorderd here */ | . | . | . | spin_unlock() | set_bit() | . | smp_mb__after_atomic() ---- test_bit() | spin_trylock() . | . Consider the following sequence of events: CPU 0 reorder test_bit() ahead and see MISSED = 0 CPU 1 calls set_bit() CPU 1 calls spin_trylock() and return fail CPU 0 executes spin_unlock() At the end of the sequence, CPU 0 calls spin_unlock() and does nothing because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one take it, until the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it. This patch fix this by adding one explicit barrier. As spin_unlock() and test_bit() ordering is a store-load ordering, a full memory barrier smp_mb() is needed here. Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220528101628.120193-1-gjfang@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net/ipv6: Expand and rename accept_unsolicited_na to accept_untracked_naArun Ajith S2022-05-312-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC 9131 changes default behaviour of handling RX of NA messages when the corresponding entry is absent in the neighbour cache. The current implementation is limited to accept just unsolicited NAs. However, the RFC is more generic where it also accepts solicited NAs. Both types should result in adding a STALE entry for this case. Expand accept_untracked_na behaviour to also accept solicited NAs to be compliant with the RFC and rename the sysctl knob to accept_untracked_na. Fixes: f9a2fb73318e ("net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131") Signed-off-by: Arun Ajith S <aajith@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220530101414.65439-1-aajith@arista.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * netfilter: conntrack: re-fetch conntrack after insertionFlorian Westphal2022-05-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case the conntrack is clashing, insertion can free skb->_nfct and set skb->_nfct to the already-confirmed entry. This wasn't found before because the conntrack entry and the extension space used to free'd after an rcu grace period, plus the race needs events enabled to trigger. Reported-by: <syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com> Fixes: 71d8c47fc653 ("netfilter: conntrack: introduce clash resolution on insertion race") Fixes: 2ad9d7747c10 ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * net: sched: fixed barrier to prevent skbuff sticking in qdisc backlogVincent Ray2022-05-261-28/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit() does not provide any ordering guarantee as test_bit() is not an atomic operation. This, added to the fact that the spin_trylock() call at the beginning of qdisc_run_begin() does not guarantee acquire semantics if it does not grab the lock, makes it possible for the following statement : if (test_bit(__QDISC_STATE_MISSED, &qdisc->state)) to be executed before an enqueue operation called before qdisc_run_begin(). As a result the following race can happen : CPU 1 CPU 2 qdisc_run_begin() qdisc_run_begin() /* true */ set(MISSED) . /* returns false */ . . /* sees MISSED = 1 */ . /* so qdisc not empty */ . __qdisc_run() . . . pfifo_fast_dequeue() ----> /* may be done here */ . | . clear(MISSED) | . . | . smp_mb __after_atomic(); | . . | . /* recheck the queue */ | . /* nothing => exit */ | enqueue(skb1) | . | qdisc_run_begin() | . | spin_trylock() /* fail */ | . | smp_mb__before_atomic() /* not enough */ | . ---- if (test_bit(MISSED)) return false; /* exit */ In the above scenario, CPU 1 and CPU 2 both try to grab the qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the bypass code path, where it emits its skb then calls __qdisc_run(). CPU1 fails, sets MISSED and goes down the traditionnal enqueue() + dequeue() code path. But when executing qdisc_run_begin() for the second time, after enqueuing its skbuff, it sees the MISSED bit still set (by itself) and consequently chooses to exit early without setting it again nor trying to grab the spinlock again. Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue and found it empty, so it returned. At the end of the sequence, we end up with skb1 enqueued in the backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set, and no __netif_schedule() called made. skb1 will now linger in the qdisc until somebody later performs a full __qdisc_run(). Associated to the bypass capacity of the qdisc, and the ability of the TCP layer to avoid resending packets which it knows are still in the qdisc, this can lead to serious traffic "holes" in a TCP connection. We fix this by replacing the smp_mb__before_atomic() / test_bit() / set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin() by a single test_and_set_bit() call, which is more concise and enforces the needed memory barriers. Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc") Signed-off-by: Vincent Ray <vray@kalrayinc.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * amt: fix typo in amtTaehee Yoo2022-05-251-1/+1
| | | | | | | | | | | | | | | | | | AMT_MSG_TEARDOWM is defined, But it should be AMT_MSG_TEARDOWN. Fixes: b9022b53adad ("amt: add control plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge tag 'livepatching-for-5.19' of ↵Linus Torvalds2022-06-021-2/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching cleanup from Petr Mladek: - Remove duplicated livepatch code [Christophe] * tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
| * | livepatch: Remove klp_arch_set_pc() and asm/livepatch.hChristophe Leroy2022-05-241-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All three versions of klp_arch_set_pc() do exactly the same: they call ftrace_instruction_pointer_set(). Call ftrace_instruction_pointer_set() directly and remove klp_arch_set_pc(). As klp_arch_set_pc() was the only thing remaining in asm/livepatch.h on x86 and s390, remove asm/livepatch.h livepatch.h remains on powerpc but its content is exclusively used by powerpc specific code. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Petr Mladek <pmladek@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Petr Mladek <pmladek@suse.com>
* | | Merge tag 'linux-watchdog-5.19-rc1' of ↵Linus Torvalds2022-06-012-0/+91
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://www.linux-watchdog.org/linux-watchdog Pull watchdog updates from Wim Van Sebroeck: - Add MediaTek MT8186 support - Add Mediatek MT7986 reset-controller support - Add i.MX93 support - Add watchdog driver for Sunplus SP7021 - Add SC8180X and SC8280XP compatibles - Add Renesas RZ/N1 Watchdog driver and support for RZ/N1 - rzg2l_wdt improvements and fixes - Several other improvements and fixes * tag 'linux-watchdog-5.19-rc1' of git://www.linux-watchdog.org/linux-watchdog: (38 commits) watchdog: ts4800_wdt: Fix refcount leak in ts4800_wdt_probe dt-bindings: watchdog: renesas,wdt: R-Car V3U is R-Car Gen4 watchdog: Add Renesas RZ/N1 Watchdog driver dt-bindings: watchdog: renesas,wdt: Add support for RZ/N1 watchdog: wdat_wdt: Stop watchdog when uninstalling module watchdog: wdat_wdt: Stop watchdog when rebooting the system watchdog: wdat_wdt: Using the existing function to check parameter timeout dt-bindings: watchdog: da9062: add watchdog timeout mode dt-bindings: watchdog: renesas,wdt: Document RZ/G2UL SoC watchdog: iTCO_wdt: Using existing macro define covers more scenarios watchdog: rti-wdt: Fix pm_runtime_get_sync() error checking dt-bindings: watchdog: Add SC8180X and SC8280XP compatibles watchdog: rti_wdt: Fix calculation and evaluation of preset heartbeat dt-bindings: watchdog: uniphier: Use unevaluatedProperties watchdog: sp805: disable watchdog on remove watchdog: da9063: optionally disable watchdog during suspend dt-bindings: mfd: da9063: watchdog: add suspend disable option dt-bindings: watchdog: sunxi: clarify clock support dt-bindings: watchdog: sunxi: fix F1C100s compatible watchdog: Add watchdog driver for Sunplus SP7021 ...
| * | | dt-bindings: reset: mt7986: Add reset-controller header fileSam Shih2022-05-081-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add infracfg, toprgu, and ethsys reset-controller header file for MT7986 platform. Signed-off-by: Sam Shih <sam.shih@mediatek.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220105100456.7126-2-sam.shih@mediatek.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
| * | | dt-bindings: reset: mt8186: add reset-controller header fileRunyang Chen2022-05-081-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Add toprgu reset-controller header file for MT8186. 2. Add DSI software reset bit which is controlled in MMSYS for MT8186. Signed-off-by: Runyang Chen <runyang.chen@mediatek.com> Signed-off-by: Rex-BC Chen <rex-bc.chen@mediatek.com> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220301054405.25021-3-rex-bc.chen@mediatek.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
* | | | Merge tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds2022-06-014-34/+29
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull vfio updates from Alex Williamson: - Improvements to mlx5 vfio-pci variant driver, including support for parallel migration per PF (Yishai Hadas) - Remove redundant iommu_present() check (Robin Murphy) - Ongoing refactoring to consolidate the VFIO driver facing API to use vfio_device (Jason Gunthorpe) - Use drvdata to store vfio_device among all vfio-pci and variant drivers (Jason Gunthorpe) - Remove redundant code now that IOMMU core manages group DMA ownership (Jason Gunthorpe) - Remove vfio_group from external API handling struct file ownership (Jason Gunthorpe) - Correct typo in uapi comments (Thomas Huth) - Fix coccicheck detected deadlock (Wan Jiabing) - Use rwsem to remove races and simplify code around container and kvm association to groups (Jason Gunthorpe) - Harden access to devices in low power states and use runtime PM to enable d3cold support for unused devices (Abhishek Sahu) - Fix dma_owner handling of fake IOMMU groups (Jason Gunthorpe) - Set driver_managed_dma on vfio-pci variant drivers (Jason Gunthorpe) - Pass KVM pointer directly rather than via notifier (Matthew Rosato) * tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio: (38 commits) vfio: remove VFIO_GROUP_NOTIFY_SET_KVM vfio/pci: Add driver_managed_dma to the new vfio_pci drivers vfio: Do not manipulate iommu dma_owner for fake iommu groups vfio/pci: Move the unused device into low power state with runtime PM vfio/pci: Virtualize PME related registers bits and initialize to zero vfio/pci: Change the PF power state to D0 before enabling VFs vfio/pci: Invalidate mmaps and block the access in D3hot power state vfio: Change struct vfio_group::container_users to a non-atomic int vfio: Simplify the life cycle of the group FD vfio: Fully lock struct vfio_group::container vfio: Split up vfio_group_get_device_fd() vfio: Change struct vfio_group::opened from an atomic to bool vfio: Add missing locking for struct vfio_group::kvm kvm/vfio: Fix potential deadlock problem in vfio include/uapi/linux/vfio.h: Fix trivial typo - _IORW should be _IOWR instead vfio/pci: Use the struct file as the handle not the vfio_group kvm/vfio: Remove vfio_group from kvm vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm() vfio: Change vfio_external_check_extension() to vfio_file_enforced_coherent() vfio: Remove vfio_external_group_match_file() ...
| * | | | vfio: remove VFIO_GROUP_NOTIFY_SET_KVMMatthew Rosato2022-05-241-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than relying on a notifier for associating the KVM with the group, let's assume that the association has already been made prior to device_open. The first time a device is opened associate the group KVM with the device. This fixes a user-triggerable oops in GVT. Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/20220519183311.582380-2-mjrosato@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | include/uapi/linux/vfio.h: Fix trivial typo - _IORW should be _IOWR insteadThomas Huth2022-05-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no macro called _IORW, so use _IOWR in the comment instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220516101202.88373-1-thuth@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | vfio/pci: Use the struct file as the handle not the vfio_groupJason Gunthorpe2022-05-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VFIO PCI does a security check as part of hot reset to prove that the user has permission to manipulate all the devices that will be impacted by the reset. Use a new API vfio_file_has_dev() to perform this security check against the struct file directly and remove the vfio_group from VFIO PCI. Since VFIO PCI was the last user of vfio_group_get_external_user() and vfio_group_put_external_user() remove it as well. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm()Jason Gunthorpe2022-05-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just change the argument from struct vfio_group to struct file *. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | vfio: Change vfio_external_check_extension() to vfio_file_enforced_coherent()Jason Gunthorpe2022-05-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of a general extension check change the function into a limited test if the iommu_domain has enforced coherency, which is the only thing kvm needs to query. Make the new op self contained by properly refcounting the container before touching it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | vfio: Remove vfio_external_group_match_file()Jason Gunthorpe2022-05-131-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfio_group_fops_open() ensures there is only ever one struct file open for any struct vfio_group at any time: /* Do we need multiple instances of the group open? Seems not. */ opened = atomic_cmpxchg(&group->opened, 0, 1); if (opened) { vfio_group_put(group); return -EBUSY; Therefor the struct file * can be used directly to search the list of VFIO groups that KVM keeps instead of using the vfio_external_group_match_file() callback to try to figure out if the passed in FD matches the list or not. Delete vfio_external_group_match_file(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | vfio: Change vfio_external_user_iommu_id() to vfio_file_iommu_group()Jason Gunthorpe2022-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only caller wants to get a pointer to the struct iommu_group associated with the VFIO group file. Instead of returning the group ID then searching sysfs for that string to get the struct iommu_group just directly return the iommu_group pointer already held by the vfio_group struct. It already has a safe lifetime due to the struct file kref, the vfio_group and thus the iommu_group cannot be destroyed while the group file is open. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | Merge remote-tracking branch 'iommu/vfio-notifier-fix' into v5.19/vfio/nextAlex Williamson2022-05-136-25/+66
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge IOMMU dependencies for vfio. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | vfio/pci: Remove vfio_device_get_from_dev()Jason Gunthorpe2022-05-112-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last user of this function is in PCI callbacks that want to convert their struct pci_dev to a vfio_device. Instead of searching use the vfio_device available trivially through the drvdata. When a callback in the device_driver is called, the caller must hold the device_lock() on dev. The purpose of the device_lock is to prevent remove() from being called (see __device_release_driver), and allow the driver to safely interact with its drvdata without races. The PCI core correctly follows this and holds the device_lock() when calling error_detected (see report_error_detected) and sriov_configure (see sriov_numvfs_store). Further, since the drvdata holds a positive refcount on the vfio_device any access of the drvdata, under the device_lock(), from a driver callback needs no further protection or refcounting. Thus the remark in the vfio_device_get_from_dev() comment does not apply here, VFIO PCI drivers all call vfio_unregister_group_dev() from their remove callbacks under the device_lock() and cannot race with the remaining callers. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-c841817a0349+8f-vfio_get_from_dev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | vfio: Remove dead codeJason Gunthorpe2022-05-111-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that callers have been updated to use the vfio_device APIs the driver facing group interface is no longer used, delete it: - vfio_group_get_external_user_from_dev() - vfio_group_pin_pages() - vfio_group_unpin_pages() - vfio_group_iommu_domain() -- Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | vfio/mdev: Pass in a struct vfio_device * to vfio_dma_rw()Jason Gunthorpe2022-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. Change vfio_dma_rw() to take in the struct vfio_device and move the container users that would have been held by vfio_group_get_external_user_from_dev() to vfio_dma_rw() directly, like vfio_pin/unpin_pages(). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages()Jason Gunthorpe2022-05-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every caller has a readily available vfio_device pointer, use that instead of passing in a generic struct device. The struct vfio_device already contains the group we need so this avoids complexity, extra refcountings, and a confusing lifecycle model. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | vfio: Make vfio_(un)register_notifier accept a vfio_deviceJason Gunthorpe2022-05-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers have a struct vfio_device trivially available, pass it in directly and avoid calling the expensive vfio_group_get_from_dev(). Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * | | | | Merge tag 'gvt-next-2022-04-29' into v5.19/vfio/nextAlex Williamson2022-05-111-78/+4
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge GVT-g dependencies for vfio. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * \ \ \ \ \ Merge tag 'mlx5-lm-parallel' of ↵Alex Williamson2022-05-115-242/+29
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into v5.19/vfio/next Improve mlx5 live migration driver From Yishai: This series improves mlx5 live migration driver in few aspects as of below. Refactor to enable running migration commands in parallel over the PF command interface. To achieve that we exposed from mlx5_core an API to let the VF be notified before that the PF command interface goes down/up. (e.g. PF reload upon health recovery). Once having the above functionality in place mlx5 vfio doesn't need any more to obtain the global PF lock upon using the command interface but can rely on the above mechanism to be in sync with the PF. This can enable parallel VFs migration over the PF command interface from kernel driver point of view. In addition, Moved to use the PF async command mode for the SAVE state command. This enables returning earlier to user space upon issuing successfully the command and improve latency by let things run in parallel. Alex, as this series touches mlx5_core we may need to send this in a pull request format to VFIO to avoid conflicts before acceptance. Link: https://lore.kernel.org/all/20220510090206.90374-1-yishaih@nvidia.com Signed-of-by: Leon Romanovsky <leonro@nvidia.com>
| | * | | | | | net/mlx5: Expose mlx5_sriov_blocking_notifier_register / unregister APIsYishai Hadas2022-05-101-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose mlx5_sriov_blocking_notifier_register / unregister APIs to let a VF register to be notified for its enablement / disablement by the PF. Upon VF probe it will call mlx5_sriov_blocking_notifier_register() with its notifier block and upon VF remove it will call mlx5_sriov_blocking_notifier_unregister() to drop its registration. This can give a VF the ability to clean some resources upon disable before that the command interface goes down and on the other hand sets some stuff before that it's enabled. This may be used by a VF which is migration capable in few cases.(e.g. PF load/unload upon an health recovery). Link: https://lore.kernel.org/r/20220510090206.90374-2-yishaih@nvidia.com Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
* | | | | | | | Merge tag 'pwm/for-5.19-rc1' of ↵Linus Torvalds2022-06-012-0/+91
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "Quite a large number of conversions this time around, courtesy of Uwe who has been working tirelessly on these. No drivers of the legacy API are left at this point, so as a next step the old API can be removed. Support is added for a few new devices such as the Xilinx AXI timer- based PWMs and the PWM IP found on Sunplus SoCs. Other than that, there's a number of fixes, cleanups and optimizations" * tag 'pwm/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (43 commits) pwm: pwm-cros-ec: Add channel type support dt-bindings: google,cros-ec-pwm: Add the new -type compatible dt-bindings: Add mfd/cros_ec definitions pwm: Document that the pinstate of a disabled PWM isn't reliable pwm: twl-led: Implement .apply() callback pwm: lpc18xx: Implement .apply() callback pwm: mediatek: Implement .apply() callback pwm: lpc32xx: Implement .apply() callback pwm: tegra: Implement .apply() callback pwm: stmpe: Implement .apply() callback pwm: sti: Implement .apply() callback pwm: pwm-mediatek: Add support for MediaTek Helio X10 MT6795 dt-bindings: pwm: pwm-mediatek: Add documentation for MT6795 SoC pwm: tegra: Optimize period calculation pwm: renesas-tpu: Improve precision of period and duty_cycle calculation pwm: renesas-tpu: Improve maths to compute register settings pwm: renesas-tpu: Rename variables to match the usual naming pwm: renesas-tpu: Implement .apply() callback pwm: renesas-tpu: Make use of devm functions pwm: renesas-tpu: Make use of dev_err_probe() ...
| * | | | | | | | dt-bindings: Add mfd/cros_ec definitionsFabio Baltieri2022-05-201-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a dt-bindings include file for cros_ec devicetree definition, define a pair of special purpose PWM channels in it. Signed-off-by: Fabio Baltieri <fabiobaltieri@chromium.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * | | | | | | | pwm: Add support for Xilinx AXI TimerSean Anderson2022-04-221-0/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds PWM support for Xilinx LogiCORE IP AXI soft timers commonly found on Xilinx FPGAs. At the moment clock control is very basic: we just enable the clock during probe and pin the frequency. In the future, someone could add support for disabling the clock when not in use. Some common code has been specially demarcated. While currently only used by the PWM driver, it is anticipated that it may be split off in the future to be used by the timer driver as well. This driver was written with reference to Xilinx DS764 for v1.03.a [1]. [1] https://www.xilinx.com/support/documentation/ip_documentation/axi_timer/v1_03_a/axi_timer_ds764.pdf Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
* | | | | | | | | Merge tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2022-05-313-1/+14
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Anna Schumaker: "New Features: - Add support for 'dacl' and 'sacl' attributes Bugfixes and Cleanups: - Fixes for reporting mapping errors - Fixes for memory allocation errors - Improve warning message when locks are lost - Update documentation for the nfs4_unique_id parameter - Add an explanation of NFSv4 client identifiers - Ensure the i_size attribute is written to the fscache storage - Fix freeing uninitialized nfs4_labels - Better handling when xprtrdma bc_serv is NULL - Mark qualified async operations as MOVEABLE tasks" * tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1 mark qualified async operations as MOVEABLE tasks xprtrdma: treat all calls not a bcall when bc_serv is NULL NFSv4: Fix free of uninitialized nfs4_label on referral lookup. NFS: Pass i_size to fscache_unuse_cookie() when a file is released Documentation: Add an explanation of NFSv4 client identifiers NFS: update documentation for the nfs4_unique_id parameter NFS: Improve warning message when locks are lost. NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributes NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributes NFSv4: Specify the type of ACL to cache NFSv4: Don't hold the layoutget locks across multiple RPC calls pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors NFS: Further fixes to the writeback error handling NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout NFS: Memory allocation failures are not server fatal errors NFS: Don't report errors from nfs_pageio_complete() more than once NFS: Do not report flush errors in nfs_write_end() NFS: Don't report ENOSPC write errors twice NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report EINTR/ERESTARTSYS as mapping errors
| * | | | | | | | | NFSv4.1 mark qualified async operations as MOVEABLE tasksOlga Kornievskaia2022-05-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark async operations such as RENAME, REMOVE, COMMIT MOVEABLE for the nfsv4.1+ sessions. Fixes: 85e39feead948 ("NFSv4.1 identify and mark RPC tasks that can move between transports") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | | | | | NFSv4: Fix free of uninitialized nfs4_label on referral lookup.Benjamin Coddington2022-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Send along the already-allocated fattr along with nfs4_fs_locations, and drop the memcpy of fattr. We end up growing two more allocations, but this fixes up a crash as: PID: 790 TASK: ffff88811b43c000 CPU: 0 COMMAND: "ls" #0 [ffffc90000857920] panic at ffffffff81b9bfde #1 [ffffc900008579c0] do_trap at ffffffff81023a9b #2 [ffffc90000857a10] do_error_trap at ffffffff81023b78 #3 [ffffc90000857a58] exc_stack_segment at ffffffff81be1f45 #4 [ffffc90000857a80] asm_exc_stack_segment at ffffffff81c009de #5 [ffffc90000857b08] nfs_lookup at ffffffffa0302322 [nfs] #6 [ffffc90000857b70] __lookup_slow at ffffffff813a4a5f #7 [ffffc90000857c60] walk_component at ffffffff813a86c4 #8 [ffffc90000857cb8] path_lookupat at ffffffff813a9553 #9 [ffffc90000857cf0] filename_lookup at ffffffff813ab86b Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 9558a007dbc3 ("NFS: Remove the label from the nfs4_lookup_res struct") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | | | | | NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributesTrond Myklebust2022-05-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ability to set or retrieve the acl using the NFSv4.1 'dacl' and 'sacl' attributes to the NFSv4 xdr encoders/decoders. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | | | | | NFSv4: Specify the type of ACL to cacheTrond Myklebust2022-05-172-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When caching a NFSv4 ACL, we want to specify whether we are caching an NFSv4.0 type acl, the NFSv4.1 dacl or the NFSv4.1 sacl. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | | | | | | | | | Merge tag 'f2fs-for-5.19-rc1' of ↵Linus Torvalds2022-05-311-32/+109
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've refactored the existing atomic write support implemented by in-memory operations to have storing data in disk temporarily, which can give us a benefit to accept more atomic writes. At the same time, we removed the existing volatile write support. We've also revisited the file pinning and GC flows and found some corner cases which contributeed abnormal system behaviours. As usual, there're several minor code refactoring for readability, sanity check, and clean ups. Enhancements: - allow compression for mmap files in compress_mode=user - kill volatile write support - change the current atomic write way - give priority to select unpinned section for foreground GC - introduce data read/write showing path info - remove unnecessary f2fs_lock_op in f2fs_new_inode Bug fixes: - fix the file pinning flow during checkpoint=disable and GCs - fix foreground and background GCs to select the right victims and get free sections on time - fix GC flags on defragmenting pages - avoid an infinite loop to flush node pages - fix fallocate to use file_modified to update permissions consistently" * tag 'f2fs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits) f2fs: fix to tag gcing flag on page during file defragment f2fs: replace F2FS_I(inode) and sbi by the local variable f2fs: add f2fs_init_write_merge_io function f2fs: avoid unneeded error handling for revoke_entry_slab allocation f2fs: allow compression for mmap files in compress_mode=user f2fs: fix typo in comment f2fs: make f2fs_read_inline_data() more readable f2fs: fix to do sanity check for inline inode f2fs: fix fallocate to use file_modified to update permissions consistently f2fs: don't use casefolded comparison for "." and ".." f2fs: do not stop GC when requiring a free section f2fs: keep wait_ms if EAGAIN happens f2fs: introduce f2fs_gc_control to consolidate f2fs_gc parameters f2fs: reject test_dummy_encryption when !CONFIG_FS_ENCRYPTION f2fs: kill volatile write support f2fs: change the current atomic write way f2fs: don't need inode lock for system hidden quota f2fs: stop allocating pinned sections if EAGAIN happens f2fs: skip GC if possible when checkpoint disabling f2fs: give priority to select unpinned section for foreground GC ...
| * | | | | | | | | | f2fs: do not stop GC when requiring a free sectionJaegeuk Kim2022-05-171-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The f2fs_gc uses a bitmap to indicate pinned sections, but when disabling chckpoint, we call f2fs_gc() with NULL_SEGNO which selects the same dirty segment as a victim all the time, resulting in checkpoint=disable failure, for example. Let's pick another one, if we fail to collect it. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | | | | | | | | f2fs: introduce f2fs_gc_control to consolidate f2fs_gc parametersJaegeuk Kim2022-05-121-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No functional change. - remove checkpoint=disable check for f2fs_write_checkpoint - get sec_freed all the time Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | | | | | | | | f2fs: change the current atomic write wayDaeho Jeong2022-05-121-22/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current atomic write has three major issues like below. - keeps the updates in non-reclaimable memory space and they are even hard to be migrated, which is not good for contiguous memory allocation. - disk spaces used for atomic files cannot be garbage collected, so this makes it difficult for the filesystem to be defragmented. - If atomic write operations hit the threshold of either memory usage or garbage collection failure count, All the atomic write operations will fail immediately. To resolve the issues, I will keep a COW inode internally for all the updates to be flushed from memory, when we need to flush them out in a situation like high memory pressure. These COW inodes will be tagged as orphan inodes to be reclaimed in case of sudden power-cut or system failure during atomic writes. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>