summaryrefslogtreecommitdiffstats
path: root/include/linux/netpoll.h
Commit message (Collapse)AuthorAgeFilesLines
* netpoll: add net device refcount tracker to struct netpollEric Dumazet2021-12-061-0/+1
| | | | | Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* netpoll: Remove unused inline function netpoll_netdev_init()YueHaibing2020-07-151-3/+0
| | | | | | | | commit d565b0a1a9b6 ("net: Add Generic Receive Offload infrastructure") left behind this, remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: netpoll_send_skb() returns transmit statusEric Dumazet2020-05-071-1/+1
| | | | | | | | Some callers want to know if the packet has been sent or dropped, to inform upper stacks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: move netpoll_send_skb() out of lineEric Dumazet2020-05-071-8/+1
| | | | | | | | There is no need to inline this helper, as we intend to add more code in this function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: remove dev argument from netpoll_send_skb_on_dev()Eric Dumazet2020-05-071-3/+2
| | | | | | | | | | | netpoll_send_skb_on_dev() can get the device pointer directly from np->dev Rename it to __netpoll_send_skb() Following patch will move netpoll_send_skb() out-of-line. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: allow cleanup to be synchronousDebabrata Banerjee2018-10-191-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem introduced by: commit 2cde6acd49da ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock") When using netconsole on a bond, __netpoll_cleanup can asynchronously recurse multiple times, each __netpoll_free_async call can result in more __netpoll_free_async's. This means there is now a race between cleanup_work queues on multiple netpoll_info's on multiple devices and the configuration of a new netpoll. For example if a netconsole is set to enable 0, reconfigured, and enable 1 immediately, this netconsole will likely not work. Given the reason for __netpoll_free_async is it can be called when rtnl is not locked, if it is locked, we should be able to execute synchronously. It appears to be locked everywhere it's called from. Generalize the design pattern from the teaming driver for current callers of __netpoll_free_async. CC: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: make ndo_poll_controller() optionalEric Dumazet2018-09-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller(). NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. This patch allows netpoll_poll_dev() to process NAPI contexts even for drivers not providing ndo_poll_controller(), allowing for following patches in NAPI drivers. Also we export netpoll_poll_dev() so that it can be called by bonding/team drivers in following patches. Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2017-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: convert netpoll_info.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-011-1/+2
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: more efficient lockingEric Dumazet2016-11-161-6/+7
| | | | | | | | | | | | | | | | | Callers of netpoll_poll_lock() own NAPI_STATE_SCHED Callers of netpoll_poll_unlock() have BH blocked between the NAPI_STATE_SCHED being cleared and poll_lock is released. We can avoid the spinlock which has no contention, and use cmpxchg() on poll_owner which we need to set anyway. This removes a possible lockdep violation after the cited commit, since sk_busy_loop() re-enables BH before calling busy_poll_stop() Fixes: 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enableEric W. Biederman2014-03-291-4/+4
| | | | | | | | | | | The netpoll_rx_enable and netpoll_rx_disable functions have always controlled polling the network drivers transmit and receive queues. Rename them to netpoll_poll_enable and netpoll_poll_disable to make their functionality clear. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Remove gfp parameter from __netpoll_setupEric W. Biederman2014-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfp parameter was added in: commit 47be03a28cc6c80e3aa2b3e8ed6d960ff0c5c0af Author: Amerigo Wang <amwang@redhat.com> Date: Fri Aug 10 01:24:37 2012 +0000 netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup() slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> The reason for the gfp parameter was removed in: commit c4cdef9b7183159c23c7302aaf270d64c549f557 Author: dingtianhong <dingtianhong@huawei.com> Date: Tue Jul 23 15:25:27 2013 +0800 bonding: don't call slave_xxx_netpoll under spinlocks The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Nothing else that calls __netpoll_setup or ndo_netpoll_setup requires a gfp paramter, so remove the gfp parameter from both of these functions making the code clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)Eric W. Biederman2014-03-171-84/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The netpoll packet receive code only becomes active if the netpoll rx_skb_hook is implemented, and there is not a single implementation of the netpoll rx_skb_hook in the kernel. All of the out of tree implementations I have found all call netpoll_poll which was removed from the kernel in 2011, so this change should not add any additional breakage. There are problems with the netpoll packet receive code. __netpoll_rx does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq context. netpoll_neigh_reply leaks every skb it receives. Reception of packets does not work successfully on stacked devices (aka bonding, team, bridge, and vlans). Given that the netpoll packet receive code is buggy, there are no out of tree users that will be merged soon, and the code has not been used for in tree for a decade let's just remove it. Reverting this commit can server as a starting point for anyone who wants to resurrect netpoll packet reception support. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Move all receive processing under CONFIG_NETPOLL_TRAPEric W. Biederman2014-03-171-33/+40
| | | | | | | | | | | | | | | | | | | | | Make rx_skb_hook, and rx in struct netpoll depend on CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct netpoll_info depend on CONFIG_NETPOLL_TRAP Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb no-ops when CONFIG_NETPOLL_TRAP is not set. Only build netpoll_neigh_reply, checksum_udp service_neigh_queue, pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined. Add helper functions netpoll_trap_setup, netpoll_trap_setup_info, netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize and cleanup the struct netpoll and struct netpoll_info receive specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing otherwise. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAPEric W. Biederman2014-03-171-2/+9
| | | | | | | | | | | | Now that we no longer need to receive packets to safely drain the network drivers receive queue move netpoll_trap and netpoll_set_trap under CONFIG_NETPOLL_TRAP Making netpoll_trap and netpoll_set_trap noop inline functions when CONFIG_NETPOLL_TRAP is not set. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Don't drop all received packets.Eric W. Biederman2014-03-171-2/+1
| | | | | | | | | | | | | | | | | | | | Change the strategy of netpoll from dropping all packets received during netpoll_poll_dev to calling napi poll with a budget of 0 (to avoid processing drivers rx queue), and to ignore packets received with netif_rx (those will safely be placed on the backlog queue). All of the netpoll supporting drivers have been reviewed to ensure either thay use netif_rx or that a budget of 0 is supported by their napi poll routine and that a budget of 0 will not process the drivers rx queues. Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed. npinfo->rx_flags is removed as rx_flags with just the NETPOLL_RX_ENABLED flag becomes just a redundant mirror of list_empty(&npinfo->rx_np). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Add netpoll_rx_processingEric W. Biederman2014-03-171-4/+14
| | | | | | | | Add a helper netpoll_rx_processing that reports when netpoll has receive side processing to perform. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: fix rx_hook() interface by passing the skbAntonio Quartulli2013-10-251-2/+3
| | | | | | | | | | | | | | | | | | Right now skb->data is passed to rx_hook() even if the skb has not been linearised and without giving rx_hook() a way to linearise it. Change the rx_hook() interface and make it accept the skb and the offset to the UDP payload as arguments. rx_hook() is also renamed to rx_skb_hook() to ensure that out of the tree users notice the API change. In this way any rx_skb_hook() implementation can perform all the needed operations to properly (and safely) access the skb data. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: remove return value from netpoll_rx_disable()dingtianhong2013-05-271-2/+2
| | | | | | | | The netpoll_rx_disable() will always return 0, it is no use and looks wordy, so remove the unnecessary code and get rid of it in _dev_open and _dev_close. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: convert mutex into a semaphoreNeil Horman2013-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bart Van Assche recently reported a warning to me: <IRQ> [<ffffffff8103d79f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103d7fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff814761dd>] mutex_trylock+0x16d/0x180 [<ffffffff813968c9>] netpoll_poll_dev+0x49/0xc30 [<ffffffff8136a2d2>] ? __alloc_skb+0x82/0x2a0 [<ffffffff81397715>] netpoll_send_skb_on_dev+0x265/0x410 [<ffffffff81397c5a>] netpoll_send_udp+0x28a/0x3a0 [<ffffffffa0541843>] ? write_msg+0x53/0x110 [netconsole] [<ffffffffa05418bf>] write_msg+0xcf/0x110 [netconsole] [<ffffffff8103eba1>] call_console_drivers.constprop.17+0xa1/0x1c0 [<ffffffff8103fb76>] console_unlock+0x2d6/0x450 [<ffffffff8104011e>] vprintk_emit+0x1ee/0x510 [<ffffffff8146f9f6>] printk+0x4d/0x4f [<ffffffffa0004f1d>] scsi_print_command+0x7d/0xe0 [scsi_mod] This resulted from my commit ca99ca14c which introduced a mutex_trylock operation in a path that could execute in interrupt context. When mutex debugging is enabled, the above warns the user when we are in fact exectuting in interrupt context interrupt context. After some discussion, It seems that a semaphore is the proper mechanism to use here. While mutexes are defined to be unusable in interrupt context, no such condition exists for semaphores (save for the fact that the non blocking api calls, like up and down_trylock must be used when in irq context). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Bart Van Assche <bvanassche@acm.org> CC: Bart Van Assche <bvanassche@acm.org> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lockNeil Horman2013-02-111-2/+2
| | | | | | | | | | | | | | | | | | | | | __netpoll_rcu_free is used to free netpoll structures when the rtnl_lock is already held. The mechanism is used to asynchronously call __netpoll_cleanup outside of the holding of the rtnl_lock, so as to avoid deadlock. Unfortunately, __netpoll_cleanup modifies pointers (dev->np), which means the rtnl_lock must be held while calling it. Further, it cannot be held, because rcu callbacks may be issued in softirq contexts, which cannot sleep. Fix this by converting the rcu callback to a work queue that is guaranteed to get scheduled in process context, so that we can hold the rtnl properly while calling __netpoll_cleanup Tested successfully by myself. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> CC: Cong Wang <amwang@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: protect napi_poll and poll_controller during dev_[open|close]Neil Horman2013-02-061-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ivan Vercera was recently backporting commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that, while this patch protects the tg3 driver from having its ndo_poll_controller routine called during device initalization, it does nothing for the driver during shutdown. I.e. it would be entirely possible to have the ndo_poll_controller method (or subsequently the ndo_poll) routine called for a driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or ndo_open routine could be called. Given that the two latter routines tend to initizlize and free many data structures that the former two rely on, the result can easily be data corruption or various other crashes. Furthermore, it seems that this is potentially a problem with all net drivers that support netpoll, and so this should ideally be fixed in a common path. As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic context, so I've come up with this solution. We can use a mutex to sleep in open/close paths and just do a mutex_trylock in the napi poll path and abandon the poll attempt if we're locked, as we'll just retry the poll on the next send anyway. I've tested this here by flooding netconsole with messages on a system whos nic driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx workqueue would be forced to send frames and poll the device. While this was going on I rapidly ifdown/up'ed the interface and watched for any problems. I've not found any. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Ivan Vecera <ivecera@redhat.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: prepare for ipv6Cong Wang2013-01-081-2/+11
| | | | | | | | | This patch adjusts some struct and functions, to prepare for supporting IPv6. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: convert several functions to boolAmerigo Wang2012-08-141-7/+7
| | | | | | | | | These functions are just boolean, let them return bool instead of int. Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: take rcu_read_lock_bh() in netpoll_send_skb_on_dev()Amerigo Wang2012-08-141-0/+3
| | | | | | | | | | | | | | | | This patch fixes several problems in the call path of netpoll_send_skb_on_dev(): 1. Disable IRQ's before calling netpoll_send_skb_on_dev(). 2. All the callees of netpoll_send_skb_on_dev() should use rcu_dereference_bh() to dereference ->npinfo. 3. Rename arp_reply() to netpoll_arp_reply(), the former is too generic. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: use netpoll_rx_on() in netpoll_rx()Amerigo Wang2012-08-141-9/+9
| | | | | | | | The logic of the code is same, just call netpoll_rx_on(). Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: take rcu_read_lock_bh() in netpoll_rx()Amerigo Wang2012-08-141-2/+2
| | | | | | | | | | In __netpoll_rx(), it dereferences ->npinfo without rcu_dereference_bh(), this patch fixes it by using the 'npinfo' passed from netpoll_rx() where it is already dereferenced with rcu_dereference_bh(). Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: make __netpoll_cleanup non-blockAmerigo Wang2012-08-141-0/+3
| | | | | | | | | | Like the previous patch, slave_disable_netpoll() and __netpoll_cleanup() may be called with read_lock() held too, so we should make them non-block, by moving the cleanup and kfree() to call_rcu_bh() callbacks. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup()Amerigo Wang2012-08-141-1/+1
| | | | | | | | | | | | | slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: move np->dev and np->dev_name init into __netpoll_setup()Jiri Pirko2012-07-171-1/+1
| | | | | Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Remove unused EXPORT_SYMBOLs of netpoll_poll and netpoll_poll_devJoe Perches2011-07-031-2/+0
| | | | | | | | | | | | | | | Unused symbols waste space. Commit 0e34e93177fb "(netpoll: add generic support for bridge and bonding devices)" added the symbol more than a year ago with the promise of "future use". Because it is so far unused, remove it for now. It can be easily readded if or when it actually needs to be used. cc: WANG Cong <amwang@redhat.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Fix bonding drivers improper modification of netpoll structureNeil Horman2010-10-181-2/+7
| | | | | | | | | | | | | | | | The bonding driver currently modifies the netpoll structure in its xmit path while sending frames from netpoll. This is racy, as other cpus can access the netpoll structure in parallel. Since the bonding driver points np->dev to a slave device, other cpus can inadvertently attempt to send data directly to slave devices, leading to improper locking with the bonding master, lost frames, and deadlocks. This patch fixes that up. This patch also removes the real_dev pointer from the netpoll structure as that data is really only used by bonding in the poll_controller, and we can emulate its behavior by check each slave for IS_UP. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Disable IRQ around RCU dereference in netpoll_rxHerbert Xu2010-09-171-4/+4
| | | | | | | | | | | | | We cannot use rcu_dereference_bh safely in netpoll_rx as we may be called with IRQs disabled. We could however simply disable IRQs as that too causes BH to be disabled and is safe in either case. Thanks to John Linville for discovering this bug and providing a patch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: make netpoll_rx return bool for !CONFIG_NETPOLLJohn W. Linville2010-08-101-1/+1
| | | | | | | | "netpoll: Use 'bool' for netpoll_rx() return type." missed the case when CONFIG_NETPOLL is disabled. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Use correct primitives for RCU dereferencingHerbert Xu2010-06-151-2/+2
| | | | | | | | | Now that RCU debugging checks for matching rcu_dereference calls and rcu_read_lock, we need to use the correct primitives or face nasty warnings. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Add netpoll_tx_runningHerbert Xu2010-06-151-0/+9
| | | | | | | | | | | | | | This patch adds the helper netpoll_tx_running for use within ndo_start_xmit. It returns non-zero if ndo_start_xmit is being invoked by netpoll, and zero otherwise. This is currently implemented by simply looking at the hardirq count. This is because for all non-netpoll uses of ndo_start_xmit, IRQs must be enabled while netpoll always disables IRQs before calling ndo_start_xmit. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Allow netpoll_setup/cleanup recursionHerbert Xu2010-06-151-0/+2
| | | | | | | | | | | This patch adds the functions __netpoll_setup/__netpoll_cleanup which is designed to be called recursively through ndo_netpoll_seutp. They must be called with RTNL held, and the caller must initialise np->dev and ensure that it has a valid reference count. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Fix RCU usageHerbert Xu2010-06-151-5/+8
| | | | | | | | | | | | | | | | | The use of RCU in netpoll is incorrect in a number of places: 1) The initial setting is lacking a write barrier. 2) The synchronize_rcu is in the wrong place. 3) Read barriers are missing. 4) Some places are even missing rcu_read_lock. 5) npinfo is zeroed after freeing. This patch fixes those issues. As most users are in BH context, this also converts the RCU usage to the BH variant. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Use 'bool' for netpoll_rx() return type.David S. Miller2010-05-061-4/+4
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: add generic support for bridge and bonding devicesWANG Cong2010-05-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This whole patchset is for adding netpoll support to bridge and bonding devices. I already tested it for bridge, bonding, bridge over bonding, and bonding over bridge. It looks fine now. To make bridge and bonding support netpoll, we need to adjust some netpoll generic code. This patch does the following things: 1) introduce two new priv_flags for struct net_device: IFF_IN_NETPOLL which identifies we are processing a netpoll; IFF_DISABLE_NETPOLL is used to disable netpoll support for a device at run-time; 2) introduce one new method for netdev_ops: ->ndo_netpoll_cleanup() is used to clean up netpoll when a device is removed. 3) introduce netpoll_poll_dev() which takes a struct net_device * parameter; export netpoll_send_skb() and netpoll_poll_dev() which will be used later; 4) hide a pointer to struct netpoll in struct netpoll_info, ditto. 5) introduce ->real_dev for struct netpoll. 6) introduce a new status NETDEV_BONDING_DESLAE, which is used to disable netconsole before releasing a slave, to avoid deadlocks. Cc: David Miller <davem@davemloft.net> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: allow execution of multiple rx_hooks per interfaceDaniel Borkmann2010-01-131-3/+8
| | | | | Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: store local and remote ip in net-endianHarvey Harrison2009-03-281-1/+1
| | | | | | | | Allows for the removal of byteswapping in some places and the removal of HIPQUAD (replaced by %pI4). Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* GRO: Move netpoll checks to correct locationHerbert Xu2009-03-161-0/+11
| | | | | | | | | | | | | | | | | | | | | As my netpoll fix for net doesn't really work for net-next, we need this update to move the checks into the right place. As it stands we may pass freed skbs to netpoll_receive_skb. This patch also introduces a netpoll_rx_on function to avoid GRO completely if we're invoked through netpoll. This might seem paranoid but as netpoll may have an external receive hook it's better to be safe than sorry. I don't think we need this for 2.6.29 though since there's nothing immediately broken by it. This patch also moves the GRO_* return values to netdevice.h since VLAN needs them too (I tried to avoid this originally but alas this seems to be the easiest way out). This fixes a bug in VLAN where it continued to use the old return value 2 instead of the correct GRO_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add Generic Receive Offload infrastructureHerbert Xu2008-12-151-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the top-level GRO (Generic Receive Offload) infrastructure. This is pretty similar to LRO except that this is protocol-independent. Instead of holding packets in an lro_mgr structure, they're now held in napi_struct. For drivers that intend to use this, they can set the NETIF_F_GRO bit and call napi_gro_receive instead of netif_receive_skb or just call netif_rx. The latter will call napi_receive_skb automatically. When napi_gro_receive is used, the driver must either call napi_complete/napi_rx_complete, or call napi_gro_flush in softirq context if the driver uses the primitives __napi_complete/__napi_rx_complete. Protocols will set the gro_receive and gro_complete function pointers in order to participate in this scheme. In addition to the packet, gro_receive will get a list of currently held packets. Each packet in the list has a same_flow field which is non-zero if it is a potential match for the new packet. For each packet that may match, they also have a flush field which is non-zero if the held packet must not be merged with the new packet. Once gro_receive has determined that the new skb matches a held packet, the held packet may be processed immediately if the new skb cannot be merged with it. In this case gro_receive should return the pointer to the existing skb in gro_list. Otherwise the new skb should be merged into the existing packet and NULL should be returned, unless the new skb makes it impossible for any further merges to be made (e.g., FIN packet) where the merged skb should be returned. Whenever the skb is merged into an existing entry, the gro_receive function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb merely matches an existing entry but can't be merged with it, then this shouldn't be set. If gro_receive finds it pointless to hold the new skb for future merging, it should set NAPI_GRO_CB(skb)->flush. Held packets will be flushed by napi_gro_flush which is called by napi_complete and napi_rx_complete. Currently held packets are stored in a singly liked list just like LRO. The list is limited to a maximum of 8 entries. In future, this may be expanded to use a hash table to allow more flows to be held for merging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Revert two bogus cleanups that broke netconsole.David S. Miller2008-03-041-3/+4
| | | | | | | | | | | | | | | Based upon a report by Andrew Morton and code analysis done by Jarek Poplawski. This reverts 33f807ba0d9259e7c75c7a2ce8bd2787e5b540c7 ("[NETPOLL]: Kill NETPOLL_RX_DROP, set but never tested.") and c7b6ea24b43afb5749cb704e143df19d70e23dea ("[NETPOLL]: Don't need rx_flags."). The rx_flags did get tested for zero vs. non-zero and therefore we do need those tests and that code which sets NETPOLL_RX_DROP et al. Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Don't need rx_flags.Stephen Hemminger2008-01-281-4/+3
| | | | | | | | The rx_flags variable is redundant. Turning rx on/off is done via setting the rx_np pointer. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: no need to store local_macStephen Hemminger2008-01-281-1/+1
| | | | | | | | The local_mac is managed by the network device, no need to keep a spare copy and all the management problems that could cause. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] netconsole: Support dynamic reconfiguration using configfsSatyam Sharma2007-10-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>. This patch introduces support for dynamic reconfiguration (adding, removing and/or modifying parameters of netconsole targets at runtime) using a userspace interface exported via configfs. Documentation is also updated accordingly. Issues and brief design overview: (1) Kernel-initiated creation / destruction of kernel objects is not possible with configfs -- the lifetimes of the "config items" is managed exclusively from userspace. But netconsole must support boot/module params too, and these are parsed in kernel and hence netpolls must be setup from the kernel. Joel Becker suggested to separately manage the lifetimes of the two kinds of netconsole_target objects -- those created via configfs mkdir(2) from userspace and those specified from the boot/module option string. This adds complexity and some redundancy here and also means that boot/module param-created targets are not exposed through the configfs namespace (and hence cannot be updated / destroyed dynamically). However, this saves us from locking / refcounting complexities that would need to be introduced in configfs to support kernel-initiated item creation / destroy there. (2) In configfs, item creation takes place in the call chain of the mkdir(2) syscall in the driver subsystem. If we used an ioctl(2) to create / destroy objects from userspace, the special userspace program is able to fill out the structure to be passed into the ioctl and hence specify attributes such as local interface that are required at the time we set up the netpoll. For configfs, this information is not available at the time of mkdir(2). So, we keep all newly-created targets (via configfs) disabled by default. The user is expected to set various attributes appropriately (including the local network interface if required) and then write(2) "1" to the "enabled" attribute. Thus, netpoll_setup() is then called on the set parameters in the context of _this_ write(2) on the "enabled" attribute itself. This design enables the user to reconfigure existing netconsole targets at runtime to be attached to newly-come-up interfaces that may not have existed when netconsole was loaded or when the targets were actually created. All this effectively enables us to get rid of custom ioctls. (3) Ultra-paranoid configfs attribute show() and store() operations, with sanity and input range checking, using only safe string primitives, and compliant with the recommendations in Documentation/filesystems/sysfs.txt. (4) A new function netpoll_print_options() is created in the netpoll API, that just prints out the configured parameters for a netpoll structure. netpoll_parse_options() is modified to use that and it is also exported to be used from netconsole. Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Keiichi Kii <k-keiichi@bx.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make NAPI polling independent of struct net_device objects.Stephen Hemminger2007-10-101-14/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device *dev, int *budget) to int foo_poll(struct napi_struct *napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, *budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* WorkQueue: Fix up arch-specific work items where possibleDavid Howells2006-12-051-1/+1
| | | | | | | | | | Fix up arch-specific work items where possible to use the new work_struct and delayed_work structs. Three places that enqueue bits of their stack and then return have been marked with #error as this is not permitted. Signed-Off-By: David Howells <dhowells@redhat.com>