summaryrefslogtreecommitdiffstats
path: root/net/core/netpoll.c
Commit message (Collapse)AuthorAgeFilesLines
* NET: netpoll, fix potential NULL ptr dereferenceJiri Slaby2010-03-161-2/+2
| | | | | | | | | | | | | Stanse found that one error path in netpoll_setup dereferences npinfo even though it is NULL. Avoid that by adding new label and go to that instead. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Daniel Borkmann <danborkmann@googlemail.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: chavey@google.com Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: allow execution of multiple rx_hooks per interfaceDaniel Borkmann2010-01-131-63/+106
| | | | | Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-09-021-0/+5
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/yellowfin.c
| * netpoll: warning for ndo_start_xmit returns with interrupts enabledDongdong Deng2009-08-231-0/+5
| | | | | | | | | | | | | | | | | | | | WARN_ONCE for ndo_start_xmit() enable interrupts in netpoll_send_skb(), because the NETPOLL API requires that interrupts remain disabled in netpoll_send_skb(). Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-07-091-1/+1
|\| | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * netpoll: Fix carrier detection for drivers that are using phylibAnton Vorontsov2009-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. I belive that the main problem is in cond_resched()[1], but despite how the cond_resched() story ends, it might be a good idea to call msleep(1) instead of cond_resched(), as suggested by Andrew Morton. [1] http://lkml.org/lkml/2009/7/7/463 Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netpoll: Introduce netpoll_carrier_timeout kernel optionAnton Vorontsov2009-07-081-1/+5
|/ | | | | | | | | | | | | | Some PHYs require longer timeouts for carrier detection, and auto-negotiation process may take indefinite amount of time. It may be inconvenient to force longer timeouts for sane PHYs, so let's introduce a kernel command line option. Since we're using module_param(), the option also can be changed in runtime. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-06-151-1/+1
|\ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c
* | net: txq_trans_update() helperEric Dumazet2009-05-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would like to get rid of netdev->trans_start = jiffies; that about all net drivers have to use in their start_xmit() function, and use txq->trans_start instead. This can be done generically in core network, as suggested by David. Some devices, (particularly loopback) dont need trans_start update, because they dont have transmit watchdog. We could add a new device flag, or rely on fact that txq->tran_start can be updated is txq->xmit_lock_owner is different than -1. Use a helper function to hide our choice. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Fix arg to trace_napi_poll() in netpoll.David S. Miller2009-05-211-1/+1
| | | | | | | | | | | | Reproted by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
* | dropmon: add ability to detect when hardware dropsrxpacketsNeil Horman2009-05-211-0/+2
|/ | | | | | | | | | | | | | | | | | | | Patch to add the ability to detect drops in hardware interfaces via dropwatch. Adds a tracepoint to net_rx_action to signal everytime a napi instance is polled. The dropmon code then periodically checks to see if the rx_frames counter has changed, and if so, adds a drop notification to the netlink protocol, using the reserved all-0's vector to indicate the drop location was in hardware, rather than somewhere in the code. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h | 8 ++ include/trace/napi.h | 11 +++ net/core/dev.c | 5 + net/core/drop_monitor.c | 124 ++++++++++++++++++++++++++++++++++++++++++-- net/core/net-traces.c | 4 + net/core/netpoll.c | 2 6 files changed, 149 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: don't dereference NULL dev from npPavel Emelyanov2009-05-171-2/+6
| | | | | | | | | It looks like the dev in netpoll_poll can be NULL - at lease it's checked at the function beginning. Thus the dev->netde_ops dereference looks dangerous. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: store local and remote ip in net-endianHarvey Harrison2009-03-281-16/+15
| | | | | | | | Allows for the removal of byteswapping in some places and the removal of HIPQUAD (replaced by %pI4). Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2008-12-151-0/+2
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c
| * netpoll: fix race on poll_list resulting in garbage entryNeil Horman2008-12-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few months back a race was discused between the netpoll napi service path, and the fast path through net_rx_action: http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470 A patch was submitted for that bug, but I think we missed a case. Consider the following scenario: INITIAL STATE CPU0 has one napi_struct A on its poll_list CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same napi_struct A that CPU0 has on its list CPU0 CPU1 net_rx_action poll_napi !list_empty (returns true) locks poll_lock for A poll_one_napi napi->poll netif_rx_complete __napi_complete (removes A from poll_list) list_entry(list->next) In the above scenario, net_rx_action assumes that the per-cpu poll_list is exclusive to that cpu. netpoll of course violates that, and because the netpoll path can dequeue from the poll list, its possible for CPU0 to detect a non-empty list at the top of the while loop in net_rx_action, but have it become empty by the time it calls list_entry. Since the poll_list isn't surrounded by any other structure, the returned data from that list_entry call in this situation is garbage, and any number of crashes can result based on what exactly that garbage is. Given that its not fasible for performance reasons to place exclusive locks arround each cpus poll list to provide that mutal exclusion, I think the best solution is modify the netpoll path in such a way that we continue to guarantee that the poll_list for a cpu is in fact exclusive to that cpu. To do this I've implemented the patch below. It adds an additional bit to the state field in the napi_struct. When executing napi->poll from the netpoll_path, this bit will be set. When a driver calls netif_rx_complete, if that bit is set, it will not remove the napi_struct from the poll_list. That work will be saved for the next iteration of net_rx_action. I've tested this and it seems to work well. About the biggest drawback I can see to it is the fact that it might result in an extra loop through net_rx_action in the event that the device is actually contended for (i.e. the netpoll path actually preforms all the needed work no the device, and the call to net_rx_action winds up doing nothing, except removing the napi_struct from the poll_list. However I think this is probably a small price to pay, given that the alternative is a crash. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netdev: add more functions to netdevice opsStephen Hemminger2008-11-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netdev: network device operations infrastructureStephen Hemminger2008-11-191-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the network device internal API to move adminstrative operations out of the network device structure and into a separate structure. This patch involves some hackery to maintain compatablity between the new and old model, so all 300+ drivers don't have to be changed at once. For drivers that aren't converted yet, the netdevice_ops virt function list still resides in the net_device structure. For old protocols, the new net_device_ops are copied out to the old net_device pointers. After the transistion is completed the nag message can be changed to an WARN_ON, and the compatiablity code can be made configurable. Some function pointers aren't moved: * destructor can't be in net_device_ops because it may need to be referenced after the module is unloaded. * neighbor setup is manipulated in a couple of places that need special consideration * hard_start_xmit is in the fast path for transmit. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | include/net net/ - csum_partial - remove unnecessary castsJoe Perches2008-11-191-1/+1
| | | | | | | | | | | | | | | | The first argument to csum_partial is const void * casts to char/u8 * are not necessary Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: convert print_mac to %pMJohannes Berg2008-10-271-3/+2
|/ | | | | | | | | | | | This converts pretty much everything to print_mac. There were a few things that had conflicts which I have just dropped for now, no harm done. I've built an allyesconfig with this and looked at the files that weren't built very carefully, but it's a huge patch. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* netdev: Fix lockdep warnings in multiqueue configurations.David S. Miller2008-07-311-0/+1
| | | | | | | | | | | | | | When support for multiple TX queues were added, the netif_tx_lock() routines we converted to iterate over all TX queues and grab each queue's spinlock. This causes heartburn for lockdep and it's not a healthy thing to do with lots of TX queues anyways. So modify this to use a top-level lock and a "frozen" state for the individual TX queues. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Use queue aware tests throughout.David S. Miller2008-07-171-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This effectively "flips the switch" by making the core networking and multiqueue-aware drivers use the new TX multiqueue structures. Non-multiqueue drivers need no changes. The interfaces they use such as netif_stop_queue() degenerate into an operation on TX queue zero. So everything "just works" for them. Code that really wants to do "X" to all TX queues now invokes a routine that does so, such as netif_tx_wake_all_queues(), netif_tx_stop_all_queues(), etc. pktgen and netpoll required a little bit more surgery than the others. In particular the pktgen changes, whilst functional, could be largely improved. The initial check in pktgen_xmit() will sometimes check the wrong queue, which is mostly harmless. The thing to do is probably to invoke fill_packet() earlier. The bulk of the netpoll changes is to make the code operate solely on the TX queue indicated by by the SKB queue mapping. Setting of the SKB queue mapping is entirely confined inside of net/core/dev.c:dev_pick_tx(). If we end up needing any kind of special semantics (drops, for example) it will be implemented here. Finally, we now have a "real_num_tx_queues" which is where the driver indicates how many TX queues are actually active. With IGB changes from Jeff Kirsher. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Allow netdevices to specify needed head/tailroomJohannes Berg2008-05-121-1/+1
| | | | | | | | | | This patch adds needed_headroom/needed_tailroom members to struct net_device and updates many places that allocate sbks to use them. Not all of them can be converted though, and I'm sure I missed some (I mostly grepped for LL_RESERVED_SPACE) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2008-03-211-2/+4
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
| * netpoll: zap_completion_queue: adjust skb->users counterJarek Poplawski2008-03-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | zap_completion_queue() retrieves skbs from completion_queue where they have zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero yet, so it's increased now. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2008-03-051-4/+8
|\| | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/rc80211_pid_algo.c
| * [NETPOLL]: Revert two bogus cleanups that broke netconsole.David S. Miller2008-03-041-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based upon a report by Andrew Morton and code analysis done by Jarek Poplawski. This reverts 33f807ba0d9259e7c75c7a2ce8bd2787e5b540c7 ("[NETPOLL]: Kill NETPOLL_RX_DROP, set but never tested.") and c7b6ea24b43afb5749cb704e143df19d70e23dea ("[NETPOLL]: Don't need rx_flags."). The rx_flags did get tested for zero vs. non-zero and therefore we do need those tests and that code which sets NETPOLL_RX_DROP et al. Signed-off-by: David S. Miller <davem@davemloft.net>
* | [ARP]: Introduce the arp_hdr_len helper.Pavel Emelyanov2008-03-031-4/+2
|/ | | | | | | | | | | | | | There are some place, that calculate the ARP header length. These calculations are correct, but a) some operate with "magic" constants, b) enlarge the code length (sometimes at the cost of coding style), c) are not informative from the first glance. The proposal is to introduce a helper, that includes all the good sides of these calculations. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] net/core: Use ipv4_is_<type>Joe Perches2008-01-281-1/+2
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Don't need rx_flags.Stephen Hemminger2008-01-281-4/+0
| | | | | | | | The rx_flags variable is redundant. Turning rx on/off is done via setting the rx_np pointer. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Kill NETPOLL_RX_DROP, set but never tested.Stephen Hemminger2008-01-281-4/+4
| | | | | Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: no need to store local_macStephen Hemminger2008-01-281-6/+3
| | | | | | | | The local_mac is managed by the network device, no need to keep a spare copy and all the management problems that could cause. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: netpoll_poll() cleanupStephen Hemminger2008-01-281-18/+14
| | | | | | | | | | Restructure code slightly to improve readability: * dereference device once * change obvious while() loop * let poll_napi() handle null list itself Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Use skb_queue_purge().Stephen Hemminger2008-01-281-5/+1
| | | | | | | Use standard routine for flushing queue. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix race between poll_napi() and net_rx_action()David S. Miller2007-10-291-9/+28
| | | | | | | | | | | | | | | | | | | | | | netpoll_poll_lock() synchronizes the ->poll() invocation code paths, but once we have the lock we have to make sure that NAPI_STATE_SCHED is still set. Otherwise we get: cpu 0 cpu 1 net_rx_action() poll_napi() netpoll_poll_lock() ... spin on ->poll_lock ->poll() netif_rx_complete netpoll_poll_unlock() acquire ->poll_lock() ->poll() netif_rx_complete() CRASH Based upon a bug report from Tina Yang. Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Hide the queue_mapping field inside netif_subqueue_stoppedPavel Emelyanov2007-10-221-2/+2
| | | | | | | | | | Many places get the queue_mapping field from skb to pass it to the netif_subqueue_stopped() which will be 0 in any case. Make the helper that works with sk_buff Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Wrap netdevice hardware header creation.Stephen Hemminger2007-10-101-5/+3
| | | | | | | | | | Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Introduce and use print_mac() and DECLARE_MAC_BUF()Joe Perches2007-10-101-9/+3
| | | | | | | This is nicer than the MAC_FMT stuff. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make the device list and device lookups per namespace.Eric W. Biederman2007-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] netconsole: Support dynamic reconfiguration using configfsSatyam Sharma2007-10-101-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>. This patch introduces support for dynamic reconfiguration (adding, removing and/or modifying parameters of netconsole targets at runtime) using a userspace interface exported via configfs. Documentation is also updated accordingly. Issues and brief design overview: (1) Kernel-initiated creation / destruction of kernel objects is not possible with configfs -- the lifetimes of the "config items" is managed exclusively from userspace. But netconsole must support boot/module params too, and these are parsed in kernel and hence netpolls must be setup from the kernel. Joel Becker suggested to separately manage the lifetimes of the two kinds of netconsole_target objects -- those created via configfs mkdir(2) from userspace and those specified from the boot/module option string. This adds complexity and some redundancy here and also means that boot/module param-created targets are not exposed through the configfs namespace (and hence cannot be updated / destroyed dynamically). However, this saves us from locking / refcounting complexities that would need to be introduced in configfs to support kernel-initiated item creation / destroy there. (2) In configfs, item creation takes place in the call chain of the mkdir(2) syscall in the driver subsystem. If we used an ioctl(2) to create / destroy objects from userspace, the special userspace program is able to fill out the structure to be passed into the ioctl and hence specify attributes such as local interface that are required at the time we set up the netpoll. For configfs, this information is not available at the time of mkdir(2). So, we keep all newly-created targets (via configfs) disabled by default. The user is expected to set various attributes appropriately (including the local network interface if required) and then write(2) "1" to the "enabled" attribute. Thus, netpoll_setup() is then called on the set parameters in the context of _this_ write(2) on the "enabled" attribute itself. This design enables the user to reconfigure existing netconsole targets at runtime to be attached to newly-come-up interfaces that may not have existed when netconsole was loaded or when the targets were actually created. All this effectively enables us to get rid of custom ioctls. (3) Ultra-paranoid configfs attribute show() and store() operations, with sanity and input range checking, using only safe string primitives, and compliant with the recommendations in Documentation/filesystems/sysfs.txt. (4) A new function netpoll_print_options() is created in the netpoll API, that just prints out the configured parameters for a netpoll structure. netpoll_parse_options() is modified to use that and it is also exported to be used from netconsole. Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Keiichi Kii <k-keiichi@bx.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make NAPI polling independent of struct net_device objects.Stephen Hemminger2007-10-101-14/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device *dev, int *budget) to int foo_poll(struct napi_struct *napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, *budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert "[NET]: Fix races in net_rx_action vs netpoll."Linus Torvalds2007-07-161-8/+0
| | | | | | | | | | | | | | This reverts commit 29578624e354f56143d92510fff33a8b2aaa2c03. Ingo Molnar reports complete breakage with his e1000 card (no networking, card reports transmit timeouts), and bisected it down to this commit. Let's figure out what went wrong, but not keep breaking machines until we do. Cc: Ingo Molnar <mingo@elte.hu> Cc: Olaf Kirch <olaf.kirch@oracle.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [NET]: Fix races in net_rx_action vs netpoll.Olaf Kirch2007-07-111-0/+8
| | | | | | | | Keep netpoll/poll_napi from messing with the poll_list. Only net_rx_action is allowed to manipulate the list. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Fix a leak-n-bug in netpoll_cleanup()Satyam Sharma2007-07-101-1/+1
| | | | | | | | | | | | | | | | | | | 93ec2c723e3f8a216dde2899aeb85c648672bc6b applied excessive duct tape to the netpoll beast's netpoll_cleanup(), thus substituting one leak with another, and opening up a little buglet :-) net_device->npinfo (netpoll_info) is a shared and refcounted object and cannot simply be set NULL the first time netpoll_cleanup() is called. Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and become no-ops, thus leaking. And it's a bug too: the first call to netpoll_cleanup() would thus (annoyingly) "disable" other (still alive) netpolls too. Maybe nobody noticed this because netconsole (only user of netpoll) never supported multiple netpoll objects earlier. This is a trivial and obvious one-line fixlet. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CORE] Stack changes to add multiqueue hardware support APIPeter P Waskiewicz Jr2007-07-101-3/+5
| | | | | | | | | | | | | Add the multiqueue hardware device support API to the core network stack. Allow drivers to allocate multiple queues and manage them at the netdev level if they choose to do so. Added a new field to sk_buff, namely queue_mapping, for drivers to know which tx_ring to select based on OS classification of the flow. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: Fixups for 'fix soft lockup when removing module'Jarek Poplawski2007-07-051-4/+2
| | | | | | | | | | | | | | | | | | | | | | | >From my recent patch: > > #1 > > Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work() > > required a work function should always (unconditionally) rearm with > > delay > 0 - otherwise it would endlessly loop. This patch replaces > > this function with cancel_delayed_work(). Later kernel versions don't > > require this, so here it's only for uniformity. But Oleg Nesterov <oleg@tv-sign.ru> found: > But 2.6.22 doesn't need this change, why it was merged? > > In fact, I suspect this change adds a race, ... His description was right (thanks), so this patch reverts #1. Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL] netconsole: fix soft lockup when removing moduleJarek Poplawski2007-06-281-2/+9
| | | | | | | | | | | | | | | | | | | | #1 Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work() required a work function should always (unconditionally) rearm with delay > 0 - otherwise it would endlessly loop. This patch replaces this function with cancel_delayed_work(). Later kernel versions don't require this, so here it's only for uniformity. #2 After deleting a timer in cancel_[rearming_]delayed_work() there could stay a last skb queued in npinfo->txq causing a memory leak after kfree(npinfo). Initial patch & testing by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETPOLL]: tx lock deadlock fixStephen Hemminger2007-06-271-9/+10
| | | | | | | | | | | | | | If sky2 device poll routine is called from netpoll_send_skb, it would deadlock. The netpoll_send_skb held the netif_tx_lock, and the poll routine could acquire it to clean up skb's. Other drivers might use same locking model. The driver is correct, netpoll should not introduce more locking problems than it causes already. So change the code to drop lock before calling poll handler. Signed-off-by: Stephen Hemminger <shemminger@linux.foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* header cleaning: don't include smp_lock.h when not usedRandy Dunlap2007-05-081-1/+0
| | | | | | | | | | | | Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARYHerbert Xu2007-04-251-1/+1
| | | | | | | | | When a transmitted packet is looped back directly, CHECKSUM_PARTIAL maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should treat it as such in the stack. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [SK_BUFF]: Introduce skb_copy_to_linear_data{_offset}Arnaldo Carvalho de Melo2007-04-251-1/+1
| | | | | | | To clearly state the intent of copying to linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>