summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: addr_list: add exclusive dev_uc_add and dev_mc_addJohn Fastabend2012-04-152-16/+83
| | | | | | | | | | | | | | | | This adds a dev_uc_add_excl() and dev_mc_add_excl() calls similar to the original dev_{uc|mc}_add() except it sets the global bit and returns -EEXIST for duplicat entires. This is useful for drivers that support SR-IOV, macvlan devices and any other devices that need to manage the unicast and multicast lists. v2: fix typo UNICAST should be MULTICAST in dev_mc_add_excl() CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: add generic PF_BRIDGE:RTM_ FDB hooksJohn Fastabend2012-04-158-112/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds two new flags NTF_MASTER and NTF_SELF that can now be used to specify where PF_BRIDGE netlink commands should be sent. NTF_MASTER sends the commands to the 'dev->master' device for parsing. Typically this will be the linux net/bridge, or open-vswitch devices. Also without any flags set the command will be handled by the master device as well so that current user space tools continue to work as expected. The NTF_SELF flag will push the PF_BRIDGE commands to the device. In the basic example below the commands are then parsed and programmed in the embedded bridge. Note if both NTF_SELF and NTF_MASTER bits are set then the command will be sent to both 'dev->master' and 'dev' this allows user space to easily keep the embedded bridge and software bridge in sync. There is a slight complication in the case with both flags set when an error occurs. To resolve this the rtnl handler clears the NTF_ flag in the netlink ack to indicate which sets completed successfully. The add/del handlers will abort as soon as any error occurs. To support this new net device ops were added to call into the device and the existing bridging code was refactored to use these. There should be no required changes in user space to support the current bridge behavior. A basic setup with a SR-IOV enabled NIC looks like this, veth0 veth2 | | ------------ | bridge0 | <---- software bridging ------------ / / ethx.y ethx VF PF \ \ <---- propagate FDB entries to HW \ \ -------------------- | Embedded Bridge | <---- hardware offloaded switching -------------------- In this case the embedded bridge must be managed to allow 'veth0' to communicate with 'ethx.y' correctly. At present drivers managing the embedded bridge either send frames onto the network which then get dropped by the switch OR the embedded bridge will flood these frames. With this patch we have a mechanism to manage the embedded bridge correctly from user space. This example is specific to SR-IOV but replacing the VF with another PF or dropping this into the DSA framework generates similar management issues. Examples session using the 'br'[1] tool to add, dump and then delete a mac address with a new "embedded" option and enabled ixgbe driver: # br fdb add 22:35:19:ac:60:59 dev eth3 # br fdb port mac addr flags veth0 22:35:19:ac:60:58 static veth0 9a:5f:81:f7:f6:ec local eth3 00:1b:21:55:23:59 local eth3 22:35:19:ac:60:59 static veth0 22:35:19:ac:60:57 static #br fdb add 22:35:19:ac:60:59 embedded dev eth3 #br fdb port mac addr flags veth0 22:35:19:ac:60:58 static veth0 9a:5f:81:f7:f6:ec local eth3 00:1b:21:55:23:59 local eth3 22:35:19:ac:60:59 static veth0 22:35:19:ac:60:57 static eth3 22:35:19:ac:60:59 local embedded #br fdb del 22:35:19:ac:60:59 embedded dev eth3 I added a couple lines to 'br' to set the flags correctly is all. It is my opinion that the merit of this patch is now embedded and SW bridges can both be modeled correctly in user space using very nearly the same message passing. [1] 'br' tool was published as an RFC here and will be renamed 'bridge' http://patchwork.ozlabs.org/patch/117664/ Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for valuable feedback, suggestions, and review. v2: fixed api descriptions and error case with both NTF_SELF and NTF_MASTER set plus updated patch description. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: do not drop rx/tx interrupts before they are scheduledTony Zelenoff2012-04-151-4/+10
| | | | | | | | | | | | | To prevent interrupts lost they should be dropped only if they are scheduled via napi interfaces. In other case, there is exists situation when napi handler process TX interrupt, stay in RX processing and in that moment any other interrupt received. Then before this patch TX bit in ISR will be cleaned, napi schedule will not occur in case of currently processing event and TX interrupt definitely will be lost. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: do not process interrupts in cycle in handlerTony Zelenoff2012-04-152-57/+52
| | | | | | | | | As the rx/tx handled inside napi handler, the cycle is not needed now, because only the rx/tx need such kind of processing. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: enable errors and link ints when rx/tx scheduledTony Zelenoff2012-04-152-13/+29
| | | | | Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: add value to check ability of reenabling IRQsTony Zelenoff2012-04-152-0/+8
| | | | | | | | | | Unfortunately it is not clear from code is usage of IMR register possible or not. So, to prevent possible side-effects of reading this register i prefer store interrupts enable flag separately. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: make function to set imr of cardTony Zelenoff2012-04-151-4/+9
| | | | | | | | This function should be used later to set/remove proper bits in imr to disable only rx ints. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: use defined functions to disable irqTony Zelenoff2012-04-151-3/+3
| | | | | | | | Looks like direct writes to IMR register is not good idea, because there are exist functions to make this work. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: add napi process of tx interruptsTony Zelenoff2012-04-151-11/+16
| | | | | | | | | Make the tx ints processing same as rx ones via napi. The idea got from e1000. The interrupt disabling is still not fine grained. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: make driver napi compatibleTony Zelenoff2012-04-152-6/+41
| | | | | | | | | This is first step, here there is no fine interrupt disabling which cause TX/ERR interrupts stalling when RX scheduled ints processed. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1: handle rx in separate conditionTony Zelenoff2012-04-151-9/+10
| | | | | | | | Remove rx from unlikely optimization in case of rx is very likely thing for network card. This also reduce code a bit. Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Add multicast_querier toggle and disable queries by defaultHerbert Xu2012-04-153-0/+40
| | | | | | | | | | | | | | | | | | | Sending general queries was implemented as an optimisation to speed up convergence on start-up. In order to prevent interference with multicast routers a zero source address has to be used. Unfortunately these packets appear to cause some multicast-aware switches to misbehave, e.g., by disrupting multicast packets to us. Since the multicast snooping feature still functions without sending our own queries, this patch will change the default to not send queries. For those that need queries in order to speed up convergence on start-up, a toggle is provided to restore the previous behaviour. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Restart queries when last querier expiresHerbert Xu2012-04-151-1/+18
| | | | | | | | | | | | | As it stands when we discover that a real querier (one that queries with a non-zero source address) we stop querying. However, even after said querier has fallen off the edge of the earth, we will never restart querying (unless the bridge itself is restarted). This patch fixes this by kicking our own querier into gear when the timer for other queriers expire. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Add br_multicast_start_querierHerbert Xu2012-04-151-9/+16
| | | | | | | | | This patch adds the helper br_multicast_start_querier so that the code which starts the queriers in br_multicast_toggle can be reused elsewhere. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: cleanup unsigned to unsigned intEric Dumazet2012-04-15167-455/+461
| | | | | | | Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: fix checkpatch errorsDaniel Baluta2012-04-1516-49/+49
| | | | | | | | | Fix checkpatch errors of the following type: * ERROR: "foo * bar" should be "foo *bar" * ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio-net: send gratuitous packets when neededJason Wang2012-04-152-5/+73
| | | | | | | | | | | | | | | | | | | As hypervior does not have the knowledge of guest network configuration, it's better to ask guest to send gratuitous packets when needed. This patch implements VIRTIO_NET_F_GUEST_ANNOUNCE feature: hypervisor would notice the guest when it thinks it's time for guest to announce the link presnece. Guest tests VIRTIO_NET_S_ANNOUNCE bit during config change interrupt and woule send gratuitous packets through netif_notify_peers() and ack the notification through ctrl vq. We need to make sure the atomicy of read and ack in guest otherwise we may ack more times than being notified. This is done through handling the whole config change interrupt in an non-reentrant workqueue. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* isdn/hysdn: Convert to kstrtoul_from_userPeter Hüwe2012-04-151-9/+1
| | | | | | | | | This patch replaces the code for getting an number from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Remove unused argument to addrconf_dad_start().David S. Miller2012-04-141-6/+6
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Fix spelling typo in netMasanari Iida2012-04-1411-12/+12
| | | | | | | Correct spelling typo within drivers/net. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Remove redundant code entering quickack modeVijay Subramanian2012-04-141-2/+0
| | | | | | | | | tcp_enter_quickack_mode() already calls tcp_incr_quickack() and sets icsk->icsk_ack.ato to TCP_ATO_MIN. This patch removes the duplication. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: bind() use stronger condition for bind_conflictAlex Copot2012-04-144-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must try harder to get unique (addr, port) pairs when doing port autoselection for sockets with SO_REUSEADDR option set. We achieve this by adding a relaxation parameter to inet_csk_bind_conflict. When 'relax' parameter is off we return a conflict whenever the current searched pair (addr, port) is not unique. This tries to address the problems reported in patch: 8d238b25b1ec22a73b1c2206f111df2faaff8285 Revert "tcp: bind() fix when many ports are bound" Tests where ran for creating and binding(0) many sockets on 100 IPs. The results are, on average: * 60000 sockets, 600 ports / IP: * 0.210 s, 620 (IP, port) duplicates without patch * 0.219 s, no duplicates with patch * 100000 sockets, 1000 ports / IP: * 0.371 s, 1720 duplicates without patch * 0.373 s, no duplicates with patch * 200000 sockets, 2000 ports / IP: * 0.766 s, 6900 duplicates without patch * 0.768 s, no duplicates with patch * 500000 sockets, 5000 ports / IP: * 2.227 s, 41500 duplicates without patch * 2.284 s, no duplicates with patch Signed-off-by: Alex Copot <alex.mihai.c@gmail.com> Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: makes syn_ack_timeout mandatoryEric Dumazet2012-04-144-2/+10
| | | | | | | | | | | | There are two struct request_sock_ops providers, tcp and dccp. inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being NULL if we make it non NULL like syn_ack_timeout Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: RFC6298 supersedes RFC2988bisEric Dumazet2012-04-143-4/+4
| | | | | | | | | Updates some comments to track RFC6298 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/wan: use module_pci_driverAxel Lin2012-04-142-24/+2
| | | | | | | | | | This patch converts the drivers in drivers/net/wan/* to use module_pci_driver() macro which makes the code smaller and a bit simpler. Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/tokenring: use module_pci_driverAxel Lin2012-04-144-48/+4
| | | | | | | | | | This patch converts the drivers in drivers/net/tokenring/* to use module_pci_driver() macro which makes the code smaller and a bit simpler. Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2012-04-1412-58/+62
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
| * ixgbe: add I2C clock stretchingDon Skidmore2012-04-142-6/+15
| | | | | | | | | | | | | | | | | | | | This patch adds support for I2C clock stretching which is required per SFF-8636. Customers with passive DA cables implement clock stretching would fail without this patch. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * igb: Update version to 3.4.7.Carolyn Wyborny2012-04-141-2/+2
| | | | | | | | | | | | Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * e1000e: cleanup boolean logicBruce Allan2012-04-146-35/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace occurrences of 'if (<bool expr> == <1|0>)' with 'if ([!]<bool expr>)' Replace occurrences of '<bool var> = (<non-bool expr>) ? true : false' with '<bool var> = <non-bool expr>'. Replace occurrence of '<bool var> = <non-bool expr>' with '<bool var> = !!<non-bool expr>' While the latter replacement is not really necessary, it is done here for consistency and clarity. No functional changes. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * e1000e: cleanup remaining strings split across multiple linesBruce Allan2012-04-142-15/+12
| | | | | | | | | | | | | | | | | | | | Now that split strings generate checkpatch warnings (per Chapter 2 of Documentation/CodingStyle to make it easier to grep the code for the string) cleanup the remaining instances of them in the driver. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * e100: enable transmit time stamping.Richard Cochran2012-04-141-0/+1
| | | | | | | | | | | | | | | | | | This patch enables software (and phy device) transmit time stamping. Tested on an old PIII laptop with built in NIC. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * e100: Support the get_ts_info ethtool method.Richard Cochran2012-04-141-0/+1
| | | | | | | | | | | | Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | tunnel: implement 64 bits statisticsstephen hemminger2012-04-144-55/+111
|/ | | | | | | | Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels to use 64 bit statistics. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Fixup get_tx_queue() op second arg type.David S. Miller2012-04-131-1/+1
| | | | | | I missed this when fixing up the warning in the previous commit. Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: ops->get_tx_queue() cannot take a const 'tb'.David S. Miller2012-04-131-1/+1
| | | | | | | | net/core/rtnetlink.c: In function ‘rtnl_create_link’: net/core/rtnetlink.c:1645:3: warning: passing argument 2 of ‘ops->get_tx_queues’ from incompatible pointer type [enabled by default] net/core/rtnetlink.c:1645:3: note: expected ‘const struct nlattr **’ but argument is of type ‘struct nlattr **’ Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: change maintainer addressRémi Denis-Courmont2012-04-131-1/+1
| | | | | | | nokia.com MX does not cope well with kernel.org. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: missing headers (sparse)Rémi Denis-Courmont2012-04-131-0/+4
| | | | | Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: phonet_net_id can be static (sparse)Rémi Denis-Courmont2012-04-131-1/+1
| | | | | Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* neighbour: Make neigh_table_init_no_netlink() static.Hiroaki SHIMODA2012-04-132-3/+1
| | | | | | | neigh_table_init_no_netlink() is only used in net/core/neighbour.c file. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* udp: intoduce udp_encap_needed static_keyEric Dumazet2012-04-133-1/+13
| | | | | | | | | | | | | | | | | | | | Most machines dont use UDP encapsulation (L2TP) Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a test if L2TP never setup the encap_rcv on a socket. Idea of this patch came after Simon Horman proposal to add a hook on TCP as well. If static_key is not yet enabled, the fast path does a single JMP . When static_key is enabled, JMP destination is patched to reach the real encap_type/encap_rcv logic, possibly adding cache misses. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: dev@openvswitch.org Signed-off-by: David S. Miller <davem@davemloft.net>
* net: WIZnet drivers: fix possible NULL dereferenceMike Sinkovsky2012-04-132-2/+2
| | | | | | | | | This fixes possible null dereference in probe() function: when both .mac_addr and .link_gpio are unknown, dev.platform_data may be NULL Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Sinkovsky <msink@permonline.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: fix commentsstephen hemminger2012-04-131-2/+4
| | | | | | | Fix spelling and references in rtnetlink. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink & bonding: change args got get_tx_queuesstephen hemminger2012-04-133-12/+8
| | | | | | | | | | | Change get_tx_queues, drop unsused arg/return value real_tx_queues, and use return by value (with error) rather than call by reference. Probably bonding should just change to LLTX and the whole get_tx_queues API could disappear! Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: replace open-coded ARRAY_SIZE with macroJim Cromie2012-04-131-1/+1
| | | | | Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: replace open-coded ARRAY_SIZE with macroJim Cromie2012-04-131-1/+1
| | | | | Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* broadcom: replace open-coded ARRAY_SIZE with macroJim Cromie2012-04-131-2/+1
| | | | | Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove redundant spi driver bus initializationLars-Peter Clausen2012-04-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | In ancient times it was necessary to manually initialize the bus field of an spi_driver to spi_bus_type. These days this is done in spi_driver_register() so we can drop the manual assignment. The patch was generated using the following coccinelle semantic patch: // <smpl> @@ identifier _driver; @@ struct spi_driver _driver = { .driver = { - .bus = &spi_bus_type, }, }; // </smpl> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Gabor Juhos <juhosg@openwrt.org> Cc: Frederic Lambert <frdrc66@gmail.com> Cc: netdev@vger.kernel.org Acked-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/garp: fix GID rbtree orderingDavid Ward2012-04-131-4/+4
| | | | | | | | | | | The comparison operators were backwards in both garp_attr_lookup and garp_attr_create, so the entire GID rbtree was in reverse order. (There was no practical side effect to this though, except that PDUs were sent with attributes listed in reverse order, which is still valid by the protocol. This change is only for clarity.) Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
* pppoatm: Fix excessive queue bloatDavid Woodhouse2012-04-131-10/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We discovered that PPPoATM has an excessively deep transmit queue. A queue the size of the default socket send buffer (wmem_default) is maintained between the PPP generic core and the ATM device. Fix it to queue a maximum of *two* packets. The one the ATM device is currently working on, and one more for the ATM driver to process immediately in its TX done interrupt handler. The PPP core is designed to feed packets to the channel with minimal latency, so that really ought to be enough to keep the ATM device busy. While we're at it, fix the fact that we were triggering the wakeup tasklet on *every* pppoatm_pop() call. The comment saying "this is inefficient, but doing it right is too hard" turns out to be overly pessimistic... I think :) On machines like the Traverse Geos, with a slow Geode CPU and two high-speed ADSL2+ interfaces, there were reports of extremely high CPU usage which could partly be attributed to the extra wakeups. (The wakeup handling could actually be made a whole lot easier if we stop checking sk->sk_sndbuf altogether. Given that we now only queue *two* packets ever, one wonders what the point is. As it is, you could already deadlock the thing by setting the sk_sndbuf to a value lower than the MTU of the device, and it'd just block for ever.) Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>