summaryrefslogtreecommitdiffstats
path: root/drivers/net/virtio_net.c
Commit message (Collapse)AuthorAgeFilesLines
* virtio: add virtio IDs fileFernando Luis Vazquez Cao2009-09-231-0/+1
| | | | | | | | Virtio IDs are spread all over the tree which makes assigning new IDs bothersome. Putting them together should make the process less error-prone. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: make add_buf return capacity remainingRusty Russell2009-09-231-7/+7
| | | | | | | | | | This API change means that virtio_net can tell how much capacity remains for buffers. It's necessarily fuzzy, since VIRTIO_RING_F_INDIRECT_DESC means we can fit any number of descriptors in one, *if* we can kmalloc. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dinesh Subhraveti <dineshs@us.ibm.com>
* netdev: drivers should make ethtool_ops constStephen Hemminger2009-09-021-1/+1
| | | | | | | No need to put ethtool_ops in data, they should be const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-09-021-15/+46
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/yellowfin.c
| * virtio: net refill on out-of-memoryRusty Russell2009-08-261-15/+46
| | | | | | | | | | | | | | | | | | If we run out of memory, use keventd to fill the buffer. There's a report of this happening: "Page allocation failures in guest", Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netdev: convert pseudo drivers to netdev_tx_tStephen Hemminger2009-09-011-1/+1
| | | | | | | | | | | | | | These are all drivers that don't touch real hardware. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio-net: Allow UFO feature to be set and advertised.Sridhar Samudrala2009-07-171-1/+2
|/ | | | | | | | - Allow setting UFO on virtio-net and advertise to host. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: group address list and its countJiri Pirko2009-06-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | This patch is inspired by patch recently posted by Johannes Berg. Basically what my patch does is to group list and a count of addresses into newly introduced structure netdev_hw_addr_list. This brings us two benefits: 1) struct net_device becames a bit nicer. 2) in the future there will be a possibility to operate with lists independently on netdevices (with exporting right functions). I wanted to introduce this patch before I'll post a multicast lists conversion. Signed-off-by: Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c | 4 +- drivers/net/e1000/e1000_main.c | 4 +- drivers/net/ixgbe/ixgbe_main.c | 6 +- drivers/net/mv643xx_eth.c | 2 +- drivers/net/niu.c | 4 +- drivers/net/virtio_net.c | 10 ++-- drivers/s390/net/qeth_l2_main.c | 2 +- include/linux/netdevice.h | 17 +++-- net/core/dev.c | 130 ++++++++++++++++++-------------------- 9 files changed, 89 insertions(+), 90 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-06-151-27/+18
|\ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c
| * virtio: find_vqs/del_vqs virtio operationsMichael S. Tsirkin2009-06-121-27/+18
| | | | | | | | | | | | | | | | | | This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations, and updates all drivers. This is needed for MSI support, because MSI needs to know the total number of vectors upfront. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)
| * virtio: add names to virtqueue struct, mapping from devices to queues.Rusty Russell2009-06-121-3/+3
| | | | | | | | | | | | | | | | | | Add a linked list of all virtqueues for a virtio device: this helps for debugging and is also needed for upcoming interface change. Also, add a "name" field for clearer debug messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | virtio_net: Fix IP alignment on non-mergeable RX pathHerbert Xu2009-06-111-1/+2
| | | | | | | | | | | | | | | | | | | | We need to enforce the IP alignment on the non-mergeable RX path just like the other RX path. Not doing so results in misaligned IP headers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Set correct gso->hdr_lenHerbert Xu2009-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Through a bug in the tun driver, I noticed that virtio_net is producing bogus hdr_len values. In particular, it only includes the IP header in the linear area, and excludes the entire TCP header. This causes the TCP header to be copied twice for each packet. (The bug omitted the second copy :) This patch corrects this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: convert unicast addr listJiri Pirko2009-05-291-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts unicast address list to standard list_head using previously introduced struct netdev_hw_addr. It also relaxes the locking. Original spinlock (still used for multicast addresses) is not needed and is no longer used for a protection of this list. All reading and writing takes place under rtnl (with no changes). I also removed a possibility to specify the length of the address while adding or deleting unicast address. It's always dev->addr_len. The convertion touched especially e1000 and ixgbe codes when the change is not so trivial. Signed-off-by: Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c | 13 +-- drivers/net/e1000/e1000_main.c | 24 +++-- drivers/net/ixgbe/ixgbe_common.c | 14 ++-- drivers/net/ixgbe/ixgbe_common.h | 4 +- drivers/net/ixgbe/ixgbe_main.c | 6 +- drivers/net/ixgbe/ixgbe_type.h | 4 +- drivers/net/macvlan.c | 11 +- drivers/net/mv643xx_eth.c | 11 +- drivers/net/niu.c | 7 +- drivers/net/virtio_net.c | 7 +- drivers/s390/net/qeth_l2_main.c | 6 +- drivers/scsi/fcoe/fcoe.c | 16 ++-- include/linux/netdevice.h | 18 ++-- net/8021q/vlan.c | 4 +- net/8021q/vlan_dev.c | 10 +- net/core/dev.c | 195 +++++++++++++++++++++++++++----------- net/dsa/slave.c | 10 +- net/packet/af_packet.c | 4 +- 18 files changed, 227 insertions(+), 137 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-05-031-10/+14
|\| | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * virtio_net: Fix function name typoAlex Williamson2009-05-011-4/+4
| | | | | | | | | | Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio_net: Cleanup command queue scatterlist usageAlex Williamson2009-05-011-6/+10
| | | | | | | | | | | | | | | | | | | | We were avoiding calling sg_init* on scatterlists passed into virtnet_send_command to prevent extraneous end markers. This caused build warnings for uninitialized variables. Cleanup the code to create proper scatterlists. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | drivers/net: replace BUG() with BUG_ON() if possibleAlexander Beregalov2009-04-131-8/+4
|/ | | | | Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio_net: Set the mac config only when VIRITO_NET_F_MACAlex Williamson2009-04-041-6/+4
| | | | | | | | | | | VIRTIO_NET_F_MAC indicates the presence of the mac field in config space, not the validity of the value it contains. Allow the mac to be changed at runtime, but only push the change into config space with the VIRTIO_NET_F_MAC feature present. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-03-201-0/+1
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c
| * virtio_net: Make virtio_net support carrier detectionPantelis Koukousoulas2009-03-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Make NetworkManager work with virtio_net For now the semantics are simple: There is always carrier. This allows a seamless experience with e.g., qemu/kvm where NetworkManager just configures and sets up everything automagically. If/when a generally agreed-upon way to control carrier on/off in the emulator/hypervisor level emerges, it will be trivial to extend the driver to support that too, but for now even this 2-liner makes user experience that much better. Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Allow setting the MAC address of the NICAlex Williamson2009-02-041-2/+21
| | | | | | | | | | | | | | | | | | Many physical NICs let the OS re-program the "hardware" MAC address. Virtual NICs should allow this too. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Add support for VLAN filtering in the hypervisorAlex Williamson2009-02-041-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | VLAN filtering allows the hypervisor to drop packets from VLANs that we're not a part of, further reducing the number of extraneous packets recieved. This makes use of the VLAN virtqueue command class. The CTRL_VLAN feature bit tells us whether the backend supports VLAN filtering. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Add a MAC filter tableAlex Williamson2009-02-041-8/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | Make use of the MAC control virtqueue class to support a MAC filter table. The filter table is managed by the hypervisor. We consider the table to be available if the CTRL_RX feature bit is set. We leave it to the hypervisor to manage the table and enable promiscuous or all-multi mode as necessary depending on the resources available to it. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Add a set_rx_mode interfaceAlex Williamson2009-02-041-1/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | Make use of the RX_MODE control virtqueue class to enable the set_rx_mode netdev interface. This allows us to selectively enable/disable promiscuous and allmulti mode so we don't see packets we don't want. For now, we automatically enable these as needed if additional unicast or multicast addresses are requested. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: Add a virtqueue for outbound control commandsAlex Williamson2009-02-041-3/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will be used for RX mode, MAC filter table, VLAN filtering, etc... The control transaction consists of one or more "out" sg entries and one or more "in" sg entries. The first out entry contains a header defining the class and command. Additional out entries may provide data for the command. The last in entry provides a status response back from the command. Virtqueues typically run asynchronous, running a callback function when there's data in the channel. We can't readily make use of this in the command paths where we need to use this. Instead, we kick the virtqueue and spin. The kick causes an I/O write, triggering an immediate trap into the hypervisor. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-01-301-3/+3
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000/e1000_main.c
| * virtio_net: use correct accessors for scatterlistsIra W. Snyder2009-01-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Without this fix, virtio_net makes incorrect usage of scatterlists. It sets the end of the scatterlist chain after the first element, despite the fact that more entries come after it. If you try to run dma_map_sg() on one of the scatterlists given to you by add_buf(), you will get a null pointer oops. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-01-261-1/+2
|\| | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANsAlex Williamson2009-01-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | 802.1Q expanded the maximum ethernet frame size by 4 bytes for the VLAN tag. We're not taking this into account in virtio_net, which means the buffers we provide to the backend in the virtqueue RX ring aren't big enough to hold a full MTU VLAN packet. For QEMU/KVM, this results in the backend exiting with a packet truncation error. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: add link status handlingMark McLoughlin2009-01-211-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the host to inform us that the link is down by adding a VIRTIO_NET_F_STATUS which indicates that device status is available in virtio_net config. This is currently useful for simulating link down conditions (e.g. using proposed qemu 'set_link' monitor command) but would also be needed if we were to support device assignment via virtio. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking) Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Remove redundant NAPI functionsBen Hutchings2009-01-211-6/+6
|/ | | | | | | | | | | Following the removal of the unused struct net_device * parameter from the NAPI functions named *netif_rx_* in commit 908a7a1, they are exactly equivalent to the corresponding *napi_* functions and are therefore redundant. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio: convert to net_device_opsStephen Hemminger2009-01-061-7/+13
| | | | | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove unused netdev arg from some NAPI interfaces.Neil Horman2008-12-221-6/+6
| | | | | | | | | | When the napi api was changed to separate its 1:1 binding to the net_device struct, the netif_rx_[prep|schedule|complete] api failed to remove the now vestigual net_device structure parameter. This patch cleans up that api by properly removing it.. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio_net: large tx MTU supportMark McLoughlin2008-12-021-0/+12
| | | | | | | | We don't really have a max tx packet size limit, so allow configuring the device with up to 64k tx MTU. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation)Mark McLoughlin2008-11-161-20/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If segmentation offload is enabled by the host, we currently allocate maximum sized packet buffers and pass them to the host. This uses up 20 ring entries, allowing us to supply only 20 packet buffers to the host with a 256 entry ring. This is a huge overhead when receiving small packets, and is most keenly felt when receiving MTU sized packets from off-host. The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support using receive buffers which are smaller than the maximum packet size. In order to transfer large packets to the guest, the host merges together multiple receive buffers to form a larger logical buffer. The number of merged buffers is returned to the guest via a field in the virtio_net_hdr. Make use of this support by supplying single page receive buffers to the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of the payload to the skb's linear data buffer and adjust the fragment offset to point to the remaining data. This ensures proper alignment and allows us to not use any paged data for small packets. If the payload occupies multiple pages, we simply append those pages as fragments and free the associated skbs. This scheme allows us to be efficient in our use of ring entries while still supporting large packets. Benchmarking using netperf from an external machine to a guest over a 10Gb/s network shows a 100% improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark with GSO disabled on the host side, throughput was seen to increase from 700Mb/s to 1.7Gb/s. Based on a patch from Herbert Xu. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio_net: hook up the set-tso ethtool opMark McLoughlin2008-11-161-0/+1
| | | | | | | | | | | | Seems like an oversight that we have set-tx-csum and set-sg hooked up, but not set-tso. Also leads to the strange situation that if you e.g. disable tx-csum, then tso doesn't get disabled. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio_net: Recycle some more rx buffer pagesMark McLoughlin2008-11-161-9/+13
| | | | | | | | | | | | | | Each time we re-fill the recv queue with buffers, we allocate one too many skbs and free it again when adding fails. We should recycle the pages allocated in this case. A previous version of this patch made trim_pages() trim trailing unused pages from skbs with some paged data, but this actually caused a barely measurable slowdown. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>
* netdevice: safe convert to netdev_priv() #part-3Wang Chen2008-11-121-1/+2
| | | | | | | | | | | | | | | | | | We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: convert print_mac to %pMJohannes Berg2008-10-271-3/+1
| | | | | | | | | | | | This converts pretty much everything to print_mac. There were a few things that had conflicts which I have just dropped for now, no harm done. I've built an allyesconfig with this and looked at the files that weren't built very carefully, but it's a huge patch. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* virtio: Recycle unused recv buffer pages for large skbs in net driverRusty Russell2008-07-251-1/+35
| | | | | | | | | | | | | | | | | | | | | | If we hack the virtio_net driver to always allocate full-sized (64k+) skbuffs, the driver slows down (lguest numbers): Time to receive 1GB (small buffers): 10.85 seconds Time to receive 1GB (64k+ buffers): 24.75 seconds Of course, large buffers use up more space in the ring, so we increase that from 128 to 2048: Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds If we recycle pages rather than using alloc_page/free_page: Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds This demonstrates that with efficient allocation, we don't need to have a separate "small buffer" queue. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio net: Allow receiving SG packetsHerbert Xu2008-07-251-5/+39
| | | | | | | | | | Finally this patch lets virtio_net receive GSO packets in addition to sending them. This can definitely be optimised for the non-GSO case. For comparison the Xen approach stores one page in each skb and uses subsequent skb's pages to construct an SG skb instead of preallocating the maximum amount of pages per skb. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added feature bits)
* virtio net: Add ethtool ops for SG/GSOHerbert Xu2008-07-251-0/+18
| | | | | | | | This patch adds some basic ethtool operations to virtio_net so I could test SG without GSO (which was really useful because TSO turned out to be buggy :) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (remove MTU setting)
* virtio: fix virtio_net xmit of freed skb bugMark McLoughlin2008-07-251-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote: > If we fail to transmit a packet, we assume the queue is full and put > the skb into last_xmit_skb. However, if more space frees up before we > xmit it, we loop, and the result can be transmitting the same skb twice. > > Fix is simple: set skb to NULL if we've used it in some way, and check > before sending. ... > diff -r 564237b31993 drivers/net/virtio_net.c > --- a/drivers/net/virtio_net.c Mon May 19 12:22:00 2008 +1000 > +++ b/drivers/net/virtio_net.c Mon May 19 12:24:58 2008 +1000 > @@ -287,21 +287,25 @@ again: > free_old_xmit_skbs(vi); > > /* If we has a buffer left over from last time, send it now. */ > - if (vi->last_xmit_skb) { > + if (unlikely(vi->last_xmit_skb)) { > if (xmit_skb(vi, vi->last_xmit_skb) != 0) { > /* Drop this skb: we only queue one. */ > vi->dev->stats.tx_dropped++; > kfree_skb(skb); > + skb = NULL; > goto stop_queue; > } > vi->last_xmit_skb = NULL; With this, may drop an skb and then later in the function discover that we could have sent it after all. Poor wee skb :) How about the incremental patch below? Cheers, Mark. Subject: [PATCH] virtio_net: Delay dropping tx skbs Currently we drop the skb in start_xmit() if we have a queued buffer and fail to transmit it. However, if we delay dropping it until we've stopped the queue and enabled the tx notification callback, then there is a chance space might become available for it. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio_net: Set VIRTIO_NET_F_GUEST_CSUM featureMark McLoughlin2008-07-111-1/+2
| | | | | | | | | We can handle receiving partial csums, so set the appropriate feature bit. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* virtio: use callback on empty in virtio_netRusty Russell2008-06-101-5/+17
| | | | | | | | | | virtio_net uses a timer to free old transmitted packets, rather than leaving callbacks enabled all the time. If the host promises to always notify us when the transmit ring is empty, we can free packets at that point and avoid the timer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* virtio: virtio_net free transmit skbs in a timerMark McLoughlin2008-06-101-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | virtio_net currently only frees old transmit skbs just before queueing new ones. If the queue is full, it then enables interrupts and waits for notification that more work has been performed. However, a side-effect of this scheme is that there are always xmit skbs left dangling when no new packets are sent, against the Documentation/networking/driver.txt guideline: "... it is not allowed for your TX mitigation scheme to let TX packets "hang out" in the TX ring unreclaimed forever if no new TX packets are sent." Add a timer to ensure that any time we queue new TX skbs, we will shortly free them again. This fixes an easily reproduced hang at shutdown where iptables attempts to unload nf_conntrack and nf_conntrack waits for an skb it is tracking to be freed, but virtio_net never frees it. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* virtio_net: Fix skb->csum_start computationMark McLoughlin2008-06-101-3/+5
| | | | | | | | | | | | | | | | | | | | | | hdr->csum_start is the offset from the start of the ethernet header to the transport layer checksum field. skb->csum_start is the offset from skb->head. skb_partial_csum_set() assumes that skb->data points to the ethernet header - i.e. it computes skb->csum_start by adding the headroom to hdr->csum_start. Since eth_type_trans() skb_pull()s the ethernet header, skb_partial_csum_set() should be called before eth_type_trans(). (Without this patch, GSO packets from a guest to the world outside the host are corrupted). Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* virtio: fix delayed xmit of packet and freeing of old packets.Rusty Russell2008-05-301-0/+22
| | | | | | | | | | Because we cache the last failed-to-xmit packet, if there are no packets queued behind that one we may never send it (reproduced here as TCP stalls, "cured" by an outgoing ping). Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* virtio: fix virtio_net xmit of freed skb bugRusty Russell2008-05-301-5/+9
| | | | | | | | | | | | If we fail to transmit a packet, we assume the queue is full and put the skb into last_xmit_skb. However, if more space frees up before we xmit it, we loop, and the result can be transmitting the same skb twice. Fix is simple: set skb to NULL if we've used it in some way, and check before sending. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>