linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	xfrm: remove useless hash_resize_mutex locks	Ying Xue	2014-08-29	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In xfrm_state.c, hash_resize_mutex is defined as a local variable and only used in xfrm_hash_resize() which is declared as a work handler of xfrm.state_hash_work. But when the xfrm.state_hash_work work is put in the global workqueue(system_wq) with schedule_work(), the work will be really inserted in the global workqueue if it was not already queued, otherwise, it is still left in the same position on the the global workqueue. This means the xfrm_hash_resize() work handler is only executed once at any time no matter how many times its work is scheduled, that is, xfrm_hash_resize() is not called concurrently at all, so hash_resize_mutex is redundant for us. Cc: Christophe Gouault <christophe.gouault@6wind.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	neigh: document gc_thresh2	stephen hemminger	2014-08-25	1	-0/+6
\| \| \| \| \| \| \|	Missing documentation for gc_thresh2 sysctl. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: bnx2x: fix build error with ptp	Randy Dunlap	2014-08-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2x uses ptp functions, so it should select the provider of those functions (PTP_1588_CLOCK). Fixes these build errors: drivers/built-in.o: In function `__bnx2x_remove': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13409: undefined reference to `ptp_clock_unregister' drivers/built-in.o: In function `bnx2x_register_phc': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13202: undefined reference to `ptp_clock_register' drivers/built-in.o: In function `bnx2x_get_ts_info': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3498: undefined reference to `ptp_clock_index' Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	team: set IFF_TEAM_PORT priv_flag after rx_handler is registered	Jiri Pirko	2014-08-25	1	-14/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When one tries to add eth as a port into team and that eth is already in use by other rx_handler device (macvlan, bond, bridge, ...) a bug in team_port_add() causes that IFF_TEAM_PORT flag is set before rx_handler is registered. In between, netdev nofifier is called and team_device_event() sees IFF_TEAM_PORT and thinks that rx_handler_data pointer is set to team_port. But it isn't. Fix this by reordering rx_handler register and IFF_TEAM_PORT priv flag set so it is very similar to how bonding does this. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Fixes: 3d249d4ca7 "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bpf: x86: add missing 'shift by register' instructions to x64 eBPF JIT	Alexei Starovoitov	2014-08-25	2	-0/+80
\| \| \| \| \| \| \| \| \| \|	'shift by register' operations are supported by eBPF interpreter, but were accidently left out of x64 JIT compiler. Fix it and add a testcase. Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'bnx2x-next'	David S. Miller	2014-08-25	5	-18/+44
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yuval Mintz says: ==================== bnx2x: `fixes' patch-series This series contains mostly bug fixes, but never the less is intended for `net-next' and not `net', as: - Some of the fixes are quite insignificant [`VF clean statistics', `ethtool -d might cause timeout in log']. - Some only recently were submitted to `net-next' [`Fix timesync endianity']. - Some are not usually compiled as part of the kernel [`Fix stop-on-error']. Dave - please consider applying this series to `net-next'; If you prefer, I can break this series into 2 parts [one for `net' and the other for `net-next'] - but personally I don't see much benefit in it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: Fix timesync endianity	Michal Kalderon	2014-08-25	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support") has a missing conversion to LE32, which will prevent the feature from working on big endian machines. Signed-off-by: Michal Kalderon <Michal.Kalderon@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: Be more forgiving toward SW GRO	Dmitry Kravkov	2014-08-25	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces 2 new relaxations in the bnx2x driver regarding GRO: 1. Don't prevent SW GRO if HW GRO is disabled. 2. If all aggregations are disabled, when GRO configuration changes there's no need to perform an inner-reload [since it will have no actual effect]. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: VF clean statistics	Yuval Mintz	2014-08-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During statistics initialization of a VF we need to clean its statistics. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: Fix stop-on-error	Yuval Mintz	2014-08-25	2	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When STOP_ON_ERROR is set driver will not compile. Even if it did, traffic will not pass without this patch as several fields which are verified by FW/HW on the Tx path are not properly set. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: ethtool -d might cause timeout in log	Dmitry Kravkov	2014-08-25	1	-9/+5
\|/ \| \| \| \| \| \| \| \| \| \|	This changes slightly the set of registers read during `ethtool -d'. Without this change, it's possible the HW will generate a grc Attention which will be logged into system logs as `grc timeout'. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: make skb an optional parameter for__skb_flow_dissect()	WANG Cong	2014-08-25	2	-5/+17
\| \| \| \| \| \| \|	Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fix comments for __skb_flow_get_ports()	WANG Cong	2014-08-25	1	-2/+4
\| \| \| \| \| \| \|	Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ixgbe: support skb->xmit_more in netdev_ops->ndo_start_xmit()	Daniel Borkmann	2014-08-25	1	-3/+4
\| \| \| \| \| \| \| \|	This implements the deferred tail pointer flush API for the ixgbe driver. Similar version also proposed longer time ago by Alexander Duyck. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Remove ndo_xmit_flush netdev operation, use signalling instead.	David S. Miller	2014-08-25	4	-56/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As reported by Jesper Dangaard Brouer, for high packet rates the overhead of having another indirect call in the TX path is non-trivial. There is the indirect call itself, and then there is all of the reloading of the state to refetch the tail pointer value and then write the device register. Move to a more passive scheme, which requires very light modifications to the device drivers. The signal is a new skb->xmit_more value, if it is non-zero it means that more SKBs are pending to be transmitted on the same queue as the current SKB. And therefore, the driver may elide the tail pointer update. Right now skb->xmit_more is always zero. Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'is_kdump_kernel'	David S. Miller	2014-08-25	4	-3/+7
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Amir Vadai says: ==================== Make is_kdump_kernel() accessible from modules I'm re-spinning this patchset. At the begining it was suggested to use a different name for the parameter, but at the end [3] the resolution was to leave it as it is in this patch. Drivers need to know if running from kdump kernel in order to change their memory profile - since kdump environment is limited by available memory. Currently there are drivers that are using reset_devices as suggested in [2]. In [2] it was suggested to use reset_devices, but the context was, to enable driver to know when the hardware device is needed to be reset, and not if this is a kdump environment. We think that is_kdump_kernel() is better suited to select between different memory profiles. The first patch in this patchset exports a needed symbol in order to make is_kdump_kernel() accessible from the drivers. The rest of the patches change from reset_devices to is_kdump_kernel() in 2 networking drivers. The idea of this patchset was suggested by Vivek Goyal. Tested (only build) and applied on top of commit 8fc54f6: ("net: use reciprocal_scale() helper") [1] - ea1c1af: ("net/mlx4_en: Reduce memory consumption on kdump kernel") [2] - https://lkml.org/lkml/2011/1/27/341 [3] - http://www.spinics.net/lists/netdev/msg291492.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/bnx2x: Use is_kdump_kernel() to detect kdump kernel	Amir Vadai	2014-08-25	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. CC: Ariel Elior <ariel.elior@qlogic.com> CC: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4: Use is_kdump_kernel() to detect kdump kernel	Amir Vadai	2014-08-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	crash_dump: Make is_kdump_kernel() accessible from modules	Amir Vadai	2014-08-25	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to make is_kdump_kernel() accessible from modules, need to make elfcorehdr_addr exported. This was rejected in the past [1] because reset_devices was prefered in that context (reseting the device in kdump kernel), but now there are some network drivers that need to reduce memory usage when loaded from a kdump kernel. And in that context, is_kdump_kernel() suits better. [1] - https://lkml.org/lkml/2011/1/27/341 CC: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	stmmac: simple cleanups	Pavel Machek	2014-08-25	3	-12/+10
\| \| \| \| \| \| \| \|	This adds simple cleanups for stmmac, removing test we know is always true, fixing whitespace, and moving code out of if(). Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	r8152: check code with checkpatch.pl	hayeswang	2014-08-25	1	-31/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	626: CHECK: Alignment should match open parenthesis 646: CHECK: Alignment should match open parenthesis 655: CHECK: Alignment should match open parenthesis 695: CHECK: Alignment should match open parenthesis 729: CHECK: Alignment should match open parenthesis 739: CHECK: Alignment should match open parenthesis 976: WARNING: externs should be avoided in .c files 1314: CHECK: Alignment should match open parenthesis 1358: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1402: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1521: CHECK: multiple assignments should be avoided 1775: CHECK: Alignment should match open parenthesis 1838: CHECK: multiple assignments should be avoided 1843: CHECK: multiple assignments should be avoided 1847: CHECK: multiple assignments should be avoided 1850: WARNING: Missing a blank line after declarations 1864: CHECK: Alignment should match open parenthesis 1872: CHECK: braces {} should be used on all arms of this statement 1906: CHECK: usleep_range is preferred over udelay 2865: WARNING: networking block comments don't use an empty /* line, use /* Comment... 3088: CHECK: Alignment should match open parenthesis total: 0 errors, 5 warnings, 16 checks, 3567 lines checked Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'ndo_xmit_flush'	David S. Miller	2014-08-24	11	-27/+77
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Basic deferred TX queue flushing infrastructure. Over time, and specifically and more recently at the Networking Workshop during Kernel SUmmit in Chicago, we have discussed the idea of having some way to optimize transmits of multiple TX packets at a time. There are several areas of overhead that could be amortized with such schemes. One has to do with locking and transactional overhead, the other has to do with device specific costs. This patch set here is more aimed at device specific costs. Typically a device queues up a packet in the TX queue and then has to do something to have the device start processing that new entry. Sometimes this is composed of doing an MMIO write to a "tail" register, and in other cases it can involve something as expensive as a hypervisor call. The basic setup defined here is that when the driver supports deferred TX queue flushing, ndo_start_xmit should no longer perform that operation. Instead a new operation, ndo_xmit_flush, should do it. I have converted IGB and virtio_net as example initial users. The IGB conversion is tested, virtio_net is not but it does compile :-) All ndo_start_xmit call sites have been abstracted behind a new helper called netdev_start_xmit(). This just adds the infrastructure, it does not actually add any instances of actually doing multiple ndo_start_xmit calls per ndo_xmit_flush invocation. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	virtio_net: Support netdev_ops->ndo_xmit_flush()	David S. Miller	2014-08-24	1	-1/+9
\| \| \| \| \| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	igb: Support netdev_ops->ndo_xmit_flush()	David S. Miller	2014-08-24	1	-11/+24
\| \| \| \| \| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: Add ops->ndo_xmit_flush()	David S. Miller	2014-08-24	9	-15/+44
\|/ \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: White-space cleansing : gaps between function and symbol export	Ian Morris	2014-08-24	11	-23/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch removes some blank lines between the end of a function definition and the EXPORT_SYMBOL_GPL macro in order to prevent checkpatch warning that EXPORT_SYMBOL must immediately follow a function. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: White-space cleansing : Structure layouts	Ian Morris	2014-08-24	5	-21/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch addresses structure definitions, specifically it cleanses the brace placement and replaces spaces with tabs in a few places. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: White-space cleansing : Line Layouts	Ian Morris	2014-08-24	31	-203/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char )foo etc. Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ec_bhf: remove excessive debug messages	Darek Marcinkiewicz	2014-08-24	1	-91/+10
\| \| \| \| \| \| \| \|	This cuts down the number of debug information spit out by the driver. Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
*	random32: improvements to prandom_bytes	Daniel Borkmann	2014-08-24	2	-23/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses a couple of minor items, mostly addesssing prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t for length arguments, 2) We can use put_unaligned() when filling the array instead of open coding it [ perhaps some archs will further benefit from their own arch specific implementation when GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned int as type for getting the arch seed, 5) Make use of prandom_u32_max() for timer slack. Regarding the change to put_unaligned(), callers of prandom_bytes() which internally invoke prandom_bytes_state(), don't bother as they expect the array to be filled randomly and don't have any control of the internal state what-so-ever (that's also why we have periodic reseeding there, etc), so they really don't care. Now for the direct callers of prandom_bytes_state(), which are solely located in test cases for MTD devices, that is, drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}: These tests basically fill a test write-vector through prandom_bytes_state() with an a-priori defined seed each time and write that to a MTD device. Later on, they set up a read-vector and read back that blocks from the device. So in the verification phase, the write-vector is being re-setup [ so same seed and prandom_bytes_state() called ], and then memcmp()'ed against the read-vector to check if the data is the same. Akinobu, Lothar and I also tested this patch and it runs through the 3 relevant MTD test cases w/o any errors on the nandsim device (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53 and i.MX6): # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \ third_id_byte=0x00 fourth_id_byte=0x15 # modprobe mtd_oobtest dev=0 # modprobe mtd_pagetest dev=0 # modprobe mtd_subpagetest dev=0 We also don't have any users depending directly on a particular result of the PRNG (except the PRNG self-test itself), and that's just fine as it e.g. allowed us easily to do things like upgrading from taus88 to taus113. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Lothar Waßmann <LW@KARO-electronics.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'csums-next'	David S. Miller	2014-08-24	12	-103/+233
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tom Herbert says: ==================== net: Checksum offload changes - Part V I am working on overhauling RX checksum offload. Goals of this effort are: - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY - Preserve CHECKSUM_COMPLETE through encapsulation layers - Don't do skb_checksum more than once per packet - Unify GRO and non-GRO csum verification as much as possible - Unify the checksum functions (checksum_init) - Simplify code What is in this fifth patch set: - Added GRO checksum validation functions - Call the GRO validations functions from TCP and GRE gro_receive - Perform checksum verification in the UDP gro_receive path using GRO functions and add support for gro_receive in UDP6 Changes in V2: - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it to CHECKSUM_COMPLETE from GRO checksum validation. This avoids performance penalty in checksumming bytes which are before the header GRO is at. Please review carefully and test if possible, mucking with basic checksum functions is always a little precarious :-) ---- Test results with this patch set are below. I did not notice any performace regression. Tests run: TCP_STREAM: super_netperf with 200 streams TCP_RR: super_netperf with 200 streams and -r 1,1 Device bnx2x (10Gbps): No GRE RSS hash (RX interrupts occur on one core) UDP RSS port hashing enabled. * GRE with checksum with IPv4 encapsulated packets With fix: TCP_STREAM 9.91% CPU utilization 5163.78 Mbps TCP_RR 50.64% CPU utilization 219/347/502 90/95/99% latencies 834103 tps Without fix: TCP_STREAM 10.05% CPU utilization 5186.22 tps TCP_RR 49.70% CPU utilization 227/338/486 90/95/99% latencies 813450 tps * GRE without checksum with IPv4 encapsulated packets With fix: TCP_STREAM 10.18% CPU utilization 5159 Mbps TCP_RR 51.86% CPU utilization 214/325/471 90/95/99% latencies 865943 tps Without fix: TCP_STREAM 10.26% CPU utilization 5307.87 Mbps TCP_RR 50.59% CPU utilization 224/325/476 90/95/99% latencies 846429 tps *** Simulate device returns CHECKSUM_COMPLETE * VXLAN with checksum With fix: TCP_STREAM 13.03% CPU utilization 9093.9 Mbps TCP_RR 95.96% CPU utilization 161/259/474 90/95/99% latencies 1.14806e+06 tps Without fix: TCP_STREAM 13.59% CPU utilization 9093.97 Mbps TCP_RR 93.95% CPU utilization 160/259/484 90/95/99% latencies 1.10262e+06 tps * VXLAN without checksum With fix: TCP_STREAM 13.28% CPU utilization 9093.87 Mbps TCP_RR 95.04% CPU utilization 155/246/439 90/95/99% latencies 1.15e+06 tps Without fix: TCP_STREAM 13.37% CPU utilization 9178.45 Mbps TCP_RR 93.74% CPU utilization 161/257/469 90/95/99% latencies 1.1068e+06 Mbps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	gre: When GRE csum is present count as encap layer wrt csum	Tom Herbert	2014-08-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In GRE demux if the GRE checksum pop rcv encapsulation so that any encapsulated checksums are treated as tunnel checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	udp: additional GRO support	Tom Herbert	2014-08-24	4	-17/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement GRO for UDPv6. Add UDP checksum verification in gro_receive for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: Call skb_gro_checksum_validate	Tom Herbert	2014-08-24	2	-47/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP checksum in the gro context. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	gre: call skb_gro_checksum_simple_validate	Tom Herbert	2014-08-24	1	-36/+7
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: add gro_compute_pseudo functions	Tom Herbert	2014-08-24	2	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo for GRO path. The IP header is taken from skb_gro_network_header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: skb_gro_checksum_* functions	Tom Herbert	2014-08-24	2	-3/+107
\|/ \| \| \| \| \| \| \| \| \|	Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check, and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete. These are the cognates of the normal checksum functions but are used in the gro_receive path and operate on GRO related fields in sk_buffs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: use reciprocal_scale() helper	Daniel Borkmann	2014-08-23	14	-22/+24
\| \| \| \| \| \| \| \|	Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Allow raw buffers to be passed into the flow dissector.	David S. Miller	2014-08-23	3	-22/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drivers, and perhaps other entities we have not yet considered, sometimes want to know how deep the protocol headers go before deciding how large of an SKB to allocate and how much of the packet to place into the linear SKB area. For example, consider a driver which has a device which DMAs into pools of pages and then tells the driver where the data went in the DMA descriptor(s). The driver can then build an SKB and reference most of the data via SKB fragments (which are page/offset/length triplets). However at least some of the front of the packet should be placed into the linear SKB area, which comes before the fragments, so that packet processing can get at the headers efficiently. The first thing each protocol layer is going to do is a "pskb_may_pull()" so we might as well aggregate as much of this as possible while we're building the SKB in the driver. Part of supporting this is that we don't have an SKB yet, so we want to be able to let the flow dissector operate on a raw buffer in order to compute the offset of the end of the headers. So now we have a __skb_flow_dissect() which takes an explicit data pointer and length. Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'bcm7xxx_apd_eee'	David S. Miller	2014-08-23	6	-127/+229
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Florian Fainelli says: ==================== net: phy: bcm7xxx: APD and EEE support This patch series enables Auto-power down and EEE for the BCM7xxx integrated Gigabit PHYs. I also put a fix for the fixed PHY that would allow clause 45 over clause 22 reads/writes but would return bogus data by using e.g: ethtool --show-eee ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: bcm7xxx: enable EEE at the PHY level	Florian Fainelli	2014-08-23	2	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 28nm Gigabit PHY on BCM7xxx chips comes out of reset with absolutely no EEE capabilities, such that we would actually return that we do not support EEE when accessing 3.20 (MDIO_PCS_EEE_ABLE) registers. Poke through the vendor-specific C45 register to enable EEE globally at the PHY level, and advertise supported EEE modes. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: allow phy_init_eee() to work with internal PHYs	Florian Fainelli	2014-08-23	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Internal PHYs do not have any specific phy_interface_t defined because they are within an Ethernet MAC or a larger IC, they will fail the early check in phy_init_eee(). Allow these PHYs to proceed with EEE initialization and report error/success by checking the standard C45 EEE-related registers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: export phy_{read,write}_mmd_indirect	Florian Fainelli	2014-08-23	2	-2/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some PHY drivers might need to access Clause 45 registers in Clause 22 compatibility mode to e.g: properly advertise EEE support when disabled by default. Export these two helper functions: phy_read_mmd_indirect() and phy_write_mmd_indirect() for drivers to use them. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: fixed: return an error for Clause 45 over 22 reads	Florian Fainelli	2014-08-23	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fixed PHY driver does not properly emulate Clause 45 over Clause 22 MDIO reads, and as such, will return bogus values when we access such registers. Return an error when accessing these registers in order to prevent advertising bogus capabilities such as EEE support and such. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: bcm7xxx: enable auto power down	Florian Fainelli	2014-08-23	1	-1/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 28nm process BCM7xxx internal Gigabit PHYs all support automatic power down, turn on that feature as part of the configuration initialization callback. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: broadcom: move shadow 0x1C register accessors to brcmphy.h	Florian Fainelli	2014-08-23	2	-18/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The shadow register 0x1C is used both by the BCM54xxx PHYs and the BCM7xxx internal PHYs, move the accessors to a common location so both drivers can use them. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: phy: broadcom: extract all registers to brcmphy.h	Florian Fainelli	2014-08-23	2	-104/+103
\|/ \| \| \| \| \| \| \| \| \|	Commit 439d39a9ac8fbbba9c04581361188f33f21ced50 ("net: phy: broadcom: extract register definitions") added a bunch of registers to brcmphy.h but left some to broadcom.c, move all of them to the header file since the BCM54xx and BCM7xxx PHY drivers do share all of these registers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'tipc-next'	David S. Miller	2014-08-23	20	-1301/+956
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jon Maloy says: ==================== tipc: Merge port and socket layer code After the removal of the TIPC native interface, there is no reason to keep a distinction between a "generic" port layer and a "specific" socket layer in the code. Throughout the last months, we have posted several series that aimed at facilitating removal of the port layer, and in particular the port_lock spinlock, which in reality duplicates the role normally kept by lock_sock()/bh_lock_sock(). In this series, we finalize this work, by making a significant number of changes to the link, node, port and socket code, all with the aim of reducing dependencies between the layers. In the final commits, we then remove the port spinlock, port.c and port.h altogether. After this series, we have a socket layer that has only few dependencies to the rest of the stack, so that it should be possible to continue cleanups of its code without significantly affecting other code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: merge struct tipc_port into struct tipc_sock	Jon Paul Maloy	2014-08-23	2	-241/+206
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We complete the merging of the port and socket layer by aggregating the fields of struct tipc_port directly into struct tipc_sock, and moving the combined structure into socket.c. We also move all functions and macros that are not any longer exposed to the rest of the stack into socket.c, and rename them accordingly. Despite the size of this commit, there are no functional changes. We have only made such changes that are necessary due of the removal of struct tipc_port. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: remove files ref.h and ref.c	Jon Paul Maloy	2014-08-23	6	-334/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reference table is now 'socket aware' instead of being generic, and has in reality become a socket internal table. In order to be able to minimize the API exposed by the socket layer towards the rest of the stack, we now move the reference table definitions and functions into the file socket.c, and rename the functions accordingly. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>