summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | inet: tcp: fix inetpeer_set_addr_v4()Eric Dumazet2015-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern added a vif field in the a4 part of inetpeer_addr struct. This broke IPv4 TCP fast open client side and more generally tcp metrics cache, because inetpeer_addr_cmp() is now comparing two u32 instead of one. inetpeer_set_addr_v4() needs to properly init vif field, otherwise the comparison result depends on uninitialized data. Fixes: 192132b9a034 ("net: Add support for VRFs to inetpeer cache") Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv6: automatically enable stable privacy mode if stable_secret setHannes Frederic Sowa2015-12-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bjørn reported that while we switch all interfaces to privacy stable mode when setting the secret, we don't set this mode for new interfaces. This does not make sense, so change this behaviour. Fixes: 622c81d57b392cc ("ipv6: generation of stable privacy addresses for link-local and autoconf") Reported-by: Bjørn Mork <bjorn@mork.no> Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: fix uninitialized variable issuetadeusz.struk@intel.com2015-12-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | msg_iocb needs to be initialized on the recv/recvfrom path. Otherwise afalg will wrongly interpret it as an async call. Cc: stable@vger.kernel.org Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bluetooth: Validate socket address length in sco_sock_bind().David S. Miller2015-12-151-0/+3
| | | | | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net_sched: make qdisc_tree_decrease_qlen() work for non mqEric Dumazet2015-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stas Nichiporovich reported a regression in his HFSC qdisc setup on a non multi queue device. It turns out I mistakenly added a TCQ_F_NOPARENT flag on all qdisc allocated in qdisc_create() for non multi queue devices, which was rather buggy. I was clearly mislead by the TCQ_F_ONETXQUEUE that is also set here for no good reason, since it only matters for the root qdisc. Fixes: 4eaf3b84f288 ("net_sched: fix qdisc_tree_decrease_qlen() races") Reported-by: Stas Nichiporovich <stasn77@gmail.com> Tested-by: Stas Nichiporovich <stasn77@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'ser_gigaset-platform-device-dealloc'David S. Miller2015-12-151-12/+11
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Paul Bolle says: ==================== ser_gigaset: fix deallocation of platform device structure Sascha Levin reported that the syzkaller fuzzer triggered a WARNING in ser_gigaset (see https://lkml.kernel.org/g/56587467.8050102@oracle.com ). It turned out that ser_gigaset has always deallocated its platform device structure incorrectly. Tilman submitted the patch that fixes that (3/4) and a related cleanup (4/4). Tilman also submitted a minor cleanup of some NULL checks (1/4) that prompted Alan to turn those checks into WARN_ONs (2/4). If no one hits these WARN_ONs in the next couple of releases these WARN_ONs should be removed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | ser_gigaset: remove unnecessary kfree() calls from release methodTilman Schmidt2015-12-151-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | device->platform_data and platform_device->resource are never used and remain NULL through their entire life. Drops the kfree() calls for them from the device release method. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | ser_gigaset: fix deallocation of platform device structureTilman Schmidt2015-12-151-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When shutting down the device, the struct ser_cardstate must not be kfree()d immediately after the call to platform_device_unregister() since the embedded struct platform_device is still in use. Move the kfree() call to the release method instead. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Fixes: 2869b23e4b95 ("drivers/isdn/gigaset: new M101 driver (v2)") Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | ser_gigaset: turn nonsense checks into WARN_ONAlan Cox2015-12-151-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These checks do nothing useful to protect the code from races. On the other hand if the old code has been masking a real bug we would like to know about it. The check for tiocmset is kept because it is valid for a tty driver to have a NULL tiocmset method. That in itself is probably a mistake given modern coding practices - but needs fixing in the tty layer. Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | ser_gigaset: fix up NULL checksTilman Schmidt2015-12-151-3/+3
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f34d7a5b7010 ("tty: The big operations rework") changed tty->driver to tty->ops but left NULL checks for tty->driver untouched. Fix. Signed-off-by: Tilman Schmidt <tilman@imap.cc> [pebolle: removed Fixes tag] Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | qlcnic: fix a timeout loopDan Carpenter2015-12-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem here is that at the end of the loop we test for if idc->vnic_wait_limit is zero, but since idc->vnic_wait_limit-- is a post-op, it actually ends up set to (u8)-1. I have fixed this by moving the decrement inside the loop. Fixes: 486a5bc77a4a ('qlcnic: Add support for 83xx suspend and resume.') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sfc: fix a timeout loopDan Carpenter2015-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We test for if "tries" is zero at the end but "tries--" is a post-op so it will end with "tries" set to -1. I have changed it to a pre-op instead. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | qlge: fix a timeout loop in ql_change_rx_buffers()Dan Carpenter2015-12-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem here is that after the loop we test for "if (!i) " but because "i--" is a post-op we exit with i set to -1. I have fixed this by changing it to a pre-op instead. I had to change the starting value from 3 to 4 so that we still iterate 3 times. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | amd-xgbe: fix a couple timeout loopsDan Carpenter2015-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the end of the loop we test "if (!count)" but because "count--" is a post-op then the loop will end with count set to -1. I have fixed this by changing it to --count. Fixes: c5aa9e3b8156 ('amd-xgbe: Initial AMD 10GbE platform driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | mISDN: fix a loop countDan Carpenter2015-12-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two issue here. 1) cnt starts as maxloop + 1 so all these loops iterate one more time than intended. 2) At the end of the loop we test for "if (maxloop && !cnt)" but for the first two loops, we end with cnt equal to -1. Changing this to a pre-op means we end with cnt set to 0. Fixes: cae86d4a4e56 ('mISDN: Add driver for Infineon ISDN chipset family') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: fix handling return value of mlx4_slave_convert_portAndrzej Hajda2015-12-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function can return negative values, so its result should be assigned to signed variable. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2046107 Fixes: fc48866f7 ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | skbuff: Fix offset error in skb_reorder_vlan_headerVlad Yasevich2015-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb_reorder_vlan_header is called after the vlan header has been pulled. As a result the offset of the begining of the mac header has been incrased by 4 bytes (VLAN_HLEN). When moving the mac addresses, include this incrase in the offset calcualation so that the mac addresses are copied correctly. Fixes: a6e18ff1117 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off) CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | pptp: verify sockaddr_len in pptp_bind() and pptp_connect()WANG Cong2015-12-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Dmitry Vyukov <dvyukov@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ravb: Add disable 10baseKazuya Mizuguchi2015-12-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ethernet AVB does not support 10 Mbps transfer speed. Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sh_eth: fix descriptor access endiannessSergei Shtylyov2015-12-151-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver never calls cpu_to_edmac() when writing the descriptor address and edmac_to_cpu() when reading it, although it should -- fix this. Note that the frame/buffer length descriptor field accesses also need fixing but since they are both 16-bit we can't use {cpu|edmac}_to_{edmac|cpu}()... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sh_eth: fix TX buffer byte-swappingSergei Shtylyov2015-12-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the little-endian SH771x kernels the driver has to byte-swap the RX/TX buffers, however yet unset physcial address from the TX descriptor is used to call sh_eth_soft_swap(). Use 'skb->data' instead... Fixes: 31fcb99d9958 ("net: sh_eth: remove __flush_purge_region") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: fix IP early demux racesEric Dumazet2015-12-144-6/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Wilder reported crashes caused by dst reuse. <quote David> I am seeing a crash on a distro V4.2.3 kernel caused by a double release of a dst_entry. In ipv4_dst_destroy() the call to list_empty() finds a poisoned next pointer, indicating the dst_entry has already been removed from the list and freed. The crash occurs 18 to 24 hours into a run of a network stress exerciser. </quote> Thanks to his detailed report and analysis, we were able to understand the core issue. IP early demux can associate a dst to skb, after a lookup in TCP/UDP sockets. When socket cache is not properly set, we want to store into sk->sk_dst_cache the dst for future IP early demux lookups, by acquiring a stable refcount on the dst. Problem is this acquisition is simply using an atomic_inc(), which works well, unless the dst was queued for destruction from dst_release() noticing dst refcount went to zero, if DST_NOCACHE was set on dst. We need to make sure current refcount is not zero before incrementing it, or risk double free as David reported. This patch, being a stable candidate, adds two new helpers, and use them only from IP early demux problematic paths. It might be possible to merge in net-next skb_dst_force() and skb_dst_force_safe(), but I prefer having the smallest patch for stable kernels : Maybe some skb_dst_force() callers do not expect skb->dst can suddenly be cleared. Can probably be backported back to linux-3.6 kernels Reported-by: David J. Wilder <dwilder@us.ibm.com> Tested-by: David J. Wilder <dwilder@us.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sh_eth: uninline sh_eth_{write|read}()Sergei Shtylyov2015-12-142-25/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3365711df024 ("sh_eth: WARN on access to a register not implemented in in a particular chip") added WARN_ON() to sh_eth_{read|write}(), thus making it unacceptable for these functions to be *inline* anymore. Remove *inline* and move the functions from the header to the driver itself. Below is our code economy with ARM gcc 4.7.3: $ size drivers/net/ethernet/renesas/sh_eth.o{~,} text data bss dec hex filename 32489 1140 0 33629 835d drivers/net/ethernet/renesas/sh_eth.o~ 25413 1140 0 26553 67b9 drivers/net/ethernet/renesas/sh_eth.o Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | stmmac: dwmac-sunxi: Call exit cleanup function in probe error pathChen-Yu Tsai2015-12-141-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dwmac-sunxi has 2 callbacks that were called from stmmac_platform as part of the probe and remove sequences. Ater the conversion of dwmac-sunxi into a standalone platform driver, the .init function is called before calling into the stmmac driver core, but .exit is not called to clean up if stmmac returns an error. This patch fixes the probe error path. This properly cleans up and releases resources when the driver core fails to probe. Cc: Joachim Eastwood <manabian@gmail.com> Fixes: 9a9e9a1edee8 ("stmmac: dwmac-sunxi: turn setup callback into a probe function") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: add validation for the socket syscall protocol argumentHannes Frederic Sowa2015-12-146-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: phy: mdio-mux: Check return value of mdiobus_alloc()Tobias Klauser2015-12-141-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mdiobus_alloc() might return NULL, but its return value is not checked in mdio_mux_init(). This could potentially lead to a NULL pointer dereference. Fix it by checking the return value Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | openvswitch: fix trivial comment typoPaolo Abeni2015-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 33db4125ec74 ("openvswitch: Rename LABEL->LABELS") left over an old OVS_CT_ATTR_LABEL instance, fix it. Fixes: 33db4125ec74 ("openvswitch: Rename LABEL->LABELS") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2015-12-146-59/+57
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains Netfilter fixes for you net tree, specifically for nf_tables and nfnetlink_queue, they are: 1) Avoid a compilation warning in nfnetlink_queue that was introduced in the previous merge window with the simplification of the conntrack integration, from Arnd Bergmann. 2) nfnetlink_queue is leaking the pernet subsystem registration from a failure path, patch from Nikolay Borisov. 3) Pass down netns pointer to batch callback in nfnetlink, this is the largest patch and it is not a bugfix but it is a dependency to resolve a splat in the correct way. 4) Fix a splat due to incorrect socket memory accounting with nfnetlink skbuff clones. 5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6. 6) Traverse the nftables commit list in reverse order from the commit path, otherwise we crash when the user applies an incremental update via 'nft -f' that deletes an object that was just introduced in this batch, from Xin Long. Regarding the compilation warning fix, many people have sent us (and keep sending us) patches to address this, that's why I'm including this batch even if this is not critical. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abortXin Long2015-12-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we use 'nft -f' to submit rules, it will build multiple rules into one netlink skb to send to kernel, kernel will process them one by one. meanwhile, it add the trans into commit_list to record every commit. if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY will be marked. after all the process is done. it will roll back all the commits. now kernel use list_add_tail to add trans to commit, and use list_for_each_entry_safe to roll back. which means the order of adding and rollback is the same. that will cause some cases cannot work well, even trigger call trace, like: 1. add a set into table foo [return -EAGAIN]: commit_list = 'add set trans' 2. del foo: commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans' then nf_tables_abort will be called to roll back: firstly process 'add set trans': case NFT_MSG_NEWSET: trans->ctx.table->use--; list_del_rcu(&nft_trans_set(trans)->list); it will del the set from the table foo, but it has removed when del table foo [step 2], then the kernel will panic. the right order of rollback should be: 'del tab trans' -> 'del set trans' -> 'add set trans'. which is opposite with commit_list order. so fix it by rolling back commits with reverse order in nf_tables_abort. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | netfilter: nf_dup: add missing dependencies with NF_CONNTRACKPablo Neira Ayuso2015-12-102-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_NF_CONNTRACK=m CONFIG_NF_DUP_IPV4=y results in: net/built-in.o: In function `nf_dup_ipv4': >> (.text+0xd434f): undefined reference to `nf_conntrack_untracked' Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in ↵Pablo Neira Ayuso2015-12-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skbuff clones If we attach the sk to the skb from nfnetlink_rcv_batch(), then netlink_skb_destructor() will underflow the socket receive memory counter and we get warning splat when releasing the socket. $ cat /proc/net/netlink sk Eth Pid Groups Rmem Wmem Dump Locks Drops Inode ffff8800ca903000 12 0 00000000 -54144 0 0 2 0 17942 ^^^^^^ Rmem above shows an underflow. And here below the warning splat: [ 1363.815976] WARNING: CPU: 2 PID: 1356 at net/netlink/af_netlink.c:958 netlink_sock_destruct+0x80/0xb9() [...] [ 1363.816152] CPU: 2 PID: 1356 Comm: kworker/u16:1 Tainted: G W 4.4.0-rc1+ #153 [ 1363.816155] Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012 [ 1363.816160] Workqueue: netns cleanup_net [ 1363.816163] 0000000000000000 ffff880119203dd0 ffffffff81240204 0000000000000000 [ 1363.816169] ffff880119203e08 ffffffff8104db4b ffffffff813d49a1 ffff8800ca771000 [ 1363.816174] ffffffff81a42b00 0000000000000000 ffff8800c0afe1e0 ffff880119203e18 [ 1363.816179] Call Trace: [ 1363.816181] <IRQ> [<ffffffff81240204>] dump_stack+0x4e/0x79 [ 1363.816193] [<ffffffff8104db4b>] warn_slowpath_common+0x9a/0xb3 [ 1363.816197] [<ffffffff813d49a1>] ? netlink_sock_destruct+0x80/0xb9 skb->sk was only needed to lookup for the netns, however we don't need this anymore since 633c9a840d0b ("netfilter: nfnetlink: avoid recurrent netns lookups in call_batch") so this patch removes this manual socket assignment to resolve this problem. Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
| | * | | netfilter: nfnetlink: avoid recurrent netns lookups in call_batchPablo Neira Ayuso2015-12-103-53/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the net pointer to the call_batch callback functions so we can skip recurrent lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
| | * | | netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failureNikolay Borisov2015-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3bfe049807c2403 ("netfilter: nfnetlink_{log,queue}: Register pernet in first place") reorganised the initialisation order of the pernet_subsys to avoid "use-before-initialised" condition. However, in doing so the cleanup logic in nfnetlink_queue got botched in that the pernet_subsys wasn't cleaned in case nfnetlink_subsys_register failed. This patch adds the necessary cleanup routine call. Fixes: 3bfe049807c2403 ("netfilter: nfnetlink_{log,queue}: Register pernet in first place") Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | netfilter: nfnetlink_queue: avoid harmless unnitialized variable warningsArnd Bergmann2015-11-231-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several ARM default configurations give us warnings on recent compilers about potentially uninitialized variables in the nfnetlink code in two functions: net/netfilter/nfnetlink_queue.c: In function 'nfqnl_build_packet_message': net/netfilter/nfnetlink_queue.c:519:19: warning: 'nfnl_ct' may be used uninitialized in this function [-Wmaybe-uninitialized] if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0) Moving the rcu_dereference(nfnl_ct_hook) call outside of the conditional code avoids the warning without forcing us to preinitialize the variable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: a4b4766c3ceb ("netfilter: nfnetlink_queue: rename related to nfqueue attaching conntrack info") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | net: Flush local routes when device changes vrf associationDavid Ahern2015-12-131-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VRF driver cycles netdevs when an interface is enslaved or released: the down event is used to flush neighbor and route tables and the up event (if the interface was already up) effectively moves local and connected routes to the proper table. As of 4f823defdd5b the local route is left hanging around after a link down, so when a netdev is moved from one VRF to another (or released from a VRF altogether) local routes are left in the wrong table. Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is an L3mdev then call fib_disable_ip to flush all routes, local ones to. Fixes: 4f823defdd5b ("ipv4: fix to not remove local route on link down") Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net:hns: print MAC with %pMAndy Shevchenko2015-12-131-35/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | printf() has a dedicated specifier to print MAC addresses. Use it instead of pushing each byte via stack. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net:hns: annotate IO address space properlyAndy Shevchenko2015-12-131-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark address pointer with __iomem in the IO accessors. Otherwise we will get a sparse complain like following .../hns/hns_dsaf_reg.h:991:36: warning: incorrect type in argument 1 (different address spaces) .../hns/hns_dsaf_reg.h:991:36: expected unsigned char [noderef] [usertype] <asn:2>*base .../hns/hns_dsaf_reg.h:991:36: got void *base Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'mpls-fixes'David S. Miller2015-12-121-12/+31
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Robert Shearman says: ==================== mpls: fixes for nexthops without via addresses These four fixes all apply to the case of having an mpls route with an output device, but without a nexthop. Patches 2 and 3 could really have been combined in one patch, but I wanted to separate the fix for some recent breakage from the fix for a day-1 issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | mpls: make via address optional for multipath routesRobert Shearman2015-12-121-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The via address is optional for a single path route, yet is mandatory when the multipath attribute is used: # ip -f mpls route add 100 dev lo # ip -f mpls route add 101 nexthop dev lo RTNETLINK answers: Invalid argument Make them consistent by making the via address optional when the RTA_MULTIPATH attribute is being parsed so that both forms of specifying the route work. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | mpls: fix out-of-bounds access when via address not specifiedRobert Shearman2015-12-121-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a via address isn't specified, the via table is left initialised to 0 (NEIGH_ARP_TABLE), and the via address length also left initialised to 0. This results in a via address array of length 0 being allocated (contiguous with route and nexthop array), meaning that when a packet is sent using neigh_xmit the neighbour lookup and creation will cause an out-of-bounds access when accessing the 4 bytes of the IPv4 address it assumes it has been given a pointer to. This could be fixed by allocating the 4 bytes of via address necessary and leaving it as all zeroes. However, it seems wrong to me to use an ipv4 nexthop (including possibly ARPing for 0.0.0.0) when the user didn't specify to do so. Instead, set the via address table to NEIGH_NR_TABLES to signify it hasn't been specified and use this at forwarding time to signify a neigh_xmit using an L2 address consisting of the device address. This mechanism is the same as that used for both ARP and ND for loopback interfaces and those flagged as no-arp, which are all we can really support in this case. Fixes: cf4b24f0024f ("mpls: reduce memory usage of routes") Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | mpls: don't dump RTA_VIA attribute if not specifiedRobert Shearman2015-12-121-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem seen is that when adding a route with a nexthop with no via address specified, iproute2 generates bogus output: # ip -f mpls route add 100 dev lo # ip -f mpls route list 100 via inet 0.0.8.0 dev lo The reason for this is that the kernel generates an RTA_VIA attribute with the family set to AF_INET, but the via address data having zero length. The cause of family being AF_INET is that on route insert cfg->rc_via_table is left set to 0, which just happens to be NEIGH_ARP_TABLE which is then translated into AF_INET. iproute2 doesn't validate the length prior to printing and so prints garbage. Although it could be fixed to do the validation, I would argue that AF_INET addresses should always be exactly 4 bytes so the kernel is really giving userspace bogus data. Therefore, avoid generating the RTA_VIA attribute when dumping the route if the via address wasn't specified on add/modify. This is indicated by NEIGH_ARP_TABLE and a zero via address length - if the user specified a via address the address length would have been validated such that it was 4 bytes. Although this is a change in behaviour that is visible to userspace, I believe that what was generated before was invalid and as such userspace wouldn't be expecting it. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | mpls: validate L2 via address lengthRobert Shearman2015-12-121-0/+4
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an L2 via address for an mpls nexthop is specified, the length of the L2 address must match that expected by the output device, otherwise it could access memory beyond the end of the via address buffer in the route. This check was present prior to commit f8efb73c97e2 ("mpls: multipath route support"), but got lost in the refactoring, so add it back, applying it to all nexthops in multipath routes. Fixes: f8efb73c97e2 ("mpls: multipath route support") Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | sfc: only use RSS filters if we're using RSSBert Kenward2015-12-123-13/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this, filter insertion on a VF would fail if only one channel was in use. This would include the unicast station filter and therefore no traffic would be received. Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | uapi: export ila.hstephen hemminger2015-12-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The file ila.h used for lightweight tunnels is being used by iproute2 but is not exported yet. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'bnxt_en-fixes'David S. Miller2015-12-113-18/+36
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Michael Chan says: ==================== bnxt_en: Bug fix and add tx timeout recovery. Fix a bitmap declaration bug and add missing tx timeout recovery. v2: Fixed white space error. Thanks Dave. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | bnxt_en: Implement missing tx timeout reset logic.Michael Chan2015-12-111-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reset logic calls bnxt_close_nic() and bnxt_open_nic() under rtnl_lock from bnxt_sp_task. BNXT_STATE_IN_SP_TASK must be cleared before calling bnxt_close_nic() to avoid deadlock. v2: Fixed white space error. Thanks Dave. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | bnxt_en: Don't cancel sp_task from bnxt_close_nic().Michael Chan2015-12-112-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When implementing driver reset from tx_timeout in the next patch, bnxt_close_nic() will be called from the sp_task workqueue. Calling cancel_work() on sp_task will hang the workqueue. Instead, set a new bit BNXT_STATE_IN_SP_TASK when bnxt_sp_task() is running. bnxt_close_nic() will wait for BNXT_STATE_IN_SP_TASK to clear before proceeding. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | bnxt_en: Change bp->state to bitmap.Michael Chan2015-12-113-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows multiple independent bits to be set for various states. Subsequent patches to implement tx timeout reset will require this. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | bnxt_en: Fix bitmap declaration to work on 32-bit arches.Michael Chan2015-12-111-6/+5
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The declaration of the bitmap vf_req_snif_bmap using fixed array of unsigned long will only work on 64-bit archs. Use DECLARE_BITMAP instead which will work on all archs. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | openvswitch: Respect conntrack zone even if invalidJoe Stringer2015-12-111-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If userspace executes ct(zone=1), and the connection tracker determines that the packet is invalid, then the ct_zone flow key field is populated with the default zone rather than the zone that was specified. Even though connection tracking failed, this field should be updated with the value that the action specified. Fix the issue. Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>