summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tcp: make tcp_clear_md5_list staticstephen hemminger2012-10-311-1/+1
| | | | | | | Trivial. Only used in one file. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: compute skb->rxhash if nic hash may be 3-tupleWillem de Bruijn2012-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Network device drivers can communicate a Toeplitz hash in skb->rxhash, but devices differ in their hashing capabilities. All compute a 5-tuple hash for TCP over IPv4, but for other connection-oriented protocols, they may compute only a 3-tuple. This breaks RPS load balancing, e.g., for TCP over IPv6 flows. Additionally, for GRE and other tunnels, the kernel computes a 5-tuple hash over the inner packet if possible, but devices do not. This patch recomputes the rxhash in software in all cases where it cannot be certain that a 5-tuple was computed. Device drivers can avoid recomputation by setting the skb->l4_rxhash flag. Recomputing adds cycles to each packet when RPS is enabled or the packet arrives over a tunnel. A comparison of 200x TCP_STREAM between two servers running unmodified netnext with rxhash computation in hardware vs software (using ethtool -K eth0 rxhash [on|off]) shows how much time is spent in __skb_get_rxhash in this worst case: 0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash 0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash 0.05% swapper [kernel.kallsyms] [k] __skb_get_rxhash With 200x TCP_RR it increases to 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash I considered having the patch explicitly skips recomputation when it knows that it will not improve the hash (TCP over IPv4), but that conditional complicates code without saving many cycles in practice, because it has to take place after flow dissector. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* dlink: dl2k: use the module_pci_driver macroDevendra Naga2012-10-311-15/+1
| | | | | | | | use the module_pci_driver macro to make the code simpler by eliminating module_init and module_exit calls. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* realtek: r8169: use module_pci_driver macroDevendra Naga2012-10-311-12/+1
| | | | | | | | use the module_pci_driver macro to make the code simpler by eliminating the module_init and module_exit calls Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller2012-10-3116-247/+238
|\ | | | | | | | | | | | | | | included changes: - some code cleanups and minor fixes (3 of them were reported by Coverity) - 'struct hard_iface' re-shaping to improve multi-protocol support - ECTP packets silent drop - transfer the WIFI flag on clients in case of roaming
| * batman-adv: add kernel-doc for enum batadv_dbg_levelAntonio Quartulli2012-10-291-4/+11
| | | | | | | | Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: pass the WIFI flag from the local to global entryAntonio Quartulli2012-10-292-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in case of client roaming a new global entry is added while a corresponding local one is still present. In this case the node can safely pass the WIFI flag from the local to the global entry. This change is required to let the AP-isolation correctly working in case of roaming: if a generic WIFI client C roams from node A to B, A adds a global entry for C without adding any WIFI flag. The latter will be set only later, once A has received C's advertisement from B. In this time period the AP-Isolation (if enabled) would not correctly work since C is not marked as WIFI, so allowing it to communicate with other WIFI clients. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: properly convert flag into a boolean valueAntonio Quartulli2012-10-291-1/+1
| | | | | | | | | | | | | | In order to properly convert a bitwise AND to a boolean value, the whole expression must be prepended by "!!". Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: check for more space before accessing the skbAntonio Quartulli2012-10-291-2/+6
| | | | | | | | | | | | | | | | In batadv_check_unicast_ttvn() the code accesses both the unicast header and the Ethernet header in the payload. For this reason pskb_may_pull() must be invoked to check for the required space. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: print packets re-routing on DBG_TT and ratelimit itAntonio Quartulli2012-10-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | To simplify TranslationTable debugging it is better to print the packet rerouting message on the DBG_TT log level. In this way a developer interested in packets rerouting doesn't need to filter it out of the whole ROUTES log. Moreover, since this message will appear for each rerouted message, it is now "ratelimited". Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: properly store the roaming timeAntonio Quartulli2012-10-291-0/+6
| | | | | | | | | | | | | | | | | | | | in case of a new global entry added because of roaming, the roam_at field must be properly initiated with the current time. This value will be later use to purge this entry out on time out (if nobody claims it). Instead roam_at field is now set to zero in this situation leading to an immediate purging of the related entry. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: don't allow ECTP traffic on batman-advSimon Wunderlich2012-10-291-2/+10
| | | | | | | | | | | | | | | | | | We have seen this to break networks when used with bridge loop avoidance. As we can't see any benefit from sending these ancient frames via our mesh, we just drop them. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: Only increase refcounter once for alternate routerSven Eckelmann2012-10-291-15/+8
| | | | | | | | | | | | | | | | | | | | | | The test whether we can use a router for alternating bonding should only be done once because it is already known that it is still usable and will not be deleted from the list soon. This patch addresses Coverity #712285: Unchecked return value Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: Check return value of try_module_getSven Eckelmann2012-10-294-18/+12
| | | | | | | | | | | | | | | | | | | | New operations should not be started when they need an increased module reference counter and try_module_get failed. This patch addresses Coverity #712284: Unchecked return value Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: Remove extra check in batadv_bit_get_packetSven Eckelmann2012-10-291-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | batadv_bit_get_packet checks for all common situations before it decides that the new received packet indicates that the host was restarted. This extra condition check at the end of the function is not necessary because this condition is always true. This patch addresses Coverity #712296: Logically dead code Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: Set special lockdep classes to avoid lockdep warningSven Eckelmann2012-10-291-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Transmissions over batman-adv devices always start another nested transmission over devices attached to the batman-adv interface. These devices usually use the ethernet lockdep class for the tx_queue lock which is also set by default for all batman-adv devices. Lockdep will detect a nested locking attempt of two locks with the same class and warn about a possible deadlock. This is the default and expected behavior and should not alarm the locking correctness prove mechanism. Therefore, the locks for all netdevice specific tx queues get a special batman-adv lock class to avoid a false positive for each transmission. Reported-by: Linus Luessing <linus.luessing@web.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: return proper value in case of hash_add failureAntonio Quartulli2012-10-291-1/+1
| | | | | | | | | | | | | | In case of hash_add failure tt_global_add() must return 0 (which means on entry insertion). Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: consolidate duplicated primary_if checking codeMarek Lindner2012-10-296-91/+57
| | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: Remove unused define BAT_ATTR_HIF_UINTSven Eckelmann2012-10-291-49/+0
| | | | | | | | | | | | Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: split hard_iface struct for each routing protocolMarek Lindner2012-10-293-24/+39
| | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: use check_unicast_packet() in recv_roam_adv()Antonio Quartulli2012-10-291-14/+1
| | | | | | | | | | | | | | | | To avoid code duplication and to simplify further changes, check_unicast_packet() is now used in recv_roam_adv() to check for not well formed packets and so discard them. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | qla3xxx: remove unused variable in ql_process_mac_tx_intr()Wei Yongjun2012-10-311-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | The variable retval is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | qla3xxx: use module_pci_driver to simplify the codeWei Yongjun2012-10-311-12/+1
| | | | | | | | | | | | | | | | | | | | | | | | Use the module_pci_driver() macro to make the code simpler by eliminating module_init and module_exit calls. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | smsc95xx: add wol support for more frame typesSteve Glendinning2012-10-313-6/+128
| | | | | | | | | | | | | | | | | | | | This patch adds support for wol wakeup on unicast, broadcast, multicast and arp frames. The wakeup filter code isn't pretty, but it works. Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/ipv4/ipconfig: add device address to a KERN_INFO messageClaudio Fontana2012-10-311-2/+4
| | | | | | | | | | | | | | | | adds a "hwaddr" to the "IP-Config: Complete" KERN_INFO message with the dev_addr of the device selected for auto configuration. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ixgbe: add setlink, getlink support to ixgbe and ixgbevfJohn Fastabend2012-10-315-3/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the net device ops to manage the embedded hardware bridge on ixgbe devices. With this patch the bridge mode can be toggled between VEB and VEPA to support stacking macvlan devices or using the embedded switch without any SW component in 802.1Qbg/br environments. Additionally, this adds source address pruning to the ixgbevf driver to prune any frames sent back from a reflective relay on the switch. This is required because the existing hardware does not support this. Without it frames get pushed into the stack with its own src mac which is invalid per 802.1Qbg VEPA definition. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: set and query VEB/VEPA bridge mode via PF_BRIDGEJohn Fastabend2012-10-314-7/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardware switches may support enabling and disabling the loopback switch which puts the device in a VEPA mode defined in the IEEE 802.1Qbg specification. In this mode frames are not switched in the hardware but sent directly to the switch. SR-IOV capable NICs will likely support this mode I am aware of at least two such devices. Also I am told (but don't have any of this hardware available) that there are devices that only support VEPA modes. In these cases it is important at a minimum to be able to query these attributes. This patch adds an additional IFLA_BRIDGE_MODE attribute that can be set and dumped via the PF_BRIDGE:{SET|GET}LINK operations. Also anticipating bridge attributes that may be common for both embedded bridges and software bridges this adds a flags attribute IFLA_BRIDGE_FLAGS currently used to determine if the command or event is being generated to/from an embedded bridge or software bridge. Finally, the event generation is pulled out of the bridge module and into rtnetlink proper. For example using the macvlan driver in VEPA mode on top of an embedded switch requires putting the embedded switch into a VEPA mode to get the expected results. -------- -------- | VEPA | | VEPA | <-- macvlan vepa edge relays -------- -------- | | | | ------------------ | VEPA | <-- embedded switch in NIC ------------------ | | ------------------- | external switch | <-- shiny new physical ------------------- switch with VEPA support A packet sent from the macvlan VEPA at the top could be loopbacked on the embedded switch and never seen by the external switch. So in order for this to work the embedded switch needs to be set in the VEPA state via the above described commands. By making these attributes nested in IFLA_AF_SPEC we allow future extensions to be made as needed. CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: create generic bridge opsJohn Fastabend2012-10-315-60/+108
|/ | | | | | | | | | | | | | | | | | | The PF_BRIDGE:RTM_{GET|SET}LINK nlmsg family and type are currently embedded in the ./net/bridge module. This prohibits them from being used by other bridging devices. One example of this being hardware that has embedded bridging components. In order to use these nlmsg types more generically this patch adds two net_device_ops hooks. One to set link bridge attributes and another to dump the current bride attributes. ndo_bridge_setlink() ndo_bridge_getlink() CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sierra: shut up sparse restricted type warningsBjørn Mork2012-10-281-2/+2
| | | | | | | | | | | | | | | Removes the warnings drivers/net/usb/sierra_net.c:343:45: warning: incorrect type in assignment (different base types) drivers/net/usb/sierra_net.c:343:45: expected unsigned short [unsigned] [short] [usertype] <noident> drivers/net/usb/sierra_net.c:343:45: got restricted __be16 [usertype] <noident> and drivers/net/usb/sierra_net.c:658:18: warning: cast to restricted __le16 Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: cdc_ncm: error path lock fixBjørn Mork2012-10-281-0/+2
| | | | | | | | | Fixes the sparse warning drivers/net/usb/cdc_ncm.c:836:9: warning: context imbalance in 'cdc_ncm_txpath_bh' - different lock contexts for basic block Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: cdc_ncm: big endian fixBjørn Mork2012-10-281-1/+1
| | | | | | | | | | | | Probably doesn't matter much since the value is used as a boolean anyway, but it removes the sparse warning: drivers/net/usb/cdc_ncm.c:1090:32: warning: incorrect type in assignment (different base types) drivers/net/usb/cdc_ncm.c:1090:32: expected unsigned short [unsigned] [usertype] connected drivers/net/usb/cdc_ncm.c:1090:32: got restricted __le16 [usertype] wValue Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnl/ipv4: add support of RTM_GETNETCONFNicolas Dichtel2012-10-281-2/+73
| | | | | | | This message allows to get the devconf for an interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnl/ipv4: use netconf msg to advertise forwarding statusNicolas Dichtel2012-10-282-4/+91
| | | | | Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnl/ipv6: add support of RTM_GETNETCONFNicolas Dichtel2012-10-281-2/+73
| | | | | | | This message allows to get the devconf for an interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnl/ipv6: use netconf msg to advertise forwarding statusNicolas Dichtel2012-10-282-0/+79
| | | | | Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnl: add a new type of msg to advertise protocol configurationNicolas Dichtel2012-10-282-0/+27
| | | | | | | | | | A new type is added to allow userland to monitor protocol configuration, like IPv4 or IPv6. For example, monitoring the state of the forwarding status of an interface of the system. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller2012-10-2617-362/+502
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== The following changeset contains updates for IPVS from Jesper Dangaard Brouer that did not reach the previous merge window in time. More specifically, updates to improve IPv6 support in IPVS. More relevantly, some of the existing code performed wrong handling of the extensions headers and better fragmentation handling. Jesper promised more follow-up patches to refine this after this batch hits net-next. Yet to come. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipvs: fix build errors related to config option combinationsJesper Dangaard Brouer2012-10-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix two build error introduced by commit 63dca2c0: "ipvs: Fix faulty IPv6 extension header handling in IPVS" First build error was fairly trivial and can occur, when CONFIG_IP_VS_IPV6 is disabled. The second build error was tricky, and can occur when deselecting both all Netfilter and IPVS, but selecting CONFIG_IPV6. This is caused by "kernel/sysctl_binary.c" including "net/ip_vs.h", which includes "linux/netfilter_ipv6/ip6_tables.h" causing include of "include/linux/netfilter/x_tables.h" which then cannot find the typedef nf_hookfn. Fix this by only including "linux/netfilter_ipv6/ip6_tables.h" in case of CONFIG_IP_VS_IPV6 as its already used to guard the usage of ipv6_find_hdr(). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * Merge branch 'master' of ↵Pablo Neira Ayuso2012-10-2217-362/+501
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next Pull updates from Jesper Dangaard Brouer for IPVS mostly targeted to improve IPv6 support (7 commits): ipvs: Trivial changes, use compressed IPv6 address in output ipvs: IPv6 extend ICMPv6 handling for future types ipvs: Use config macro IS_ENABLED() ipvs: Fix faulty IPv6 extension header handling in IPVS ipvs: Complete IPv6 fragment handling for IPVS ipvs: API change to avoid rescan of IPv6 exthdr ipvs: SIP fragment handling
| | * ipvs: SIP fragment handlingJesper Dangaard Brouer2012-09-281-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the nfct_reasm SKB if available. Based on part of a patch from: Hans Schillstrom I have left Hans'es comment in the patch (marked /HS) Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> [ horms@verge.net.au: Fix comment style ] Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: API change to avoid rescan of IPv6 exthdrJesper Dangaard Brouer2012-09-289-229/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce the number of times we scan/skip the IPv6 exthdrs. This patch contains a lot of API changes. This is done, to avoid repeating the scan of finding the IPv6 headers, via ipv6_find_hdr(), which is called by ip_vs_fill_iph_skb(). Finding the IPv6 headers is done as early as possible, and passed on as a pointer "struct ip_vs_iphdr *" to the affected functions. This patch reduce/removes 19 calls to ip_vs_fill_iph_skb(). Notice, I have choosen, not to change the API of function pointer "(*schedule)" (in struct ip_vs_scheduler) as it can be used by external schedulers, via {un,}register_ip_vs_scheduler. Only 4 out of 10 schedulers use info from ip_vs_iphdr*, and when they do, they are only interested in iph->{s,d}addr. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: Complete IPv6 fragment handling for IPVSJesper Dangaard Brouer2012-09-285-36/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPVS now supports fragmented packets, with support from nf_conntrack_reasm.c Based on patch from: Hans Schillstrom. IPVS do like conntrack i.e. use the skb->nfct_reasm (i.e. when all fragments is collected, nf_ct_frag6_output() starts a "re-play" of all fragments into the interrupted PREROUTING chain at prio -399 (NF_IP6_PRI_CONNTRACK_DEFRAG+1) with nfct_reasm pointing to the assembled packet.) Notice, module nf_defrag_ipv6 must be loaded for this to work. Report unhandled fragments, and recommend user to load nf_defrag_ipv6. To handle fw-mark for fragments. Add a new IPVS hook into prerouting chain at prio -99 (NF_IP6_PRI_NAT_DST+1) to catch fragments, and copy fw-mark info from the first packet with an upper layer header. IPv6 fragment handling should be the last thing on the IPVS IPv6 missing support list. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: Fix faulty IPv6 extension header handling in IPVSJesper Dangaard Brouer2012-09-2813-153/+214
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPv6 packets can contain extension headers, thus its wrong to assume that the transport/upper-layer header, starts right after (struct ipv6hdr) the IPv6 header. IPVS uses this false assumption, and will write SNAT & DNAT modifications at a fixed pos which will corrupt the message. To fix this, proper header position must be found before modifying packets. Introducing ip_vs_fill_iph_skb(), which uses ipv6_find_hdr() to skip the exthdrs. It finds (1) the transport header offset, (2) the protocol, and (3) detects if the packet is a fragment. Note, that fragments in IPv6 is represented via an exthdr. Thus, this is detected while skipping through the exthdrs. This patch depends on commit 84018f55a: "netfilter: ip6_tables: add flags parameter to ipv6_find_hdr()" This also adds a dependency to ip6_tables. Originally based on patch from: Hans Schillstrom kABI notes: Changing struct ip_vs_iphdr is a potential minor kABI breaker, because external modules can be compiled with another version of this struct. This should not matter, as they would most-likely be using a compiled-in version of ip_vs_fill_iphdr(). When recompiled, they will notice ip_vs_fill_iphdr() no longer exists, and they have to used ip_vs_fill_iph_skb() instead. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: Use config macro IS_ENABLED()Jesper Dangaard Brouer2012-09-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup patch. Use the IS_ENABLED macro, instead of having to check both the build and the module CONFIG_ option. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: IPv6 extend ICMPv6 handling for future typesJesper Dangaard Brouer2012-09-281-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend handling of ICMPv6, to all none Informational Messages (via ICMPV6_INFOMSG_MASK). This actually only extend our handling to type ICMPV6_PARAMPROB (Parameter Problem), and future types. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: Trivial changes, use compressed IPv6 address in outputJesper Dangaard Brouer2012-09-285-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | Have not converted the proc file output to compressed IPv6 addresses. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* | | net: Update args to dummy sock_update_classid().David S. Miller2012-10-261-1/+1
| | | | | | | | | | | | | | | | | | Only the real implementation got updated. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | l2tp: session is an array not a pointerAlan Cox2012-10-261-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | isdn: remove dead codeAlan Cox2012-10-261-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | multi is assigned to 0 and then acts as a constant. Remove the dead code. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | cgroup: net_cls: Rework update socket logicDaniel Wagner2012-10-263-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cgroup logic part of net_cls is very similar as the one in net_prio. Let's stream line the net_cls logic with the net_prio one. The net_prio update logic was changed by following commit (note there were some changes necessary later on) commit 406a3c638ce8b17d9704052c07955490f732c2b8 Author: John Fastabend <john.r.fastabend@intel.com> Date: Fri Jul 20 10:39:25 2012 +0000 net: netprio_cgroup: rework update socket logic Instead of updating the sk_cgrp_prioidx struct field on every send this only updates the field when a task is moved via cgroup infrastructure. This allows sockets that may be used by a kernel worker thread to be managed. For example in the iscsi case today a user can put iscsid in a netprio cgroup and control traffic will be sent with the correct sk_cgrp_prioidx value set but as soon as data is sent the kernel worker thread isssues a send and sk_cgrp_prioidx is updated with the kernel worker threads value which is the default case. It seems more correct to only update the field when the user explicitly sets it via control group infrastructure. This allows the users to manage sockets that may be used with other threads. Since classid is now updated when the task is moved between the cgroups, we don't have to call sock_update_classid() from various places to ensure we always using the latest classid value. [v2: Use iterate_fd() instead of open coding] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Li Zefan <lizefan@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Joe Perches <joe@perches.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: <netdev@vger.kernel.org> Cc: <cgroups@vger.kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>