summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* rxrpc: Use negative error codes in rxrpc_call structDavid Howells2017-04-069-27/+27
| | | | | | | | Use negative error codes in struct rxrpc_call::error because that's what the kernel normally deals with and to make the code consistent. We only turn them positive when transcribing into a cmsg for userspace recvmsg. Signed-off-by: David Howells <dhowells@redhat.com>
* Merge tag 'linux-can-next-for-4.13-20170404' of ↵David S. Miller2017-04-056-176/+203
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2017-03-03 this is a pull request of 5 patches for net-next/master. There are two patches by Yegor Yefremov which convert the ti_hecc driver into a DT only driver, as there is no in-tree user of the old platform driver interface anymore. The next patch by Mario Kicherer adds network namespace support to the can subsystem. The last two patches by Akshay Bhat add support for the holt_hi311x SPI CAN driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * can: initial support for network namespacesMario Kicherer2017-04-046-176/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds initial support for network namespaces. The changes only enable support in the CAN raw, proc and af_can code. GW and BCM still have their checks that ensure that they are used only from the main namespace. The patch boils down to moving the global structures, i.e. the global filter list and their /proc stats, into a per-namespace structure and passing around the corresponding "struct net" in a lot of different places. Changes since v1: - rebased on current HEAD (2bfe01e) - fixed overlong line Signed-off-by: Mario Kicherer <dev@kicherer.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* | rtnl: Add support for netdev event to link messagesVlad Yasevich2017-04-052-10/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When netdev events happen, a rtnetlink_event() handler will send messages for every event in it's white list. These messages contain current information about a particular device, but they do not include the iformation about which event just happened. The consumer of the message has to try to infer this information. In some cases (ex: NETDEV_NOTIFY_PEERS), that is not possible. This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT that would have an encoding of the which event triggered this message. This would allow the the message consumer to easily determine if it is interested in a particular event or not. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnetlink: Convert rtnetlink_event to white listVlad Yasevich2017-04-051-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rtnetlink_event currently functions as a blacklist where we block cerntain netdev events from being sent to user space. As a result, events have been added to the system that userspace probably doesn't care about. This patch converts the implementation to the white list so that newly events would have to be specifically added to the list to be sent to userspace. This would force new event implementers to consider whether a given event is usefull to user space or if it's just a kernel event. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: tcp: Define the TCP_MAX_WSCALE instead of literal number 14Gao Feng2017-04-053-12/+11
| | | | | | | | | | | | | | | | | | | | Define one new macro TCP_MAX_WSCALE instead of literal number '14', and use U16_MAX instead of 65535 as the max value of TCP window. There is another minor change, use rounddown(space, mss) instead of (space / mss) * mss; Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink/diag: report flags for netlink socketsAndrey Vagin2017-04-053-8/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | cb_running is reported in /proc/self/net/netlink and it is reported by the ss tool, when it gets information from the proc files. sock_diag is a new interface which is used instead of proc files, so it looks reasonable that this interface has to report no less information about sockets than proc files. We use these flags to dump and restore netlink sockets. Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: sched: choke: remove some dead codeDan Carpenter2017-04-051-6/+0
|/ | | | | | | | | We accidentally left this dead code behind after commit 5952fde10c35 ("net: sched: choke: remove dead filter classify code"). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* soreuseport: use "unsigned int" in __reuseport_alloc()Alexey Dobriyan2017-04-031-2/+2
| | | | | | | | | | | | | | | | | Number of sockets is limited by 16-bit, so 64-bit allocation will never happen. 16-bit ops are the worst code density-wise on x86_64 because of additional prefix (66). Space savings: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3 (-3) function old new delta reuseport_add_sock 539 536 -3 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flowcache: more "unsigned int"Alexey Dobriyan2017-04-031-6/+7
| | | | | | | | | | | | Make ->hash_count, ->low_watermark and ->high_watermark unsigned int and propagate unsignedness to other variables. This change doesn't change code generation because these fields aren't used in 64-bit contexts but make it anyway: these fields can't be negative numbers. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flowcache: make flow_cache_hash_size() return "unsigned int"Alexey Dobriyan2017-04-031-6/+8
| | | | | | | | | | | | | | | | | | Hash size can't negative so "unsigned int" is logically correct. Propagate "unsigned int" to loop counters. Space savings: add/remove: 0/0 grow/shrink: 2/2 up/down: 6/-18 (-12) function old new delta flow_cache_flush_tasklet 362 365 +3 __flow_cache_shrink 333 336 +3 flow_cache_cpu_up_prep 178 171 -7 flow_cache_lookup 1159 1148 -11 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flowcache: make flow_key_size() return "unsigned int"Alexey Dobriyan2017-04-031-3/+3
| | | | | | | | | | | | | | Flow keys aren't 4GB+ numbers so 64-bit arithmetic is excessive. Space savings (I'm not sure what CSWTCH is): add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-48 (-48) function old new delta flow_cache_lookup 1163 1159 -4 CSWTCH 75997 75953 -44 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctpXin Long2017-04-033-2/+81
| | | | | | | | | | | | | | | | | | Before when implementing sctp prsctp, SCTP_PR_STREAM_STATUS wasn't added, as it needs to save abandoned_(un)sent for every stream. After sctp stream reconf is added in sctp, assoc has structure sctp_stream_out to save per stream info. This patch is to add SCTP_PR_STREAM_STATUS by putting the prsctp per stream statistics into sctp_stream_out. v1->v2: fix an indent issue. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: tcp: canonical connection order for all paths with index > 0Sowmini Varadhan2017-04-021-1/+1
| | | | | | | | | | | | | The rds_connect_worker() has a bug in the check that enforces the canonical connection order described in the comments of rds_tcp_state_change(). The intention is to make sure that all the multipath connections are always initiated by the smaller IP address via rds_start_mprds. To achieve this, rds_connection_worker should check that cp_index > 0. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is ↵Sowmini Varadhan2017-04-021-1/+9
| | | | | | | | | | | | | | | | marked ERROR by an intervening FIN rds_conn_shutdown() runs in workq context, and marks the rds_connection as DISCONNECTING before quiescing Tx/Rx paths. However, after all I/O has quiesced, we may still find the rds_connection state to be RDS_CONN_ERROR if an intervening FIN was processed in softirq context. This is not a fatal error: rds_conn_shutdown() should continue the shutdown, and there is no need to log noisy messages about this event. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Increase max number of labels for lwt encapDavid Ahern2017-04-013-8/+15
| | | | | | | | | | | | Alow users to push down more labels per MPLS encap. Similar to LSR case, move label array to the end of mpls_iptunnel_encap and allocate based on the number of labels for the route. For consistency with the LSR case, re-use the same maximum number of labels. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: bump maximum number of labelsDavid Ahern2017-04-012-34/+71
| | | | | | | | | | | | | Allow users to push down more labels per MPLS route. With the previous patches, no memory allocations are based on MAX_NEW_LABELS; the limit is only used to keep userspace in check. At this point MAX_NEW_LABELS is only used for mpls_route_config (copying route data from userspace) and processing nexthops looking for the max number of labels across the route spec. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Limit memory allocation for mpls_routeDavid Ahern2017-04-011-10/+21
| | | | | | | Limit memory allocation size for mpls_route to 4096. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: change mpls_route layoutDavid Ahern2017-04-012-31/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move labels to the end of mpls_nh as a 0-sized array and within mpls_route move the via for a nexthop after the mpls_nh. The new layout becomes: +----------------------+ | mpls_route | +----------------------+ | mpls_nh 0 | +----------------------+ | alignment padding | 4 bytes for odd number of labels; 0 for even +----------------------+ | via[rt_max_alen] 0 | +----------------------+ | alignment padding | via's aligned on sizeof(unsigned long) +----------------------+ | ... | +----------------------+ | mpls_nh n-1 | +----------------------+ | via[rt_max_alen] n-1 | +----------------------+ Memory allocated for nexthop + via is constant across all nexthops and their via. It is based on the maximum number of labels across all nexthops and the maximum via length. The size is saved in the mpls_route as rt_nh_size. Accessing a nexthop becomes rt->rt_nh + index * rt->rt_nh_size. The offset of the via address from a nexthop is saved as rt_via_offset so that given an mpls_nh pointer the via for that hop is simply nh + rt->rt_via_offset. With prior code, memory allocated per mpls_route with 1 nexthop: via is an ethernet address - 64 bytes via is an ipv4 address - 64 via is an ipv6 address - 72 With this patch set, memory allocated per mpls_route with 1 nexthop and 1 or 2 labels: via is an ethernet address - 56 bytes via is an ipv4 address - 56 via is an ipv6 address - 64 The 8-byte reduction is due to the previous patch; the change introduced by this patch has no impact on the size of allocations for 1 or 2 labels. Performance impact of this change was examined using network namespaces with veth pairs connecting namespaces. ns0 inserts the packet to the label-switched path using an lwt route with encap mpls. ns1 adds 1 or 2 labels depending on test, ns2 (and ns3 for 2-label test) pops the label and forwards. ns3 (or ns4) for a 2-label is the destination. Similar series of namespaces used for 2-nexthop test. Intent is to measure changes to latency (overhead in manipulating the packet) in the forwarding path. Tests used netperf with UDP_RR. IPv4: current patches 1 label, 1 nexthop 29908 30115 2 label, 1 nexthop 29071 29612 1 label, 2 nexthop 29582 29776 2 label, 2 nexthop 29086 29149 IPv6: current patches 1 label, 1 nexthop 24502 24960 2 label, 1 nexthop 24041 24407 1 label, 2 nexthop 23795 23899 2 label, 2 nexthop 23074 22959 In short, the change has no effect to a modest increase in performance. This is expected since this patch does not really have an impact on routes with 1 or 2 labels (the current limit) and 1 or 2 nexthops. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Convert number of nexthops to u8David Ahern2017-04-012-13/+20
| | | | | | | | | | | | | Number of nexthops and number of alive nexthops are tracked using an unsigned int. A route should never have more than 255 nexthops so convert both to u8. Update all references and intermediate variables to consistently use u8 as well. Shrinks the size of mpls_route from 32 bytes to 24 bytes with a 2-byte hole before the nexthops. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCEDavid Ahern2017-04-012-11/+33
| | | | | | | | | | | | | | | | The number of alive nexthops for a route (rt->rt_nhn_alive) and the flags for a next hop (nh->nh_flags) are modified by netdev event handlers. The event handlers run with rtnl_lock held so updates are always done with the lock held. The packet path accesses the fields under the rcu lock. Since those fields can change at any moment in the packet path, both fields should be accessed using READ_ONCE. Updates to both fields should use WRITE_ONCE. Update mpls_select_multipath (packet path) and mpls_ifdown and mpls_ifup (event handlers) accordingly. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: fix build error with devlink build as moduleTobias Regnery2017-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 96567d5dacf4 ("net: dsa: dsa2: Add basic support of devlink") I see the following link error with CONFIG_NET_DSA=y and CONFIG_NET_DEVLINK=m: net/built-in.o: In function 'dsa_register_switch': (.text+0xe226b): undefined reference to `devlink_alloc' net/built-in.o: In function 'dsa_register_switch': (.text+0xe2284): undefined reference to `devlink_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe243e): undefined reference to `devlink_port_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe24e1): undefined reference to `devlink_port_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe24fa): undefined reference to `devlink_port_type_eth_set' net/built-in.o: In function 'dsa_dst_unapply.part.8': dsa2.c:(.text.unlikely+0x345): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x36c): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x38e): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x3f2): undefined reference to 'devlink_unregister' dsa2.c:(.text.unlikely+0x3fb): undefined reference to 'devlink_free' Fix this by adding a dependency on MAY_USE_DEVLINK so that CONFIG_NET_DSA get switched to be build as module when CONFIG_NET_DEVLINK=m. Fixes: 96567d5dacf4 ("net: dsa: dsa2: Add basic support of devlink") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: introduce BPF_PROG_TEST_RUN commandAlexei Starovoitov2017-04-014-1/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | development and testing of networking bpf programs is quite cumbersome. Despite availability of user space bpf interpreters the kernel is the ultimate authority and execution environment. Current test frameworks for TC include creation of netns, veth, qdiscs and use of various packet generators just to test functionality of a bpf program. XDP testing is even more complicated, since qemu needs to be started with gro/gso disabled and precise queue configuration, transferring of xdp program from host into guest, attaching to virtio/eth0 and generating traffic from the host while capturing the results from the guest. Moreover analyzing performance bottlenecks in XDP program is impossible in virtio environment, since cost of running the program is tiny comparing to the overhead of virtio packet processing, so performance testing can only be done on physical nic with another server generating traffic. Furthermore ongoing changes to user space control plane of production applications cannot be run on the test servers leaving bpf programs stubbed out for testing. Last but not least, the upstream llvm changes are validated by the bpf backend testsuite which has no ability to test the code generated. To improve this situation introduce BPF_PROG_TEST_RUN command to test and performance benchmark bpf programs. Joint work with Daniel Borkmann. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: add cross-chip bridging operationsVivien Didelot2017-04-011-6/+6
| | | | | | | | Introduce crosschip_bridge_{join,leave} operations in the dsa_switch_ops structure, which can be used by switches supporting interconnection. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sock: avoid dirtying sk_stamp, if possiblePaolo Abeni2017-03-301-1/+1
| | | | | | | | | | | | | | | | | sock_recv_ts_and_drops() unconditionally set sk->sk_stamp for every packet, even if the SOCK_TIMESTAMP flag is not set in the related socket. If selinux is enabled, this cause a cache miss for every packet since sk->sk_stamp and sk->sk_security share the same cacheline. With this change sk_stamp is set only if the SOCK_TIMESTAMP flag is set, and is cleared for the first packet, so that the user perceived behavior is unchanged. This gives up to 5% speed-up under udp-flood with small packets. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: tcp: Refine the __tcp_select_windowGao Feng2017-03-301-5/+3
| | | | | | | | | | | | | | | 1. Move the "window = tp->rcv_wnd;" into the condition block without tp->rx_opt.rcv_wscale. Because it is unnecessary when enable wscale; 2. Use the macro ALIGN instead of two statements. The two statements are used to make window align to 1<<wscale. Use the ALIGN is more clearer. 3. Use the rounddown to make codes clearer. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* VSOCK: remove unnecessary ternary operator on return valueColin Ian King2017-03-301-15/+7
| | | | | | | | | | | | | | Rather than assign the positive errno values to ret and then checking if it is positive and flip the sign, just return the errno value. Detected by CoverityScan, CID#986649 ("Logically Dead Code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers: add explicit interrupt.h includesFlorian Westphal2017-03-302-0/+2
| | | | | | | | | | | These files all use functions declared in interrupt.h, but currently rely on implicit inclusion of this file (via netns/xfrm.h). That won't work anymore when the flow cache is removed so include that header where needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: allow rdm/dgram socketpairsErik Hugne2017-03-291-4/+16
| | | | | | | | | | for socketpairs using connectionless transport, we cache the respective node local TIPC portid to use in subsequent calls to send() in the socket's private data. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: add support for stream/seqpacket socketpairsErik Hugne2017-03-291-2/+12
| | | | | | | | | sockets A and B are connected back-to-back, similar to what AF_UNIX does. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Update lfib_nlmsg_size to skip deleted nexthopsDavid Ahern2017-03-291-0/+2
| | | | | | | | | | | A recent commit skips nexthops in a route if the device has been deleted. Update lfib_nlmsg_size accordingly. Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: dsa2: Add basic support of devlinkAndrew Lunn2017-03-281-2/+45
| | | | | | | | | Register the switch and its ports with devlink. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: break include loop netdevice.h, dsa.h, devlink.hAndrew Lunn2017-03-2811-2/+13
| | | | | | | | | | | | | | | | | | There is an include loop between netdevice.h, dsa.h, devlink.h because of NETDEV_ALIGN, making it impossible to use devlink structures in dsa.h. Break this loop by taking dsa.h out of netdevice.h, add a forward declaration of dsa_switch_tree and netdev_set_default_ethtool_ops() function, which is what netdevice.h requires. No longer having dsa.h in netdevice.h means the includes in dsa.h no longer get included. This breaks a few other files which depend on these includes. Add these directly in the affected file. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Send netconf messages on device register and unregisterDavid Ahern2017-03-281-5/+11
| | | | | | | | Send netconf notifications for MPLS when the device registers and unregisters. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net:mpls: Refactor mpls_netconf_notify_devconf to take eventDavid Ahern2017-03-281-7/+5
| | | | | | | Refactor mpls_netconf_notify_devconf to take the event as an input arg. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Add support for RTM_DELNETCONFDavid Ahern2017-03-281-5/+15
| | | | | | | | | | | Send RTM_DELNETCONF notifications when a device is deleted. The message only needs the device index, so modify inet6_netconf_fill_devconf to skip devconf references if it is NULL. Allows a userspace cache to remove entries as devices are deleted. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Refactor inet6_netconf_notify_devconf to take eventDavid Ahern2017-03-282-15/+27
| | | | | | | Refactor inet6_netconf_notify_devconf to take the event as an input arg. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: devinet: Add support for RTM_DELNETCONFDavid Ahern2017-03-281-11/+21
| | | | | | | | | | | Send RTM_DELNETCONF notifications when a device is deleted. The message only needs the device index, so modify inet_netconf_fill_devconf to skip devconf references if it is NULL. Allows a userspace cache to remove entries as devices are deleted. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: devinet: Refactor inet_netconf_notify_devconf to take eventDavid Ahern2017-03-282-17/+27
| | | | | | | Refactor inet_netconf_notify_devconf to take the event as an input arg. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: fix copyright holderVivien Didelot2017-03-281-1/+2
| | | | | | | | | | | I do not hold the copyright of the DSA core and drivers source files, since these changes have been written as an initiative of my day job. Fix this. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: add support for NETDEV_RESEND_IGMP eventVlad Yasevich2017-03-282-2/+55
| | | | | | | | This patch adds support for NETDEV_RESEND_IGMP event similar to how it works for IPv4. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: adjust the policy of holding subscription krefYing Xue2017-03-283-6/+7
| | | | | | | | | | | | | | | | | | When a new subscription object is inserted into name_seq->subscriptions list, it's under name_seq->lock protection; when a subscription is deleted from the list, it's also under the same lock protection; similarly, when accessing a subscription by going through subscriptions list, the entire process is also protected by the name_seq->lock. Therefore, if subscription refcount is increased before it's inserted into subscriptions list, and its refcount is decreased after it's deleted from the list, it will be unnecessary to hold refcount at all before accessing subscription object which is obtained by going through subscriptions list under name_seq->lock protection. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: advance the time of deleting subscription from subscriber->subscrp_listYing Xue2017-03-281-7/+2
| | | | | | | | | | | | | | | | | | After a subscription object is created, it's inserted into its subscriber subscrp_list list under subscriber lock protection, similarly, before it's destroyed, it should be first removed from its subscriber->subscrp_list. Since the subscription list is accessed with subscriber lock, all the subscriptions are valid during the lock duration. Hence in tipc_subscrb_subscrp_delete(), we remove subscription get/put and the extra subscriber unlock/lock. After this change, the subscriptions refcount cleanup is very simple and does not access any lock. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* soc: qcom: smd: Transition client drivers from smd to rpmsgBjorn Andersson2017-03-282-23/+21
| | | | | | | | | | | | | | | | By moving these client drivers to use RPMSG instead of the direct SMD API we can reuse them ontop of the newly added GLINK wire-protocol support found in the 820 and 835 Qualcomm platforms. As the new (RPMSG-based) and old SMD implementations are mutually exclusive we have to change all client drivers in one commit, to make sure we have a working system before and after this transition. Acked-by: Andy Gross <andy.gross@linaro.org> Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* devlink: Support for pipeline debug (dpipe)Arkadi Sharshevsky2017-03-281-0/+836
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pipeline debug is used to export the pipeline abstractions for the main objects - tables, headers and entries. The only support for set is for changing the counter parameter on specific table. The basic structures: Header - can represent a real protocol header information or internal metadata. Generic protocol headers like IPv4 can be shared between drivers. Each driver can add local headers. Field - part of a header. Can represent protocol field or specific ASIC metadata field. Hardware special metadata fields can be mapped to different resources, for example switch ASIC ports can have internal number which from the systems point of view is mapped to netdeivce ifindex. Match - represent specific match rule. Can describe match on specific field or header. The header index should be specified as well in order to support several header instances of the same type (tunneling). Action - represents specific action rule. Actions can describe operations on specific field values for example like set, increment, etc. And header operation like add and delete. Value - represents value which can be associated with specific match or action. Table - represents a hardware block which can be described with match/ action behavior. The match/action can be done on the packets data or on the internal metadata that it gathered along the packets traversal throw the pipeline which is vendor specific and should be exported in order to provide understanding of ASICs behavior. Entry - represents single record in a specific table. The entry is identified by specific combination of values for match/action. Prior to accessing the tables/entries the drivers provide the header/ field data base which is used by driver to user-space. The data base is split between the shared headers and unique headers. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: sr: select DST_CACHE by defaultDavid Lebrun2017-03-272-18/+1
| | | | | | | | When CONFIG_IPV6_SEG6_LWTUNNEL is selected, automatically select DST_CACHE. This allows to remove multiple ifdefs. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Delete route when all nexthops have been deletedDavid Ahern2017-03-271-1/+8
| | | | | | | | | When all devices for all nexthops in a route have been deleted, the route is effectively dead, so remove it. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mpls: Don't show nexthop if device has been deletedDavid Ahern2017-03-271-3/+5
| | | | | | | | | | | | | | | | | | | | | | | If the device for a nexthop in a multipath route is deleted, the nexthop is effectively removed from the route. Currently, a route dump still returns the nexhop though without the device set: $ ip -f mpls ro ls 100 nexthopvia inet 10.11.1.2 dev br0 nexthopvia inet 10.100.3.1 dev eth3 $ ip li del br0 $ ip -f mpls ro ls 100 nexthopvia inet 10.11.1.2 dev * dead linkdown nexthopvia inet 10.100.3.1 dev eth3 Since the nexthop is effectively deleted, drop the hop from the route dump. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Introduce SO_INCOMING_NAPI_IDSridhar Samudrala2017-03-241-0/+12
| | | | | | | | | | | | | | | This socket option returns the NAPI ID associated with the queue on which the last frame is received. This information can be used by the apps to split the incoming flows among the threads based on the Rx queue on which they are received. If the NAPI ID actually represents a sender_cpu then the value is ignored and 0 is returned. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Commonize busy polling code to focus on napi_id instead of socketSridhar Samudrala2017-03-242-13/+19
| | | | | | | | | | | | Move the core functionality in sk_busy_loop() to napi_busy_loop() and make it independent of sk. This enables re-using this function in epoll busy loop implementation. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>