summaryrefslogtreecommitdiffstats
path: root/net/rxrpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'next' of ↵Linus Torvalds2016-05-191-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing of modules and firmware to be loaded from a specific device (this is from ChromeOS, where the device as a whole is verified cryptographically via dm-verity). This is disabled by default but can be configured to be enabled by default (don't do this if you don't know what you're doing). - Keys: allow authentication data to be stored in an asymmetric key. Lots of general fixes and updates. - SELinux: add restrictions for loading of kernel modules via finit_module(). Distinguish non-init user namespace capability checks. Apply execstack check on thread stacks" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits) LSM: LoadPin: provide enablement CONFIG Yama: use atomic allocations when reporting seccomp: Fix comment typo ima: add support for creating files using the mknodat syscall ima: fix ima_inode_post_setattr vfs: forbid write access when reading a file into memory fs: fix over-zealous use of "const" selinux: apply execstack check on thread stacks selinux: distinguish non-init user namespace capability checks LSM: LoadPin for kernel file loading restrictions fs: define a string representation of the kernel_read_file_id enumeration Yama: consolidate error reporting string_helpers: add kstrdup_quotable_file string_helpers: add kstrdup_quotable_cmdline string_helpers: add kstrdup_quotable selinux: check ss_initialized before revalidating an inode label selinux: delay inode label lookup as long as possible selinux: don't revalidate an inode's label when explicitly setting it selinux: Change bool variable name to index. KEYS: Add KEYCTL_DH_COMPUTE command ...
| * KEYS: Add a facility to restrict new links into a keyringDavid Howells2016-04-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a facility whereby proposed new links to be added to a keyring can be vetted, permitting them to be rejected if necessary. This can be used to block public keys from which the signature cannot be verified or for which the signature verification fails. It could also be used to provide blacklisting. This affects operations like add_key(), KEYCTL_LINK and KEYCTL_INSTANTIATE. To this end: (1) A function pointer is added to the key struct that, if set, points to the vetting function. This is called as: int (*restrict_link)(struct key *keyring, const struct key_type *key_type, unsigned long key_flags, const union key_payload *key_payload), where 'keyring' will be the keyring being added to, key_type and key_payload will describe the key being added and key_flags[*] can be AND'ed with KEY_FLAG_TRUSTED. [*] This parameter will be removed in a later patch when KEY_FLAG_TRUSTED is removed. The function should return 0 to allow the link to take place or an error (typically -ENOKEY, -ENOPKG or -EKEYREJECTED) to reject the link. The pointer should not be set directly, but rather should be set through keyring_alloc(). Note that if called during add_key(), preparse is called before this method, but a key isn't actually allocated until after this function is called. (2) KEY_ALLOC_BYPASS_RESTRICTION is added. This can be passed to key_create_or_update() or key_instantiate_and_link() to bypass the restriction check. (3) KEY_FLAG_TRUSTED_ONLY is removed. The entire contents of a keyring with this restriction emplaced can be considered 'trustworthy' by virtue of being in the keyring when that keyring is consulted. (4) key_alloc() and keyring_alloc() take an extra argument that will be used to set restrict_link in the new key. This ensures that the pointer is set before the key is published, thus preventing a window of unrestrictedness. Normally this argument will be NULL. (5) As a temporary affair, keyring_restrict_trusted_only() is added. It should be passed to keyring_alloc() as the extra argument instead of setting KEY_FLAG_TRUSTED_ONLY on a keyring. This will be replaced in a later patch with functions that look in the appropriate places for authoritative keys. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
* | net: udp: rename UDP_INC_STATS_BH()Eric Dumazet2016-04-271-2/+2
| | | | | | | | | | | | | | | | Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(), and UDP6_INC_STATS_BH() to __UDP6_INC_STATS() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Create a null security type and get rid of conditional callsDavid Howells2016-04-119-61/+105
| | | | | | | | | | | | | | | | | | Create a null security type for security index 0 and get rid of all conditional calls to the security operations. We expect normally to be using security, so this should be of little negative impact. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Absorb the rxkad security moduleDavid Howells2016-04-116-134/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | Absorb the rxkad security module into the af_rxrpc module so that there's only one module file. This avoids a circular dependency whereby rxkad pins af_rxrpc and cached connections pin rxkad but can't be manually evicted (they will expire eventually and cease pinning). With this change, af_rxrpc can just be unloaded, despite having cached connections. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Don't assume transport address family and size when using itDavid Howells2016-04-112-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Don't assume transport address family and size when using the peer address to send a packet. Instead, use the start of the transport address rather than any particular element of the union and use the transport address length noted inside the sockaddr_rxrpc struct. This will be necessary when IPv6 support is introduced. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Don't pass gfp around in incoming call handling functionsDavid Howells2016-04-114-12/+9
| | | | | | | | | | | | | | | | | | Don't pass gfp around in incoming call handling functions, but rather hard code it at the points where we actually need it since the value comes from within the rxrpc driver and is always the same. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Differentiate local and remote abort codes in structsDavid Howells2016-04-118-21/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the rxrpc_connection and rxrpc_call structs, there's one field to hold the abort code, no matter whether that value was generated locally to be sent or was received from the peer via an abort packet. Split the abort code fields in two for cleanliness sake and add an error field to hold the Linux error number to the rxrpc_call struct too (sometimes this is generated in a context where we can't return it to userspace directly). Furthermore, add a skb mark to indicate a packet that caused a local abort to be generated so that recvmsg() can pick up the correct abort code. A future addition will need to be to indicate to userspace the difference between aborts via a control message. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Static arrays of strings should be const char *const[]David Howells2016-04-112-2/+2
| | | | | | | | | | | | | | Static arrays of strings should be const char *const[]. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Move some miscellaneous bits out into their own fileDavid Howells2016-04-115-84/+106
| | | | | | | | | | | | | | | | Move some miscellaneous bits out into their own file to make it easier to split the call handling. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Disable a debugging statement that has been left enabled.David Howells2016-04-111-1/+1
| | | | | | | | | | | | | | Disable a debugging statement that has been left enabled Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: do not pull udp headers on receiveWillem de Bruijn2016-04-111-2/+2
|/ | | | | | | | | | | | | | | Commit e6afc8ace6dd modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Rxrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2016-03-1919-607/+665
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...
| * rxrpc: Replace all unsigned with unsigned intDavid Howells2016-03-138-39/+39
| | | | | | | | | | | | | | | | Replace all "unsigned" types with "unsigned int" types. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * rxrpc: Don't try to map ICMP to error as the lower layer already did thatDavid Howells2016-03-041-10/+0
| | | | | | | | | | | | | | In the ICMP message processing code, don't try to map ICMP codes to UNIX error codes as the caller (IPv4/IPv6) already did that for us (ee_errno). Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Clear the unused part of a sockaddr_rxrpc for memcmp() useDavid Howells2016-03-041-3/+5
| | | | | | | | | | | | | | Clear the unused part of a sockaddr_rxrpc structs so that memcmp() can be used to compare them. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: rxkad: Casts are needed when comparing be32 valuesDavid Howells2016-03-041-1/+1
| | | | | | | | | | | | | | Forced casts are needed to avoid sparse warning when directly comparing be32 values. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: rxkad: The version number in the response should be net byte orderDavid Howells2016-03-041-8/+9
| | | | | | | | | | | | | | | | | | The version number rxkad places in the response should be network byte order. Whilst we're at it, rearrange the code to be more readable. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Use ACCESS_ONCE() when accessing circular buffer pointersDavid Howells2016-03-041-4/+7
| | | | | | | | | | | | | | | | | | Use ACCESS_ONCE() when accessing the other-end pointer into a circular buffer as it's possible the other-end pointer might change whilst we're doing this, and if we access it twice, we might get some weird things happening. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Adjust some whitespace and commentsDavid Howells2016-03-047-29/+22
| | | | | | | | | | | | | | Remove some excess whitespace, insert some missing spaces and adjust a couple of comments. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Be more selective about the types of received packets we acceptDavid Howells2016-03-041-1/+2
| | | | | | | | | | | | | | | | Currently, received RxRPC packets outside the range 1-13 are rejected. There are, however, holes in the range that should also be rejected - plus at least one type we don't yet support - so reject these also. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Fix defined range for /proc/sys/net/rxrpc/rx_mtuDavid Howells2016-03-041-1/+1
| | | | | | | | | | | | | | | | The upper bound of the defined range for rx_mtu is being set in the same member as the lower bound (extra1) rather than the correct place (extra2). I'm not entirely sure why this compiles. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: The protocol family should be set to PF_RXRPC not PF_UNIXDavid Howells2016-03-041-1/+1
| | | | | | | | | | | | | | Fix the protocol family set in the proto_ops for rxrpc to be PF_RXRPC not PF_UNIX. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Keep the skb private record of the Rx header in host byte orderDavid Howells2016-03-0417-381/+431
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, a copy of the Rx packet header is copied into the the sk_buff private data so that we can advance the pointer into the buffer, potentially discarding the original. At the moment, this copy is held in network byte order, but this means we're doing a lot of unnecessary translations. The reasons it was done this way are that we need the values in network byte order occasionally and we can use the copy, slightly modified, as part of an iov array when sending an ack or an abort packet. However, it seems more reasonable on review that it would be better kept in host byte order and that we make up a new header when we want to send another packet. To this end, rename the original header struct to rxrpc_wire_header (with BE fields) and institute a variant called rxrpc_host_header that has host order fields. Change the struct in the sk_buff private data into an rxrpc_host_header and translate the values when filling it in. This further allows us to keep values kept in various structures in host byte order rather than network byte order and allows removal of some fields that are byteswapped duplicates. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Rename call events to begin RXRPC_CALL_EV_David Howells2016-03-0410-101/+101
| | | | | | | | | | | | | | Rename call event names to begin RXRPC_CALL_EV_ to distinguish them from the flags. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Convert call flag and event numbers into enumsDavid Howells2016-03-043-47/+65
| | | | | | | | | | | | | | | | | | | | Convert call flag and event numbers into enums and move their definitions outside of the struct. Also move the call state enum outside of the struct and add an extra element to count the number of states. Signed-off-by: David Howells <dhowells@redhat.com>
| * rxrpc: Fix a case where a call event bit is being used as a flag bitDavid Howells2016-03-041-1/+1
| | | | | | | | | | | | | | Fix a case where RXRPC_CALL_RELEASE (an event) is being used to specify a flag bit. RXRPC_CALL_RELEASED should be used instead. Signed-off-by: David Howells <dhowells@redhat.com>
* | rxrpc: Use skcipherHerbert Xu2016-01-273-72/+114
|/ | | | | | This patch replaces uses of blkcipher with skcipher. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2016-01-121-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from Davic Miller: 1) Support busy polling generically, for all NAPI drivers. From Eric Dumazet. 2) Add byte/packet counter support to nft_ct, from Floriani Westphal. 3) Add RSS/XPS support to mvneta driver, from Gregory Clement. 4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes Frederic Sowa. 5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai. 6) Add support for VLAN device bridging to mlxsw switch driver, from Ido Schimmel. 7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski. 8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko. 9) Reorganize wireless drivers into per-vendor directories just like we do for ethernet drivers. From Kalle Valo. 10) Provide a way for administrators "destroy" connected sockets via the SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti. 11) Add support to add/remove multicast routes via netlink, from Nikolay Aleksandrov. 12) Make TCP keepalive settings per-namespace, from Nikolay Borisov. 13) Add forwarding and packet duplication facilities to nf_tables, from Pablo Neira Ayuso. 14) Dead route support in MPLS, from Roopa Prabhu. 15) TSO support for thunderx chips, from Sunil Goutham. 16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon. 17) Rationalize, consolidate, and more completely document the checksum offloading facilities in the networking stack. From Tom Herbert. 18) Support aborting an ongoing scan in mac80211/cfg80211, from Vidyullatha Kanchanapally. 19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits) net: bnxt: always return values from _bnxt_get_max_rings net: bpf: reject invalid shifts phonet: properly unshare skbs in phonet_rcv() dwc_eth_qos: Fix dma address for multi-fragment skbs phy: remove an unneeded condition mdio: remove an unneed condition mdio_bus: NULL dereference on allocation error net: Fix typo in netdev_intersect_features net: freescale: mac-fec: Fix build error from phy_device API change net: freescale: ucc_geth: Fix build error from phy_device API change bonding: Prevent IPv6 link local address on enslaved devices IB/mlx5: Add flow steering support net/mlx5_core: Export flow steering API net/mlx5_core: Make ipv4/ipv6 location more clear net/mlx5_core: Enable flow steering support for the IB driver net/mlx5_core: Initialize namespaces only when supported by device net/mlx5_core: Set priority attributes net/mlx5_core: Connect flow tables net/mlx5_core: Introduce modify flow table command net/mlx5_core: Managing root flow table ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-12-032-2/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/renesas/ravb_main.c kernel/bpf/syscall.c net/ipv4/ipmr.c All three conflicts were cases of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: Generalise wq_has_sleeper helperHerbert Xu2015-11-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory barrier in the helper wq_has_sleeper is needed by just about every user of waitqueue_active. This patch generalises it by making it take a wait_queue_head_t directly. The existing helper is renamed to skwq_has_sleeper. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | convert a bunch of open-coded instances of memdup_user_nul()Al Viro2016-01-041-18/+6
| |/ |/| | | | | | | | | | | A _lot_ of ->write() instances were open-coding it; some are converted to memdup_user_nul(), a lot more remain... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATAEric Dumazet2015-12-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a cleanup to make following patch easier to review. Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA from (struct socket)->flags to a (struct socket_wq)->flags to benefit from RCU protection in sock_wake_async() To ease backports, we rename both constants. Two new helpers, sk_set_bit(int nr, struct sock *sk) and sk_clear_bit(int net, struct sock *sk) are added so that following patch can change their implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rxrpc: Correctly handle ack at end of client call transmit phaseDavid Howells2015-11-241-1/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, the transmit phase of a client call is implicitly ack'd by the reception of the first data packet of the response being received. However, if a security negotiation happens, the transmit phase, if it is entirely contained in a single packet, may get an ack packet in response and then may get aborted due to security negotiation failure. Because the client has shifted state to RXRPC_CALL_CLIENT_AWAIT_REPLY due to having transmitted all the data, the code that handles processing of the received ack packet doesn't note the hard ack the data packet. The following abort packet in the case of security negotiation failure then incurs an assertion failure when it tries to drain the Tx queue because the hard ack state is out of sync (hard ack means the packets have been processed and can be discarded by the sender; a soft ack means that the packets are received but could still be discarded and rerequested by the receiver). To fix this, we should record the hard ack we received for the ack packet. The assertion failure looks like: RxRPC: Assertion failed 1 <= 0 is false 0x1 <= 0x0 is false ------------[ cut here ]------------ kernel BUG at ../net/rxrpc/ar-ack.c:431! ... RIP: 0010:[<ffffffffa006857b>] [<ffffffffa006857b>] rxrpc_rotate_tx_window+0xbc/0x131 [af_rxrpc] ... Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mm, page_alloc: distinguish between being unable to sleep, unwilling to ↵Mel Gorman2015-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'next' of ↵Linus Torvalds2015-11-055-28/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem update from James Morris: "This is mostly maintenance updates across the subsystem, with a notable update for TPM 2.0, and addition of Jarkko Sakkinen as a maintainer of that" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits) apparmor: clarify CRYPTO dependency selinux: Use a kmem_cache for allocation struct file_security_struct selinux: ioctl_has_perm should be static selinux: use sprintf return value selinux: use kstrdup() in security_get_bools() selinux: use kmemdup in security_sid_to_context_core() selinux: remove pointless cast in selinux_inode_setsecurity() selinux: introduce security_context_str_to_sid selinux: do not check open perm on ftruncate call selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default KEYS: Merge the type-specific data with the payload data KEYS: Provide a script to extract a module signature KEYS: Provide a script to extract the sys cert list from a vmlinux file keys: Be more consistent in selection of union members used certs: add .gitignore to stop git nagging about x509_certificate_list KEYS: use kvfree() in add_key Smack: limited capability for changing process label TPM: remove unnecessary little endian conversion vTPM: support little endian guests char: Drop owner assignment from i2c_driver ...
| * KEYS: Merge the type-specific data with the payload dataDavid Howells2015-10-215-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the type-specific data with the payload data into one four-word chunk as it seems pointless to keep them separate. Use user_key_payload() for accessing the payloads of overloaded user-defined keys. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cifs@vger.kernel.org cc: ecryptfs@vger.kernel.org cc: linux-ext4@vger.kernel.org cc: linux-f2fs-devel@lists.sourceforge.net cc: linux-nfs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: linux-ima-devel@lists.sourceforge.net
* | rxrpc: Replace get_seconds with ktime_get_secondsKsenija Stanojevic2015-09-203-6/+6
|/ | | | | | | | | | Replace time_t type and get_seconds function which are not y2038 safe on 32-bit systems. Function ktime_get_seconds use monotonic instead of real time and therefore will not cause overflow. Signed-off-by: Ksenija Stanojevic <ksenija.stanojevic@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Pass kern from net_proto_family.create to sk_allocEric W. Biederman2015-05-111-1/+1
| | | | | | | | | In preparation for changing how struct net is refcounted on kernel sockets pass the knowledge that we are creating a kernel socket from sock_create_kern through to sk_alloc. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add a struct net parameter to sock_create_kernEric W. Biederman2015-05-111-2/+2
| | | | | | | | This is long overdue, and is part of cleaning up how we allocate kernel sockets that don't reference count struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* new helper: msg_data_left()Al Viro2015-04-111-10/+9
| | | | | | convert open-coded instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge remote-tracking branch 'dh/afs' into for-davemAl Viro2015-04-114-29/+148
|\
| * RxRPC: Handle VERSION Rx protocol packetsDavid Howells2015-04-013-1/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle VERSION Rx protocol packets. We should respond to a VERSION packet with a string indicating the Rx version. This is a maximum of 64 characters and is padded out to 65 chars with NUL bytes. Note that other AFS clients use the version request as a NAT keepalive so we need to handle it rather than returning an abort. The standard formulation seems to be: <project> <version> built <yyyy>-<mm>-<dd> for example: " OpenAFS 1.6.2 built 2013-05-07 " (note the three extra spaces) as obtained with: rxdebug grand.mit.edu -version from the openafs package. Signed-off-by: David Howells <dhowells@redhat.com>
| * RxRPC: Use iov_iter_count() in rxrpc_send_data() instead of the len argumentDavid Howells2015-04-011-13/+11
| | | | | | | | | | | | | | Use iov_iter_count() in rxrpc_send_data() to get the remaining data length instead of using the len argument as the len argument is now redundant. Signed-off-by: David Howells <dhowells@redhat.com>
| * RxRPC: Don't call skb_add_data() if there's no data to copyDavid Howells2015-04-011-19/+19
| | | | | | | | | | | | | | Don't call skb_add_data() in rxrpc_send_data() if there's no data to copy and also skip the calculations associated with it in such a case. Signed-off-by: David Howells <dhowells@redhat.com>
| * RxRPC: Fix the conversion to iov_iterDavid Howells2015-04-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit: commit af2b040e470b470bfc881981db3c796072853eae Author: Al Viro <viro@zeniv.linux.org.uk> Date: Thu Nov 27 21:44:24 2014 -0500 Subject: rxrpc: switch rxrpc_send_data() to iov_iter primitives incorrectly changes a do-while loop into a while loop in rxrpc_send_data(). Unfortunately, at least one pass through the loop is required - even if there is no data - so that the packet the closes the send phase can be sent if MSG_MORE is not set. Signed-off-by: David Howells <dhowells@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-03-201-1/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
| * rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg()Al Viro2015-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [I would really like an ACK on that one from dhowells; it appears to be quite straightforward, but...] MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit in there. It gets passed via flags; in fact, another such check in the same function is done correctly - as flags & MSG_PEEK. It had been that way (effectively disabled) for 8 years, though, so the patch needs beating up - that case had never been tested. If it is correct, it's -stable fodder. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-03-091-2/+2
|\| | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/cadence/macb.c Overlapping changes in macb driver, mostly fixes and cleanups in 'net' overlapping with the integration of at91_ether into macb in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * ip: fix error queue empty skb handlingWillem de Bruijn2015-03-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading from the error queue, msg_name and msg_control are only populated for some errors. A new exception for empty timestamp skbs added a false positive on icmp errors without payload. `traceroute -M udpconn` only displayed gateways that return payload with the icmp error: the embedded network headers are pulled before sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise. Fix this regression by refining when msg_name and msg_control branches are taken. The solutions for the two fields are independent. msg_name only makes sense for errors that configure serr->port and serr->addr_offset. Test the first instead of skb->len. This also fixes another issue. saddr could hold the wrong data, as serr->addr_offset is not initialized in some code paths, pointing to the start of the network header. It is only valid when serr->port is set (non-zero). msg_control support differs between IPv4 and IPv6. IPv4 only honors requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The skb->len test can simply be removed, because skb->dev is also tested and never true for empty skbs. IPv6 honors requests for all errors aside from local errors and timestamps on empty skbs. In both cases, make the policy more explicit by moving this logic to a new function that decides whether to process msg_control and that optionally prepares the necessary fields in skb->cb[]. After this change, the IPv4 and IPv6 paths are more similar. The last case is rxrpc. Here, simply refine to only match timestamps. Fixes: 49ca0d8bfaf3 ("net-timestamp: no-payload option") Reported-by: Jan Niehusmann <jan@gondor.com> Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes v1->v2 - fix local origin test inversion in ip6_datagram_support_cmsg - make v4 and v6 code paths more similar by introducing analogous ipv4_datagram_support_cmsg - fix compile bug in rxrpc Signed-off-by: David S. Miller <davem@davemloft.net>