summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* binder: implement binderfsChristian Brauner2018-12-192-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed at Linux Plumbers Conference 2018 in Vancouver [1] this is the implementation of binderfs. /* Abstract */ binderfs is a backwards-compatible filesystem for Android's binder ipc mechanism. Each ipc namespace will mount a new binderfs instance. Mounting binderfs multiple times at different locations in the same ipc namespace will not cause a new super block to be allocated and hence it will be the same filesystem instance. Each new binderfs mount will have its own set of binder devices only visible in the ipc namespace it has been mounted in. All devices in a new binderfs mount will follow the scheme binder%d and numbering will always start at 0. /* Backwards compatibility */ Devices requested in the Kconfig via CONFIG_ANDROID_BINDER_DEVICES for the initial ipc namespace will work as before. They will be registered via misc_register() and appear in the devtmpfs mount. Specifically, the standard devices binder, hwbinder, and vndbinder will all appear in their standard locations in /dev. Mounting or unmounting the binderfs mount in the initial ipc namespace will have no effect on these devices, i.e. they will neither show up in the binderfs mount nor will they disappear when the binderfs mount is gone. /* binder-control */ Each new binderfs instance comes with a binder-control device. No other devices will be present at first. The binder-control device can be used to dynamically allocate binder devices. All requests operate on the binderfs mount the binder-control device resides in. Assuming a new instance of binderfs has been mounted at /dev/binderfs via mount -t binderfs binderfs /dev/binderfs. Then a request to create a new binder device can be made as illustrated in [2]. Binderfs devices can simply be removed via unlink(). /* Implementation details */ - dynamic major number allocation: When binderfs is registered as a new filesystem it will dynamically allocate a new major number. The allocated major number will be returned in struct binderfs_device when a new binder device is allocated. - global minor number tracking: Minor are tracked in a global idr struct that is capped at BINDERFS_MAX_MINOR. The minor number tracker is protected by a global mutex. This is the only point of contention between binderfs mounts. - struct binderfs_info: Each binderfs super block has its own struct binderfs_info that tracks specific details about a binderfs instance: - ipc namespace - dentry of the binder-control device - root uid and root gid of the user namespace the binderfs instance was mounted in - mountable by user namespace root: binderfs can be mounted by user namespace root in a non-initial user namespace. The devices will be owned by user namespace root. - binderfs binder devices without misc infrastructure: New binder devices associated with a binderfs mount do not use the full misc_register() infrastructure. The misc_register() infrastructure can only create new devices in the host's devtmpfs mount. binderfs does however only make devices appear under its own mountpoint and thus allocates new character device nodes from the inode of the root dentry of the super block. This will have the side-effect that binderfs specific device nodes do not appear in sysfs. This behavior is similar to devpts allocated pts devices and has no effect on the functionality of the ipc mechanism itself. [1]: https://goo.gl/JL2tfX [2]: program to allocate a new binderfs binder device: #define _GNU_SOURCE #include <errno.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> #include <linux/android/binder_ctl.h> int main(int argc, char *argv[]) { int fd, ret, saved_errno; size_t len; struct binderfs_device device = { 0 }; if (argc < 2) exit(EXIT_FAILURE); len = strlen(argv[1]); if (len > BINDERFS_MAX_NAME) exit(EXIT_FAILURE); memcpy(device.name, argv[1], len); fd = open("/dev/binderfs/binder-control", O_RDONLY | O_CLOEXEC); if (fd < 0) { printf("%s - Failed to open binder-control device\n", strerror(errno)); exit(EXIT_FAILURE); } ret = ioctl(fd, BINDER_CTL_ADD, &device); saved_errno = errno; close(fd); errno = saved_errno; if (ret < 0) { printf("%s - Failed to allocate new binder device\n", strerror(errno)); exit(EXIT_FAILURE); } printf("Allocated new binder device with major %d, minor %d, and " "name %s\n", device.major, device.minor, device.name); exit(EXIT_SUCCESS); } Cc: Martijn Coenen <maco@android.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder: fix use-after-free due to ksys_close() during fdget()Todd Kjos2018-12-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | 44d8047f1d8 ("binder: use standard functions to allocate fds") exposed a pre-existing issue in the binder driver. fdget() is used in ksys_ioctl() as a performance optimization. One of the rules associated with fdget() is that ksys_close() must not be called between the fdget() and the fdput(). There is a case where this requirement is not met in the binder driver which results in the reference count dropping to 0 when the device is still in use. This can result in use-after-free or other issues. If userpace has passed a file-descriptor for the binder driver using a BINDER_TYPE_FDA object, then kys_close() is called on it when handling a binder_ioctl(BC_FREE_BUFFER) command. This violates the assumptions for using fdget(). The problem is fixed by deferring the close using task_work_add(). A new variant of __close_fd() was created that returns a struct file with a reference. The fput() is deferred instead of using ksys_close(). Fixes: 44d8047f1d87a ("binder: use standard functions to allocate fds") Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Todd Kjos <tkjos@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'thunderbolt-for-v4.21' of ↵Greg Kroah-Hartman2018-12-102-0/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into char-misc-next Mika writes: thunderbolt: Changes for v4.21 merge window * tag 'thunderbolt-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Export IOMMU based DMA protection support to userspace iommu/vt-d: Do not enable ATS for untrusted devices iommu/vt-d: Force IOMMU on for platform opt in hint PCI / ACPI: Identify untrusted PCI devices
| * iommu/vt-d: Force IOMMU on for platform opt in hintLu Baolu2018-12-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel VT-d spec added a new DMA_CTRL_PLATFORM_OPT_IN_FLAG flag in DMAR ACPI table [1] for BIOS to report compliance about platform initiated DMA restricted to RMRR ranges when transferring control to the OS. This means that during OS boot, before it enables IOMMU none of the connected devices can bypass DMA protection for instance by overwriting the data structures used by the IOMMU. The OS also treats this as a hint that the IOMMU should be enabled to prevent DMA attacks from possible malicious devices. A use of this flag is Kernel DMA protection for Thunderbolt [2] which in practice means that IOMMU should be enabled for PCIe devices connected to the Thunderbolt ports. With IOMMU enabled for these devices, all DMA operations are limited in the range reserved for it, thus the DMA attacks are prevented. All these devices are enumerated in the PCI/PCIe module and marked with an untrusted flag. This forces IOMMU to be enabled if DMA_CTRL_PLATFORM_OPT_IN_FLAG is set in DMAR ACPI table and there are PCIe devices marked as untrusted in the system. This can be turned off by adding "intel_iommu=off" in the kernel command line, if any problems are found. [1] https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf [2] https://docs.microsoft.com/en-us/windows/security/information-protection/kernel-dma-protection-for-thunderbolt Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de>
| * PCI / ACPI: Identify untrusted PCI devicesMika Westerberg2018-12-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A malicious PCI device may use DMA to attack the system. An external Thunderbolt port is a convenient point to attach such a device. The OS may use IOMMU to defend against DMA attacks. Some BIOSes mark these externally facing root ports with this ACPI _DSD [1]: Name (_DSD, Package () { ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"), Package () { Package () {"ExternalFacingPort", 1}, Package () {"UID", 0 } } }) If we find such a root port, mark it and all its children as untrusted. The rest of the OS may use this information to enable DMA protection against malicious devices. For instance the device may be put behind an IOMMU to keep it from accessing memory outside of what the driver has allocated for it. While at it, add a comment on top of prp_guids array explaining the possible caveat resulting when these GUIDs are treated equivalent. [1] https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
* | Merge 4.20-rc6 into char-misc-nextGreg Kroah-Hartman2018-12-1016-42/+109
|\ \ | | | | | | | | | | | | | | | This should resolve the hv driver merge conflict. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-12-096-25/+75
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "A decent batch of fixes here. I'd say about half are for problems that have existed for a while, and half are for new regressions added in the 4.20 merge window. 1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach. 2) Revert bogus emac driver change, from Benjamin Herrenschmidt. 3) Handle BPF exported data structure with pointers when building 32-bit userland, from Daniel Borkmann. 4) Memory leak fix in act_police, from Davide Caratti. 5) Check RX checksum offload in RX descriptors properly in aquantia driver, from Dmitry Bogdanov. 6) SKB unlink fix in various spots, from Edward Cree. 7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from Eric Dumazet. 8) Fix FID leak in mlxsw driver, from Ido Schimmel. 9) IOTLB locking fix in vhost, from Jean-Philippe Brucker. 10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory limits otherwise namespace exit can hang. From Jiri Wiesner. 11) Address block parsing length fixes in x25 from Martin Schiller. 12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan. 13) For tun interfaces, only iface delete works with rtnl ops, enforce this by disallowing add. From Nicolas Dichtel. 14) Use after free in liquidio, from Pan Bian. 15) Fix SKB use after passing to netif_receive_skb(), from Prashant Bhole. 16) Static key accounting and other fixes in XPS from Sabrina Dubroca. 17) Partially initialized flow key passed to ip6_route_output(), from Shmulik Ladkani. 18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas Falcon. 19) Several small TCP fixes (off-by-one on window probe abort, NULL deref in tail loss probe, SNMP mis-estimations) from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) net/sched: cls_flower: Reject duplicated rules also under skip_sw bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips. bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips. bnxt_en: Keep track of reserved IRQs. bnxt_en: Fix CNP CoS queue regression. net/mlx4_core: Correctly set PFC param if global pause is turned off. Revert "net/ibm/emac: wrong bit is used for STA control" neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: Check available headroom in ip6_xmit() even without options tcp: lack of available data can also cause TSO defer ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl mlxsw: spectrum_router: Relax GRE decap matching check mlxsw: spectrum_switchdev: Avoid leaking FID's reference count mlxsw: spectrum_nve: Remove easily triggerable warnings ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes sctp: frag_point sanity check tcp: fix NULL ref in tail loss probe tcp: Do not underestimate rwnd_limited net: use skb_list_del_init() to remove from RX sublists ...
| | * | neighbour: Avoid writing before skb->head in neigh_hh_output()Stefano Brivio2018-12-071-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs. In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer. Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet. v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sctp: frag_point sanity checkJakub Audykowicz2018-12-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If for some reason an association's fragmentation point is zero, sctp_datamsg_from_user will try to endlessly try to divide a message into zero-sized chunks. This eventually causes kernel panic due to running out of memory. Although this situation is quite unlikely, it has occurred before as reported. I propose to add this simple last-ditch sanity check due to the severity of the potential consequences. Signed-off-by: Jakub Audykowicz <jakub.audykowicz@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2018-12-052-19/+44
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf 2018-12-05 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) fix bpf uapi pointers for 32-bit architectures, from Daniel. 2) improve verifer ability to handle progs with a lot of branches, from Alexei. 3) strict btf checks, from Yonghong. 4) bpf_sk_lookup api cleanup, from Joe. 5) other misc fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | bpf: Improve socket lookup reuseport documentationJoe Stringer2018-11-301-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the wording around socket lookup for reuseport sockets, and ensure that both bpf.h headers are in sync. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | | * | bpf: Support sk lookup in netns with id 0Joe Stringer2018-11-301-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern and Nicolas Dichtel report that the handling of the netns id 0 is incorrect for the BPF socket lookup helpers: rather than finding the netns with id 0, it is resolving to the current netns. This renders the netns_id 0 inaccessible. To fix this, adjust the API for the netns to treat all negative s32 values as a lookup in the current netns (including u64 values which when truncated to s32 become negative), while any values with a positive value in the signed 32-bit integer space would result in a lookup for a socket in the netns corresponding to that id. As before, if the netns with that ID does not exist, no socket will be found. Any netns outside of these ranges will fail to find a corresponding socket, as those values are reserved for future usage. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Joey Pabalinas <joeypabalinas@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | | * | bpf: fix pointer offsets in context for 32 bitDaniel Borkmann2018-11-302-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, pointer offsets in three BPF context structures are broken in two scenarios: i) 32 bit compiled applications running on 64 bit kernels, and ii) LLVM compiled BPF programs running on 32 bit kernels. The latter is due to BPF target machine being strictly 64 bit. So in each of the cases the offsets will mismatch in verifier when checking / rewriting context access. Fix this by providing a helper macro __bpf_md_ptr() that will enforce padding up to 64 bit and proper alignment, and for context access a macro bpf_ctx_range_ptr() which will cover full 64 bit member range on 32 bit archs. For flow_keys, we additionally need to force the size check to sizeof(__u64) as with other pointer types. Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | sctp: kfree_rcu asocXin Long2018-12-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In sctp_hash_transport/sctp_epaddr_lookup_transport, it dereferences a transport's asoc under rcu_read_lock while asoc is freed not after a grace period, which leads to a use-after-free panic. This patch fixes it by calling kfree_rcu to make asoc be freed after a grace period. Note that only the asoc's memory is delayed to free in the patch, it won't cause sk to linger longer. Thanks Neil and Marcelo to make this clear. Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable") Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new transport") Reported-by: syzbot+0b05d8aa7cb185107483@syzkaller.appspotmail.com Reported-by: syzbot+aad231d51b1923158444@syzkaller.appspotmail.com Suggested-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | net: phy: sfp: correct location of SFP standardsBaruch Siach2018-11-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SFP standards are now available from the SNIA (Storage Networking Industry Association) website. Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge tag 'char-misc-4.20-rc6' of ↵Linus Torvalds2018-12-091-0/+7
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small driver fixes for 4.20-rc6. There is a hyperv fix that for some reaon took forever to get into a shape that could be applied to the tree properly, but resolves a much reported issue. The others are some gnss patches, one a bugfix and the two others updates to the MAINTAINERS file to properly match the gnss files in the tree. All have been in linux-next for a while with no reported issues" * tag 'char-misc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching MAINTAINERS: add gnss scm tree gnss: sirf: fix activation retry handling Drivers: hv: vmbus: Offload the handling of channels to two workqueues
| | * | | | Drivers: hv: vmbus: Offload the handling of channels to two workqueuesDexuan Cui2018-12-031-0/+7
| | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmbus_process_offer() mustn't call channel->sc_creation_callback() directly for sub-channels, because sc_creation_callback() -> vmbus_open() may never get the host's response to the OPEN_CHANNEL message (the host may rescind a channel at any time, e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind() may not wake up the vmbus_open() as it's blocked due to a non-zero vmbus_connection.offer_in_progress, and finally we have a deadlock. The above is also true for primary channels, if the related device drivers use sync probing mode by default. And, usually the handling of primary channels and sub-channels can depend on each other, so we should offload them to different workqueues to avoid possible deadlock, e.g. in sync-probing mode, NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() -> rtnl_lock(), and causes deadlock: the former gets the rtnl_lock and waits for all the sub-channels to appear, but the latter can't get the rtnl_lock and this blocks the handling of sub-channels. The patch can fix the multiple-NIC deadlock described above for v3.x kernels (e.g. RHEL 7.x) which don't support async-probing of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing but don't enable async-probing for Hyper-V drivers (yet). The patch can also fix the hang issue in sub-channel's handling described above for all versions of kernels, including v4.19 and v4.20-rc4. So actually the patch should be applied to all the existing kernels, not only the kernels that have 8195b1396ec8. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Cc: stable@vger.kernel.org Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Merge tag 'usb-4.20-rc6' of ↵Linus Torvalds2018-12-092-2/+3
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 4.20-rc6 The "largest" here are some xhci fixes for reported issues. Also here is a USB core fix, some quirk additions, and a usb-serial fix which required the export of one of the tty layer's functions to prevent code duplication. The tty maintainer agreed with this change. All of these have been in linux-next with no reported issues" * tag 'usb-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: Prevent U1/U2 link pm states if exit latency is too long xhci: workaround CSS timeout on AMD SNPS 3.0 xHC USB: check usb_get_extra_descriptor for proper size USB: serial: console: fix reported terminal settings usb: quirk: add no-LPM quirk on SanDisk Ultra Flair device USB: Fix invalid-free bug in port_over_current_notify() usb: appledisplay: Add 27" Apple Cinema Display
| | * \ \ \ Merge tag 'usb-serial-4.20-rc6' of ↵Greg Kroah-Hartman2018-12-061-0/+1
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fix for v4.20-rc6 Here's a fix for a reported USB-console regression in 4.18 which revealed a long-standing bug in the console implementation. The patch has been in linux-next over night with no reported issues. Signed-off-by: Johan Hovold <johan@kernel.org> * tag 'usb-serial-4.20-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: console: fix reported terminal settings
| | | * | | | USB: serial: console: fix reported terminal settingsJohan Hovold2018-12-051-0/+1
| | | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The USB-serial console implementation has never reported the actual terminal settings used. Despite storing the corresponding cflags in its struct console, these were never honoured on later tty open() where the tty termios would be left initialised to the driver defaults. Unlike the serial console implementation, the USB-serial code calls subdriver open() already at console setup. While calling set_termios() and write() before open() looks like it could work for some USB-serial drivers, others definitely do not expect this, so modelling this after serial core is going to be intrusive, if at all possible. Instead, use a (renamed) tty helper to save the termios data used at console setup so that the tty termios reflects the actual terminal settings after a subsequent tty open(). Note that the calls to tty_init_termios() (tty_driver_install()) and tty_save_termios() are serialised using the disconnect mutex. This specifically fixes a regression that was triggered by a recent change adding software flow control to the pl2303 driver: a getty trying to disable flow control while leaving the baud rate unchanged would now also set the baud rate to the driver default (prior to the flow-control change this had been a noop). Fixes: 7041d9c3f01b ("USB: serial: pl2303: add support for tx xon/xoff flow control") Cc: stable <stable@vger.kernel.org> # 4.18 Cc: Florian Zumbiehl <florz@florz.de> Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
| | * / / / USB: check usb_get_extra_descriptor for proper sizeMathias Payer2018-12-051-2/+2
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading an extra descriptor, we need to properly check the minimum and maximum size allowed, to prevent from invalid data being sent by a device. Reported-by: Hui Peng <benquike@gmail.com> Reported-by: Mathias Payer <mathias.payer@nebelwelt.net> Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hui Peng <benquike@gmail.com> Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Merge tag 'dax-fixes-4.20-rc6' of ↵Linus Torvalds2018-12-091-6/+8
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax fixes from Dan Williams: "The last of the known regression fixes and fallout from the Xarray conversion of the filesystem-dax implementation. On the path to debugging why the dax memory-failure injection test started failing after the Xarray conversion a couple more fixes for the dax_lock_mapping_entry(), now called dax_lock_page(), surfaced. Those plus the bug that started the hunt are now addressed. These patches have appeared in a -next release with no issues reported. Note the touches to mm/memory-failure.c are just the conversion to the new function signature for dax_lock_page(). Summary: - Fix the Xarray conversion of fsdax to properly handle dax_lock_mapping_entry() in the presense of pmd entries - Fix inode destruction racing a new lock request" * tag 'dax-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Fix unlock mismatch with updated API dax: Don't access a freed inode dax: Check page->mapping isn't NULL
| | * | | | dax: Fix unlock mismatch with updated APIMatthew Wilcox2018-12-041-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Internal to dax_unlock_mapping_entry(), dax_unlock_entry() is used to store a replacement entry in the Xarray at the given xas-index with the DAX_LOCKED bit clear. When called, dax_unlock_entry() expects the unlocked value of the entry relative to the current Xarray state to be specified. In most contexts dax_unlock_entry() is operating in the same scope as the matched dax_lock_entry(). However, in the dax_unlock_mapping_entry() case the implementation needs to recall the original entry. In the case where the original entry is a 'pmd' entry it is possible that the pfn performed to do the lookup is misaligned to the value retrieved in the Xarray. Change the api to return the unlock cookie from dax_lock_page() and pass it to dax_unlock_page(). This fixes a bug where dax_unlock_page() was assuming that the page was PMD-aligned if the entry was a PMD entry with signatures like: WARNING: CPU: 38 PID: 1396 at fs/dax.c:340 dax_insert_entry+0x2b2/0x2d0 RIP: 0010:dax_insert_entry+0x2b2/0x2d0 [..] Call Trace: dax_iomap_pte_fault.isra.41+0x791/0xde0 ext4_dax_huge_fault+0x16f/0x1f0 ? up_read+0x1c/0xa0 __do_fault+0x1f/0x160 __handle_mm_fault+0x1033/0x1490 handle_mm_fault+0x18b/0x3d0 Link: https://lkml.kernel.org/r/20181130154902.GL10377@bombadil.infradead.org Fixes: 9f32d221301c ("dax: Convert dax_lock_mapping_entry to XArray") Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Tested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | Merge tag 'asm-generic-4.20' of ↵Linus Torvalds2018-12-081-0/+4
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fix from Arnd Bergmann: "Multiple people reported a bug I introduced in asm-generic/unistd.h in 4.20, this is the obvious bugfix to get glibc and others to correctly build again on new architectures that no longer provide the old fstatat64() family of system calls" * tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: unistd.h: fixup broken macro include.
| | * | | | | asm-generic: unistd.h: fixup broken macro include.Guo Ren2018-12-061-0/+4
| | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The broken macros make the glibc compile error. If there is no __NR3264_fstat*, we should also removed related definitions. Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Fixes: bf4b6a7d371e ("y2038: Remove stat64 family from default syscall set") [arnd: Both Marcin and Guo provided this patch to fix up my clearly broken commit, I applied the version with the better changelog.] Signed-off-by: Guo Ren <ren_guo@c-sky.com> Signed-off-by: Mao Han <han_mao@c-sky.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | Revert "mm, thp: consolidate THP gfp handling into ↵David Rientjes2018-12-081-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alloc_hugepage_direct_gfpmask" This reverts commit 89c83fb539f95491be80cdd5158e6f0ce329e317. This should have been done as part of 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations"). The movement of the thp allocation policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was intended to only set __GFP_THISNODE for mempolicies that are not MPOL_BIND whereas the revert could set this regardless of mempolicy. While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask() and alloc_pages_vma() was racy, that has since been removed since the revert. What is left is the possibility to use __GFP_THISNODE in policy_node() when it is unexpected because the special handling for hugepages in alloc_pages_vma() was removed as part of the consolidation. Secondly, prior to 89c83fb539f9, alloc_pages_vma() implemented a somewhat different policy for hugepage allocations, which were allocated through alloc_hugepage_vma(). For hugepage allocations, if the allocating process's node is in the set of allowed nodes, allocate with __GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with __GFP_THISNODE instead). This was changed for shmem_alloc_hugepage() to allow fallback to other nodes in 89c83fb539f9 as it did for new_page() in mm/mempolicy.c which is functionally different behavior and removes the requirement to only allocate hugepages locally. So this commit does a full revert of 89c83fb539f9 instead of the partial revert that was done in 2f0799a0ffc0. The result is the same thp allocation policy for 4.20 that was in 4.19. Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") Fixes: 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations") Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | Merge tag 'nfs-for-4.20-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2018-12-061-1/+0
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: "This is mainly fallout from the updates to the SUNRPC code that is being triggered from less common combinations of NFS mount options. Highlights include: Stable fixes: - Fix a page leak when using RPCSEC_GSS/krb5p to encrypt data. Bugfixes: - Fix a regression that causes the RPC receive code to hang - Fix call_connect_status() so that it handles tasks that got transmitted while queued waiting for the socket lock. - Fix a memory leak in call_encode() - Fix several other connect races. - Fix receive code error handling. - Use the discard iterator rather than MSG_TRUNC for compatibility with AF_UNIX/AF_LOCAL sockets. - nfs: don't dirty kernel pages read by direct-io - pnfs/Flexfiles fix to enforce per-mirror stateid only for NFSv4 data servers" * tag 'nfs-for-4.20-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Don't force a redundant disconnection in xs_read_stream() SUNRPC: Fix up socket polling SUNRPC: Use the discard iterator rather than MSG_TRUNC SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request() SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flag SUNRPC: Fix RPC receive hangs SUNRPC: Fix a potential race in xprt_connect() SUNRPC: Fix a memory leak in call_encode() SUNRPC: Fix leak of krb5p encode pages SUNRPC: call_connect_status() must handle tasks that got transmitted nfs: don't dirty kernel pages read by direct-io flexfiles: enforce per-mirror stateid only for v4 DSes
| | * | | | | SUNRPC: Fix a memory leak in call_encode()Trond Myklebust2018-12-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we retransmit an RPC request, we currently end up clobbering the value of req->rq_rcv_buf.bvec that was allocated by the initial call to xprt_request_prepare(req). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * | | | | | Merge tag 'sound-4.20-rc6' of ↵Linus Torvalds2018-12-061-1/+3
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Still more incoming fixes than wished at this stage, but all look like small and reasonable fixes. In addition to the usual HD-audio and USB-audio quirks for various devices, two notable changes are included: - a fix for USB-audio UAF at probing a malformed descriptor - workarounds for PCM rwsem mutex starvation" * tag 'sound-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4860G/Z6860G ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4660G ALSA: hda/realtek - Add support for Acer Aspire C24-860 headset mic ALSA: hda/realtek: ALC286 mic and headset-mode fixups for Acer Aspire U27-880 ALSA: usb-audio: Fix UAF decrement if card has no live interfaces in card.c ALSA: hda/realtek - Fix speaker output regression on Thinkpad T570 ALSA: pcm: Fix interval evaluation with openmin/max ALSA: hda: Add support for AMD Stoney Ridge ALSA: usb-audio: Add SMSL D1 to quirks for native DSD support ALSA: pcm: Fix starvation on down_write_nonblock() ALSA: pcm: Call snd_pcm_unlink() conditionally at closing
| | * | | | | | ALSA: pcm: Fix interval evaluation with openmin/maxTakashi Iwai2018-11-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As addressed in alsa-lib (commit b420056604f0), we need to fix the case where the evaluation of PCM interval "(x x+1]" leading to -EINVAL. After applying rules, such an interval may be translated as "(x x+1)". Fixes: ff2d6acdf6f1 ("ALSA: pcm: Fix snd_interval_refine first/last with open min/max") Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | | | | | | mm, thp: restore node-local hugepage allocationsDavid Rientjes2018-12-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a full revert of ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") and a partial revert of 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"). By not setting __GFP_THISNODE, applications can allocate remote hugepages when the local node is fragmented or low on memory when either the thp defrag setting is "always" or the vma has been madvised with MADV_HUGEPAGE. Remote access to hugepages often has much higher latency than local pages of the native page size. On Haswell, ac5b2c18911f was shown to have a 13.9% access regression after this commit for binaries that remap their text segment to be backed by transparent hugepages. The intent of ac5b2c18911f is to address an issue where a local node is low on memory or fragmented such that a hugepage cannot be allocated. In every scenario where this was described as a fix, there is abundant and unfragmented remote memory available to allocate from, even with a greater access latency. If remote memory is also low or fragmented, not setting __GFP_THISNODE was also measured on Haswell to have a 40% regression in allocation latency. Restore __GFP_THISNODE for thp allocations. Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | Merge tag 'media/v4.20-4' of ↵Linus Torvalds2018-12-031-1/+1
| |\ \ \ \ \ \ \ | | |_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - Revert a dt-bindings patch whose driver didn't make for 4.20 - fix a kernel oops at vicodec driver - fix a frame overflow at gspca with was causing regressions on some cameras, making them to not work - use the proper type for wait_queue head - make media request API compatible with 32-bit userspace on 64-bit kernel - fix a regression on Kernel 4.19 at dvb-pll - don't use SPDX headers yet for GFDL * tag 'media/v4.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: mediactl docs: Fix licensing message media: dvb-pll: don't re-validate tuner frequencies media: dvb-pll: fix tuner frequency ranges media: Revert "media: dt-bindings: Document the Rockchip VPU bindings" media: gspca: fix frame overflow error media: vicodec: fix memchr() kernel oops media: cedrus: add action item to the TODO media: media-request: Add compat ioctl media: Use wait_queue_head_t for media_request
| | * | | | | | media: Use wait_queue_head_t for media_requestJasmin Jessich2018-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The portable type for a wait queue is wait_queue_head_t. Signed-off-by: Jasmin Jessich <jasmin@anw.at> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
* | | | | | | | bus: fsl-mc: explicitly define the fsl_mc_command endiannessIoana Ciornei2018-12-061-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both the header and the command parameters of the fsl_mc_command are 64-bit little-endian words. Use the appropriate type to explicitly specify their endianness. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | mtd: add support for reading MTD devices via the nvmem APIAlban Bedel2018-12-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow drivers that use the nvmem API to read data stored on MTD devices. For this the mtd devices are registered as read-only NVMEM providers. We don't support device tree systems for now. Signed-off-by: Alban Bedel <albeu@free.fr> [Bartosz: - include linux/nvmem-provider.h - set the name of the nvmem provider - set no_of_node to true in nvmem_config - don't check the return value of nvmem_unregister() - it cannot fail - tweaked the commit message] Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | nvmem: add new config optionBartosz Golaszewski2018-12-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to add nvmem support for MTD. TI DaVinci is the first platform that will be using it, but only in non-DT mode. In order not to introduce any new interface to supporting of which we would have to commit - add a new config option that tells nvmem not to use the DT node of the parent device. This way we won't be creating nvmem devices corresponding with MTD partitions defined in device tree. By default MTD will set this new field to true. Once a set of bindings for MTD nvmem cells is agreed upon, we'll be able to remove this option. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | nvmem: Move nvmem_type_str array to its only userAndy Shevchenko2018-12-061-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we put static variable to a header file it's copied to each module that includes the header. But not all of them are actually using it. Move nvmem_type_str array to its only user to make a compiler happy: In file included from include/linux/rtc.h:18, from drivers/rtc/rtc-proc.c:15: include/linux/nvmem-provider.h:29:27: warning: 'nvmem_type_str' defined but not used [-Wunused-const-variable=] static const char * const nvmem_type_str[] = { ^~~~~~~~~~~~~~ Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Suggested-by: Joe Perches <joe@perches.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | nvmem: add type attributeAlexandre Belloni2018-12-061-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a type attribute so userspace is able to know how the data is stored as this can help taking the correct decision when selecting which device to use. This will also help program display the proper warnings when burning fuses for example. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | Merge 4.20-rc5 into char-misc-nextGreg Kroah-Hartman2018-12-0320-42/+89
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need the fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | | | | Merge tag 'armsoc-fixes' of ↵Linus Torvalds2018-12-021-0/+2
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Volume is a little higher than usual due to a set of gpio fixes for Davinci platforms that's been around a while, still seemed appropriate to not hold off until next merge window. Besides that it's the usual mix of minor fixes, mostly corrections of small stuff in device trees. Major stability-related one is the removal of a regulator from DT on Rock960, since DVFS caused undervoltage. I expect it'll be restored once they figure out the underlying issue" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits) MAINTAINERS: Remove unused Qualcomm SoC mailing list ARM: davinci: dm644x: set the GPIO base to 0 ARM: davinci: da830: set the GPIO base to 0 ARM: davinci: dm355: set the GPIO base to 0 ARM: davinci: dm646x: set the GPIO base to 0 ARM: davinci: dm365: set the GPIO base to 0 ARM: davinci: da850: set the GPIO base to 0 gpio: davinci: restore a way to manually specify the GPIO base ARM: davinci: dm644x: define gpio interrupts as separate resources ARM: davinci: dm355: define gpio interrupts as separate resources ARM: davinci: dm646x: define gpio interrupts as separate resources ARM: davinci: dm365: define gpio interrupts as separate resources ARM: davinci: da8xx: define gpio interrupts as separate resources ARM: dts: at91: sama5d2: use the divided clock for SMC ARM: dts: imx51-zii-rdu1: Remove EEPROM node ARM: dts: rockchip: Remove @0 from the veyron memory node arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou. arm64: dts: qcom: msm8998: Reserve gpio ranges on MTP arm64: dts: sdm845-mtp: Reserve reserved gpios arm64: dts: ti: k3-am654: Fix wakeup_uart reg address ...
| | * | | | | | | gpio: davinci: restore a way to manually specify the GPIO baseBartosz Golaszewski2018-11-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection") broke the network support in legacy boot mode for da850-evm since we can no longer request the MDIO clock GPIO. Other boards may be broken too, which I haven't tested. The problem is in the fact that most board files still use the legacy GPIO API where lines are requested by numbers rather than descriptors. While this should be fixed eventually, in order to unbreak the board for now - provide a way to manually specify the GPIO base in platform data. Fixes: 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
| * | | | | | | | Merge tag 'for-linus-4.20a-rc5-tag' of ↵Linus Torvalds2018-12-021-5/+0
| |\ \ \ \ \ \ \ \ | | |_|_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A revert of a previous commit as it is no longer necessary and has shown to cause problems in some memory hotplug cases. - Some small fixes and a minor cleanup. - A patch for adding better diagnostic data in a very rare failure case. * tag 'for-linus-4.20a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: pvcalls-front: fixes incorrect error handling Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" xen: xlate_mmu: add missing header to fix 'W=1' warning xen/x86: add diagnostic printout to xen_mc_flush() in case of error x86/xen: cleanup includes in arch/x86/xen/spinlock.c
| | * | | | | | | Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin2018-11-291-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| * | | | | | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-12-014-17/+30
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull STIBP fallout fixes from Thomas Gleixner: "The performance destruction department finally got it's act together and came up with a cure for the STIPB regression: - Provide a command line option to control the spectre v2 user space mitigations. Default is either seccomp or prctl (if seccomp is disabled in Kconfig). prctl allows mitigation opt-in, seccomp enables the migitation for sandboxed processes. - Rework the code to handle the conditional STIBP/IBPB control and remove the now unused ptrace_may_access_sched() optimization attempt - Disable STIBP automatically when SMT is disabled - Optimize the switch_to() logic to avoid MSR writes and invocations of __switch_to_xtra(). - Make the asynchronous speculation TIF updates synchronous to prevent stale mitigation state. As a general cleanup this also makes retpoline directly depend on compiler support and removes the 'minimal retpoline' option which just pretended to provide some form of security while providing none" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/speculation: Provide IBPB always command line options x86/speculation: Add seccomp Spectre v2 user space protection mode x86/speculation: Enable prctl mode for spectre_v2_user x86/speculation: Add prctl() control for indirect branch speculation x86/speculation: Prepare arch_smt_update() for PRCTL mode x86/speculation: Prevent stale SPEC_CTRL msr content x86/speculation: Split out TIF update ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS x86/speculation: Prepare for conditional IBPB in switch_mm() x86/speculation: Avoid __switch_to_xtra() calls x86/process: Consolidate and simplify switch_to_xtra() code x86/speculation: Prepare for per task indirect branch speculation control x86/speculation: Add command line control for indirect branch speculation x86/speculation: Unify conditional spectre v2 print functions x86/speculataion: Mark command line parser data __initdata x86/speculation: Mark string arrays const correctly x86/speculation: Reorder the spec_v2 code x86/l1tf: Show actual SMT state x86/speculation: Rework SMT state change sched/smt: Expose sched_smt_present static key ...
| | * | | | | | | | x86/speculation: Add prctl() control for indirect branch speculationThomas Gleixner2018-11-282-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of indirect branch speculation via STIBP and IBPB. Invocations: Check indirect branch speculation status with - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0); Enable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0); Disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0); Force disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0); See Documentation/userspace-api/spec_ctrl.rst. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de
| | * | | | | | | | ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRSThomas Gleixner2018-11-281-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IBPB control code in x86 removed the usage. Remove the functionality which was introduced for this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.559149393@linutronix.de
| | * | | | | | | | x86/speculation: Rework SMT state changeThomas Gleixner2018-11-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_smt_update() is only called when the sysfs SMT control knob is changed. This means that when SMT is enabled in the sysfs control knob the system is considered to have SMT active even if all siblings are offline. To allow finegrained control of the speculation mitigations, the actual SMT state is more interesting than the fact that siblings could be enabled. Rework the code, so arch_smt_update() is invoked from each individual CPU hotplug function, and simplify the update function while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de
| | * | | | | | | | sched/smt: Expose sched_smt_present static keyThomas Gleixner2018-11-281-0/+18
| | | |_|_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the scheduler's 'sched_smt_present' static key globaly available, so it can be used in the x86 speculation control code. Provide a query function and a stub for the CONFIG_SMP=n case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185004.430168326@linutronix.de
| * | | | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2018-11-301-1/+2
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton: "31 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (31 commits) ocfs2: fix potential use after free mm/khugepaged: fix the xas_create_range() error path mm/khugepaged: collapse_shmem() do not crash on Compound mm/khugepaged: collapse_shmem() without freezing new_page mm/khugepaged: minor reorderings in collapse_shmem() mm/khugepaged: collapse_shmem() remember to clear holes mm/khugepaged: fix crashes due to misaccounted holes mm/khugepaged: collapse_shmem() stop if punched or truncated mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() mm/huge_memory: splitting set mapping+index before unfreeze mm/huge_memory: rename freeze_page() to unmap_page() initramfs: clean old path before creating a hardlink kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace psi: make disabling/enabling easier for vendor kernels proc: fixup map_files test on arm debugobjects: avoid recursive calls with kmemleak userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set userfaultfd: shmem: add i_size checks userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem ...
| | * | | | | | | | psi: make disabling/enabling easier for vendor kernelsJohannes Weiner2018-11-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mel Gorman reports a hackbench regression with psi that would prohibit shipping the suse kernel with it default-enabled, but he'd still like users to be able to opt in at little to no cost to others. With the current combination of CONFIG_PSI and the psi_disabled bool set from the commandline, this is a challenge. Do the following things to make it easier: 1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros to enable CONFIG_PSI in their kernel but leave the feature disabled unless a user requests it at boot-time. To avoid double negatives, rename psi_disabled= to psi=. 2. Make psi_disabled a static branch to eliminate any branch costs when the feature is disabled. In terms of numbers before and after this patch, Mel says: : The following is a comparision using CONFIG_PSI=n as a baseline against : your patch and a vanilla kernel : : 4.20.0-rc4 4.20.0-rc4 4.20.0-rc4 : kconfigdisable-v1r1 vanilla psidisable-v1r1 : Amean 1 1.3100 ( 0.00%) 1.3923 ( -6.28%) 1.3427 ( -2.49%) : Amean 3 3.8860 ( 0.00%) 4.1230 * -6.10%* 3.8860 ( -0.00%) : Amean 5 6.8847 ( 0.00%) 8.0390 * -16.77%* 6.7727 ( 1.63%) : Amean 7 9.9310 ( 0.00%) 10.8367 * -9.12%* 9.9910 ( -0.60%) : Amean 12 16.6577 ( 0.00%) 18.2363 * -9.48%* 17.1083 ( -2.71%) : Amean 18 26.5133 ( 0.00%) 27.8833 * -5.17%* 25.7663 ( 2.82%) : Amean 24 34.3003 ( 0.00%) 34.6830 ( -1.12%) 32.0450 ( 6.58%) : Amean 30 40.0063 ( 0.00%) 40.5800 ( -1.43%) 41.5087 ( -3.76%) : Amean 32 40.1407 ( 0.00%) 41.2273 ( -2.71%) 39.9417 ( 0.50%) : : It's showing that the vanilla kernel takes a hit (as the bisection : indicated it would) and that disabling PSI by default is reasonably : close in terms of performance for this particular workload on this : particular machine so; Link: http://lkml.kernel.org/r/20181127165329.GA29728@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>