summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding maskMarc Zyngier2016-06-191-1/+1
| | | | | | | | | | | [ Upstream commit dd5f1b049dc139876801db3cdd0f20d21fd428cc ] The INTID mask is wrong, and is made a signed value, which has nteresting effects in the KVM emulation. Let's sanitize it. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* arm64: GICv3: introduce symbolic names for GICv3 ICC_SGI1R_EL1 fieldsAndre Przywara2016-06-191-0/+12
| | | | | | | | | | | | | [ Upstream commit 7e5802781c3e109558ddfd8b02155ad24d872ee7 ] The gic_send_sgi() function used hardcoded bit shift values to generate the ICC_SGI1R_EL1 register value. Replace this with symbolic names to allow reusing them later. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* drm/fb-helper: Propagate errors from initial config failureThierry Reding2016-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 01934c2a691882185b3021d437df13bcba07711d ] Make drm_fb_helper_initial_config() return an int rather than a bool so that the error can be properly propagated. While at it, update drivers to propagate errors further rather than just ignore them. v2: - cirrus: No cleanup is required, the top-level cirrus_driver_load() will do it as part of cirrus_driver_unload() in its cleanup path. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> [danvet: Squash in simplification patch from kbuild.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* can: fix handling of unmodifiable configuration optionsOliver Hartkopp2016-06-031-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit bb208f144cf3f59d8f89a09a80efd04389718907 ] As described in 'can: m_can: tag current CAN FD controllers as non-ISO' (6cfda7fbebe) it is possible to define fixed configuration options by setting the according bit in 'ctrlmode' and clear it in 'ctrlmode_supported'. This leads to the incovenience that the fixed configuration bits can not be passed by netlink even when they have the correct values (e.g. non-ISO, FD). This patch fixes that issue and not only allows fixed set bit values to be set again but now requires(!) to provide these fixed values at configuration time. A valid CAN FD configuration consists of a nominal/arbitration bittiming, a data bittiming and a control mode with CAN_CTRLMODE_FD set - which is now enforced by a new can_validate() function. This fix additionally removed the inconsistency that was prohibiting the support of 'CANFD-only' controller drivers, like the RCar CAN FD. For this reason a new helper can_set_static_ctrlmode() has been introduced to provide a proper interface to handle static enabled CAN controller options. Reported-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reviewed-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com> Cc: <stable@vger.kernel.org> # >= 3.18 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* USB: leave LPM alone if possible when binding/unbinding interface driversAlan Stern2016-06-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 6fb650d43da3e7054984dc548eaa88765a94d49f ] When a USB driver is bound to an interface (either through probing or by claiming it) or is unbound from an interface, the USB core always disables Link Power Management during the transition and then re-enables it afterward. The reason is because the driver might want to prevent hub-initiated link power transitions, in which case the HCD would have to recalculate the various LPM parameters. This recalculation takes place when LPM is re-enabled and the new parameters are sent to the device and its parent hub. However, if the driver does not want to prevent hub-initiated link power transitions then none of this work is necessary. The parameters don't need to be recalculated, and LPM doesn't need to be disabled and re-enabled. It turns out that disabling and enabling LPM can be time-consuming, enough so that it interferes with user programs that want to claim and release interfaces rapidly via usbfs. Since the usbfs kernel driver doesn't set the disable_hub_initiated_lpm flag, we can speed things up and get the user programs to work by leaving LPM alone whenever the flag isn't set. And while we're improving the way disable_hub_initiated_lpm gets used, let's also fix its kerneldoc. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Matthew Giassa <matthew@giassa.net> CC: Mathias Nyman <mathias.nyman@intel.com> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* usb: core: hub: hub_port_init lock controller instead of busChris Bainbridge2016-06-032-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit feb26ac31a2a5cb88d86680d9a94916a6343e9e6 ] The XHCI controller presents two USB buses to the system - one for USB2 and one for USB3. The hub init code (hub_port_init) is reentrant but only locks one bus per thread, leading to a race condition failure when two threads attempt to simultaneously initialise a USB2 and USB3 device: [ 8.034843] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command [ 13.183701] usb 3-3: device descriptor read/all, error -110 On a test system this failure occurred on 6% of all boots. The call traces at the point of failure are: Call Trace: [<ffffffff81b9bab7>] schedule+0x37/0x90 [<ffffffff817da7cd>] usb_kill_urb+0x8d/0xd0 [<ffffffff8111e5e0>] ? wake_up_atomic_t+0x30/0x30 [<ffffffff817dafbe>] usb_start_wait_urb+0xbe/0x150 [<ffffffff817db10c>] usb_control_msg+0xbc/0xf0 [<ffffffff817d07de>] hub_port_init+0x51e/0xb70 [<ffffffff817d4697>] hub_event+0x817/0x1570 [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620 [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620 [<ffffffff810f4684>] worker_thread+0x64/0x4b0 [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390 [<ffffffff810fa7f5>] kthread+0x105/0x120 [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70 [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 Call Trace: [<ffffffff817fd36d>] xhci_setup_device+0x53d/0xa40 [<ffffffff817fd87e>] xhci_address_device+0xe/0x10 [<ffffffff817d047f>] hub_port_init+0x1bf/0xb70 [<ffffffff811247ed>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff817d4697>] hub_event+0x817/0x1570 [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620 [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620 [<ffffffff810f4684>] worker_thread+0x64/0x4b0 [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390 [<ffffffff810fa7f5>] kthread+0x105/0x120 [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70 [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 Which results from the two call chains: hub_port_init usb_get_device_descriptor usb_get_descriptor usb_control_msg usb_internal_control_msg usb_start_wait_urb usb_submit_urb / wait_for_completion_timeout / usb_kill_urb hub_port_init hub_set_address xhci_address_device xhci_setup_device Mathias Nyman explains the current behaviour violates the XHCI spec: hub_port_reset() will end up moving the corresponding xhci device slot to default state. As hub_port_reset() is called several times in hub_port_init() it sounds reasonable that we could end up with two threads having their xhci device slots in default state at the same time, which according to xhci 4.5.3 specs still is a big no no: "Note: Software shall not transition more than one Device Slot to the Default State at a time" So both threads fail at their next task after this. One fails to read the descriptor, and the other fails addressing the device. Fix this in hub_port_init by locking the USB controller (instead of an individual bus) to prevent simultaneous initialisation of both buses. Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in parallel") Link: https://lkml.org/lkml/2016/2/8/312 Link: https://lkml.org/lkml/2016/2/4/748 Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Cc: stable <stable@vger.kernel.org> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* Minimal fix-up of bad hashing behavior of hash_64()Linus Torvalds2016-05-171-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 689de1d6ca95b3b5bd8ee446863bf81a4883ea25 ] This is a fairly minimal fixup to the horribly bad behavior of hash_64() with certain input patterns. In particular, because the multiplicative value used for the 64-bit hash was intentionally bit-sparse (so that the multiply could be done with shifts and adds on architectures without hardware multipliers), some bits did not get spread out very much. In particular, certain fairly common bit ranges in the input (roughly bits 12-20: commonly with the most information in them when you hash things like byte offsets in files or memory that have block factors that mean that the low bits are often zero) would not necessarily show up much in the result. There's a bigger patch-series brewing to fix up things more completely, but this is the fairly minimal fix for the 64-bit hashing problem. It simply picks a much better constant multiplier, spreading the bits out a lot better. NOTE! For 32-bit architectures, the bad old hash_64() remains the same for now, since 64-bit multiplies are expensive. The bigger hashing cleanup will replace the 32-bit case with something better. The new constants were picked by George Spelvin who wrote that bigger cleanup series. I just picked out the constants and part of the comment from that series. Cc: stable@vger.kernel.org Cc: George Spelvin <linux@horizon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* IB/security: Restrict use of the write() interfaceJason Gunthorpe2016-05-171-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e6bd18f57aad1a2d1ef40e646d03ed0f2515c9e3 ] The drivers/infiniband stack uses write() as a replacement for bi-directional ioctl(). This is not safe. There are ways to trigger write calls that result in the return structure that is normally written to user space being shunted off to user specified kernel memory instead. For the immediate repair, detect and deny suspicious accesses to the write API. For long term, update the user space libraries and the kernel API to something that doesn't present the same security vulnerabilities (likely a structured ioctl() interface). The impacted uAPI interfaces are generally only available if hardware from drivers/infiniband is installed in the system. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [ Expanded check to all known write() entry points ] Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* [media] v4l2-dv-timings.h: fix polarity for 4k formatsHans Verkuil2016-05-171-10/+20
| | | | | | | | | | | | | | [ Upstream commit 3020ca711871fdaf0c15c8bab677a6bc302e28fe ] The VSync polarity was negative instead of positive for the 4k CEA formats. I probably copy-and-pasted these from the DMT 4k format, which does have a negative VSync polarity. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reported-by: Martin Bugge <marbugge@cisco.com> Cc: <stable@vger.kernel.org> # for v4.1 and up Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* regulator: s2mps11: Fix invalid selector mask and voltages for buck9Krzysztof Kozlowski2016-05-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 3b672623079bb3e5685b8549e514f2dfaa564406 ] The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff instead of 0x1f) thus reading entire register as buck9's voltage. This effectively caused regulator core to interpret values as higher voltages than they were and then to set real voltage much lower than intended. The buck9 provides power to other regulators, including LDO13 and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower voltage caused SD card detection errors on Odroid XU3/XU4: mmc1: card never left busy state mmc1: error -110 whilst initialising SD card During driver probe the regulator core was checking whether initial voltage matches the constraints. With incorrect vsel_mask of 0xff and default value of 0x50, the core interpreted this as 5 V which is outside of constraints (3-3.775 V). Then the regulator core was adjusting the voltage to match the constraints. With incorrect vsel_mask this new voltage mapped to a vere low voltage in the driver. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* mm: hugetlb: allow hugepages_supported to be architecture specificDominik Dingel2016-05-101-9/+8
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2531c8cf56a640cd7d17057df8484e570716a450 ] s390 has a constant hugepage size, by setting HPAGE_SHIFT we also change e.g. the pageblock_order, which should be independent in respect to hugepage support. With this patch every architecture is free to define how to check for hugepage support. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* drm: Loongson-3 doesn't fully support wc memoryHuacai Chen2016-05-101-0/+2
| | | | | | | | | | [ Upstream commit 221004c66a58949a0f25c937a6789c0839feb530 ] Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* USB: uas: Add a new NO_REPORT_LUNS quirkHans de Goede2016-04-201-0/+2
| | | | | | | | | | | | | | | [ Upstream commit 1363074667a6b7d0507527742ccd7bbed5e3ceaa ] Add a new NO_REPORT_LUNS quirk and set it for Seagate drives with an usb-id of: 0bc2:331a, as these will fail to respond to a REPORT_LUNS command. Cc: stable@vger.kernel.org Reported-and-tested-by: David Webb <djw@noc.ac.uk> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* compiler-gcc: disable -ftracer for __noclone functionsPaolo Bonzini2016-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 95272c29378ee7dc15f43fa2758cb28a5913a06d ] -ftracer can duplicate asm blocks causing compilation to fail in noclone functions. For example, KVM declares a global variable in an asm like asm("2: ... \n .pushsection data \n .global vmx_return \n vmx_return: .long 2b"); and -ftracer causes a double declaration. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Reported-by: Linda Walsh <lkml@tlinx.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* compiler-gcc: integrate the various compiler-gcc[345].h filesJoe Perches2016-04-204-179/+116
| | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f320793e52aee78f0fbb8bcaf10e6614d2e67bfc ] [ Upstream commit cb984d101b30eb7478d32df56a0023e4603cba7f ] As gcc major version numbers are going to advance rather rapidly in the future, there's no real value in separate files for each compiler version. Deduplicate some of the macros #defined in each file too. Neaten comments using normal kernel commenting style. Signed-off-by: Joe Perches <joe@perches.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Alan Modra <amodra@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* mm: use 'unsigned int' for page orderKirill A. Shutemov2016-04-182-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d00181b96eb86c914cb327d1de974a1b71366e1b ] Let's try to be consistent about data type of page order. [sfr@canb.auug.org.au: fix build (type of pageblock_order)] [hughd@google.com: some configs end up with MAX_ORDER and pageblock_order having different types] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* fs/coredump: prevent fsuid=0 dumps into user-controlled directoriesJann Horn2016-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 378c6520e7d29280f400ef2ceaf155c86f05a71a ] This commit fixes the following security hole affecting systems where all of the following conditions are fulfilled: - The fs.suid_dumpable sysctl is set to 2. - The kernel.core_pattern sysctl's value starts with "/". (Systems where kernel.core_pattern starts with "|/" are not affected.) - Unprivileged user namespace creation is permitted. (This is true on Linux >=3.8, but some distributions disallow it by default using a distro patch.) Under these conditions, if a program executes under secure exec rules, causing it to run with the SUID_DUMP_ROOT flag, then unshares its user namespace, changes its root directory and crashes, the coredump will be written using fsuid=0 and a path derived from kernel.core_pattern - but this path is interpreted relative to the root directory of the process, allowing the attacker to control where a coredump will be written with root privileges. To fix the security issue, always interpret core_pattern for dumps that are written under SUID_DUMP_ROOT relative to the root directory of init. Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* tracing: Fix trace_printk() to print when not using bprintk()Steven Rostedt (Red Hat)2016-04-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 3debb0a9ddb16526de8b456491b7db60114f7b5e ] The trace_printk() code will allocate extra buffers if the compile detects that a trace_printk() is used. To do this, the format of the trace_printk() is saved to the __trace_printk_fmt section, and if that section is bigger than zero, the buffers are allocated (along with a message that this has happened). If trace_printk() uses a format that is not a constant, and thus something not guaranteed to be around when the print happens, the compiler optimizes the fmt out, as it is not used, and the __trace_printk_fmt section is not filled. This means the kernel will not allocate the special buffers needed for the trace_printk() and the trace_printk() will not write anything to the tracing buffer. Adding a "__used" to the variable in the __trace_printk_fmt section will keep it around, even though it is set to NULL. This will keep the string from being printed in the debugfs/tracing/printk_formats section as it is not needed. Reported-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()" Cc: stable@vger.kernel.org # v3.5+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* bitops: Do not default to __clear_bit() for __clear_bit_unlock()Peter Zijlstra2016-04-181-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f75d48644c56a31731d17fa693c8175328957e1d ] __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20160309114054.GJ6356@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* Thermal: Ignore invalid trip pointsZhang Rui2016-04-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 81ad4276b505e987dd8ebbdf63605f92cd172b52 ] In some cases, platform thermal driver may report invalid trip points, thermal core should not take any action for these trip points. This fixed a regression that bogus trip point starts to screw up thermal control on some Lenovo laptops, after commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 Author: Zhang Rui <rui.zhang@intel.com> Date: Fri Oct 30 16:31:47 2015 +0800 Thermal: initialize thermal zone device correctly After thermal zone device registered, as we have not read any temperature before, thus tz->temperature should not be 0, which actually means 0C, and thermal trend is not available. In this case, we need specially handling for the first thermal_zone_device_update(). Both thermal core framework and step_wise governor is enhanced to handle this. And since the step_wise governor is the only one that uses trends, so it's the only thermal governor that needs to be updated. Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> CC: <stable@vger.kernel.org> #3.18+ Link: https://bugzilla.redhat.com/show_bug.cgi?id=1317190 Link: https://bugzilla.kernel.org/show_bug.cgi?id=114551 Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* PCI: Disable IO/MEM decoding for devices with non-compliant BARsBjorn Helgaas2016-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b84106b4e2290c081cdab521fa832596cdfea246 ] The PCI config header (first 64 bytes of each device's config space) is defined by the PCI spec so generic software can identify the device and manage its usage of I/O, memory, and IRQ resources. Some non-spec-compliant devices put registers other than BARs where the BARs should be. When the PCI core sizes these "BARs", the reads and writes it does may have unwanted side effects, and the "BAR" may appear to describe non-sensical address space. Add a flag bit to mark non-compliant devices so we don't touch their BARs. Turn off IO/MEM decoding to prevent the devices from consuming address space, since we can't read the BARs to find out what that address space would be. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> CC: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* PCI: Add dev->has_secondary_link to track downstream PCIe linksYijing Wang2016-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d0751b98dfa391f862e02dc36a233a54615e3f1d ] A PCIe Port is an interface to a Link. A Root Port is a PCI-PCI bridge in a Root Complex and has a Link on its secondary (downstream) side. For other Ports, the Link may be on either the upstream (closer to the Root Complex) or downstream side of the Port. The usual topology has a Root Port connected to an Upstream Port. We previously assumed this was the only possible topology, and that a Downstream Port's Link was always on its downstream side, like this: +---------------------+ +------+ | Downstream | | Root | | Upstream Port +--Link-- | Port +--Link--+ Port | +------+ | Downstream | | Port +--Link-- +---------------------+ But systems do exist (see URL below) where the Root Port is connected to a Downstream Port. In this case, a Downstream Port's Link may be on either the upstream or downstream side: +---------------------+ +------+ | Upstream | | Root | | Downstream Port +--Link-- | Port +--Link--+ Port | +------+ | Downstream | | Port +--Link-- +---------------------+ We can't use the Port type to determine which side the Link is on, so add a bit in struct pci_dev to keep track. A Root Port's Link is always on the Port's secondary side. A component (Endpoint or Port) on the other end of the Link obviously has the Link on its upstream side. If that component is a Port, it is part of a Switch or a Bridge. A Bridge has a PCI or PCI-X bus on its secondary side, not a Link. The internal bus of a Switch connects the Port to another Port whose Link is on the downstream side. [bhelgaas: changelog, comment, cache "type", use if/else] Link: http://lkml.kernel.org/r/54EB81B2.4050904@pobox.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=94361 Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* x86, irq: Keep balance of IOAPIC pin reference countJiang Liu2016-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cffe0a2b5a34c95a4dadc9ec7132690a5b0f6687 ] To keep balance of IOAPIC pin reference count, we need to protect pirq_enable_irq(), acpi_pci_irq_enable() and intel_mid_pci_irq_enable() from reentrance. There are two cases which will cause reentrance. The first case is caused by suspend/hibernation. If pcibios_disable_irq is called during suspending/hibernating, we don't release the assigned IRQ number, otherwise it may break the suspend/hibernation. So late when pcibios_enable_irq is called during resume, we shouldn't allocate IRQ number again. The second case is that function acpi_pci_irq_enable() may be called twice for PCI devices present at boot time as below: 1) pci_acpi_init() --> acpi_pci_irq_enable() if pci_routeirq is true 2) pci_enable_device() --> pcibios_enable_device() --> acpi_pci_irq_enable() We can't kill kernel parameter pci_routeirq yet because it's still needed for debugging purpose. So flag irq_managed is introduced to track whether IRQ number is assigned by OS and to protect pirq_enable_irq(), acpi_pci_irq_enable() and intel_mid_pci_irq_enable() from reentrance. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/1414387308-27148-13-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* include/linux/poison.h: fix LIST_POISON{1,2} offsetVasily Kulikov2016-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 8a5e5e02fc83aaf67053ab53b359af08c6c49aaf ] Poison pointer values should be small enough to find a room in non-mmap'able/hardly-mmap'able space. E.g. on x86 "poison pointer space" is located starting from 0x0. Given unprivileged users cannot mmap anything below mmap_min_addr, it should be safe to use poison pointers lower than mmap_min_addr. The current poison pointer values of LIST_POISON{1,2} might be too big for mmap_min_addr values equal or less than 1 MB (common case, e.g. Ubuntu uses only 0x10000). There is little point to use such a big value given the "poison pointer space" below 1 MB is not yet exhausted. Changing it to a smaller value solves the problem for small mmap_min_addr setups. The values are suggested by Solar Designer: http://www.openwall.com/lists/oss-security/2015/05/02/6 Signed-off-by: Vasily Kulikov <segoon@openwall.com> Cc: Solar Designer <solar@openwall.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* modules: fix longstanding /proc/kallsyms vs module insertion race.Rusty Russell2016-04-131-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 8244062ef1e54502ef55f54cced659913f244c3e ] For CONFIG_KALLSYMS, we keep two symbol tables and two string tables. There's one full copy, marked SHF_ALLOC and laid out at the end of the module's init section. There's also a cut-down version that only contains core symbols and strings, and lives in the module's core section. After module init (and before we free the module memory), we switch the mod->symtab, mod->num_symtab and mod->strtab to point to the core versions. We do this under the module_mutex. However, kallsyms doesn't take the module_mutex: it uses preempt_disable() and rcu tricks to walk through the modules, because it's used in the oops path. It's also used in /proc/kallsyms. There's nothing atomic about the change of these variables, so we can get the old (larger!) num_symtab and the new symtab pointer; in fact this is what I saw when trying to reproduce. By grouping these variables together, we can use a carefully-dereferenced pointer to ensure we always get one or the other (the free of the module init section is already done in an RCU callback, so that's safe). We allocate the init one at the end of the module init section, and keep the core one inside the struct module itself (it could also have been allocated at the end of the module core, but that's probably overkill). Reported-by: Weilong Chen <chenweilong@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541 Cc: stable@kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* efi: Make efivarfs entries immutable by defaultPeter Jones2016-04-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit ed8b0de5a33d2a2557dce7f9429dca8cb5bc5879 ] "rm -rf" is bricking some peoples' laptops because of variables being used to store non-reinitializable firmware driver data that's required to POST the hardware. These are 100% bugs, and they need to be fixed, but in the mean time it shouldn't be easy to *accidentally* brick machines. We have to have delete working, and picking which variables do and don't work for deletion is quite intractable, so instead make everything immutable by default (except for a whitelist), and make tools that aren't quite so broad-spectrum unset the immutable flag. Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* efi: Make our variable validation list include the guidPeter Jones2016-04-121-1/+2
| | | | | | | | | | | | | | | | | | [ Upstream commit 8282f5d9c17fe15a9e658c06e3f343efae1a2a2f ] All the variables in this list so far are defined to be in the global namespace in the UEFI spec, so this just further ensures we're validating the variables we think we are. Including the guid for entries will become more important in future patches when we decide whether or not to allow deletion of variables based on presence in this list. Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* lib/ucs2_string: Add ucs2 -> utf8 helper functionsPeter Jones2016-04-121-0/+4
| | | | | | | | | | | | | [ Upstream commit 73500267c930baadadb0d02284909731baf151f7 ] This adds ucs2_utf8size(), which tells us how big our ucs2 string is in bytes, and ucs2_as_utf8, which translates from ucs2 to utf8.. Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* ptrace: use fsuid, fsgid, effective creds for fs access checksJann Horn2016-04-121-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit caaee6234d05a58c5b4d05e7bf766131b810a657 ] By checking the effective credentials instead of the real UID / permitted capabilities, ensure that the calling process actually intended to use its credentials. To ensure that all ptrace checks use the correct caller credentials (e.g. in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS flag), use two new flags and require one of them to be set. The problem was that when a privileged task had temporarily dropped its privileges, e.g. by calling setreuid(0, user_uid), with the intent to perform following syscalls with the credentials of a user, it still passed ptrace access checks that the user would not be able to pass. While an attacker should not be able to convince the privileged task to perform a ptrace() syscall, this is a problem because the ptrace access check is reused for things in procfs. In particular, the following somewhat interesting procfs entries only rely on ptrace access checks: /proc/$pid/stat - uses the check for determining whether pointers should be visible, useful for bypassing ASLR /proc/$pid/maps - also useful for bypassing ASLR /proc/$pid/cwd - useful for gaining access to restricted directories that contain files with lax permissions, e.g. in this scenario: lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar drwx------ root root /root drwxr-xr-x root root /root/foobar -rw-r--r-- root root /root/foobar/secret Therefore, on a system where a root-owned mode 6755 binary changes its effective credentials as described and then dumps a user-specified file, this could be used by an attacker to reveal the memory layout of root's processes or reveal the contents of files he is not allowed to access (through /proc/$pid/cwd). [akpm@linux-foundation.org: fix warning] Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* tty: Fix GPF in flush_to_ldisc(), part 2Peter Hurley2016-04-121-1/+1
| | | | | | | | | | | | | | | | | [ Upstream commit f33798deecbd59a2955f40ac0ae2bc7dff54c069 ] commit 9ce119f318ba ("tty: Fix GPF in flush_to_ldisc()") fixed a GPF caused by a line discipline which does not define a receive_buf() method. However, the vt driver (and speakup driver also) pushes selection data directly to the line discipline receive_buf() method via tty_ldisc_receive_buf(). Fix the same problem in tty_ldisc_receive_buf(). Cc: <stable@vger.kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* drm/dp/mst: move GUID storage from mgr, port to only mst branchHersen Wu2016-04-121-15/+10
| | | | | | | | | | | | | | | | | | | [ Upstream commit 5e93b8208d3c419b515fb75e2601931c027e12ab ] Previous implementation does not handle case below: boot up one MST branch to DP connector of ASIC. After boot up, hot plug 2nd MST branch to DP output of 1st MST, GUID is not created for 2nd MST branch. When downstream port of 2nd MST branch send upstream request, it fails because 2nd MST branch GUID is not available. New Implementation: only create GUID for MST branch and save it within Branch. Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Cc: stable@vger.kernel.org Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* tracing: Fix check for cpu online when event is disabledSteven Rostedt (Red Hat)2016-03-221-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit dc17147de328a74bbdee67c1bf37d2f1992de756 ] Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f37755490fe9b only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* cfg80211/wext: fix message orderingJohannes Berg2016-03-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cb150b9d23be6ee7f3a0fff29784f1c5b5ac514d ] Since cfg80211 frequently takes actions from its netdev notifier call, wireless extensions messages could still be ordered badly since the wext netdev notifier, since wext is built into the kernel, runs before the cfg80211 netdev notifier. For example, the following can happen: 5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff 5: wlan1: <BROADCAST,MULTICAST,UP> link/ether when setting the interface down causes the wext message. To also fix this, export the wireless_nlevent_flush() function and also call it from the cfg80211 notifier. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* block: bio: introduce helpers to get the 1st and last bvecMing Lei2016-03-131-0/+37
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 7bcd79ac50d9d83350a835bdb91c04ac9e098412 ] The bio passed to bio_will_gap() may be fast cloned from upper layer(dm, md, bcache, fs, ...), or from bio splitting in block core. Unfortunately bio_will_gap() just figures out the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously wrong. This patch introduces two helpers for getting the first and last bvec of one bio for fixing the issue. Cc: stable@vger.kernel.org Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* libata: Align ata_device's id on a cachelineHarvey Hunt2016-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | [ Upstream commit 4ee34ea3a12396f35b26d90a094c75db95080baa ] The id buffer in ata_device is a DMA target, but it isn't explicitly cacheline aligned. Due to this, adjacent fields can be overwritten with stale data from memory on non coherent architectures. As a result, the kernel is sometimes unable to communicate with an ATA device. Fix this by ensuring that the id buffer is cacheline aligned. This issue is similar to that fixed by Commit 84bda12af31f ("libata: align ap->sector_buf"). Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 2.6.18 Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* libata: fix HDIO_GET_32BIT ioctlArnd Bergmann2016-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 287e6611ab1eac76c2c5ebf6e345e04c80ca9c61 ] As reported by Soohoon Lee, the HDIO_GET_32BIT ioctl does not work correctly in compat mode with libata. I have investigated the issue further and found multiple problems that all appeared with the same commit that originally introduced HDIO_GET_32BIT handling in libata back in linux-2.6.8 and presumably also linux-2.4, as the code uses "copy_to_user(arg, &val, 1)" to copy a 'long' variable containing either 0 or 1 to user space. The problems with this are: * On big-endian machines, this will always write a zero because it stores the wrong byte into user space. * In compat mode, the upper three bytes of the variable are updated by the compat_hdio_ioctl() function, but they now contain uninitialized stack data. * The hdparm tool calling this ioctl uses a 'static long' variable to store the result. This means at least the upper bytes are initialized to zero, but calling another ioctl like HDIO_GET_MULTCOUNT would fill them with data that remains stale when the low byte is overwritten. Fortunately libata doesn't implement any of the affected ioctl commands, so this would only happen when we query both an IDE and an ATA device in the same command such as "hdparm -N -c /dev/hda /dev/sda" * The libata code for unknown reasons started using ATA_IOC_GET_IO32 and ATA_IOC_SET_IO32 as aliases for HDIO_GET_32BIT and HDIO_SET_32BIT, while the ioctl commands that were added later use the normal HDIO_* names. This is harmless but rather confusing. This addresses all four issues by changing the code to use put_user() on an 'unsigned long' variable in HDIO_GET_32BIT, like the IDE subsystem does, and by clarifying the names of the ioctl commands. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Soohoon Lee <Soohoon.Lee@f5.com> Tested-by: Soohoon Lee <Soohoon.Lee@f5.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* nfs: fix nfs_size_to_loff_tChristoph Hellwig2016-03-071-3/+1
| | | | | | | | | | | | | | | | | | [ Upstream commit 50ab8ec74a153eb30db26529088bc57dd700b24c ] See http: //www.infradead.org/rpr.html X-Evolution-Source: 1451162204.2173.11@leira.trondhjem.org Content-Transfer-Encoding: 8bit Mime-Version: 1.0 We support OFFSET_MAX just fine, so don't round down below it. Also switch to using min_t to make the helper more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Fixes: 433c92379d9c ("NFS: Clean up nfs_size_to_loff_t()") Cc: stable@vger.kernel.org # 2.6.23+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* target: Fix remote-port TMR ABORT + se_cmd fabric stopNicholas Bellinger2016-03-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0f4a943168f31d29a1701908931acaba518b131a upstream. To address the bug where fabric driver level shutdown of se_cmd occurs at the same time when TMR CMD_T_ABORTED is happening resulting in a -1 ->cmd_kref, this patch adds a CMD_T_FABRIC_STOP bit that is used to determine when TMR + driver I_T nexus shutdown is happening concurrently. It changes target_sess_cmd_list_set_waiting() to obtain se_cmd->cmd_kref + set CMD_T_FABRIC_STOP, and drop local reference in target_wait_for_sess_cmds() and invoke extra target_put_sess_cmd() during Task Aborted Status (TAS) when necessary. Also, it adds a new target_wait_free_cmd() wrapper around transport_wait_for_tasks() for the special case within transport_generic_free_cmd() to set CMD_T_FABRIC_STOP, and is now aware of CMD_T_ABORTED + CMD_T_TAS status bits to know when an extra transport_put_cmd() during TAS is required. Note transport_generic_free_cmd() is expected to block on cmd->cmd_wait_comp in order to follow what iscsi-target expects during iscsi_conn context se_cmd shutdown. Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@daterainc.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* target: Fix race for SCF_COMPARE_AND_WRITE_POST checkingNicholas Bellinger2016-03-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 057085e522f8bf94c2e691a5b76880f68060f8ba ] This patch addresses a race + use after free where the first stage of COMPARE_AND_WRITE in compare_and_write_callback() is rescheduled after the backend sends the secondary WRITE, resulting in second stage compare_and_write_post() callback completing in target_complete_ok_work() before the first can return. Because current code depends on checking se_cmd->se_cmd_flags after return from se_cmd->transport_complete_callback(), this results in first stage having SCF_COMPARE_AND_WRITE_POST set, which incorrectly falls through into second stage CAW processing code, eventually triggering a NULL pointer dereference due to use after free. To address this bug, pass in a new *post_ret parameter into se_cmd->transport_complete_callback(), and depend upon this value instead of ->se_cmd_flags to determine when to return or fall through into ->queue_status() code for CAW. Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* ALSA: pcm: More kerneldoc updatesTakashi Iwai2016-03-041-7/+153
| | | | | | | | | [ Upstream commit 30b771cf8c3120c5c946811ecc5a9b87a34003a2 ] Add proper kerneldoc comments to the exported functions. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* tracing: Fix freak link error caused by branch tracerArnd Bergmann2016-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b33c8ff4431a343561e2319f17c14286f2aa52e2 ] In my randconfig tests, I came across a bug that involves several components: * gcc-4.9 through at least 5.3 * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files * CONFIG_PROFILE_ALL_BRANCHES overriding every if() * The optimized implementation of do_div() that tries to replace a library call with an division by multiplication * code in drivers/media/dvb-frontends/zl10353.c doing u32 adc_clock = 450560; /* 45.056 MHz */ if (state->config.adc_clock) adc_clock = state->config.adc_clock; do_div(value, adc_clock); In this case, gcc fails to determine whether the divisor in do_div() is __builtin_constant_p(). In particular, it concludes that __builtin_constant_p(adc_clock) is false, while __builtin_constant_p(!!adc_clock) is true. That in turn throws off the logic in do_div() that also uses __builtin_constant_p(), and instead of picking either the constant- optimized division, and the code in ilog2() that uses __builtin_constant_p() to figure out whether it knows the answer at compile time. The result is a link error from failing to find multiple symbols that should never have been called based on the __builtin_constant_p(): dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN' dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod' ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined! ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined! This patch avoids the problem by changing __trace_if() to check whether the condition is known at compile-time to be nonzero, rather than checking whether it is actually a constant. I see this one link error in roughly one out of 1600 randconfig builds on ARM, and the patch fixes all known instances. Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.de Acked-by: Nicolas Pitre <nico@linaro.org> Fixes: ab3c9c686e22 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y") Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* tracepoints: Do not trace when cpu is offlineSteven Rostedt (Red Hat)2016-03-041-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 ] The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org Reported-by: Denis Kirjanov <kda@linux-powerpc.org> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* pty: make sure super_block is still valid in final /dev/tty closeHerton R. Krzesinski2016-03-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 1f55c718c290616889c04946864a13ef30f64929 ] Considering current pty code and multiple devpts instances, it's possible to umount a devpts file system while a program still has /dev/tty opened pointing to a previosuly closed pty pair in that instance. In the case all ptmx and pts/N files are closed, umount can be done. If the program closes /dev/tty after umount is done, devpts_kill_index will use now an invalid super_block, which was already destroyed in the umount operation after running ->kill_sb. This is another "use after free" type of issue, but now related to the allocated super_block instance. To avoid the problem (warning at ida_remove and potential crashes) for this specific case, I added two functions in devpts which grabs additional references to the super_block, which pty code now uses so it makes sure the super block structure is still valid until pty shutdown is done. I also moved the additional inode references to the same functions, which also covered similar case with inode being freed before /dev/tty final close/shutdown. Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Cc: stable@vger.kernel.org # 2.6.29+ Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* target: Remove first argument of target_{get,put}_sess_cmd()Bart Van Assche2016-03-021-2/+2
| | | | | | | | | | | | | | | | | [ Upstream commit afc16604c06414223478df3e42301ab630b9960a ] The first argument of these two functions is always identical to se_cmd->se_sess. Hence remove the first argument. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()zengtao2016-03-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 0f26922fe5dc5724b1adbbd54b21bad03590b4f3 ] The datatype __kernel_time_t is u32 on 32bit platform, so its subject to overflows in the timeval/timespec to cputime conversion. Currently the following functions are affected: 1. setitimer() 2. timer_create/timer_settime() 3. sys_clock_nanosleep This can happen on MIPS32 and ARM32 with "Full dynticks CPU time accounting" enabled, which is required for CONFIG_NO_HZ_FULL. Enforce u64 conversion to prevent the overflow. Fixes: 31c1fc818715 ("ARM: Kconfig: allow full nohz CPU accounting") Signed-off-by: zengtao <prime.zeng@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: <fweisbec@gmail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1454384314-154784-1-git-send-email-prime.zeng@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* ipv6: update skb->csum when CE mark is propagatedEric Dumazet2016-02-151-3/+16
| | | | | | | | | | | | | | | | [ Upstream commit 34ae6a1aa0540f0f781dd265366036355fdc8930 ] When a tunnel decapsulates the outer header, it has to comply with RFC 6080 and eventually propagate CE mark into inner header. It turns out IP6_ECN_set_ce() does not correctly update skb->csum for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure" messages and stack traces. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* vxlan: fix test which detect duplicate vxlan ifaceNicolas Dichtel2016-02-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 07b9b37c227cb8d88d478b4a9c5634fee514ede1 ] When a vxlan interface is created, the driver checks that there is not another vxlan interface with the same properties. To do this, it checks the existing vxlan udp socket. Since commit 1c51a9159dde, the creation of the vxlan socket is done only when the interface is set up, thus it breaks that test. Example: $ ip l a vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 $ ip l a vxlan11 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 $ ip -br l | grep vxlan vxlan10 DOWN f2:55:1c:6a:fb:00 <BROADCAST,MULTICAST> vxlan11 DOWN 7a:cb:b9:38:59:0d <BROADCAST,MULTICAST> Instead of checking sockets, let's loop over the vxlan iface list. Fixes: 1c51a9159dde ("vxlan: fix race caused by dropping rtnl_unlock") Reported-by: Thomas Faivre <thomas.faivre@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* net: filter: make JITs zero A for SKF_AD_ALU_XOR_XRabin Vincent2016-02-151-0/+19
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 55795ef5469290f89f04e12e662ded604909e462 ] The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. This was found using american fuzzy lop. Add a helper to determine if A needs to be cleared given the first instruction in a filter, and use this in the JITs. Except for ARM, the rest have only been compile-tested. Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* unix: properly account for FDs passed over unix socketswilly tarreau2016-02-151-0/+1
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 712f4aad406bb1ed67f3f98d04c044191f0ff593 ] It is possible for a process to allocate and accumulate far more FDs than the process' limit by sending them over a unix socket then closing them to keep the process' fd count low. This change addresses this problem by keeping track of the number of FDs in flight per user and preventing non-privileged processes from having more FDs in flight than their configured FD limit. Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
* radix-tree: fix oops after radix_tree_iter_retryKonstantin Khlebnikov2016-02-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 732042821cfa106b3c20b9780e4c60fee9d68900 ] Helper radix_tree_iter_retry() resets next_index to the current index. In following radix_tree_next_slot current chunk size becomes zero. This isn't checked and it tries to dereference null pointer in slot. Tagged iterator is fine because retry happens only at slot 0 where tag bitmask in iter->tags is filled with single bit. Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ohad Ben-Cohen <ohad@wizery.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>