summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* perf/x86/intel: Support Extended PEBS for Goldmont PlusKan Liang2018-07-252-7/+1
| | | | | | | | | | | | | | | | | | | | | Enable the extended PEBS for Goldmont Plus. There is no specific PEBS constrains for Goldmont Plus. Removing the pebs_constraints for Goldmont Plus. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20180309021542.11374-4-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf/x86/intel/ds: Handle PEBS overflow for fixed countersKan Liang2018-07-253-11/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The pebs_drain() need to support fixed counters. The DS Save Area now include "counter reset value" fields for each fixed counters. Extend the related variables (e.g. mask, counters, error) to support fixed counters. There is no extended PEBS in PEBS v2 and earlier PEBS format. Only need to change the code for PEBS v3 and later PEBS format. Extend the pebs_event_reset[] logic to support new "counter reset value" fields. Increase the reserve space for fixed counters. Based-on-code-from: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20180309021542.11374-3-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf/x86/intel: Support PEBS on fixed countersKan Liang2018-07-251-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Extended PEBS feature supports PEBS on fixed-function performance counters as well as all four general purpose counters. It has to change the order of PEBS and fixed counter enabling to make sure PEBS is enabled for the fixed counters. The change of the order doesn't impact the behavior of current code on other platforms which don't support extended PEBS. Because there is no dependency among those enable/disable functions. Don't enable IRQ generation (0x8) for MSR_ARCH_PERFMON_FIXED_CTR_CTRL. The PEBS ucode will handle the interrupt generation. Based-on-code-from: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20180309021542.11374-2-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf/x86/intel: Introduce PMU flag for Extended PEBSKan Liang2018-07-252-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Extended PEBS feature, introduced in the Goldmont Plus microarchitecture, supports all events as "Extended PEBS". Introduce flag PMU_FL_PEBS_ALL to indicate the platforms which support extended PEBS. To support all events, it needs to support all constraints for PEBS. To avoid duplicating all the constraints in the PEBS table, making the PEBS code search the normal constraints too. Based-on-code-from: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20180309021542.11374-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar2018-07-25193-966/+1113
|\ | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)Peter Zijlstra2018-07-252-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vince reported the perf_fuzzer giving various unwinder warnings and Josh reported: > Deja vu. Most of these are related to perf PEBS, similar to the > following issue: > > b8000586c90b ("perf/x86/intel: Cure bogus unwind from PEBS entries") > > This is basically the ORC version of that. setup_pebs_sample_data() is > assembling a franken-pt_regs which ORC isn't happy about. RIP is > inconsistent with some of the other registers (like RSP and RBP). And where the previous unwinder only needed BP,SP ORC also requires IP. But we cannot spoof IP because then the sample will get displaced, entirely negating the point of PEBS. So cure the whole thing differently by doing the unwind early; this does however require a means to communicate we did the unwind early. We (ab)use an unused sample_type bit for this, which we set on events that fill out the data->callchain before the normal perf_prepare_sample(). Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf/x86/amd/ibs: Don't access non-started eventThomas Gleixner2018-07-241-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Paul Menzel reported the following bug: > Enabling the undefined behavior sanitizer and building GNU/Linux 4.18-rc5+ > (with some unrelated commits) with GCC 8.1.0 from Debian Sid/unstable, the > warning below is shown. > > > [ 2.111913] > > ================================================================================ > > [ 2.111917] UBSAN: Undefined behaviour in arch/x86/events/amd/ibs.c:582:24 > > [ 2.111919] member access within null pointer of type 'struct perf_event' > > [ 2.111926] CPU: 0 PID: 144 Comm: udevadm Not tainted 4.18.0-rc5-00316-g4864b68cedf2 #104 > > [ 2.111928] Hardware name: ASROCK E350M1/E350M1, BIOS TIMELESS 01/01/1970 > > [ 2.111930] Call Trace: > > [ 2.111943] dump_stack+0x55/0x89 > > [ 2.111949] ubsan_epilogue+0xb/0x33 > > [ 2.111953] handle_null_ptr_deref+0x7f/0x90 > > [ 2.111958] __ubsan_handle_type_mismatch_v1+0x55/0x60 > > [ 2.111964] perf_ibs_handle_irq+0x596/0x620 The code dereferences event before checking the STARTED bit. Patch below should cure the issue. The warning should not trigger, if I analyzed the thing correctly. (And Paul's testing confirms this.) Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Menzel <pmenzel+linux-x86@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1807200958390.1580@nanos.tec.linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * alpha: fix osf_wait4() breakageAl Viro2018-07-221-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * Merge tag 'armsoc-fixes' of ↵Linus Torvalds2018-07-212-7/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: - Fix interrupt type on ethernet switch for i.MX-based RDU2 - GPC on i.MX exposed too large a register window which resulted in userspace being able to crash the machine. - Fixup of bad merge resolution moving GPIO DT nodes under pinctrl on droid4. * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch soc: imx: gpc: restrict register range for regmap access ARM: dts: omap4-droid4: fix dts w.r.t. pwm
| | * Merge tag 'imx-fixes-4.18-4' of ↵Olof Johansson2018-07-201-1/+1
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.18, round 4: - A fix for i.MX6 RDU2 board on the wrong IRQ type of Marvell switch, which might result in a race condition in the interrupt handler and cause the OS to miss all future events. * tag 'imx-fixes-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch Signed-off-by: Olof Johansson <olof@lixom.net>
| | | * ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switchUwe Kleine-König2018-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Marvell switches report their interrupts in a level sensitive way. When using edge sensitive detection a race condition in the interrupt handler of the swich might result in the OS to miss all future events which might make the switch non-functional. The problem is that both mv88e6xxx_g2_irq_thread_fn() and mv88e6xxx_g1_irq_thread_work() sample the irq cause register (MV88E6XXX_G2_INT_SRC and MV88E6XXX_G1_STS respectively) once and then handle the observed sources. If after sampling but before all observed irq sources are handled a new irq source gets active this is not noticed by the handler which returns unsuspecting, but the interrupt line stays active which prevents the edge detector to kick in. All device trees but imx6qdl-zii-rdu2 get this right (most of them by not specifying an interrupt parent). So fix imx6qdl-zii-rdu2 accordingly. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Fixes: f64992d1a916 ("ARM: dts: imx6: RDU2: Add Switch interrupts") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
| | * | Merge tag 'omap-for-v4.18/fixes-rc5-signed' of ↵Olof Johansson2018-07-191-6/+3
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes One omap dts mismerge fix The dts patch for droid4 PWM vibrator has added gpio6 entries to the wrong node. Let's fix it with a note that there seems to be also other GPIO PWM issues to fix still to get the PWM vibrator working. So this can wait for v4.19 merge cycle if necessary. * tag 'omap-for-v4.18/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap4-droid4: fix dts w.r.t. pwm Signed-off-by: Olof Johansson <olof@lixom.net>
| | | * | ARM: dts: omap4-droid4: fix dts w.r.t. pwmPavel Machek2018-07-161-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pwm node should not be under gpio6 node in the device tree. This fixes detection of the pwm on Droid 4. Fixes: 6d7bdd328da4 ("ARM: dts: omap4-droid4: update touchscreen") Signed-off-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> [tony@atomide.com: added fixes tag] Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2018-07-211-3/+0
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "A single fix for a MCE-polling regression, which prevented the disabling of polling" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE: Remove min interval polling limitation
| | * | | | x86/MCE: Remove min interval polling limitationDewet Thibaut2018-07-171-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b3b7c4795c ("x86/MCE: Serialize sysfs changes") introduced a min interval limitation when setting the check interval for polled MCEs. However, the logic is that 0 disables polling for corrected MCEs, see Documentation/x86/x86_64/machinecheck. The limitation prevents disabling. Remove this limitation and allow the value 0 to disable polling again. Fixes: b3b7c4795c ("x86/MCE: Serialize sysfs changes") Signed-off-by: Dewet Thibaut <thibaut.dewet@nokia.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180716084927.24869-1-alexander.sverdlin@nokia.com
| * | | | | Merge branch 'x86-pti-urgent-for-linus' of ↵Linus Torvalds2018-07-213-9/+10
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti fixes from Ingo Molnar: "An APM fix, and a BTS hardware-tracing fix related to PTI changes" * 'x86-pti-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apm: Don't access __preempt_count with zeroed fs x86/events/intel/ds: Fix bts_interrupt_threshold alignment
| | * | | | | x86/apm: Don't access __preempt_count with zeroed fsVille Syrjälä2018-07-162-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | APM_DO_POP_SEGS does not restore fs/gs which were zeroed by APM_DO_ZERO_SEGS. Trying to access __preempt_count with zeroed fs doesn't really work. Move the ibrs call outside the APM_DO_SAVE_SEGS/APM_DO_RESTORE_SEGS invocations so that fs is actually restored before calling preempt_enable(). Fixes the following sort of oopses: [ 0.313581] general protection fault: 0000 [#1] PREEMPT SMP [ 0.313803] Modules linked in: [ 0.314040] CPU: 0 PID: 268 Comm: kapmd Not tainted 4.16.0-rc1-triton-bisect-00090-gdd84441a7971 #19 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 [ 0.316161] EFLAGS: 00210016 CPU: 0 [ 0.316161] EAX: 00000102 EBX: 00000000 ECX: 00000102 EDX: 00000000 [ 0.316161] ESI: 0000530e EDI: dea95f64 EBP: dea95f18 ESP: dea95ef0 [ 0.316161] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 0.316161] CR0: 80050033 CR2: 00000000 CR3: 015d3000 CR4: 000006d0 [ 0.316161] Call Trace: [ 0.316161] ? cpumask_weight.constprop.15+0x20/0x20 [ 0.316161] on_cpu0+0x44/0x70 [ 0.316161] apm+0x54e/0x720 [ 0.316161] ? __switch_to_asm+0x26/0x40 [ 0.316161] ? __schedule+0x17d/0x590 [ 0.316161] kthread+0xc0/0xf0 [ 0.316161] ? proc_apm_show+0x150/0x150 [ 0.316161] ? kthread_create_worker_on_cpu+0x20/0x20 [ 0.316161] ret_from_fork+0x2e/0x38 [ 0.316161] Code: da 8e c2 8e e2 8e ea 57 55 2e ff 1d e0 bb 5d b1 0f 92 c3 5d 5f 07 1f 89 47 0c 90 8d b4 26 00 00 00 00 90 8d b4 26 00 00 00 00 90 <64> ff 0d 84 16 5c b1 74 7f 8b 45 dc 8e e0 8b 45 d8 8e e8 8b 45 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 SS:ESP: 0068:dea95ef0 [ 0.316161] ---[ end trace 656253db2deaa12c ]--- Fixes: dd84441a7971 ("x86/speculation: Use IBRS if available before calling into firmware") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180709133534.5963-1-ville.syrjala@linux.intel.com
| | * | | | | x86/events/intel/ds: Fix bts_interrupt_threshold alignmentHugh Dickins2018-07-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Markus reported that BTS is sporadically missing the tail of the trace in the perf_event data buffer: [decode error (1): instruction overflow] shown in GDB; and bisected it to the conversion of debug_store to PTI. A little "optimization" crept into alloc_bts_buffer(), which mistakenly placed bts_interrupt_threshold away from the 24-byte record boundary. Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from the BTS buffer base that is a multiple of the BTS record size." Revert "max" from a byte count to a record count, to calculate the bts_interrupt_threshold correctly: which turns out to fix problem seen. Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-and-tested-by: Markus T Metzger <markus.t.metzger@intel.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v4.14+ Link: https://lkml.kernel.org/r/alpine.LSU.2.11.1807141248290.1614@eggly.anvils
| * | | | | | Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2018-07-212-2/+7
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core kernel fixes from Ingo Molnar: "This is mostly the copy_to_user_mcsafe() related fixes from Dan Williams, and an ORC fix for Clang" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handling lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe() lib/iov_iter: Document _copy_to_iter_flushcache() lib/iov_iter: Document _copy_to_iter_mcsafe() objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang
| | * | | | | | x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handlingDan Williams2018-07-162-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All copy_to_user() implementations need to be prepared to handle faults accessing userspace. The __memcpy_mcsafe() implementation handles both mmu-faults on the user destination and machine-check-exceptions on the source buffer. However, the memcpy_mcsafe() wrapper may silently fallback to memcpy() depending on build options and cpu-capabilities. Force copy_to_user_mcsafe() to always use __memcpy_mcsafe() when available, and otherwise disable all of the copy_to_user_mcsafe() infrastructure when __memcpy_mcsafe() is not available, i.e. CONFIG_X86_MCE=n. This fixes crashes of the form: run fstests generic/323 at 2018-07-02 12:46:23 BUG: unable to handle kernel paging request at 00007f0d50001000 RIP: 0010:__memcpy+0x12/0x20 [..] Call Trace: copyout_mcsafe+0x3a/0x50 _copy_to_iter_mcsafe+0xa1/0x4a0 ? dax_alive+0x30/0x50 dax_iomap_actor+0x1f9/0x280 ? dax_iomap_rw+0x100/0x100 iomap_apply+0xba/0x130 ? dax_iomap_rw+0x100/0x100 dax_iomap_rw+0x95/0x100 ? dax_iomap_rw+0x100/0x100 xfs_file_dax_read+0x7b/0x1d0 [xfs] xfs_file_read_iter+0xa7/0xc0 [xfs] aio_read+0x11c/0x1a0 Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Fixes: 8780356ef630 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()") Link: http://lkml.kernel.org/r/153108277790.37979.1486841789275803399.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | Merge tag 'powerpc-4.18-4' of ↵Linus Torvalds2018-07-217-9/+47
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Two regression fixes, one for xmon disassembly formatting and the other to fix the E500 build. Two commits to fix a potential security issue in the VFIO code under obscure circumstances. And finally a fix to the Power9 idle code to restore SPRG3, which is user visible and used for sched_getcpu(). Thanks to: Alexey Kardashevskiy, David Gibson. Gautham R. Shenoy, James Clarke" * tag 'powerpc-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle) powerpc/Makefile: Assemble with -me500 when building for E500 KVM: PPC: Check if IOMMU page is contained in the pinned physical page vfio/spapr: Use IOMMU pageshift rather than pagesize powerpc/xmon: Fix disassembly since printf changes
| | * | | | | | | powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle)Gautham R. Shenoy2018-07-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror SPRN_USPRG3 are used as userspace VDSO write and read registers respectively. SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not restored. As a result, any read from SPRN_USPRG3 returns zero on an exit from stop4 (Power9 only) and above. Thus in this situation, on POWER9, any call from sched_getcpu() always returns zero, as on powerpc, we call __kernel_getcpu() which relies upon SPRN_USPRG3 to report the CPU and NUMA node information. Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state with the sprg_vdso value that is cached in PACA. Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle") Cc: stable@vger.kernel.org # v4.14+ Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| | * | | | | | | powerpc/Makefile: Assemble with -me500 when building for E500James Clarke2018-07-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the assembly files use instructions specific to BookE or E500, which are rejected with the now-default -mcpu=powerpc, so we must pass -me500 to the assembler just as we pass -me200 for E200. Fixes: 4bf4f42a2feb ("powerpc/kbuild: Set default generic machine type for 32-bit compile") Signed-off-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| | * | | | | | | KVM: PPC: Check if IOMMU page is contained in the pinned physical pageAlexey Kardashevskiy2018-07-184-7/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A VM which has: - a DMA capable device passed through to it (eg. network card); - running a malicious kernel that ignores H_PUT_TCE failure; - capability of using IOMMU pages bigger that physical pages can create an IOMMU mapping that exposes (for example) 16MB of the host physical memory to the device when only 64K was allocated to the VM. The remaining 16MB - 64K will be some other content of host memory, possibly including pages of the VM, but also pages of host kernel memory, host programs or other VMs. The attacking VM does not control the location of the page it can map, and is only allowed to map as many pages as it has pages of RAM. We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that an IOMMU page is contained in the physical page so the PCI hardware won't get access to unassigned host memory; however this check is missing in the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and did not hit this yet as the very first time when the mapping happens we do not have tbl::it_userspace allocated yet and fall back to the userspace which in turn calls VFIO IOMMU driver, this fails and the guest does not retry, This stores the smallest preregistered page size in the preregistered region descriptor and changes the mm_iommu_xxx API to check this against the IOMMU page size. This calculates maximum page size as a minimum of the natural region alignment and compound page size. For the page shift this uses the shift returned by find_linux_pte() which indicates how the page is mapped to the current userspace - if the page is huge and this is not a zero, then it is a leaf pte and the page is mapped within the range. Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| | * | | | | | | powerpc/xmon: Fix disassembly since printf changesMichael Ellerman2018-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent change to add printf annotations to xmon inadvertently made the disassembly output ugly, eg: c00000002001e058 7ee00026 mfcr r23 c00000002001e05c fffffffffae101a0 std r23,416(r1) c00000002001e060 fffffffff8230000 std r1,0(r3) The problem being that negative 32-bit values are being displayed in full 64-bits. The printf conversion was actually correct, we are passing unsigned long so it should use "lx". But powerpc instructions are only 4 bytes and the code only reads 4 bytes, so inst should really just be unsigned int, and that also fixes the printing to look the way we want: c00000002001e058 7ee00026 mfcr r23 c00000002001e05c fae101a0 std r23,416(r1) c00000002001e060 f8230000 std r1,0(r3) Fixes: e70d8f55268b ("powerpc/xmon: Add __printf annotation to xmon_printf()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | | | | | mm: make vm_area_alloc() initialize core fieldsLinus Torvalds2018-07-212-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | mm: use helper functions for allocating and freeing vm_area structsLinus Torvalds2018-07-212-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | Merge tag 'arc-4.18-rc6' of ↵Linus Torvalds2018-07-2024-47/+112
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "ARC is back after radio silence in 4.17: - Fix CONFIG_SWAP [Alexey] - Robustify cmpxchg emulation for systems w/o atomics [Alexey / PeterZ] - Allow mprotext(PROT_EXEC) for stack mappings [Vineet] - HSDK platform enable PCIe, APG GPIO [Gustavo] - miscll other fixes, config updates etc" * tag 'arc-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARCv2: [plat-hsdk]: Save accl reg pair by default ARC: mm: allow mprotect to make stack mappings executable ARC: Fix CONFIG_SWAP ARC: [arcompact] entry.S: minor code movement ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs ARC: configs: remove no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES ARC: Improve cmpxchg syscall implementation ARC: [plat-hsdk]: Configure APB GPIO controller on ARC HSDK platform ARC: [plat-hsdk] Add PCIe support ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP ARC: Explicitly add -mmedium-calls to CFLAGS
| | * | | | | | | | ARCv2: [plat-hsdk]: Save accl reg pair by defaultVineet Gupta2018-07-192-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This manifsted as strace segfaulting on HSDK because gcc was targetting the accumulator registers as GPRs, which kernek was not saving/restoring by default. Cc: stable@vger.kernel.org #4.14+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: mm: allow mprotect to make stack mappings executableVineet Gupta2018-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mprotect(EXEC) was failing for stack mappings as default vm flags was missing MAYEXEC. This was triggered by glibc test suite nptl/tst-execstack testcase What is surprising is that despite running LTP for years on, we didn't catch this issue as it lacks a directed test case. gcc dejagnu tests with nested functions also requiring exec stack work fine though because they rely on the GNU_STACK segment spit out by compiler and handled in kernel elf loader. This glibc case is different as the stack is non exec to begin with and a dlopen of shared lib with GNU_STACK segment triggers the exec stack proceedings using a mprotect(PROT_EXEC) which was broken. CC: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: Fix CONFIG_SWAPAlexey Brodkin2018-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | swap was broken on ARC due to silly copy-paste issue. We encode offset from swapcache page in __swp_entry() as (off << 13) but were not decoding back in __swp_offset() as (off >> 13) - it was still (off << 13). This finally fixes swap usage on ARC. | # mkswap /dev/sda2 | | # swapon -a -e /dev/sda2 | Adding 500728k swap on /dev/sda2. Priority:-2 extents:1 across:500728k | | # free | total used free shared buffers cached | Mem: 765104 13456 751648 4736 8 4736 | -/+ buffers/cache: 8712 756392 | Swap: 500728 0 500728 Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: [arcompact] entry.S: minor code movementVineet Gupta2018-07-092-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a non functional code changw, which moves r25 restore from macro into the caller of macro Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigsAlexey Brodkin2018-07-0912-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to have pre-set CONFIG_INITRAMFS_SOURCE with local path to intramfs in ARC defconfigs. This was quite convenient for in-house development but not that convenient for newcomers who obviusly don't have folders like "arc_initramfs" next to the Linux source tree. Which leads to quite surprising failure of defconfig building: ------------------------------->8----------------------------- ../scripts/gen_initramfs_list.sh: Cannot open '../../arc_initramfs_hs/' ../usr/Makefile:57: recipe for target 'usr/initramfs_data.cpio.gz' failed make[2]: *** [usr/initramfs_data.cpio.gz] Error 1 ------------------------------->8----------------------------- So now when more and more people start to deal with our defconfigs let's make their life easier with removal of CONFIG_INITRAMFS_SOURCE. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Kevin Hilman <khilman@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: configs: remove no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCESAnders Roxell2018-07-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit eedf265aa003 ("devpts: Make each mount of devpts an independent filesystem.") CONFIG_DEVPTS_MULTIPLE_INSTANCES isn't needed in the defconfig anymore. Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: Improve cmpxchg syscall implementationPeter Zijlstra2018-07-091-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is used in configs lacking hardware atomics to emulate atomic r-m-w for user space, implemented by disabling preemption in kernel. However there are issues in current implementation: 1. Process not terminated if invalid user pointer passed: i.e. __get_user() failed. 2. The reason for this patch was __put_user() failure not being handled either, specifically for the COW break scenario. The zero page is initially wired up and read from __get_user() succeeds. A subsequent write by __put_user() induces a Protection Violation, but COW can't finish as Linux page fault handler is disabled due to preempt disable. And what's worse is we silently return the stale value to user space. Fix this specific case by re-enabling preemption and explicitly fixing up the fault and retrying the whole sequence over. Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: linux-arch@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: rewrote the changelog]
| | * | | | | | | | ARC: [plat-hsdk]: Configure APB GPIO controller on ARC HSDK platformGustavo Pimentel2018-07-091-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of HSDK we have intermediate INTC in for of DW APB GPIO controller which is used as a de-bounce logic for interrupt wires that come from outside the board. We cannot use existing "irq-dw-apb-ictl" driver here because all input lines are routed to corresponding output lines but not muxed into one line (this is configured in RTL and we cannot change this in software). But even if we add such a feature to "irq-dw-apb-ictl" driver that won't benefit us as higher-level INTC (in case of HSDK it is IDU) anyways has per-input control so adding fully-controller intermediate INTC will only bring some overhead on interrupt processing but no other benefits. Thus we just do one-time configuration of DW APB GPIO controller and forget about it. Based on implementation available on arch/arc/plat-axs10x/axs10x.c file. Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: [plat-hsdk] Add PCIe supportGustavo Pimentel2018-06-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PCI support to the ARC HSDK platform allowing to use the generic PCI setup functions. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| | * | | | | | | | ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMPAlexey Brodkin2018-06-202-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | machine_desc->init_per_cpu() hook is supposed to be per cpu initialization and would seem to apply equally to UP and/or SMP. Infact the comment in header file seems to suggest it works for UP too, which was not the case and this patch. This enables !CONFIG_SMP build for platforms such as hsdk. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: trimmeed changelog]
| | * | | | | | | | ARC: Explicitly add -mmedium-calls to CFLAGSAlexey Brodkin2018-06-131-14/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC built for arc*-*-linux has "-mmedium-calls" implicitly enabled by default thus we don't see any problems during Linux kernel compilation. ----------------------------->8------------------------ arc-linux-gcc -mcpu=arc700 -Q --help=target | grep calls -mlong-calls [disabled] -mmedium-calls [enabled] ----------------------------->8------------------------ But if we try to use so-called Elf32 toolchain with GCC configured for arc*-*-elf* then we'd see the following failure: ----------------------------->8------------------------ init/do_mounts.o: In function 'init_rootfs': do_mounts.c:(.init.text+0x108): relocation truncated to fit: R_ARC_S21W_PCREL against symbol 'unregister_filesystem' defined in .text section in fs/filesystems.o arc-elf32-ld: final link failed: Symbol needs debug section which does not exist make: *** [vmlinux] Error 1 ----------------------------->8------------------------ That happens because neither "-mmedium-calls" nor "-mlong-calls" are enabled in Elf32 GCC: ----------------------------->8------------------------ arc-elf32-gcc -mcpu=arc700 -Q --help=target | grep calls -mlong-calls [disabled] -mmedium-calls [disabled] ----------------------------->8------------------------ Now to make it possible to use Elf32 toolchain for building Linux kernel we're explicitly add "-mmedium-calls" to CFLAGS. And since we add "-mmedium-calls" to the global CFLAGS there's no point in having per-file copies thus removing them. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| * | | | | | | | | Merge tag 'nds32-for-linus-4.18' of ↵Linus Torvalds2018-07-206-70/+58
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux Pull nds32 updates from Greentime Hu: "Bug fixes and build ixes for nds32" * tag 'nds32-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: nds32: fix build error "relocation truncated to fit: R_NDS32_25_PCREL_RELA" when make allyesconfig nds32: To simplify the implementation of update_mmu_cache() nds32: Fix the dts pointer is not passed correctly issue. nds32: To implement these icache invalidation APIs since nds32 cores don't snoop data cache. This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst and it says: nds32: Fix build error caused by configuration flag rename nds32: define __NDS32_E[BL]__ for sparse
| | * | | | | | | | | nds32: fix build error "relocation truncated to fit: R_NDS32_25_PCREL_RELA" whenGreentime Hu2018-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make allyesconfig It will cause a linking error because the jump assembly code were using j, however it may be not enough to jump to the destination. We have to change it with pseudo instruction b. In that way, assembler will generate a set of safe assembly codes to make sure the destination is able to jump. Toolchain: https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/x86_64-gcc-8.1.0-nolibc-nds32le-linux.tar.gz Command: PATH=/NOBACKUP/atcsqa06/greentime/os/toolchain-kernel.org/gcc-8.1.0-nolibc/nds32le-linux/bin:$PATH ARCH=nds32 CROSS_COMPILE=nds32le-linux- make allyesconfig PATH=/NOBACKUP/atcsqa06/greentime/os/toolchain-kernel.org/gcc-8.1.0-nolibc/nds32le-linux/bin:$PATH ARCH=nds32 CROSS_COMPILE=nds32le-linux- make -j8 MODPOST vmlinux.o WARNING: EXPORT symbol "copy_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "clear_page" [vmlinux] version generation failed, symbol will not be versioned. nds32le-linux-ld: kernel/futex.o:(.fixup+0x4): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' nds32le-linux-ld: kernel/futex.o:(.fixup+0xaa): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' nds32le-linux-ld: kernel/futex.o:(.fixup+0xb0): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' nds32le-linux-ld: kernel/futex.o:(.fixup+0xb6): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' nds32le-linux-ld: kernel/futex.o:(.fixup+0xbc): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' nds32le-linux-ld: kernel/futex.o:(.fixup+0xc4): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text' Makefile:1010: recipe for target 'vmlinux' failed make: *** [vmlinux] Error 1 Signed-off-by: Greentime Hu <greentime@andestech.com>
| | * | | | | | | | | nds32: To simplify the implementation of update_mmu_cache()Greentime Hu2018-07-031-39/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The checking code is done in kmap_atomic() so that we don't need to check it in update_mmu_cache() again. There is no need to implement it for cache aliasing or cache non-aliasing versions. We can just implement one version for both. Signed-off-by: Greentime Hu <greentime@andestech.com>
| | * | | | | | | | | nds32: Fix the dts pointer is not passed correctly issue.Greentime Hu2018-07-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We found that the original implementation will only use the built-in dtb pointer instead of the pointer pass from bootloader. This bug is fixed by this patch. Signed-off-by: Greentime Hu <greentime@andestech.com>
| | * | | | | | | | | nds32: To implement these icache invalidation APIs since nds32 cores don't snoopGreentime Hu2018-07-032-23/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data cache. This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst and it says: "Any necessary cache flushing or other coherency operations that need to occur should happen here. If the processor's instruction cache does not snoop cpu stores, it is very likely that you will need to flush the instruction cache for copy_to_user_page()." "If the icache does not snoop stores then this routine(flush_icache_range) will need to flush it." Signed-off-by: Guo Ren <ren_guo@c-sky.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
| | * | | | | | | | | nds32: Fix build error caused by configuration flag renameGuenter Roeck2018-06-191-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix build error on nds32 due to the merge of commit e3d5980568f ("lib: Rename compiler intrinsic selects to GENERIC_LIB_*") during the 4.18 merge window which renames Kconfig symbols. This had raced with commit aeaa7af744fa ("nds32: lib: To use generic lib instead of libgcc to prevent the symbol undefined issue.") merged late in the 4.17 cycle, which added selects to nds32 using the original Kconfig symbol names. When they came together in merge commit 763f96944c95 ("Merge tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux") this resulted in the following build errors: nds32le-linux-ld: kernel/time/timekeeping.o: in function `timekeeping_init': timekeeping.c:(.init.text+0x140): undefined reference to `__ashldi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x144): undefined reference to `__ashldi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x17e): undefined reference to `__lshrdi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x182): undefined reference to `__lshrdi3' nds32le-linux-ld: drivers/clocksource/mmio.o: in function `clocksource_mmio_init': mmio.c:(.init.text+0x54): undefined reference to `__lshrdi3' nds32le-linux-ld: mmio.c:(.init.text+0x58): undefined reference to `__lshrdi3' Rename all 6 selects in nds32 and adjust the ordering accordingly to be alphabetical. Fixes: 763f96944c95 ("Merge tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux") Signed-off-by: Guenter Roeck <linux@roeck-us.net> [jhogan@kernel.org: Rename all 6 symbols, sort, update commit message] Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Matt Redfearn <matt.redfearn@mips.com> Cc: Palmer Dabbelt <palmer@sifive.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
| | * | | | | | | | | nds32: define __NDS32_E[BL]__ for sparseLuc Van Oostenryck2018-06-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nds32 depends on the macros '__NDS32_E[BL]__' to correctly select or define endian-specific macros, structures or pieces of code. These macros are predefined by the compiler but sparse knows nothing about them and thus may pre-process files differently from what GCC would. Fix this by adding '-D__NDS32_E[BL]__' to CHECKFLAGS. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
| * | | | | | | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2018-07-184-27/+52
| |\ \ \ \ \ \ \ \ \ \ | | |_|_|_|_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "Miscellaneous bugfixes, plus a small patchlet related to Spectre v2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvmclock: fix TSC calibration for nested guests KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. x86/kvmclock: set pvti_cpu0_va after enabling kvmclock x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
| | * | | | | | | | | kvmclock: fix TSC calibration for nested guestsPeng Hao2018-07-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inside a nested guest, access to hardware can be slow enough that tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work to be called periodically and the nested guest to spend a lot of time reading the ACPI timer. However, if the TSC frequency is available from the pvclock page, we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration. 'refine' operation. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> [Commit message rewritten. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | | | | | | KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabledLiran Alon2018-07-181-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When eVMCS is enabled, all VMCS allocated to be used by KVM are marked with revision_id of KVM_EVMCS_VERSION instead of revision_id reported by MSR_IA32_VMX_BASIC. However, even though not explictly documented by TLFS, VMXArea passed as VMXON argument should still be marked with revision_id reported by physical CPU. This issue was found by the following setup: * L0 = KVM which expose eVMCS to it's L1 guest. * L1 = KVM which consume eVMCS reported by L0. This setup caused the following to occur: 1) L1 execute hardware_enable(). 2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON. 3) L0 intercept L1 VMXON and execute handle_vmon() which notes vmxarea->revision_id != VMCS12_REVISION and therefore fails with nested_vmx_failInvalid() which sets RFLAGS.CF. 4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore hardware_enable() continues as usual. 5) L1 hardware_enable() then calls ept_sync_global() which executes INVEPT. 6) L0 intercept INVEPT and execute handle_invept() which notes !vmx->nested.vmxon and thus raise a #UD to L1. 7) Raised #UD caused L1 to panic. Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Cc: stable@vger.kernel.org Fixes: 773e8a0425c923bc02668a2d6534a5ef5a43cc69 Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | | | | | | x86/kvmclock: set pvti_cpu0_va after enabling kvmclockRadim Krčmář2018-07-151-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pvti_cpu0_va is the address of shared kvmclock data structure. pvti_cpu0_va is currently kept unset (1) on 32 bit systems, (2) when kvmclock vsyscall is disabled, and (3) if kvmclock is not stable. This poses a problem, because kvm_ptp needs pvti_cpu0_va, but (1) can work on 32 bit, (2) has little relation to the vsyscall, and (3) does not need stable kvmclock (although kvmclock won't be used for system clock if it's not stable, so kvm_ptp is pointless in that case). Expose pvti_cpu0_va whenever kvmclock is enabled to allow all users to work with it. This fixes a regression found on Gentoo: https://bugs.gentoo.org/658544. Fixes: 9f08890ab906 ("x86/pvclock: add setter for pvclock_pvti_cpu0_va") Cc: stable@vger.kernel.org Reported-by: Andreas Steinmetz <ast@domdv.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>