| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull vfs fixes from Al Viro:
"Assorted fixes all over the place"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: fix io_destroy(2) vs. lookup_ioctx() race
ext2: fix a block leak
nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed
cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed
unfuck sysfs_mount()
kernfs: deal with kernfs_fill_super() failures
cramfs: Fix IS_ENABLED typo
befs_lookup(): use d_splice_alias()
affs_lookup: switch to d_splice_alias()
affs_lookup(): close a race with affs_remove_link()
fix breakage caused by d_find_alias() semantics change
fs: don't scan the inode cache before SB_BORN is set
do d_instantiate/unlock_new_inode combinations safely
iov_iter: fix memory leak in pipe_get_pages_alloc()
iov_iter: fix return type of __pipe_get_pages()
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
kill_ioctx() used to have an explicit RCU delay between removing the
reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
At some point that delay had been removed, on the theory that
percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was
the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
by lookup_ioctx(). As the result, we could get ctx freed right under
lookup_ioctx(). Tejun has fixed that in a6d7cff472e ("fs/aio: Add explicit
RCU grace period when freeing kioctx"); however, that fix is not enough.
Suppose io_destroy() from one thread races with e.g. io_setup() from another;
CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the
refcount, getting it to 0 and triggering a call of free_ioctx_users(),
which proceeds to drop the secondary refcount and once that reaches zero
calls free_ioctx_reqs(). That does
INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
queue_rcu_work(system_wq, &ctx->free_rwork);
and schedules freeing the whole thing after RCU delay.
In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
refcount from 0 to 1 and returned the reference to io_setup().
Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
freed until after percpu_ref_get(). Sure, we'd increment the counter before
ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to
stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it
has grabbed the reference, ctx is *NOT* going away until it gets around to
dropping that reference.
The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
It's not costlier than what we currently do in normal case, it's safe to
call since freeing *is* delayed and it closes the race window - either
lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
the object in question at all.
Cc: stable@kernel.org
Fixes: a6d7cff472e "fs/aio: Add explicit RCU grace period when freeing kioctx"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
open file, unlink it, then use ioctl(2) to make it immutable or
append only. Now close it and watch the blocks *not* freed...
Immutable/append-only checks belong in ->setattr().
Note: the bug is old and backport to anything prior to 737f2e93b972
("ext2: convert to use the new truncate convention") will need
these checks lifted into ext2_setattr().
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.
Some vfs_mkdir() callers forget about that possibility...
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.
Some vfs_mkdir() callers forget about that possibility...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
new_sb is left uninitialized in case of early failures in kernfs_mount_ns(),
and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb
is not a solution - IS_ERR(root) is true in some cases when new_sb is true.
Make sure new_sb is initialized (and matches the reality) in all cases and
fix the condition for dropping kobj reference - we want it done precisely
in those situations where the reference has not been transferred into a new
super_block instance.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| | |
make sure that info->node is initialized early, so that kernfs_kill_sb()
can list_del() it safely.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There's an extra C here...
Fixes: 99c18ce580c6 ("cramfs: direct memory access support")
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
RTFS(Documentation/filesystems/nfs/Exporting) if you try to make
something exportable.
Fixes: ac632f5b6301 "befs: add NFS export support"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Making something exportable takes more than providing ->s_export_ops.
In particular, ->lookup() *MUST* use d_splice_alias() instead of
d_add().
Reading Documentation/filesystems/nfs/Exporting would've been a good idea;
as it is, exporting AFFS is badly (and exploitably) broken.
Partially-Fixes: ed4433d72394 "fs/affs: make affs exportable"
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
we unlock the directory hash too early - if we are looking at secondary
link and primary (in another directory) gets removed just as we unlock,
we could have the old primary moved in place of the secondary, leaving
us to look into freed entry (and leaving our dentry with ->d_fsdata
pointing to a freed entry).
Cc: stable@vger.kernel.org # 2.4.4+
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
"VFS: don't keep disconnected dentries on d_anon" had a non-trivial
side-effect - d_unhashed() now returns true for those dentries,
making d_find_alias() skip them altogether. For most of its callers
that's fine - we really want a connected alias there. However,
there is a codepath where we relied upon picking such aliases
if nothing else could be found - selinux delayed initialization
of contexts for inodes on already mounted filesystems used to
rely upon that.
Cc: stable@kernel.org # f1ee616214cb "VFS: don't keep disconnected dentries on d_anon"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We recently had an oops reported on a 4.14 kernel in
xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage
and so the m_perag_tree lookup walked into lala land. It produces
an oops down this path during the failed mount:
radix_tree_gang_lookup_tag+0xc4/0x130
xfs_perag_get_tag+0x37/0xf0
xfs_reclaim_inodes_count+0x32/0x40
xfs_fs_nr_cached_objects+0x11/0x20
super_cache_count+0x35/0xc0
shrink_slab.part.66+0xb1/0x370
shrink_node+0x7e/0x1a0
try_to_free_pages+0x199/0x470
__alloc_pages_slowpath+0x3a1/0xd20
__alloc_pages_nodemask+0x1c3/0x200
cache_grow_begin+0x20b/0x2e0
fallback_alloc+0x160/0x200
kmem_cache_alloc+0x111/0x4e0
The problem is that the superblock shrinker is running before the
filesystem structures it depends on have been fully set up. i.e.
the shrinker is registered in sget(), before ->fill_super() has been
called, and the shrinker can call into the filesystem before
fill_super() does it's setup work. Essentially we are exposed to
both use-after-free and use-before-initialisation bugs here.
To fix this, add a check for the SB_BORN flag in super_cache_count.
In general, this flag is not set until ->fs_mount() completes
successfully, so we know that it is set after the filesystem
setup has completed. This matches the trylock_super() behaviour
which will not let super_cache_scan() run if SB_BORN is not set, and
hence will not allow the superblock shrinker from entering the
filesystem while it is being set up or after it has failed setup
and is being torn down.
Cc: stable@kernel.org
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.
Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.
Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall <hubcap@omnibond.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Make n signed to avoid leaking the pages array if __pipe_get_pages()
fails to allocate any pages.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
It returns -EFAULT and happens to be a helper for pipe_get_pages()
whose return type is ssize_t.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Merge speculative store buffer bypass fixes from Thomas Gleixner:
- rework of the SPEC_CTRL MSR management to accomodate the new fancy
SSBD (Speculative Store Bypass Disable) bit handling.
- the CPU bug and sysfs infrastructure for the exciting new Speculative
Store Bypass 'feature'.
- support for disabling SSB via LS_CFG MSR on AMD CPUs including
Hyperthread synchronization on ZEN.
- PRCTL support for dynamic runtime control of SSB
- SECCOMP integration to automatically disable SSB for sandboxed
processes with a filter flag for opt-out.
- KVM integration to allow guests fiddling with SSBD including the new
software MSR VIRT_SPEC_CTRL to handle the LS_CFG based oddities on
AMD.
- BPF protection against SSB
.. this is just the core and x86 side, other architecture support will
come separately.
* 'speck-v20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
bpf: Prevent memory disambiguation attack
x86/bugs: Rename SSBD_NO to SSB_NO
KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
x86/bugs: Rework spec_ctrl base and mask logic
x86/bugs: Remove x86_spec_ctrl_set()
x86/bugs: Expose x86_spec_ctrl_base directly
x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
x86/speculation: Rework speculative_store_bypass_update()
x86/speculation: Add virtualized speculative store bypass disable support
x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
x86/speculation: Handle HT correctly on AMD
x86/cpufeatures: Add FEATURE_ZEN
x86/cpufeatures: Disentangle SSBD enumeration
x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
KVM: SVM: Move spec control call after restore of GS
x86/cpu: Make alternative_msr_write work for 32-bit code
x86/bugs: Fix the parameters alignment and missing void
x86/bugs: Make cpu_show_common() static
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Detect code patterns where malicious 'speculative store bypass' can be used
and sanitize such patterns.
39: (bf) r3 = r10
40: (07) r3 += -216
41: (79) r8 = *(u64 *)(r7 +0) // slow read
42: (7a) *(u64 *)(r10 -72) = 0 // verifier inserts this instruction
43: (7b) *(u64 *)(r8 +0) = r3 // this store becomes slow due to r8
44: (79) r1 = *(u64 *)(r6 +0) // cpu speculatively executes this load
45: (71) r2 = *(u8 *)(r1 +0) // speculatively arbitrary 'load byte'
// is now sanitized
Above code after x86 JIT becomes:
e5: mov %rbp,%rdx
e8: add $0xffffffffffffff28,%rdx
ef: mov 0x0(%r13),%r14
f3: movq $0x0,-0x48(%rbp)
fb: mov %rdx,0x0(%r14)
ff: mov 0x0(%rbx),%rdi
103: movzbq 0x0(%rdi),%rsi
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The "336996 Speculative Execution Side Channel Mitigations" from
May defines this as SSB_NO, hence lets sync-up.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Expose the new virtualized architectural mechanism, VIRT_SSBD, for using
speculative store bypass disable (SSBD) under SVM. This will allow guests
to use SSBD on hardware that uses non-architectural mechanisms for enabling
SSBD.
[ tglx: Folded the migration fixup from Paolo Bonzini ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add the necessary logic for supporting the emulated VIRT_SPEC_CTRL MSR to
x86_virt_spec_ctrl(). If either X86_FEATURE_LS_CFG_SSBD or
X86_FEATURE_VIRT_SPEC_CTRL is set then use the new guest_virt_spec_ctrl
argument to check whether the state must be modified on the host. The
update reuses speculative_store_bypass_update() so the ZEN-specific sibling
coordination can be reused.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
x86_spec_ctrL_mask is intended to mask out bits from a MSR_SPEC_CTRL value
which are not to be modified. However the implementation is not really used
and the bitmask was inverted to make a check easier, which was removed in
"x86/bugs: Remove x86_spec_ctrl_set()"
Aside of that it is missing the STIBP bit if it is supported by the
platform, so if the mask would be used in x86_virt_spec_ctrl() then it
would prevent a guest from setting STIBP.
Add the STIBP bit if supported and use the mask in x86_virt_spec_ctrl() to
sanitize the value which is supplied by the guest.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
x86_spec_ctrl_set() is only used in bugs.c and the extra mask checks there
provide no real value as both call sites can just write x86_spec_ctrl_base
to MSR_SPEC_CTRL. x86_spec_ctrl_base is valid and does not need any extra
masking or checking.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
x86_spec_ctrl_base is the system wide default value for the SPEC_CTRL MSR.
x86_spec_ctrl_get_default() returns x86_spec_ctrl_base and was intended to
prevent modification to that variable. Though the variable is read only
after init and globaly visible already.
Remove the function and export the variable instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Function bodies are very similar and are going to grow more almost
identical code. Add a bool arg to determine whether SPEC_CTRL is being set
for the guest or restored to the host.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The upcoming support for the virtual SPEC_CTRL MSR on AMD needs to reuse
speculative_store_bypass_update() to avoid code duplication. Add an
argument for supplying a thread info (TIF) value and create a wrapper
speculative_store_bypass_update_current() which is used at the existing
call site.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD). To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store
Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care
about the bit position of the SSBD bit and thus facilitate migration.
Also, the sibling coordination on Family 17H CPUs can only be done on
the host.
Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an
extra argument for the VIRT_SPEC_CTRL MSR.
Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU
data structure which is going to be used in later patches for the actual
implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The AMD64_LS_CFG MSR is a per core MSR on Family 17H CPUs. That means when
hyperthreading is enabled the SSBD bit toggle needs to take both cores into
account. Otherwise the following situation can happen:
CPU0 CPU1
disable SSB
disable SSB
enable SSB <- Enables it for the Core, i.e. for CPU0 as well
So after the SSB enable on CPU1 the task on CPU0 runs with SSB enabled
again.
On Intel the SSBD control is per core as well, but the synchronization
logic is implemented behind the per thread SPEC_CTRL MSR. It works like
this:
CORE_SPEC_CTRL = THREAD0_SPEC_CTRL | THREAD1_SPEC_CTRL
i.e. if one of the threads enables a mitigation then this affects both and
the mitigation is only disabled in the core when both threads disabled it.
Add the necessary synchronization logic for AMD family 17H. Unfortunately
that requires a spinlock to serialize the access to the MSR, but the locks
are only shared between siblings.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a ZEN feature bit so family-dependent static_cpu_has() optimizations
can be built for ZEN.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The SSBD enumeration is similarly to the other bits magically shared
between Intel and AMD though the mechanisms are different.
Make X86_FEATURE_SSBD synthetic and set it depending on the vendor specific
features or family dependent setup.
Change the Intel bit to X86_FEATURE_SPEC_CTRL_SSBD to denote that SSBD is
controlled via MSR_SPEC_CTRL and fix up the usage sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The availability of the SPEC_CTRL MSR is enumerated by a CPUID bit on
Intel and implied by IBRS or STIBP support on AMD. That's just confusing
and in case an AMD CPU has IBRS not supported because the underlying
problem has been fixed but has another bit valid in the SPEC_CTRL MSR,
the thing falls apart.
Add a synthetic feature bit X86_FEATURE_MSR_SPEC_CTRL to denote the
availability on both Intel and AMD.
While at it replace the boot_cpu_has() checks with static_cpu_has() where
possible. This prevents late microcode loading from exposing SPEC_CTRL, but
late loading is already very limited as it does not reevaluate the
mitigation options and other bits and pieces. Having static_cpu_has() is
the simplest and least fragile solution.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Intel and AMD have different CPUID bits hence for those use synthetic bits
which get set on the respective vendor's in init_speculation_control(). So
that debacles like what the commit message of
c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload")
talks about don't happen anymore.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20180504161815.GG9257@pd.tnic
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
svm_vcpu_run() invokes x86_spec_ctrl_restore_host() after VMEXIT, but
before the host GS is restored. x86_spec_ctrl_restore_host() uses 'current'
to determine the host SSBD state of the thread. 'current' is GS based, but
host GS is not yet restored and the access causes a triple fault.
Move the call after the host GS restore.
Fixes: 885f82bfbc6f x86/process: Allow runtime control of Speculative Store Bypass
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Cast val and (val >> 32) to (u32), so that they fit in a
general-purpose register in both 32-bit and 64-bit code.
[ tglx: Made it u32 instead of uintptr_t ]
Fixes: c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes: 7bb4d366c ("x86/bugs: Make cpu_show_common() static")
Fixes: 24f7fc83b ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation")
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
cpu_show_common() is not used outside of arch/x86/kernel/cpu/bugs.c, so
make it static.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
__ssb_select_mitigation() returns one of the members of enum ssb_mitigation,
not ssb_mitigation_cmd; fix the prototype to reflect that.
Fixes: 24f7fc83b9204 ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix some typos, improve formulations, end sentences with a fullstop.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The style for the 'status' file is CamelCase or this. _.
Fixes: fae1fa0fc ("proc: Provide details on speculation flaw mitigations")
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2]
as SSBD (Speculative Store Bypass Disable).
Hence changing it.
It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name
is going to be. Following the rename it would be SSBD_NO but that rolls out
to Speculative Store Bypass Disable No.
Also fixed the missing space in X86_FEATURE_AMD_SSBD.
[ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Unless explicitly opted out of, anything running under seccomp will have
SSB mitigations enabled. Choosing the "prctl" mode will disable this.
[ tglx: Adjusted it to the new arch_seccomp_spec_mitigate() mechanism ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The migitation control is simpler to implement in architecture code as it
avoids the extra function call to check the mode. Aside of that having an
explicit seccomp enabled mode in the architecture mitigations would require
even more workarounds.
Move it into architecture code and provide a weak function in the seccomp
code. Remove the 'which' argument as this allows the architecture to decide
which mitigations are relevant for seccomp.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If a seccomp user is not interested in Speculative Store Bypass mitigation
by default, it can set the new SECCOMP_FILTER_FLAG_SPEC_ALLOW flag when
adding filters.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use PR_SPEC_FORCE_DISABLE in seccomp() because seccomp does not allow to
widen restrictions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There's no reason for these to be changed after boot.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When speculation flaw mitigations are opt-in (via prctl), using seccomp
will automatically opt-in to these protections, since using seccomp
indicates at least some level of sandboxing is desired.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As done with seccomp and no_new_privs, also show speculation flaw
mitigation state in /proc/$pid/status.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than
current.
This is needed both for /proc/$pid/status queries and for seccomp (since
thread-syncing can trigger seccomp in non-current threads).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|