summaryrefslogtreecommitdiffstats
path: root/security
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'linus' of ↵Linus Torvalds2020-10-131-3/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Allow DRBG testing through user-space af_alg - Add tcrypt speed testing support for keyed hashes - Add type-safe init/exit hooks for ahash Algorithms: - Mark arc4 as obsolete and pending for future removal - Mark anubis, khazad, sead and tea as obsolete - Improve boot-time xor benchmark - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity Drivers: - Fixes and enhancement for XTS in caam - Add support for XIP8001B hwrng in xiphera-trng - Add RNG and hash support in sun8i-ce/sun8i-ss - Allow imx-rngc to be used by kernel entropy pool - Use crypto engine in omap-sham - Add support for Ingenic X1830 with ingenic" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits) X.509: Fix modular build of public_key_sm2 crypto: xor - Remove unused variable count in do_xor_speed X.509: fix error return value on the failed path crypto: bcm - Verify GCM/CCM key length in setkey crypto: qat - drop input parameter from adf_enable_aer() crypto: qat - fix function parameters descriptions crypto: atmel-tdes - use semicolons rather than commas to separate statements crypto: drivers - use semicolons rather than commas to separate statements hwrng: mxc-rnga - use semicolons rather than commas to separate statements hwrng: iproc-rng200 - use semicolons rather than commas to separate statements hwrng: stm32 - use semicolons rather than commas to separate statements crypto: xor - use ktime for template benchmarking crypto: xor - defer load time benchmark to a later time crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num' crypto: hisilicon/zip - fix the return value when device is busy crypto: hisilicon/zip - fix zero length input in GZIP decompress crypto: hisilicon/zip - fix the uncleared debug registers lib/mpi: Fix unused variable warnings crypto: x86/poly1305 - Remove assignments with no effect hwrng: npcm - modify readl to readb ...
| * integrity: Asymmetric digsig supports SM2-with-SM3 algorithmTianjia Zhang2020-09-251-3/+11
| | | | | | | | | | | | | | | | | | | | | | Asymmetric digsig supports SM2-with-SM3 algorithm combination, so that IMA can also verify SM2's signature data. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Vitaly Chikunov <vt@altlinux.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | Merge branch 'work.iov_iter' of ↵Linus Torvalds2020-10-123-41/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat iovec cleanups from Al Viro: "Christoph's series around import_iovec() and compat variant thereof" * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: security/keys: remove compat_keyctl_instantiate_key_iov mm: remove compat_process_vm_{readv,writev} fs: remove compat_sys_vmsplice fs: remove the compat readv/writev syscalls fs: remove various compat readv/writev helpers iov_iter: transparently handle compat iovecs in import_iovec iov_iter: refactor rw_copy_check_uvector and import_iovec iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c compat.h: fix a spelling error in <linux/compat.h>
| * | security/keys: remove compat_keyctl_instantiate_key_iovChristoph Hellwig2020-10-033-40/+3
| | | | | | | | | | | | | | | | | | | | | | | | Now that import_iovec handles compat iovecs, the native version of keyctl_instantiate_key_iov can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | iov_iter: transparently handle compat iovecs in import_iovecChristoph Hellwig2020-10-031-3/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | Use in compat_syscall to import either native or the compat iovecs, and remove the now superflous compat_import_iovec. This removes the need for special compat logic in most callers, and the remaining ones can still be simplified by using __import_iovec with a bool compat parameter. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'efi-core-2020-10-12' of ↵Linus Torvalds2020-10-121-19/+66
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: - Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree. - Relax decompressed image placement rules for 32-bit ARM - Add support for passing MOK certificate table contents via a config table rather than a EFI variable. - Add support for 18 bit DIMM row IDs in the CPER records. - Work around broken Dell firmware that passes the entire Boot#### variable contents as the command line - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can identify it in the memory map listings. - Don't abort the boot on arm64 if the EFI RNG protocol is available but returns with an error - Replace slashes with exclamation marks in efivarfs file names - Split efi-pstore from the deprecated efivars sysfs code, so we can disable the latter on !x86. - Misc fixes, cleanups and updates. * tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) efi: mokvar: add missing include of asm/early_ioremap.h efi: efivars: limit availability to X86 builds efi: remove some false dependencies on CONFIG_EFI_VARS efi: gsmi: fix false dependency on CONFIG_EFI_VARS efi: efivars: un-export efivars_sysfs_init() efi: pstore: move workqueue handling out of efivars efi: pstore: disentangle from deprecated efivars module efi: mokvar-table: fix some issues in new code efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure efivarfs: Replace invalid slashes with exclamation marks in dentries. efi: Delete deprecated parameter comments efi/libstub: Fix missing-prototypes in string.c efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it cper,edac,efi: Memory Error Record: bank group/address and chip id edac,ghes,cper: Add Row Extension to Memory Error Record efi/x86: Add a quirk to support command line arguments on Dell EFI firmware efi/libstub: Add efi_warn and *_once logging helpers integrity: Load certs from the EFI MOK config table integrity: Move import of MokListRT certs to a separate routine efi: Support for MOK variable config table ...
| * | integrity: Load certs from the EFI MOK config tableLenny Szubowicz2020-09-161-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of system-specific EFI firmware limitations, EFI volatile variables may not be capable of holding the required contents of the Machine Owner Key (MOK) certificate store when the certificate list grows above some size. Therefore, an EFI boot loader may pass the MOK certs via a EFI configuration table created specifically for this purpose to avoid this firmware limitation. An EFI configuration table is a much more primitive mechanism compared to EFI variables and is well suited for one-way passage of static information from a pre-OS environment to the kernel. This patch adds the support to load certs from the MokListRT entry in the MOK variable configuration table, if it's present. The pre-existing support to load certs from the MokListRT EFI variable remains and is used if the EFI MOK configuration table isn't present or can't be successfully used. Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com> Link: https://lore.kernel.org/r/20200905013107.10457-4-lszubowi@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
| * | integrity: Move import of MokListRT certs to a separate routineLenny Szubowicz2020-09-161-19/+44
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the loading of certs from the UEFI MokListRT into a separate routine to facilitate additional MokList functionality. There is no visible functional change as a result of this patch. Although the UEFI dbx certs are now loaded before the MokList certs, they are loaded onto different key rings. So the order of the keys on their respective key rings is the same. Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Link: https://lore.kernel.org/r/20200905013107.10457-3-lszubowi@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
* | Merge tag 'fixes-v5.9a' of ↵Linus Torvalds2020-09-151-1/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer fix from James Morris: "A device_cgroup RCU warning fix from Amol Grover" * tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: device_cgroup: Fix RCU list debugging warning
| * | device_cgroup: Fix RCU list debugging warningAmol Grover2020-08-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | exceptions may be traversed using list_for_each_entry_rcu() outside of an RCU read side critical section BUT under the protection of decgroup_mutex. Hence add the corresponding lockdep expression to fix the following false-positive warning: [ 2.304417] ============================= [ 2.304418] WARNING: suspicious RCU usage [ 2.304420] 5.5.4-stable #17 Tainted: G E [ 2.304422] ----------------------------- [ 2.304424] security/device_cgroup.c:355 RCU-list traversed in non-reader section!! Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: James Morris <jmorris@namei.org>
* | | treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-08-2312-38/+30
| |/ |/| | | | | | | | | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2020-08-121-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...
| * | mm/gup: remove task_struct pointer for all gup codePeter Xu2020-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the cleanup of page fault accounting, gup does not need to pass task_struct around any more. Remove that parameter in the whole gup stack. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'for-v5.9' of ↵Linus Torvalds2020-08-1110-10/+10
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A couple of minor documentation updates only for this release" * tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: LSM: drop duplicated words in header file comments Replace HTTP links with HTTPS ones: security
| * | Replace HTTP links with HTTPS ones: securityAlexander A. Klimov2020-08-0610-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
* | | mm, treewide: rename kzfree() to kfree_sensitive()Waiman Long2020-08-0710-62/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As said by Linus: A symmetric naming is only helpful if it implies symmetries in use. Otherwise it's actively misleading. In "kzalloc()", the z is meaningful and an important part of what the caller wants. In "kzfree()", the z is actively detrimental, because maybe in the future we really _might_ want to use that "memfill(0xdeadbeef)" or something. The "zero" part of the interface isn't even _relevant_. The main reason that kzfree() exists is to clear sensitive information that should not be leaked to other future users of the same memory objects. Rename kzfree() to kfree_sensitive() to follow the example of the recently added kvfree_sensitive() and make the intention of the API more explicit. In addition, memzero_explicit() is used to clear the memory to make sure that it won't get optimized away by the compiler. The renaming is done by using the command sequence: git grep -w --name-only kzfree |\ xargs sed -i 's/kzfree/kfree_sensitive/' followed by some editing of the kfree_sensitive() kerneldoc and adding a kzfree backward compatibility macro in slab.h. [akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h] [akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more] Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Joe Perches <joe@perches.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: "Jason A . Donenfeld" <Jason@zx2c4.com> Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'integrity-v5.9' of ↵Linus Torvalds2020-08-0612-140/+283
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The nicest change is the IMA policy rule checking. The other changes include allowing the kexec boot cmdline line measure policy rules to be defined in terms of the inode associated with the kexec kernel image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA appraise mode (log, fix, enforce), a runtime decision based on the secure boot mode of the system, and including errno in the audit log" * tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: remove redundant initialization of variable ret ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime ima: AppArmor satisfies the audit rule requirements ima: Rename internal filter rule functions ima: Support additional conditionals in the KEXEC_CMDLINE hook function ima: Use the common function to detect LSM conditionals in a rule ima: Move comprehensive rule validation checks out of the token parser ima: Use correct type for the args_p member of ima_rule_entry.lsm elements ima: Shallow copy the args_p member of ima_rule_entry.lsm elements ima: Fail rule parsing when appraise_flag=blacklist is unsupportable ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond ima: Fail rule parsing when buffer hook functions have an invalid action ima: Free the entire rule if it fails to parse ima: Free the entire rule when deleting a list of rules ima: Have the LSM free its audit rule IMA: Add audit log for failure conditions integrity: Add errno field in audit message
| * | | integrity: remove redundant initialization of variable retColin Ian King2020-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Fixes: eb5798f2e28f ("integrity: convert digsig to akcipher api") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Addresses-Coverity: ("Unused value") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtimeBruno Meneguele2020-07-202-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IMA_APPRAISE_BOOTPARAM config allows enabling different "ima_appraise=" modes - log, fix, enforce - at run time, but not when IMA architecture specific policies are enabled.  This prevents properly labeling the filesystem on systems where secure boot is supported, but not enabled on the platform.  Only when secure boot is actually enabled should these IMA appraise modes be disabled. This patch removes the compile time dependency and makes it a runtime decision, based on the secure boot state of that platform. Test results as follows: -> x86-64 with secure boot enabled [ 0.015637] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix [ 0.015668] ima: Secure boot enabled: ignoring ima_appraise=fix boot parameter option -> powerpc with secure boot disabled [ 0.000000] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix [ 0.000000] Secure boot mode disabled -> Running the system without secure boot and with both options set: CONFIG_IMA_APPRAISE_BOOTPARAM=y CONFIG_IMA_ARCH_POLICY=y Audit prompts "missing-hash" but still allow execution and, consequently, filesystem labeling: type=INTEGRITY_DATA msg=audit(07/09/2020 12:30:27.778:1691) : pid=4976 uid=root auid=root ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=appraise_data cause=missing-hash comm=bash name=/usr/bin/evmctl dev="dm-0" ino=493150 res=no Cc: stable@vger.kernel.org Fixes: d958083a8f64 ("x86/ima: define arch_get_ima_policy() for x86") Signed-off-by: Bruno Meneguele <bmeneg@redhat.com> Cc: stable@vger.kernel.org # 5.0 Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: AppArmor satisfies the audit rule requirementsTyler Hicks2020-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AppArmor meets all the requirements for IMA in terms of audit rules since commit e79c26d04043 ("apparmor: Add support for audit rule filtering"). Update IMA's Kconfig section for CONFIG_IMA_LSM_RULES to reflect this. Fixes: e79c26d04043 ("apparmor: Add support for audit rule filtering") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Rename internal filter rule functionsTyler Hicks2020-07-202-25/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename IMA's internal filter rule functions from security_filter_rule_*() to ima_filter_rule_*(). This avoids polluting the security_* namespace, which is typically reserved for general security subsystem infrastructure. Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Suggested-by: Casey Schaufler <casey@schaufler-ca.com> [zohar@linux.ibm.com: reword using the term "filter", not "audit"] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Support additional conditionals in the KEXEC_CMDLINE hook functionTyler Hicks2020-07-207-22/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take the properties of the kexec kernel's inode and the current task ownership into consideration when matching a KEXEC_CMDLINE operation to the rules in the IMA policy. This allows for some uniformity when writing IMA policy rules for KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK, and KEXEC_CMDLINE operations. Prior to this patch, it was not possible to write a set of rules like this: dont_measure func=KEXEC_KERNEL_CHECK obj_type=foo_t dont_measure func=KEXEC_INITRAMFS_CHECK obj_type=foo_t dont_measure func=KEXEC_CMDLINE obj_type=foo_t measure func=KEXEC_KERNEL_CHECK measure func=KEXEC_INITRAMFS_CHECK measure func=KEXEC_CMDLINE The inode information associated with the kernel being loaded by a kexec_kernel_load(2) syscall can now be included in the decision to measure or not Additonally, the uid, euid, and subj_* conditionals can also now be used in KEXEC_CMDLINE rules. There was no technical reason as to why those conditionals weren't being considered previously other than ima_match_rules() didn't have a valid inode to use so it immediately bailed out for KEXEC_CMDLINE operations rather than going through the full list of conditional comparisons. Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Use the common function to detect LSM conditionals in a ruleTyler Hicks2020-07-201-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make broader use of ima_rule_contains_lsm_cond() to check if a given rule contains an LSM conditional. This is a code cleanup and has no user-facing change. Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Move comprehensive rule validation checks out of the token parserTyler Hicks2020-07-203-46/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use ima_validate_rule(), at the end of the token parsing stage, to verify combinations of actions, hooks, and flags. This is useful to increase readability by consolidating such checks into a single function and also because rule conditionals can be specified in arbitrary order making it difficult to do comprehensive rule validation until the entire rule has been parsed. This allows for the check that ties together the "keyrings" conditional with the KEY_CHECK function hook to be moved into the final rule validation. The modsig check no longer needs to compiled conditionally because the token parser will ensure that modsig support is enabled before accepting "imasig|modsig" appraise type values. The final rule validation will ensure that appraise_type and appraise_flag options are only present in appraise rules. Finally, this allows for the check that ties together the "pcr" conditional with the measure action to be moved into the final rule validation. Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Use correct type for the args_p member of ima_rule_entry.lsm elementsTyler Hicks2020-07-201-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make args_p be of the char pointer type rather than have it be a void pointer that gets casted to char pointer when it is used. It is a simple NUL-terminated string as returned by match_strdup(). Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Shallow copy the args_p member of ima_rule_entry.lsm elementsTyler Hicks2020-07-201-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The args_p member is a simple string that is allocated by ima_rule_init(). Shallow copy it like other non-LSM references in ima_rule_entry structs. There are no longer any necessary error path cleanups to do in ima_lsm_copy_rule(). Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Fail rule parsing when appraise_flag=blacklist is unsupportableTyler Hicks2020-07-201-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Verifying that a file hash is not blacklisted is currently only supported for files with appended signatures (modsig). In the future, this might change. For now, the "appraise_flag" option is only appropriate for appraise actions and its "blacklist" value is only appropriate when CONFIG_IMA_APPRAISE_MODSIG is enabled and "appraise_flag=blacklist" is only appropriate when "appraise_type=imasig|modsig" is also present. Make this clear at policy load so that IMA policy authors don't assume that other uses of "appraise_flag=blacklist" are supported. Fixes: 273df864cf74 ("ima: Check against blacklisted hashes for files with modsig") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reivewed-by: Nayna Jain <nayna@linux.ibm.com> Tested-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid condTyler Hicks2020-07-161-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KEY_CHECK function only supports the uid, pcr, and keyrings conditionals. Make this clear at policy load so that IMA policy authors don't assume that other conditionals are supported. Fixes: 5808611cccb2 ("IMA: Add KEY_CHECK func to measure keys") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an ↵Tyler Hicks2020-07-161-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | invalid cond The KEXEC_CMDLINE hook function only supports the pcr conditional. Make this clear at policy load so that IMA policy authors don't assume that other conditionals are supported. Since KEXEC_CMDLINE's inception, ima_match_rules() has always returned true on any loaded KEXEC_CMDLINE rule without any consideration for other conditionals present in the rule. Make it clear that pcr is the only supported KEXEC_CMDLINE conditional by returning an error during policy load. An example of why this is a problem can be explained with the following rule: dont_measure func=KEXEC_CMDLINE obj_type=foo_t An IMA policy author would have assumed that rule is valid because the parser accepted it but the result was that measurements for all KEXEC_CMDLINE operations would be disabled. Fixes: b0935123a183 ("IMA: Define a new hook to measure the kexec boot command line arguments") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Fail rule parsing when buffer hook functions have an invalid actionTyler Hicks2020-07-161-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Buffer based hook functions, such as KEXEC_CMDLINE and KEY_CHECK, can only measure. The process_buffer_measurement() function quietly ignores all actions except measure so make this behavior clear at the time of policy load. The parsing of the keyrings conditional had a check to ensure that it was only specified with measure actions but the check should be on the hook function and not the keyrings conditional since "appraise func=KEY_CHECK" is not a valid rule. Fixes: b0935123a183 ("IMA: Define a new hook to measure the kexec boot command line arguments") Fixes: 5808611cccb2 ("IMA: Add KEY_CHECK func to measure keys") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Free the entire rule if it fails to parseTyler Hicks2020-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use ima_free_rule() to fix memory leaks of allocated ima_rule_entry members, such as .fsname and .keyrings, when an error is encountered during rule parsing. Set the args_p pointer to NULL after freeing it in the error path of ima_lsm_rule_init() so that it isn't freed twice. This fixes a memory leak seen when loading an rule that contains an additional piece of allocated memory, such as an fsname, followed by an invalid conditional: # echo "measure fsname=tmpfs bad=cond" > /sys/kernel/security/ima/policy -bash: echo: write error: Invalid argument # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff98e7e4ece6c0 (size 8): comm "bash", pid 672, jiffies 4294791843 (age 21.855s) hex dump (first 8 bytes): 74 6d 70 66 73 00 6b a5 tmpfs.k. backtrace: [<00000000abab7413>] kstrdup+0x2e/0x60 [<00000000f11ede32>] ima_parse_add_rule+0x7d4/0x1020 [<00000000f883dd7a>] ima_write_policy+0xab/0x1d0 [<00000000b17cf753>] vfs_write+0xde/0x1d0 [<00000000b8ddfdea>] ksys_write+0x68/0xe0 [<00000000b8e21e87>] do_syscall_64+0x56/0xa0 [<0000000089ea7b98>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: f1b08bbcbdaf ("ima: define a new policy condition based on the filesystem name") Fixes: 2b60c0ecedf8 ("IMA: Read keyrings= option from the IMA policy") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Free the entire rule when deleting a list of rulesTyler Hicks2020-07-161-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a function, ima_free_rule(), to free all memory associated with an ima_rule_entry. Use the new function to fix memory leaks of allocated ima_rule_entry members, such as .fsname and .keyrings, when deleting a list of rules. Make the existing ima_lsm_free_rule() function specific to the LSM audit rule array of an ima_rule_entry and require that callers make an additional call to kfree to free the ima_rule_entry itself. This fixes a memory leak seen when loading by a valid rule that contains an additional piece of allocated memory, such as an fsname, followed by an invalid rule that triggers a policy load failure: # echo -e "dont_measure fsname=securityfs\nbad syntax" > \ /sys/kernel/security/ima/policy -bash: echo: write error: Invalid argument # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff9bab67ca12c0 (size 16): comm "bash", pid 684, jiffies 4295212803 (age 252.344s) hex dump (first 16 bytes): 73 65 63 75 72 69 74 79 66 73 00 6b 6b 6b 6b a5 securityfs.kkkk. backtrace: [<00000000adc80b1b>] kstrdup+0x2e/0x60 [<00000000d504cb0d>] ima_parse_add_rule+0x7d4/0x1020 [<00000000444825ac>] ima_write_policy+0xab/0x1d0 [<000000002b7f0d6c>] vfs_write+0xde/0x1d0 [<0000000096feedcf>] ksys_write+0x68/0xe0 [<0000000052b544a2>] do_syscall_64+0x56/0xa0 [<000000007ead1ba7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: f1b08bbcbdaf ("ima: define a new policy condition based on the filesystem name") Fixes: 2b60c0ecedf8 ("IMA: Read keyrings= option from the IMA policy") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | ima: Have the LSM free its audit ruleTyler Hicks2020-07-162-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ask the LSM to free its audit rule rather than directly calling kfree(). Both AppArmor and SELinux do additional work in their audit_rule_free() hooks. Fix memory leaks by allowing the LSMs to perform necessary work. Fixes: b16942455193 ("ima: use the lsm policy update notifier") Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Janne Karhunen <janne.karhunen@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | IMA: Add audit log for failure conditionsLakshmi Ramasubramanian2020-07-164-22/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | process_buffer_measurement() and ima_alloc_key_entry() functions need to log an audit message for auditing integrity measurement failures. Add audit message in these two functions. Remove "pr_devel" log message in process_buffer_measurement(). Sample audit messages: [ 6.303048] audit: type=1804 audit(1592506281.627:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel op=measuring_key cause=ENOMEM comm="swapper/0" name=".builtin_trusted_keys" res=0 errno=-12 [ 8.019432] audit: type=1804 audit(1592506283.344:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=measuring_kexec_cmdline cause=hashing_error comm="systemd" name="kexec-cmdline" res=0 errno=-22 Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
| * | | integrity: Add errno field in audit messageLakshmi Ramasubramanian2020-07-162-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Error code is not included in the audit messages logged by the integrity subsystem. Define a new function integrity_audit_message() that takes error code in the "errno" parameter. Add "errno" field in the audit messages logged by the integrity subsystem and set the value passed in the "errno" parameter. [ 6.303048] audit: type=1804 audit(1592506281.627:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel op=measuring_key cause=ENOMEM comm="swapper/0" name=".builtin_trusted_keys" res=0 errno=-12 [ 7.987647] audit: type=1802 audit(1592506283.312:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=policy_update cause=completed comm="systemd" res=1 errno=0 [ 8.019432] audit: type=1804 audit(1592506283.344:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=measuring_kexec_cmdline cause=hashing_error comm="systemd" name="kexec-cmdline" res=0 errno=-22 Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Steve Grubb <sgrubb@redhat.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
* | | | Merge tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-nextLinus Torvalds2020-08-061-3/+16
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull smack updates from Casey Schaufler: "Minor fixes to Smack for the v5.9 release. All were found by automated checkers and have straightforward resolution" * tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next: Smack: prevent underflow in smk_set_cipso() Smack: fix another vsscanf out of bounds Smack: fix use-after-free in smk_write_relabel_self()
| * | | | Smack: prevent underflow in smk_set_cipso()Dan Carpenter2020-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have an upper bound on "maplevel" but forgot to check for negative values. Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
| * | | | Smack: fix another vsscanf out of boundsDan Carpenter2020-07-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in vsscanf") where we added a bounds check on "rule". Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
| * | | | Smack: fix use-after-free in smk_write_relabel_self()Eric Biggers2020-07-141-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smk_write_relabel_self() frees memory from the task's credentials with no locking, which can easily cause a use-after-free because multiple tasks can share the same credentials structure. Fix this by using prepare_creds() and commit_creds() to correctly modify the task's credentials. Reproducer for "BUG: KASAN: use-after-free in smk_write_relabel_self": #include <fcntl.h> #include <pthread.h> #include <unistd.h> static void *thrproc(void *arg) { int fd = open("/sys/fs/smackfs/relabel-self", O_WRONLY); for (;;) write(fd, "foo", 3); } int main() { pthread_t t; pthread_create(&t, NULL, thrproc, NULL); thrproc(NULL); } Reported-by: syzbot+e6416dabb497a650da40@syzkaller.appspotmail.com Fixes: 38416e53936e ("Smack: limited capability for changing process label") Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
* | | | | Merge tag 'cap-checkpoint-restore-v5.9' of ↵Linus Torvalds2020-08-041-2/+3
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
| * | | | | capabilities: Introduce CAP_CHECKPOINT_RESTOREAdrian Reber2020-07-191-2/+3
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating checkpoint/restore for non-root users. Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has been asked numerous times if it is possible to checkpoint/restore a process as non-root. The answer usually was: 'almost'. The main blocker to restore a process as non-root was to control the PID of the restored process. This feature available via the clone3 system call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by CAP_SYS_ADMIN. In the past two years, requests for non-root checkpoint/restore have increased due to the following use cases: * Checkpoint/Restore in an HPC environment in combination with a resource manager distributing jobs where users are always running as non-root. There is a desire to provide a way to checkpoint and restore long running jobs. * Container migration as non-root * We have been in contact with JVM developers who are integrating CRIU into a Java VM to decrease the startup time. These checkpoint/restore applications are not meant to be running with CAP_SYS_ADMIN. We have seen the following workarounds: * Use a setuid wrapper around CRIU: See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c * Use a setuid helper that writes to ns_last_pid. Unfortunately, this helper delegation technique is impossible to use with clone3, and is thus prone to races. See https://github.com/twosigma/set_ns_last_pid * Cycle through PIDs with fork() until the desired PID is reached: This has been demonstrated to work with cycling rates of 100,000 PIDs/s See https://github.com/twosigma/set_ns_last_pid * Patch out the CAP_SYS_ADMIN check from the kernel * Run the desired application in a new user and PID namespace to provide a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited use in typical container environments (e.g., Kubernetes) as /proc is typically protected with read-only layers (e.g., /proc/sys) for hardening purposes. Read-only layers prevent additional /proc mounts (due to proc's SB_I_USERNS_VISIBLE property), making the use of new PID namespaces limited as certain applications need access to /proc matching their PID namespace. The introduced capability allows to: * Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable for the corresponding PID namespace via ns_last_pid/clone3. * Open files in /proc/pid/map_files when the current user is CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for recovering files that are unreachable via the file system such as deleted files, or memfd files. See corresponding selftest for an example with clone3(). Signed-off-by: Adrian Reber <areber@redhat.com> Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
* | | | | Merge branch 'exec-linus' of ↵Linus Torvalds2020-08-043-5/+5
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve updates from Eric Biederman: "During the development of v5.7 I ran into bugs and quality of implementation issues related to exec that could not be easily fixed because of the way exec is implemented. So I have been diggin into exec and cleaning up what I can. This cycle I have been looking at different ideas and different implementations to see what is possible to improve exec, and cleaning the way exec interfaces with in kernel users. Only cleaning up the interfaces of exec with rest of the kernel has managed to stabalize and make it through review in time for v5.9-rc1 resulting in 2 sets of changes this cycle. - Implement kernel_execve - Make the user mode driver code a better citizen With kernel_execve the code size got a little larger as the copying of parameters from userspace and copying of parameters from userspace is now separate. The good news is kernel threads no longer need to play games with set_fs to use exec. Which when combined with the rest of Christophs set_fs changes should security bugs with set_fs much more difficult" * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) exec: Implement kernel_execve exec: Factor bprm_stack_limits out of prepare_arg_pages exec: Factor bprm_execve out of do_execve_common exec: Move bprm_mm_init into alloc_bprm exec: Move initialization of bprm->filename into alloc_bprm exec: Factor out alloc_bprm exec: Remove unnecessary spaces from binfmts.h umd: Stop using split_argv umd: Remove exit_umh bpfilter: Take advantage of the facilities of struct pid exit: Factor thread_group_exited out of pidfd_poll umd: Track user space drivers with struct pid bpfilter: Move bpfilter_umh back into init data exec: Remove do_execve_file umh: Stop calling do_execve_file umd: Transform fork_usermode_blob into fork_usermode_driver umd: Rename umd_info.cmdline umd_info.driver_name umd: For clarity rename umh_info umd_info umh: Separate the user mode driver and the user mode helper support umh: Remove call_usermodehelper_setup_file. ...
| * | | | | exec: Implement kernel_execveEric W. Biederman2020-07-213-5/+5
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To allow the kernel not to play games with set_fs to call exec implement kernel_execve. The function kernel_execve takes pointers into kernel memory and copies the values pointed to onto the new userspace stack. The calls with arguments from kernel space of do_execve are replaced with calls to kernel_execve. The calls do_execve and do_execveat are made static as there are now no callers outside of exec. The comments that mention do_execve are updated to refer to kernel_execve or execve depending on the circumstances. In addition to correcting the comments, this makes it easy to grep for do_execve and verify it is not used. Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | Merge tag 'audit-pr-20200803' of ↵Linus Torvalds2020-08-045-55/+49
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Aside from some smaller bug fixes, here are the highlights: - add a new backlog wait metric to the audit status message, this is intended to help admins determine how long processes have been waiting for the audit backlog queue to clear - generate audit records for nftables configuration changes - generate CWD audit records for for the relevant LSM audit records" * tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: report audit wait metric in audit status reply audit: purge audit_log_string from the intra-kernel audit API audit: issue CWD record to accompany LSM_AUDIT_DATA_* records audit: use the proper gfp flags in the audit_log_nfcfg() calls audit: remove unused !CONFIG_AUDITSYSCALL __audit_inode* stubs audit: add gfp parameter to audit_log_nfcfg audit: log nftables configuration change events audit: Use struct_size() helper in alloc_chunk
| * | | | | audit: purge audit_log_string from the intra-kernel audit APIRichard Guy Briggs2020-07-215-55/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | audit_log_string() was inteded to be an internal audit function and since there are only two internal uses, remove them. Purge all external uses of it by restructuring code to use an existing audit_log_format() or using audit_log_format(). Please see the upstream issue https://github.com/linux-audit/audit-kernel/issues/84 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | | | | audit: issue CWD record to accompany LSM_AUDIT_DATA_* recordsRichard Guy Briggs2020-07-081-0/+5
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LSM_AUDIT_DATA_* records for PATH, FILE, IOCTL_OP, DENTRY and INODE are incomplete without the task context of the AUDIT Current Working Directory record. Add it. This record addition can't use audit_dummy_context to determine whether or not to store the record information since the LSM_AUDIT_DATA_* records are initiated by various LSMs independent of any audit rules. context->in_syscall is used to determine if it was called in user context like audit_getname. Please see the upstream issue https://github.com/linux-audit/audit-kernel/issues/96 Adapted from Vladis Dronov's v2 patch. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
* | | | | Merge tag 'selinux-pr-20200803' of ↵Linus Torvalds2020-08-0414-161/+240
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Beyond the usual smattering of bug fixes, we've got three small improvements worth highlighting: - improved SELinux policy symbol table performance due to a reworking of the insert and search functions - allow reading of SELinux labels before the policy is loaded, allowing for some more "exotic" initramfs approaches - improved checking an error reporting about process class/permissions during SELinux policy load" * tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: complete the inlining of hashtab functions selinux: prepare for inlining of hashtab functions selinux: specialize symtab insert and search functions selinux: Fix spelling mistakes in the comments selinux: fixed a checkpatch warning with the sizeof macro selinux: log error messages on required process class / permissions scripts/selinux/mdp: fix initial SID handling selinux: allow reading labels before policy is loaded
| * | | | | selinux: complete the inlining of hashtab functionsOndrej Mosnacek2020-07-092-59/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move (most of) the definitions of hashtab_search() and hashtab_insert() to the header file. In combination with the previous patch, this avoids calling the callbacks indirectly by function pointers and allows for better optimization, leading to a drastic performance improvement of these operations. With this patch, I measured a speed up in the following areas (measured on x86_64 F32 VM with 4 CPUs): 1. Policy load (`load_policy`) - takes ~150 ms instead of ~230 ms. 2. `chcon -R unconfined_u:object_r:user_tmp_t:s0:c381,c519 /tmp/linux-src` where /tmp/linux-src is an extracted linux-5.7 source tarball - takes ~522 ms instead of ~576 ms. This is because of many symtab_search() calls in string_to_context_struct() when there are many categories specified in the context. 3. `stress-ng --msg 1 --msg-ops 10000000` - takes 12.41 s instead of 13.95 s (consumes 18.6 s of kernel CPU time instead of 21.6 s). This is thanks to security_transition_sid() being ~43% faster after this patch. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | | | | selinux: prepare for inlining of hashtab functionsOndrej Mosnacek2020-07-097-63/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor searching and inserting into hashtabs to pave the way for converting hashtab_search() and hashtab_insert() to inline functions in the next patch. This will avoid indirect calls and allow the compiler to better optimize individual callers, leading to a significant performance improvement. In order to avoid the indirect calls, the key hashing and comparison callbacks need to be extracted from the hashtab struct and passed directly to hashtab_search()/_insert() by the callers so that the callback address is always known at compile time. The kernel's rhashtable library (<linux/rhashtable*.h>) does the same thing. This of course makes the hashtab functions slightly easier to misuse by passing a wrong callback set, but unfortunately there is no better way to implement a hash table that is both generic and efficient in C. This patch tries to somewhat mitigate this by only calling the hashtab functions in the same file where the corresponding callbacks are defined (wrapping them into more specialized functions as needed). Note that this patch doesn't bring any benefit without also moving the definitions of hashtab_search() and -_insert() to the header file, which is done in a follow-up patch for easier review of the hashtab.c changes in this patch. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | | | | selinux: specialize symtab insert and search functionsOndrej Mosnacek2020-07-087-56/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This encapsulates symtab a little better and will help with further refactoring later. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>