linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'integrity-v6.2' of ↵	Linus Torvalds	2022-12-13	7	-25/+57
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Aside from the one cleanup, the other changes are bug fixes: Cleanup: - Include missing iMac Pro 2017 in list of Macs with T2 security chip Bug fixes: - Improper instantiation of "encrypted" keys with user provided data - Not handling delay in updating LSM label based IMA policy rules (-ESTALE) - IMA and integrity memory leaks on error paths - CONFIG_IMA_DEFAULT_HASH_SM3 hash algorithm renamed" * tag 'integrity-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: Fix hash dependency to correct algorithm ima: Fix misuse of dereference of pointer in template_desc_init_fields() integrity: Fix memory leakage in keyring allocation error path ima: Fix memory leak in __ima_inode_hash() ima: Handle -ESTALE returned by ima_filter_rule_match() ima: Simplify ima_lsm_copy_rule ima: Fix a potential NULL pointer access in ima_restore_measurement_list efi: Add iMac Pro 2017 to uefi skip cert quirk KEYS: encrypted: fix key instantiation with user-provided data
\| *	ima: Fix hash dependency to correct algorithm	Tianjia Zhang	2022-11-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d2825fa9365d ("crypto: sm3,sm4 - move into crypto directory") moves the SM3 and SM4 stand-alone library and the algorithm implementation for the Crypto API into the same directory, and the corresponding relationship of Kconfig is modified, CONFIG_CRYPTO_SM3/4 corresponds to the stand-alone library of SM3/4, and CONFIG_CRYPTO_SM3/4_GENERIC corresponds to the algorithm implementation for the Crypto API. Therefore, it is necessary for this module to depend on the correct algorithm. Fixes: d2825fa9365d ("crypto: sm3,sm4 - move into crypto directory") Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	ima: Fix misuse of dereference of pointer in template_desc_init_fields()	Xiu Jianfeng	2022-11-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The input parameter @fields is type of struct ima_template_field *, so when allocates array memory for @fields, the size of element should be sizeof(field) instead of sizeof(*field). Actually the original code would not cause any runtime error, but it's better to make it logically right. Fixes: adf53a778a0a ("ima: new templates management mechanism") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	integrity: Fix memory leakage in keyring allocation error path	GUO Zihua	2022-11-16	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Key restriction is allocated in integrity_init_keyring(). However, if keyring allocation failed, it is not freed, causing memory leaks. Fixes: 2b6aa412ff23 ("KEYS: Use structure to capture key restriction function and data") Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	ima: Fix memory leak in __ima_inode_hash()	Roberto Sassu	2022-11-03	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit f3cc6b25dcc5 ("ima: always measure and audit files in policy") lets measurement or audit happen even if the file digest cannot be calculated. As a result, iint->ima_hash could have been allocated despite ima_collect_measurement() returning an error. Since ima_hash belongs to a temporary inode metadata structure, declared at the beginning of __ima_inode_hash(), just add a kfree() call if ima_collect_measurement() returns an error different from -ENOMEM (in that case, ima_hash should not have been allocated). Cc: stable@vger.kernel.org Fixes: 280fe8367b0d ("ima: Always return a file measurement in ima_file_hash()") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	ima: Handle -ESTALE returned by ima_filter_rule_match()	GUO Zihua	2022-11-02	1	-9/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IMA relies on the blocking LSM policy notifier callback to update the LSM based IMA policy rules. When SELinux update its policies, IMA would be notified and starts updating all its lsm rules one-by-one. During this time, -ESTALE would be returned by ima_filter_rule_match() if it is called with a LSM rule that has not yet been updated. In ima_match_rules(), -ESTALE is not handled, and the LSM rule is considered a match, causing extra files to be measured by IMA. Fix it by re-initializing a temporary rule if -ESTALE is returned by ima_filter_rule_match(). The origin rule in the rule list would be updated by the LSM policy notifier callback. Fixes: b16942455193 ("ima: use the lsm policy update notifier") Signed-off-by: GUO Zihua <guozihua@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	ima: Simplify ima_lsm_copy_rule	GUO Zihua	2022-11-02	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently ima_lsm_copy_rule() set the arg_p field of the source rule to NULL, so that the source rule could be freed afterward. It does not make sense for this behavior to be inside a "copy" function. So move it outside and let the caller handle this field. ima_lsm_copy_rule() now produce a shallow copy of the original entry including args_p field. Meaning only the lsm.rule and the rule itself should be freed for the original rule. Thus, instead of calling ima_lsm_free_rule() which frees lsm.rule as well as args_p field, free the lsm.rule directly. Signed-off-by: GUO Zihua <guozihua@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	ima: Fix a potential NULL pointer access in ima_restore_measurement_list	Huaxin Lu	2022-11-02	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In restore_template_fmt, when kstrdup fails, a non-NULL value will still be returned, which causes a NULL pointer access in template_desc_init_fields. Fixes: c7d09367702e ("ima: support restoring multiple template formats") Cc: stable@kernel.org Co-developed-by: Jiaming Li <lijiaming30@huawei.com> Signed-off-by: Jiaming Li <lijiaming30@huawei.com> Signed-off-by: Huaxin Lu <luhuaxin1@huawei.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	efi: Add iMac Pro 2017 to uefi skip cert quirk	Aditya Garg	2022-11-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The iMac Pro 2017 is also a T2 Mac. Thus add it to the list of uefi skip cert. Cc: stable@vger.kernel.org Fixes: 155ca952c7ca ("efi: Do not import certificates from UEFI Secure Boot for T2 Macs") Link: https://lore.kernel.org/linux-integrity/9D46D92F-1381-4F10-989C-1A12CD2FFDD8@live.com/ Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
\| *	KEYS: encrypted: fix key instantiation with user-provided data	Nikolaus Voss	2022-10-19	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit cd3bc044af48 ("KEYS: encrypted: Instantiate key with user-provided decrypted data") added key instantiation with user provided decrypted data. The user data is hex-ascii-encoded but was just memcpy'ed to the binary buffer. Fix this to use hex2bin instead. Old keys created from user provided decrypted data saved with "keyctl pipe" are still valid, however if the key is recreated from decrypted data the old key must be converted to the correct format. This can be done with a small shell script, e.g.: BROKENKEY=abcdefABCDEF1234567890aaaaaaaaaa NEWKEY=$(echo -ne $BROKENKEY \| xxd -p -c32) keyctl add user masterkey "$(cat masterkey.bin)" @u keyctl add encrypted testkey "new user:masterkey 32 $NEWKEY" @u However, NEWKEY is still broken: If for BROKENKEY 32 bytes were specified, a brute force attacker knowing the key properties would only need to try at most 2^(16*8) keys, as if the key was only 16 bytes long. The security issue is a result of the combination of limiting the input range to hex-ascii and using memcpy() instead of hex2bin(). It could have been fixed either by allowing binary input or using hex2bin() (and doubling the ascii input key length). This patch implements the latter. The corresponding test for the Linux Test Project ltp has also been fixed (see link below). Fixes: cd3bc044af48 ("KEYS: encrypted: Instantiate key with user-provided decrypted data") Cc: stable@kernel.org Link: https://lore.kernel.org/ltp/20221006081709.92303897@mail.steuer-voss.de/ Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Nikolaus Voss <nikolaus.voss@haag-streit.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
* \|	Merge tag 'lsm-pr-20221212' of ↵	Linus Torvalds	2022-12-13	14	-80/+131
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improve the error handling in the device cgroup such that memory allocation failures when updating the access policy do not potentially alter the policy. - Some minor fixes to reiserfs to ensure that it properly releases LSM-related xattr values. - Update the security_socket_getpeersec_stream() LSM hook to take sockptr_t values. Previously the net/BPF folks updated the getsockopt code in the network stack to leverage the sockptr_t type to make it easier to pass both kernel and __user pointers, but unfortunately when they did so they didn't convert the LSM hook. While there was/is no immediate risk by not converting the LSM hook, it seems like this is a mistake waiting to happen so this patch proactively does the LSM hook conversion. - Convert vfs_getxattr_alloc() to return an int instead of a ssize_t and cleanup the callers. Internally the function was never going to return anything larger than an int and the callers were doing some very odd things casting the return value; this patch fixes all that and helps bring a bit of sanity to vfs_getxattr_alloc() and its callers. - More verbose, and helpful, LSM debug output when the system is booted with "lsm.debug" on the command line. There are examples in the commit description, but the quick summary is that this patch provides better information about which LSMs are enabled and the ordering in which they are processed. - General comment and kernel-doc fixes and cleanups. * tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: Fix description of fs_context_parse_param lsm: Add/fix return values in lsm_hooks.h and fix formatting lsm: Clarify documentation of vm_enough_memory hook reiserfs: Add missing calls to reiserfs_security_free() lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths device_cgroup: Roll back to original exceptions after copy failure LSM: Better reporting of actual LSMs at boot lsm: make security_socket_getpeersec_stream() sockptr_t safe audit: Fix some kernel-doc warnings lsm: remove obsoleted comments for security hooks fs: edit a comment made in bad taste
\| * \|	lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths	Paul Moore	2022-11-18	8	-31/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vfs_getxattr_alloc() function currently returns a ssize_t value despite the fact that it only uses int values internally for return values. Fix this by converting vfs_getxattr_alloc() to return an int type and adjust the callers as necessary. As part of these caller modifications, some of the callers are fixed to properly free the xattr value buffer on both success and failure to ensure that memory is not leaked in the failure case. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
\| * \|	device_cgroup: Roll back to original exceptions after copy failure	Wang Weiyang	2022-11-16	1	-4/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When add the 'a : rwm' entry to devcgroup A's whitelist, at first A's exceptions will be cleaned and A's behavior is changed to DEVCG_DEFAULT_ALLOW. Then parent's exceptions will be copyed to A's whitelist. If copy failure occurs, just return leaving A to grant permissions to all devices. And A may grant more permissions than parent. Backup A's whitelist and recover original exceptions after copy failure. Cc: stable@vger.kernel.org Fixes: 4cef7299b478 ("device_cgroup: add proper checking when changing default behavior") Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com> Reviewed-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
\| * \|	LSM: Better reporting of actual LSMs at boot	Kees Cook	2022-11-16	1	-9/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enhance the details reported by "lsm.debug" in several ways: - report contents of "security=" - report contents of "CONFIG_LSM" - report contents of "lsm=" - report any early LSM details - whitespace-align the output of similar phases for easier visual parsing - change "disabled" to more accurate "skipped" - explain what "skipped" and "ignored" mean in a parenthetical Upgrade the "security= is ignored" warning from pr_info to pr_warn, and include full arguments list to make the cause even more clear. Replace static "Security Framework initializing" pr_info with specific list of the resulting order of enabled LSMs. For example, if the kernel is built with: CONFIG_SECURITY_SELINUX=y CONFIG_SECURITY_APPARMOR=y CONFIG_SECURITY_LOADPIN=y CONFIG_SECURITY_YAMA=y CONFIG_SECURITY_SAFESETID=y CONFIG_SECURITY_LOCKDOWN_LSM=y CONFIG_SECURITY_LANDLOCK=y CONFIG_INTEGRITY=y CONFIG_BPF_LSM=y CONFIG_DEFAULT_SECURITY_APPARMOR=y CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux, smack,tomoyo,apparmor,bpf" Booting without options will show: LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin, safesetid,integrity,selinux,bpf landlock: Up and running. Yama: becoming mindful. LoadPin: ready to pin (currently not enforcing) SELinux: Initializing. LSM support for eBPF active Boot with "lsm.debug" will show: LSM: legacy security= unspecified LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity, selinux,smack,tomoyo,apparmor,bpf LSM: boot arg lsm= unspecified LSM: early started: lockdown (enabled) LSM: first ordered: capability (enabled) LSM: builtin ordered: landlock (enabled) LSM: builtin ignored: lockdown (not built into kernel) LSM: builtin ordered: yama (enabled) LSM: builtin ordered: loadpin (enabled) LSM: builtin ordered: safesetid (enabled) LSM: builtin ordered: integrity (enabled) LSM: builtin ordered: selinux (enabled) LSM: builtin ignored: smack (not built into kernel) LSM: builtin ignored: tomoyo (not built into kernel) LSM: builtin ordered: apparmor (enabled) LSM: builtin ordered: bpf (enabled) LSM: exclusive chosen: selinux LSM: exclusive disabled: apparmor LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin, safesetid,integrity,selinux,bpf LSM: cred blob size = 32 LSM: file blob size = 16 LSM: inode blob size = 72 LSM: ipc blob size = 8 LSM: msg_msg blob size = 4 LSM: superblock blob size = 80 LSM: task blob size = 8 LSM: initializing capability LSM: initializing landlock landlock: Up and running. LSM: initializing yama Yama: becoming mindful. LSM: initializing loadpin LoadPin: ready to pin (currently not enforcing) LSM: initializing safesetid LSM: initializing integrity LSM: initializing selinux SELinux: Initializing. LSM: initializing bpf LSM support for eBPF active And some examples of how the lsm.debug ordering report changes... With "lsm.debug security=selinux": LSM: legacy security=selinux LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity, selinux,smack,tomoyo,apparmor,bpf LSM: boot arg lsm= unspecified LSM: early started: lockdown (enabled) LSM: first ordered: capability (enabled) LSM: security=selinux disabled: apparmor (only one legacy major LSM) LSM: builtin ordered: landlock (enabled) LSM: builtin ignored: lockdown (not built into kernel) LSM: builtin ordered: yama (enabled) LSM: builtin ordered: loadpin (enabled) LSM: builtin ordered: safesetid (enabled) LSM: builtin ordered: integrity (enabled) LSM: builtin ordered: selinux (enabled) LSM: builtin ignored: smack (not built into kernel) LSM: builtin ignored: tomoyo (not built into kernel) LSM: builtin ordered: apparmor (disabled) LSM: builtin ordered: bpf (enabled) LSM: exclusive chosen: selinux LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin, safesetid,integrity,selinux,bpf With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf, loadpin,loadpin": LSM: legacy security= unspecified LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity, selinux,smack,tomoyo,apparmor,bpf LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin, loadpin LSM: early started: lockdown (enabled) LSM: first ordered: capability (enabled) LSM: cmdline ordered: integrity (enabled) LSM: cmdline ordered: selinux (enabled) LSM: cmdline ordered: loadpin (enabled) LSM: cmdline ignored: crabability (not built into kernel) LSM: cmdline ordered: bpf (enabled) LSM: cmdline skipped: apparmor (not in requested order) LSM: cmdline skipped: yama (not in requested order) LSM: cmdline skipped: safesetid (not in requested order) LSM: cmdline skipped: landlock (not in requested order) LSM: exclusive chosen: selinux LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: linux-security-module@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Mickaël Salaün <mic@digikod.net> [PM: line wrapped commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>
\| * \|	lsm: make security_socket_getpeersec_stream() sockptr_t safe	Paul Moore	2022-11-04	4	-35/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4ff09db1b79b ("bpf: net: Change sk_getsockopt() to take the sockptr_t argument") made it possible to call sk_getsockopt() with both user and kernel address space buffers through the use of the sockptr_t type. Unfortunately at the time of conversion the security_socket_getpeersec_stream() LSM hook was written to only accept userspace buffers, and in a desire to avoid having to change the LSM hook the commit author simply passed the sockptr_t's userspace buffer pointer. Since the only sk_getsockopt() callers at the time of conversion which used kernel sockptr_t buffers did not allow SO_PEERSEC, and hence the security_socket_getpeersec_stream() hook, this was acceptable but also very fragile as future changes presented the possibility of silently passing kernel space pointers to the LSM hook. There are several ways to protect against this, including careful code review of future commits, but since relying on code review to catch bugs is a recipe for disaster and the upstream eBPF maintainer is "strongly against defensive programming", this patch updates the LSM hook, and all of the implementations to support sockptr_t and safely handle both user and kernel space buffers. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
\| * \|	audit: Fix some kernel-doc warnings	Bo Liu	2022-10-28	1	-0/+1
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code provokes some kernel-doc warnings: security/lsm_audit.c:198: warning: Function parameter or member 'ab' not described in 'dump_common_audit_data' Signed-off-by: Bo Liu <liubo03@inspur.com> [PM: description line wrap] Signed-off-by: Paul Moore <paul@paul-moore.com>
* \|	Merge tag 'selinux-pr-20221212' of ↵	Linus Torvalds	2022-12-13	5	-47/+52
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Two SELinux patches: one increases the sleep time on deprecated functionality, and one removes the indirect calls in the sidtab context conversion code" * tag 'selinux-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: remove the sidtab context conversion indirect calls selinux: increase the deprecation sleep for checkreqprot and runtime disable
\| * \|	selinux: remove the sidtab context conversion indirect calls	Paul Moore	2022-11-09	4	-44/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sidtab conversion code has support for multiple context conversion routines through the use of function pointers and indirect calls. However, the reality is that all current users rely on the same conversion routine: convert_context(). This patch does away with this extra complexity and replaces the indirect calls with direct function calls; allowing us to remove a layer of obfuscation and create cleaner, more maintainable code. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
\| * \|	selinux: increase the deprecation sleep for checkreqprot and runtime disable	Paul Moore	2022-10-17	1	-2/+2
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Further the checkreqprot and runtime disable deprecation efforts by increasing the sleep time from 5 to 15 seconds to help make this more noticeable for any users who are still using these knobs. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
* \|	Merge tag 'landlock-6.2-rc1' of ↵	Linus Torvalds	2022-12-13	8	-57/+213
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This adds file truncation support to Landlock, contributed by Günther Noack. As described by Günther [1], the goal of these patches is to work towards a more complete coverage of file system operations that are restrictable with Landlock. The known set of currently unsupported file system operations in Landlock is described at [2]. Out of the operations listed there, truncate is the only one that modifies file contents, so these patches should make it possible to prevent the direct modification of file contents with Landlock. The new LANDLOCK_ACCESS_FS_TRUNCATE access right covers both the truncate(2) and ftruncate(2) families of syscalls, as well as open(2) with the O_TRUNC flag. This includes usages of creat() in the case where existing regular files are overwritten. Additionally, this introduces a new Landlock security blob associated with opened files, to track the available Landlock access rights at the time of opening the file. This is in line with Unix's general approach of checking the read and write permissions during open(), and associating this previously checked authorization with the opened file. An ongoing patch documents this use case [3]. In order to treat truncate(2) and ftruncate(2) calls differently in an LSM hook, we split apart the existing security_path_truncate hook into security_path_truncate (for truncation by path) and security_file_truncate (for truncation of previously opened files)" Link: https://lore.kernel.org/r/20221018182216.301684-1-gnoack3000@gmail.com [1] Link: https://www.kernel.org/doc/html/v6.1/userspace-api/landlock.html#filesystem-flags [2] Link: https://lore.kernel.org/r/20221209193813.972012-1-mic@digikod.net [3] * tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: samples/landlock: Document best-effort approach for LANDLOCK_ACCESS_FS_REFER landlock: Document Landlock's file truncation support samples/landlock: Extend sample tool to support LANDLOCK_ACCESS_FS_TRUNCATE selftests/landlock: Test ftruncate on FDs created by memfd_create(2) selftests/landlock: Test FD passing from restricted to unrestricted processes selftests/landlock: Locally define __maybe_unused selftests/landlock: Test open() and ftruncate() in multiple scenarios selftests/landlock: Test file truncation support landlock: Support file truncation landlock: Document init_layer_masks() helper landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed() security: Create file_truncate hook from path_truncate hook
\| * \|	landlock: Support file truncation	Günther Noack	2022-10-19	6	-12/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce the LANDLOCK_ACCESS_FS_TRUNCATE flag for file truncation. This flag hooks into the path_truncate, file_truncate and file_alloc_security LSM hooks and covers file truncation using truncate(2), ftruncate(2), open(2) with O_TRUNC, as well as creat(). This change also increments the Landlock ABI version, updates corresponding selftests, and updates code documentation to document the flag. In security/security.c, allocate security blobs at pointer-aligned offsets. This fixes the problem where one LSM's security blob can shift another LSM's security blob to an unaligned address (reported by Nathan Chancellor). The following operations are restricted: open(2): requires the LANDLOCK_ACCESS_FS_TRUNCATE right if a file gets implicitly truncated as part of the open() (e.g. using O_TRUNC). Notable special cases: * open(..., O_RDONLY\|O_TRUNC) can truncate files as well in Linux * open() with O_TRUNC does not need the TRUNCATE right when it creates a new file. truncate(2) (on a path): requires the LANDLOCK_ACCESS_FS_TRUNCATE right. ftruncate(2) (on a file): requires that the file had the TRUNCATE right when it was previously opened. File descriptors acquired by other means than open(2) (e.g. memfd_create(2)) continue to support truncation with ftruncate(2). Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM) Link: https://lore.kernel.org/r/20221018182216.301684-5-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>
\| * \|	landlock: Document init_layer_masks() helper	Günther Noack	2022-10-19	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add kernel-doc to the init_layer_masks() function. Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20221018182216.301684-4-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>
\| * \|	landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed()	Günther Noack	2022-10-19	1	-45/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename check_access_path_dual() to is_access_to_paths_allowed(). Make it return true iff the access is allowed. Calculate the EXDEV/EACCES error code in the one place where it's needed. Suggested-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20221018182216.301684-3-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>
\| * \|	security: Create file_truncate hook from path_truncate hook	Günther Noack	2022-10-19	3	-0/+24
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like path_truncate, the file_truncate hook also restricts file truncation, but is called in the cases where truncation is attempted on an already-opened file. This is required in a subsequent commit to handle ftruncate() operations differently to truncate() operations. Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20221018182216.301684-2-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>
* \|	Merge tag 'fs.vfsuid.conversion.v6.2' of ↵	Linus Torvalds	2022-12-12	5	-54/+68
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfsuid updates from Christian Brauner: "Last cycle we introduced the vfs{g,u}id_t types and associated helpers to gain type safety when dealing with idmapped mounts. That initial work already converted a lot of places over but there were still some left, This converts all remaining places that still make use of non-type safe idmapping helpers to rely on the new type safe vfs{g,u}id based helpers. Afterwards it removes all the old non-type safe helpers" * tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: fs: remove unused idmapping helpers ovl: port to vfs{g,u}id_t and associated helpers fuse: port to vfs{g,u}id_t and associated helpers ima: use type safe idmapping helpers apparmor: use type safe idmapping helpers caps: use type safe idmapping helpers fs: use type safe idmapping helpers mnt_idmapping: add missing helpers
\| * \|	ima: use type safe idmapping helpers	Christian Brauner	2022-10-26	1	-16/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already ported most parts and filesystems over for v6.0 to the new vfs{g,u}id_t type and associated helpers for v6.0. Convert the remaining places so we can remove all the old helpers. This is a non-functional change. Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	apparmor: use type safe idmapping helpers	Christian Brauner	2022-10-26	3	-13/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already ported most parts and filesystems over for v6.0 to the new vfs{g,u}id_t type and associated helpers for v6.0. Convert the remaining places so we can remove all the old helpers. This is a non-functional change. Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	caps: use type safe idmapping helpers	Christian Brauner	2022-10-26	1	-25/+26
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already ported most parts and filesystems over for v6.0 to the new vfs{g,u}id_t type and associated helpers for v6.0. Convert the remaining places so we can remove all the old helpers. This is a non-functional change. Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* \|	Merge tag 'fs.acl.rework.v6.2' of ↵	Linus Torvalds	2022-12-12	5	-65/+225
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull VFS acl updates from Christian Brauner: "This contains the work that builds a dedicated vfs posix acl api. The origins of this work trace back to v5.19 but it took quite a while to understand the various filesystem specific implementations in sufficient detail and also come up with an acceptable solution. As we discussed and seen multiple times the current state of how posix acls are handled isn't nice and comes with a lot of problems: The current way of handling posix acls via the generic xattr api is error prone, hard to maintain, and type unsafe for the vfs until we call into the filesystem's dedicated get and set inode operations. It is already the case that posix acls are special-cased to death all the way through the vfs. There are an uncounted number of hacks that operate on the uapi posix acl struct instead of the dedicated vfs struct posix_acl. And the vfs must be involved in order to interpret and fixup posix acls before storing them to the backing store, caching them, reporting them to userspace, or for permission checking. Currently a range of hacks and duct tape exist to make this work. As with most things this is really no ones fault it's just something that happened over time. But the code is hard to understand and difficult to maintain and one is constantly at risk of introducing bugs and regressions when having to touch it. Instead of continuing to hack posix acls through the xattr handlers this series builds a dedicated posix acl api solely around the get and set inode operations. Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() helpers must be used in order to interact with posix acls. They operate directly on the vfs internal struct posix_acl instead of abusing the uapi posix acl struct as we currently do. In the end this removes all of the hackiness, makes the codepaths easier to maintain, and gets us type safety. This series passes the LTP and xfstests suites without any regressions. For xfstests the following combinations were tested: - xfs - ext4 - btrfs - overlayfs - overlayfs on top of idmapped mounts - orangefs - (limited) cifs There's more simplifications for posix acls that we can make in the future if the basic api has made it. A few implementation details: - The series makes sure to retain exactly the same security and integrity module permission checks. Especially for the integrity modules this api is a win because right now they convert the uapi posix acl struct passed to them via a void pointer into the vfs struct posix_acl format to perform permission checking on the mode. There's a new dedicated security hook for setting posix acls which passes the vfs struct posix_acl not a void pointer. Basing checking on the posix acl stored in the uapi format is really unreliable. The vfs currently hacks around directly in the uapi struct storing values that frankly the security and integrity modules can't correctly interpret as evidenced by bugs we reported and fixed in this area. It's not necessarily even their fault it's just that the format we provide to them is sub optimal. - Some filesystems like 9p and cifs need access to the dentry in order to get and set posix acls which is why they either only partially or not even at all implement get and set inode operations. For example, cifs allows setxattr() and getxattr() operations but doesn't allow permission checking based on posix acls because it can't implement a get acl inode operation. Thus, this patch series updates the set acl inode operation to take a dentry instead of an inode argument. However, for the get acl inode operation we can't do this as the old get acl method is called in e.g., generic_permission() and inode_permission(). These helpers in turn are called in various filesystem's permission inode operation. So passing a dentry argument to the old get acl inode operation would amount to passing a dentry to the permission inode operation which we shouldn't and probably can't do. So instead of extending the existing inode operation Christoph suggested to add a new one. He also requested to ensure that the get and set acl inode operation taking a dentry are consistently named. So for this version the old get acl operation is renamed to ->get_inode_acl() and a new ->get_acl() inode operation taking a dentry is added. With this we can give both 9p and cifs get and set acl inode operations and in turn remove their complex custom posix xattr handlers. In the future I hope to get rid of the inode method duplication but it isn't like we have never had this situation. Readdir is just one example. And frankly, the overall gain in type safety and the more pleasant api wise are simply too big of a benefit to not accept this duplication for a while. - We've done a full audit of every codepaths using variant of the current generic xattr api to get and set posix acls and surprisingly it isn't that many places. There's of course always a chance that we might have missed some and if so I'm sure we'll find them soon enough. The crucial codepaths to be converted are obviously stacking filesystems such as ecryptfs and overlayfs. For a list of all callers currently using generic xattr api helpers see [2] including comments whether they support posix acls or not. - The old vfs generic posix acl infrastructure doesn't obey the create and replace semantics promised on the setxattr(2) manpage. This patch series doesn't address this. It really is something we should revisit later though. The patches are roughly organized as follows: (1) Change existing set acl inode operation to take a dentry argument (Intended to be a non-functional change) (2) Rename existing get acl method (Intended to be a non-functional change) (3) Implement get and set acl inode operations for filesystems that couldn't implement one before because of the missing dentry. That's mostly 9p and cifs (Intended to be a non-functional change) (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() including security and integrity hooks (Intended to be a non-functional change) (5) Implement get and set acl inode operations for stacking filesystems (Intended to be a non-functional change) (6) Switch posix acl handling in stacking filesystems to new posix acl api now that all filesystems it can stack upon support it. (7) Switch vfs to new posix acl api (semantical change) (8) Remove all now unused helpers (9) Additional regression fixes reported after we merged this into linux-next Thanks to Seth for a lot of good discussion around this and encouragement and input from Christoph" * tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits) posix_acl: Fix the type of sentinel in get_acl orangefs: fix mode handling ovl: call posix_acl_release() after error checking evm: remove dead code in evm_inode_set_acl() cifs: check whether acl is valid early acl: make vfs_posix_acl_to_xattr() static acl: remove a slew of now unused helpers 9p: use stub posix acl handlers cifs: use stub posix acl handlers ovl: use stub posix acl handlers ecryptfs: use stub posix acl handlers evm: remove evm_xattr_acl_change() xattr: use posix acl api ovl: use posix acl api ovl: implement set acl method ovl: implement get acl method ecryptfs: implement set acl method ecryptfs: implement get acl method ksmbd: use vfs_remove_acl() acl: add vfs_remove_acl() ...
\| * \|	evm: remove dead code in evm_inode_set_acl()	Christian Brauner	2022-10-28	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When evm_status is INTEGRITY_PASS then this function returns early and so later codepaths that check for evm_status != INTEGRITY_PASS can be removed as they are dead code. Fixes: e61b135f7bfe ("integrity: implement get and set acl hook") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	evm: remove evm_xattr_acl_change()	Christian Brauner	2022-10-20	1	-64/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The security and integrity infrastructure has dedicated hooks now so evm_xattr_acl_change() is dead code. Before this commit the callchain was: evm_protect_xattr() -> evm_xattr_change() -> evm_xattr_acl_change() where evm_protect_xattr() was hit from evm_inode_setxattr() and evm_inode_removexattr(). But now we have evm_inode_set_acl() and evm_inode_remove_acl() and have switched over the vfs to rely on the posix acl api so the code isn't hit anymore. Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	integrity: implement get and set acl hook	Christian Brauner	2022-10-20	3	-3/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. I spent considerate time in the security module and integrity infrastructure and audited all codepaths. EVM is the only part that really has restrictions based on the actual posix acl values passed through it (e.g., i_mode). Before this dedicated hook EVM used to translate from the uapi posix acl format sent to it in the form of a void pointer into the vfs format. This is not a good thing. Instead of hacking around in the uapi struct give EVM the posix acls in the appropriate vfs format and perform sane permissions checks that mirror what it used to to in the generic xattr hook. IMA doesn't have any restrictions on posix acls. When posix acls are changed it just wants to update its appraisal status to trigger an EVM revalidation. The removal of posix acls is equivalent to passing NULL to the posix set acl hooks. This is the same as before through the generic xattr api. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Acked-by: Paul Moore <paul@paul-moore.com> (LSM) Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	smack: implement get, set and remove acl hook	Christian Brauner	2022-10-20	1	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. I spent considerate time in the security module infrastructure and audited all codepaths. Smack has no restrictions based on the posix acl values passed through it. The capability hook doesn't need to be called either because it only has restrictions on security.* xattrs. So these all becomes very simple hooks for smack. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	selinux: implement get, set and remove acl hook	Christian Brauner	2022-10-20	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. I spent considerate time in the security module infrastructure and audited all codepaths. SELinux has no restrictions based on the posix acl values passed through it. The capability hook doesn't need to be called either because it only has restrictions on security.* xattrs. So these are all fairly simply hooks for SELinux. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
\| * \|	security: add get, remove and set acl hook	Christian Brauner	2022-10-20	1	-0/+25
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. In the next patches we implement the hooks for the few security modules that do actually have restrictions on posix acls. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* \|	Merge tag 'pull-iov_iter' of ↵	Linus Torvalds	2022-12-12	1	-2/+2
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: "iov_iter work; most of that is about getting rid of direction misannotations and (hopefully) preventing more of the same for the future" * tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: use less confusing names for iov_iter direction initializers iov_iter: saner checks for attempt to copy to/from iterator [xen] fix "direction" argument of iov_iter_kvec() [vhost] fix 'direction' argument of iov_iter_{init,bvec}() [target] fix iov_iter_bvec() "direction" argument [s390] memcpy_real(): WRITE is "data source", not destination... [s390] zcore: WRITE is "data source", not destination... [infiniband] READ is "data destination", not source... [fsi] WRITE is "data source", not destination... [s390] copy_oldmem_kernel() - WRITE is "data source", not destination csum_and_copy_to_iter(): handle ITER_DISCARD get rid of unlikely() on page_copy_sane() calls
\| * \|	use less confusing names for iov_iter direction initializers	Al Viro	2022-11-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \| \|	Merge tag 'linux-kselftest-kunit-next-6.2-rc1' of ↵	Linus Torvalds	2022-12-12	5	-168/+196
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "Several enhancements, fixes, clean-ups, documentation updates, improvements to logging and KTAP compliance of KUnit test output: - log numbers in decimal and hex - parse KTAP compliant test output - allow conditionally exposing static symbols to tests when KUNIT is enabled - make static symbols visible during kunit testing - clean-ups to remove unused structure definition" * tag 'linux-kselftest-kunit-next-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits) Documentation: dev-tools: Clarify requirements for result description apparmor: test: make static symbols visible during kunit testing kunit: add macro to allow conditionally exposing static symbols to tests kunit: tool: make parser preserve whitespace when printing test log Documentation: kunit: Fix "How Do I Use This" / "Next Steps" sections kunit: tool: don't include KTAP headers and the like in the test log kunit: improve KTAP compliance of KUnit test output kunit: tool: parse KTAP compliant test output mm: slub: test: Use the kunit_get_current_test() function kunit: Use the static key when retrieving the current test kunit: Provide a static key to check if KUnit is actively running tests kunit: tool: make --json do nothing if --raw_ouput is set kunit: tool: tweak error message when no KTAP found kunit: remove KUNIT_INIT_MEM_ASSERTION macro Documentation: kunit: Remove redundant 'tips.rst' page Documentation: KUnit: reword description of assertions Documentation: KUnit: make usage.rst a superset of tips.rst, remove duplication kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT macros kunit: tool: remove redundant file.close() call in unit test kunit: tool: unit tests all check parser errors, standardize formatting a bit ...
\| * \| \|	apparmor: test: make static symbols visible during kunit testing	Rae Moar	2022-12-12	5	-168/+196
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use macros, VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT, to allow static symbols to be conditionally set to be visible during apparmor_policy_unpack_test, which removes the need to include the testing file in the implementation file. Change the namespace of the symbols that are now conditionally visible (by adding the prefix aa_) to avoid confusion with symbols of the same name. Allow the test to be built as a module and namespace the module name from policy_unpack_test to apparmor_policy_unpack_test to improve clarity of the module name. Provide an example of how static symbols can be dealt with in testing. Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* \| \| \|	KEYS: trusted: tee: Make registered shm dependency explicit	Sumit Garg	2022-12-08	1	-1/+2
\| \|/ / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TEE trusted keys support depends on registered shared memory support since the key buffers are needed to be registered with OP-TEE. So make that dependency explicit to not register trusted keys support if underlying implementation doesn't support registered shared memory. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Tested-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
* \| \|	Merge tag 'lsm-pr-20221031' of ↵	Linus Torvalds	2022-10-31	1	-2/+4
\|\ \ \ \| \|/ / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM fix from Paul Moore: "A single patch to the capabilities code to fix a potential memory leak in the xattr allocation error handling" * tag 'lsm-pr-20221031' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: capabilities: fix potential memleak on error path from vfs_getxattr_alloc()
\| * \|	capabilities: fix potential memleak on error path from vfs_getxattr_alloc()	Gaosheng Cui	2022-10-28	1	-2/+4
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In cap_inode_getsecurity(), we will use vfs_getxattr_alloc() to complete the memory allocation of tmpbuf, if we have completed the memory allocation of tmpbuf, but failed to call handler->get(...), there will be a memleak in below logic: \|-- ret = (int)vfs_getxattr_alloc(mnt_userns, ...) \| /* ^^^ alloc for tmpbuf / \|-- value = krealloc(xattr_value, error + 1, flags) \| /* ^^^ alloc memory / \|-- error = handler->get(handler, ...) \| / error! / \|-- xattr_value = value \| /* xattr_value is &tmpbuf (memory leak!) */ So we will try to free(tmpbuf) after vfs_getxattr_alloc() fails to fix it. Cc: stable@vger.kernel.org Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Acked-by: Serge Hallyn <serge@hallyn.com> [PM: subject line and backtrace tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
* /	selinux: enable use of both GFP_KERNEL and GFP_ATOMIC in convert_context()	GONG, Ruiqi	2022-10-19	3	-5/+6
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following warning was triggered on a hardware environment: SELinux: Converting 162 SID table entries... BUG: sleeping function called from invalid context at __might_sleep+0x60/0x74 0x0 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 5943, name: tar CPU: 7 PID: 5943 Comm: tar Tainted: P O 5.10.0 #1 Call trace: dump_backtrace+0x0/0x1c8 show_stack+0x18/0x28 dump_stack+0xe8/0x15c ___might_sleep+0x168/0x17c __might_sleep+0x60/0x74 __kmalloc_track_caller+0xa0/0x7dc kstrdup+0x54/0xac convert_context+0x48/0x2e4 sidtab_context_to_sid+0x1c4/0x36c security_context_to_sid_core+0x168/0x238 security_context_to_sid_default+0x14/0x24 inode_doinit_use_xattr+0x164/0x1e4 inode_doinit_with_dentry+0x1c0/0x488 selinux_d_instantiate+0x20/0x34 security_d_instantiate+0x70/0xbc d_splice_alias+0x4c/0x3c0 ext4_lookup+0x1d8/0x200 [ext4] __lookup_slow+0x12c/0x1e4 walk_component+0x100/0x200 path_lookupat+0x88/0x118 filename_lookup+0x98/0x130 user_path_at_empty+0x48/0x60 vfs_statx+0x84/0x140 vfs_fstatat+0x20/0x30 __se_sys_newfstatat+0x30/0x74 __arm64_sys_newfstatat+0x1c/0x2c el0_svc_common.constprop.0+0x100/0x184 do_el0_svc+0x1c/0x2c el0_svc+0x20/0x34 el0_sync_handler+0x80/0x17c el0_sync+0x13c/0x140 SELinux: Context system_u:object_r:pssp_rsyslog_log_t:s0:c0 is not valid (left unmapped). It was found that within a critical section of spin_lock_irqsave in sidtab_context_to_sid(), convert_context() (hooked by sidtab_convert_params.func) might cause the process to sleep via allocating memory with GFP_KERNEL, which is problematic. As Ondrej pointed out [1], convert_context()/sidtab_convert_params.func has another caller sidtab_convert_tree(), which is okay with GFP_KERNEL. Therefore, fix this problem by adding a gfp_t argument for convert_context()/sidtab_convert_params.func and pass GFP_KERNEL/_ATOMIC properly in individual callers. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20221018120111.1474581-1-gongruiqi1@huawei.com/ [1] Reported-by: Tan Ninghao <tanninghao1@huawei.com> Fixes: ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance") Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> [PM: wrap long BUG() output lines, tweak subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
*	Merge tag 'mm-stable-2022-10-08' of ↵	Linus Torvalds	2022-10-10	1	-0/+4
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
\| *	security: kmsan: fix interoperability with auto-initialization	Alexander Potapenko	2022-10-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Heap and stack initialization is great, but not when we are trying uses of uninitialized memory. When the kernel is built with KMSAN, having kernel memory initialization enabled may introduce false negatives. We disable CONFIG_INIT_STACK_ALL_PATTERN and CONFIG_INIT_STACK_ALL_ZERO under CONFIG_KMSAN, making it impossible to auto-initialize stack variables in KMSAN builds. We also disable CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON to prevent accidental use of heap auto-initialization. We however still let the users enable heap auto-initialization at boot-time (by setting init_on_alloc=1 or init_on_free=1), in which case a warning is printed. Link: https://lkml.kernel.org/r/20220915150417.722975-31-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* \|	Merge tag 'tpmdd-next-v6.1-rc1' of ↵	Linus Torvalds	2022-10-10	1	-1/+1
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "Just a few bug fixes this time" * tag 'tpmdd-next-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: selftest: tpm2: Add Client.__del__() to close /dev/tpm* handle security/keys: Remove inconsistent __user annotation char: move from strlcpy with unused retval to strscpy
\| * \|	security/keys: Remove inconsistent __user annotation	Vincenzo Frascino	2022-10-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The declaration of keyring_read does not match the definition (security/keys/keyring.c). In this case the definition is correct because it matches what defined in "struct key_type::read" (linux/key-type.h). Fix the declaration removing the inconsistent __user annotation. Cc: David Howells <dhowells@redhat.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
* \| \|	Merge tag 'powerpc-6.1-1' of ↵	Linus Torvalds	2022-10-09	1	-0/+2
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Remove our now never-true definitions for pgd_huge() and p4d_leaf(). - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit. - Add support for syscall wrappers. - Add support for KFENCE on 64-bit. - Update 64-bit HV KVM to use the new guest state entry/exit accounting API. - Support execute-only memory when using the Radix MMU (P9 or later). - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests. - Updates to our linker script to move more data into read-only sections. - Allow the VDSO to be randomised on 32-bit. - Many other small features and fixes. Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas, Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng Yongjun. * tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) KVM: PPC: Book3S HV: Fix stack frame regs marker powerpc: Don't add __powerpc_ prefix to syscall entry points powerpc/64s/interrupt: Fix stack frame regs marker powerpc/64: Fix msr_check_and_set/clear MSR[EE] race powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN powerpc/pseries: Add firmware details to the hardware description powerpc/powernv: Add opal details to the hardware description powerpc: Add device-tree model to the hardware description powerpc/64: Add logical PVR to the hardware description powerpc: Add PVR & CPU name to hardware description powerpc: Add hardware description string powerpc/configs: Enable PPC_UV in powernv_defconfig powerpc/configs: Update config files for removed/renamed symbols powerpc/mm: Fix UBSAN warning reported on hugetlb powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range powerpc: Drops STABS_DEBUG from linker scripts powerpc/64s: Remove lost/old comment powerpc/64s: Remove old STAB comment powerpc: remove orphan systbl_chk.sh ...
\| * \| \|	powerpc/rtas: block error injection when locked down	Nathan Lynch	2022-09-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The error injection facility on pseries VMs allows corruption of arbitrary guest memory, potentially enabling a sufficiently privileged user to disable lockdown or perform other modifications of the running kernel via the rtas syscall. Block the PAPR error injection facility from being opened or called when locked down. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926131643.146502-3-nathanl@linux.ibm.com
\| * \| \|	powerpc/pseries: block untrusted device tree changes when locked down	Nathan Lynch	2022-09-28	1	-0/+1
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The /proc/powerpc/ofdt interface allows the root user to freely alter the in-kernel device tree, enabling arbitrary physical address writes via drivers that could bind to malicious device nodes, thus making it possible to disable lockdown. Historically this interface has been used on the pseries platform to facilitate the runtime addition and removal of processor, memory, and device resources (aka Dynamic Logical Partitioning or DLPAR). Years ago, the processor and memory use cases were migrated to designs that happen to be lockdown-friendly: device tree updates are communicated directly to the kernel from firmware without passing through untrusted user space. I/O device DLPAR via the "drmgr" command in powerpc-utils remains the sole legitimate user of /proc/powerpc/ofdt, but it is already broken in lockdown since it uses /dev/mem to allocate argument buffers for the rtas syscall. So only illegitimate uses of the interface should see a behavior change when running on a locked down kernel. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926131643.146502-2-nathanl@linux.ibm.com