summaryrefslogtreecommitdiffstats
path: root/security
Commit message (Collapse)AuthorAgeFilesLines
...
| * | selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socketMickaël Salaün2024-01-041-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPv6 network stack first checks the sockaddr length (-EINVAL error) before checking the family (-EAFNOSUPPORT error). This was discovered thanks to commit a549d055a22e ("selftests/landlock: Add network tests"). Cc: Eric Paris <eparis@parisplace.org> Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Closes: https://lore.kernel.org/r/0584f91c-537c-4188-9e4f-04f192565667@collabora.com Fixes: 0f8db8cc73df ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()") Signed-off-by: Mickaël Salaün <mic@digikod.net> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/initial_sid_to_string.hPaul Moore2023-12-221-29/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/xfrm.hPaul Moore2023-12-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/security.hPaul Moore2023-12-221-80/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues with security/selinux/include/policycap_names.hPaul Moore2023-12-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/policycap.hPaul Moore2023-12-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/objsec.hPaul Moore2023-12-221-64/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues with security/selinux/include/netlabel.hPaul Moore2023-12-221-33/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/netif.hPaul Moore2023-12-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/ima.hPaul Moore2023-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/conditional.hPaul Moore2023-12-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/classmap.hPaul Moore2023-12-221-210/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/avc_ss.hPaul Moore2023-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: align avc_has_perm_noaudit() prototype with definitionPaul Moore2023-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A trivial correction to convert an 'unsigned' parameter into an 'unsigned int' parameter so the prototype matches the function definition. I really thought that someone submitted a patch for this a few years ago but sadly I can't find it now. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/avc.hPaul Moore2023-12-221-26/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: fix style issues in security/selinux/include/audit.hPaul Moore2023-12-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of on ongoing effort to perform more automated testing and provide more tools for individual developers to validate their patches before submitting, we are trying to make our code "clang-format clean". My hope is that once we have fixed all of our style "quirks", developers will be able to run clang-format on their patches to help avoid silly formatting problems and ensure their changes fit in well with the rest of the SELinux kernel code. Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: remove the wrong comment about multithreaded process handlingMunehisa Kamata2023-12-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit d9250dea3f89 ("SELinux: add boundary support and thread context assignment"), SELinux has been supporting assigning per-thread security context under a constraint and the comment was updated accordingly. However, seems like commit d84f4f992cbd ("CRED: Inaugurate COW credentials") accidentally brought the old comment back that doesn't match what the code does. Considering the ease of understanding the code, this patch just removes the wrong comment. Fixes: d84f4f992cbd ("CRED: Inaugurate COW credentials") Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: introduce an initial SID for early boot processesOndrej Mosnacek2023-11-217-2/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, SELinux doesn't allow distinguishing between kernel threads and userspace processes that are started before the policy is first loaded - both get the label corresponding to the kernel SID. The only way a process that persists from early boot can get a meaningful label is by doing a voluntary dyntransition or re-executing itself. Reusing the kernel label for userspace processes is problematic for several reasons: 1. The kernel is considered to be a privileged domain and generally needs to have a wide range of permissions allowed to work correctly, which prevents the policy writer from effectively hardening against early boot processes that might remain running unintentionally after the policy is loaded (they represent a potential extra attack surface that should be mitigated). 2. Despite the kernel being treated as a privileged domain, the policy writer may want to impose certain special limitations on kernel threads that may conflict with the requirements of intentional early boot processes. For example, it is a good hardening practice to limit what executables the kernel can execute as usermode helpers and to confine the resulting usermode helper processes. However, a (legitimate) process surviving from early boot may need to execute a different set of executables. 3. As currently implemented, overlayfs remembers the security context of the process that created an overlayfs mount and uses it to bound subsequent operations on files using this context. If an overlayfs mount is created before the SELinux policy is loaded, these "mounter" checks are made against the kernel context, which may clash with restrictions on the kernel domain (see 2.). To resolve this, introduce a new initial SID (reusing the slot of the former "init" initial SID) that will be assigned to any userspace process started before the policy is first loaded. This is easy to do, as we can simply label any process that goes through the bprm_creds_for_exec LSM hook with the new init-SID instead of propagating the kernel SID from the parent. To provide backwards compatibility for existing policies that are unaware of this new semantic of the "init" initial SID, introduce a new policy capability "userspace_initial_context" and set the "init" SID to the same context as the "kernel" SID unless this capability is set by the policy. Another small backwards compatibility measure is needed in security_sid_to_context_core() for before the initial SELinux policy load - see the code comment for explanation. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: edited comments based on feedback/discussion] Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: refactor avtab_node comparisonsJacob Satterfield2023-11-201-60/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In four separate functions within avtab, the same comparison logic is used. The only difference is how the result is handled or whether there is a unique specifier value to be checked for or used. Extracting this functionality into the avtab_node_cmp() function unifies the comparison logic between searching and insertion and gets rid of duplicative code so that the implementation is easier to maintain. Signed-off-by: Jacob Satterfield <jsatterfield.linux@gmail.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: update filenametr_hash() to use full_name_hash()Paul Moore2023-11-161-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | Using full_name_hash() instead of partial_name_hash() should result in cleaner and better performing code. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | selinux: saner handling of policy reloadsAl Viro2023-11-161-78/+66
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On policy reload selinuxfs replaces two subdirectories (/booleans and /class) with new variants. Unfortunately, that's done with serious abuses of directory locking. 1) lock_rename() should be done to parents, not to objects being exchanged 2) there's a bunch of reasons why it should not be done for directories that do not have a common ancestor; most of those do not apply to selinuxfs, but even in the best case the proof is subtle and brittle. 3) failure halfway through the creation of /class will leak names and values arrays. 4) use of d_genocide() is also rather brittle; it's probably not much of a bug per se, but e.g. an overmount of /sys/fs/selinuxfs/classes/shm/index with any regular file will end up with leaked mount on policy reload. Sure, don't do it, but... Let's stop messing with disconnected directories; just create a temporary (/.swapover) with no permissions for anyone (on the level of ->permission() returing -EPERM, no matter who's calling it) and build the new /booleans and /class in there; then lock_rename on root and that temporary directory and d_exchange() old and new both for class and booleans. Then unlock and use simple_recursive_removal() to take the temporary out; it's much more robust. And instead of bothering with separate pathways for freeing new (on failure halfway through) and old (on success) names/values, do all freeing in one place. With temporaries swapped with the old ones when we are past all possible failures. The only user-visible difference is that /.swapover shows up (but isn't possible to open, look up into, etc.) for the duration of policy reload. Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [PM: applied some fixes from Al post merge] Signed-off-by: Paul Moore <paul@paul-moore.com>
* | Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of ↵Linus Torvalds2024-01-091-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio conversions for file paths'. - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2: Folio conversions for directory paths'. - IA64 remnant removal in Heiko Carstens's 'Remove unused code after IA-64 removal'. - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in 'Treewide: enable -Wmissing-prototypes'. This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series 'hexagon: Fix up instances of -Wmissing-prototypes'. - Nathan also addressed some s390 warnings in 's390: A couple of fixes for -Wmissing-prototypes'. - Arnd Bergmann addresses the same warnings for MIPS in his series 'mips: address -Wmissing-prototypes warnings'. - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series 'kexec_file: Load kernel at top of system RAM if required' - Baoquan He has also added the self-explanatory 'kexec_file: print out debugging message if required'. - Some checkstack maintenance work from Tiezhu Yang in the series 'Modify some code about checkstack'. - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is 'watchdog: Better handling of concurrent lockups'. - Yuntao Wang has contributed some maintenance work on the crash code in 'crash: Some cleanups and fixes'" * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits) crash_core: fix and simplify the logic of crash_exclude_mem_range() x86/crash: use SZ_1M macro instead of hardcoded value x86/crash: remove the unused image parameter from prepare_elf_headers() kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE scripts/decode_stacktrace.sh: strip unexpected CR from lines watchdog: if panicking and we dumped everything, don't re-enable dumping watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/hardlockup: adopt softlockup logic avoiding double-dumps kexec_core: fix the assignment to kimage->control_page x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init() lib/trace_readwrite.c:: replace asm-generic/io with linux/io nilfs2: cpfile: fix some kernel-doc warnings stacktrace: fix kernel-doc typo scripts/checkstack.pl: fix no space expression between sp and offset x86/kexec: fix incorrect argument passed to kexec_dprintk() x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs nilfs2: add missing set_freezable() for freezable kthread kernel: relay: remove relay_file_splice_read dead code, doesn't work docs: submit-checklist: remove all of "make namespacecheck" ...
| * | kexec_file: print out debugging message if requiredBaoquan He2023-12-201-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Then when specifying '-d' for kexec_file_load interface, loaded locations of kernel/initrd/cmdline etc can be printed out to help debug. Here replace pr_debug() with the newly added kexec_dprintk() in kexec_file loading related codes. And also print out type/start/head of kimage and flags to help debug. Link: https://lkml.kernel.org/r/20231213055747.61826-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Conor Dooley <conor@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds2024-01-091-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
| * | mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov2024-01-081-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'vfs-6.8.iov_iter' of ↵Linus Torvalds2024-01-081-3/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs iov_iter cleanups from Christian Brauner: "This contains a minor cleanup. The patches drop an unused argument from import_single_range() allowing to replace import_single_range() with import_ubuf() and dropping import_single_range() completely" * tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iov_iter: replace import_single_range() with import_ubuf() iov_iter: remove unused 'iov' argument from import_single_range()
| * | iov_iter: replace import_single_range() with import_ubuf()Jens Axboe2023-12-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the removal of the 'iov' argument to import_single_range(), the two functions are now fully identical. Convert the import_single_range() callers to import_ubuf(), and remove the former fully. Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20231204174827.1258875-3-axboe@kernel.dk Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | iov_iter: remove unused 'iov' argument from import_single_range()Jens Axboe2023-12-051-2/+1
| |/ | | | | | | | | | | | | | | It is entirely unused, just get rid of it. Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20231204174827.1258875-2-axboe@kernel.dk Signed-off-by: Christian Brauner <brauner@kernel.org>
* | Merge tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds2024-01-081-8/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull vfs rw updates from Christian Brauner: "This contains updates from Amir for read-write backing file helpers for stacking filesystems such as overlayfs: - Fanotify is currently in the process of introducing pre content events. Roughly, a new permission event will be added indicating that it is safe to write to the file being accessed. These events are used by hierarchical storage managers to e.g., fill the content of files on first access. During that work we noticed that our current permission checking is inconsistent in rw_verify_area() and remap_verify_area(). Especially in the splice code permission checking is done multiple times. For example, one time for the whole range and then again for partial ranges inside the iterator. In addition, we mostly do permission checking before we call file_start_write() except for a few places where we call it after. For pre-content events we need such permission checking to be done before file_start_write(). So this is a nice reason to clean this all up. After this series, all permission checking is done before file_start_write(). As part of this cleanup we also massaged the splice code a bit. We got rid of a few helpers because we are alredy drowning in special read-write helpers. We also cleaned up the return types for splice helpers. - Introduce generic read-write helpers for backing files. This lifts some overlayfs code to common code so it can be used by the FUSE passthrough work coming in over the next cycles. Make Amir and Miklos the maintainers for this new subsystem of the vfs" * tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits) fs: fix __sb_write_started() kerneldoc formatting fs: factor out backing_file_mmap() helper fs: factor out backing_file_splice_{read,write}() helpers fs: factor out backing_file_{read,write}_iter() helpers fs: prepare for stackable filesystems backing file helpers fsnotify: optionally pass access range in file permission hooks fsnotify: assert that file_start_write() is not held in permission hooks fsnotify: split fsnotify_perm() into two hooks fs: use splice_copy_file_range() inline helper splice: return type ssize_t from all helpers fs: use do_splice_direct() for nfsd/ksmbd server-side-copy fs: move file_start_write() into direct_splice_actor() fs: fork splice_file_range() from do_splice_direct() fs: create {sb,file}_write_not_started() helpers fs: create file_write_started() helper fs: create __sb_write_started() helper fs: move kiocb_start_write() into vfs_iocb_iter_write() fs: move permission hook out of do_iter_read() fs: move permission hook out of do_iter_write() fs: move file_start_write() into vfs_iter_write() ...
| * | fsnotify: optionally pass access range in file permission hooksAmir Goldstein2023-12-121-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for pre-content permission events with file access range, move fsnotify_file_perm() hook out of security_file_permission() and into the callers. Callers that have the access range information call the new hook fsnotify_file_area_perm() with the access range. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-6-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | fsnotify: split fsnotify_perm() into two hooksAmir Goldstein2023-12-121-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would like to make changes to the fsnotify access permission hook - add file range arguments and add the pre modify event. In preparation for these changes, split the fsnotify_perm() hook into fsnotify_open_perm() and fsnotify_file_perm(). This is needed for fanotify "pre content" events. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-4-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
* | apparmor: Fix move_mount mediation by detecting if source is detachedJohn Johansen2024-01-032-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent move_mount from applying the attach_disconnected flag to move_mount(). This prevents detached mounts from appearing as / when applying mount mediation, which is not only incorrect but could result in bad policy being generated. Basic mount rules like allow mount, allow mount options=(move) -> /target/, will allow detached mounts, allowing older policy to continue to function. New policy gains the ability to specify `detached` as a source option allow mount detached -> /target/, In addition make sure support of move_mount is advertised as a feature to userspace so that applications that generate policy can respond to the addition. Note: this fixes mediation of move_mount when a detached mount is used, it does not fix the broader regression of apparmor mediation of mounts under the new mount api. Link: https://lore.kernel.org/all/68c166b8-5b4d-4612-8042-1dee3334385b@leemhuis.info/T/#mb35fdde37f999f08f0b02d58dc1bf4e6b65b8da2 Fixes: 157a3537d6bc ("apparmor: Fix regression in mount mediation") Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
* | keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiryDavid Howells2023-12-214-22/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a key has an expiration time, then when that time passes, the key is left around for a certain amount of time before being collected (5 mins by default) so that EKEYEXPIRED can be returned instead of ENOKEY. This is a problem for DNS keys because we want to redo the DNS lookup immediately at that point. Fix this by allowing key types to be marked such that keys of that type don't have this extra period, but are reclaimed as soon as they expire and turn this on for dns_resolver-type keys. To make this easier to handle, key->expiry is changed to be permanent if TIME64_MAX rather than 0. Furthermore, give such new-style negative DNS results a 1s default expiry if no other expiry time is set rather than allowing it to stick around indefinitely. This shouldn't be zero as ls will follow a failing stat call immediately with a second with AT_SYMLINK_NOFOLLOW added. Fixes: 1a4240f4764a ("DNS: Separate out CIFS DNS Resolver code") Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Wang Lei <wang840925@gmail.com> cc: Jeff Layton <jlayton@redhat.com> cc: Steve French <smfrench@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jarkko Sakkinen <jarkko@kernel.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: keyrings@vger.kernel.org cc: netdev@vger.kernel.org
* | cred: get rid of CONFIG_DEBUG_CREDENTIALSJens Axboe2023-12-151-6/+0
|/ | | | | | | | | This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'apparmor-pr-2023-11-03' of ↵Linus Torvalds2023-11-0332-848/+1336
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "This adds initial support for mediating io_uring and userns creation. Adds a new restriction that tightens the use of change_profile, and a couple of optimizations to reduce performance bottle necks that have been found when retrieving the current task's secid and allocating work buffers. The majority of the patch set continues cleaning up and simplifying the code (fixing comments, removing now dead functions, and macros etc). Finally there are 4 bug fixes, with the regression fix having had a couple months of testing. Features: - optimize retrieving current task secid - add base io_uring mediation - add base userns mediation - improve buffer allocation - allow restricting unprivilege change_profile Cleanups: - Fix kernel doc comments - remove unused declarations - remove unused functions - remove unneeded #ifdef - remove unused macros - mark fns static - cleanup fn with unused return values - cleanup audit data - pass cred through to audit data - refcount the pdb instead of using duplicates - make SK_CTX macro an inline fn - some comment cleanups Bug fixes: - fix regression in mount mediation - fix invalid refenece - use passed in gfp flags - advertise avaiability of extended perms and disconnected.path" * tag 'apparmor-pr-2023-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (39 commits) apparmor: Fix some kernel-doc comments apparmor: Fix one kernel-doc comment apparmor: Fix some kernel-doc comments apparmor: mark new functions static apparmor: Fix regression in mount mediation apparmor: cache buffers on percpu list if there is lock contention apparmor: add io_uring mediation apparmor: add user namespace creation mediation apparmor: allow restricting unprivileged change_profile apparmor: advertise disconnected.path is available apparmor: refcount the pdb apparmor: provide separate audit messages for file and policy checks apparmor: pass cred through to audit info. apparmor: rename audit_data->label to audit_data->subj_label apparmor: combine common_audit_data and apparmor_audit_data apparmor: rename SK_CTX() to aa_sock and make it an inline fn apparmor: Optimize retrieving current task secid apparmor: remove unused functions in policy_ns.c/.h apparmor: remove unneeded #ifdef in decompress_zstd() apparmor: fix invalid reference on profile->disconnected ...
| * apparmor: Fix some kernel-doc commentsYang Li2023-10-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Fix some kernel-doc comments to silence the warnings: security/apparmor/policy.c:117: warning: Function parameter or member 'kref' not described in 'aa_pdb_free_kref' security/apparmor/policy.c:117: warning: Excess function parameter 'kr' description in 'aa_pdb_free_kref' security/apparmor/policy.c:882: warning: Function parameter or member 'subj_cred' not described in 'aa_may_manage_policy' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7037 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: Fix one kernel-doc commentYang Li2023-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | Fix one kernel-doc comment to silence the warnings: security/apparmor/domain.c:46: warning: Function parameter or member 'to_cred' not described in 'may_change_ptraced_domain' security/apparmor/domain.c:46: warning: Excess function parameter 'cred' description in 'may_change_ptraced_domain' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7036 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: Fix some kernel-doc commentsYang Li2023-10-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix some kernel-doc comments to silence the warnings: security/apparmor/capability.c:66: warning: Function parameter or member 'ad' not described in 'audit_caps' security/apparmor/capability.c:66: warning: Excess function parameter 'as' description in 'audit_caps' security/apparmor/capability.c:154: warning: Function parameter or member 'subj_cred' not described in 'aa_capable' security/apparmor/capability.c:154: warning: Excess function parameter 'subj_cread' description in 'aa_capable' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7035 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: mark new functions staticArnd Bergmann2023-10-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Two new functions were introduced as global functions when they are only called from inside the file that defines them and should have been static: security/apparmor/lsm.c:658:5: error: no previous prototype for 'apparmor_uring_override_creds' [-Werror=missing-prototypes] security/apparmor/lsm.c:682:5: error: no previous prototype for 'apparmor_uring_sqpoll' [-Werror=missing-prototypes] Fixes: c4371d90633b7 ("apparmor: add io_uring mediation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: Fix regression in mount mediationJohn Johansen2023-10-183-22/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") introduced a new move_mount(2) system call and a corresponding new LSM security_move_mount hook but did not implement this hook for any existing LSM. This creates a regression for AppArmor mediation of mount. This patch provides a base mapping of the move_mount syscall to the existing mount mediation. In the future we may introduce additional mediations around the new mount calls. Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") CC: stable@vger.kernel.org Reported-by: Andreas Steinmetz <anstein99@googlemail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: cache buffers on percpu list if there is lock contentionJohn Johansen2023-10-181-5/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit df323337e507 ("apparmor: Use a memory pool instead per-CPU caches") changed buffer allocation to use a memory pool, however on a heavily loaded machine there can be lock contention on the global buffers lock. Add a percpu list to cache buffers on when lock contention is encountered. When allocating buffers attempt to use cached buffers first, before taking the global buffers lock. When freeing buffers try to put them back to the global list but if contention is encountered, put the buffer on the percpu list. The length of time a buffer is held on the percpu list is dynamically adjusted based on lock contention. The amount of hold time is increased and decreased linearly. v5: - simplify base patch by removing: improvements can be added later - MAX_LOCAL and must lock - contention scaling. v4: - fix percpu ->count buffer count which had been spliced across a debug patch. - introduce define for MAX_LOCAL_COUNT - rework count check and locking around it. - update commit message to reference commit that introduced the memory. v3: - limit number of buffers that can be pushed onto the percpu list. This avoids a problem on some kernels where one percpu list can inherit buffers from another cpu after a reschedule, causing more kernel memory to used than is necessary. Under normal conditions this should eventually return to normal but under pathelogical conditions the extra memory consumption may have been unbouanded v2: - dynamically adjust buffer hold time on percpu list based on lock contention. v1: - cache buffers on percpu list on lock contention Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: add io_uring mediationGeorgia Garcia2023-10-186-2/+131
| | | | | | | | | | | | | | | | For now, the io_uring mediation is limited to sqpoll and override_creds. Signed-off-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: add user namespace creation mediationJohn Johansen2023-10-187-2/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | Unprivileged user namespace creation is often used as a first step in privilege escalation attacks. Instead of disabling it at the sysrq level, which blocks its legitimate use as for setting up a sandbox, allow control on a per domain basis. This allows an admin to quickly lock down a system while also still allowing legitimate use. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: allow restricting unprivileged change_profileJohn Johansen2023-10-185-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unprivileged unconfined can use change_profile to alter the confinement set by the mac admin. Allow restricting unprivileged unconfined by still allowing change_profile but stacking the change against unconfined. This allows unconfined to still apply system policy but allows the task to enter the new confinement. If unprivileged unconfined is required a sysctl is provided to switch to the previous behavior. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: advertise disconnected.path is availableJohn Johansen2023-10-181-0/+1
| | | | | | | | | | | | | | | | | | While disconnected.path has been available for a while it was never properly advertised as a feature. Fix this so that userspace doesn't need special casing to handle it. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: refcount the pdbJohn Johansen2023-10-1815-210/+260
| | | | | | | | | | | | | | | | | | | | | | | | | | With the move to permission tables the dfa is no longer a stand alone entity when used, needing a minimum of a permission table. However it still could be shared among different pdbs each using a different permission table. Instead of duping the permission table when sharing a pdb, add a refcount to the pdb so it can be easily shared. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: provide separate audit messages for file and policy checksJohn Johansen2023-10-181-5/+11
| | | | | | | | | | | | | | | | Improve policy load failure messages by identifying which dfa the verification check failed in. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: pass cred through to audit info.John Johansen2023-10-1820-211/+388
| | | | | | | | | | | | | | | | | | The cred is needed to properly audit some messages, and will be needed in the future for uid conditional mediation. So pass it through to where the apparmor_audit_data struct gets defined. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: rename audit_data->label to audit_data->subj_labelJohn Johansen2023-10-1810-18/+17
| | | | | | | | | | | | | | | | | | rename audit_data's label field to subj_label to better reflect its use. Also at the same time drop unneeded assignments to ->subj_label as the later call to aa_check_perms will do the assignment if needed. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
| * apparmor: combine common_audit_data and apparmor_audit_dataJohn Johansen2023-10-1815-245/+257
| | | | | | | | | | | | | | | | | | Everywhere where common_audit_data is used apparmor audit_data is also used. We can simplify the code and drop the use of the aad macro everywhere by combining the two structures. Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>