diff options
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r-- | Documentation/sysctl/fs.txt | 17 | ||||
-rw-r--r-- | Documentation/sysctl/kernel.txt | 60 | ||||
-rw-r--r-- | Documentation/sysctl/vm.txt | 45 |
3 files changed, 100 insertions, 22 deletions
diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt index 1458448436cc..62682500878a 100644 --- a/Documentation/sysctl/fs.txt +++ b/Documentation/sysctl/fs.txt @@ -96,13 +96,16 @@ handles that the Linux kernel will allocate. When you get lots of error messages about running out of file handles, you might want to increase this limit. -The three values in file-nr denote the number of allocated -file handles, the number of unused file handles and the maximum -number of file handles. When the allocated file handles come -close to the maximum, but the number of unused file handles is -significantly greater than 0, you've encountered a peak in your -usage of file handles and you don't need to increase the maximum. - +Historically, the three values in file-nr denoted the number of +allocated file handles, the number of allocated but unused file +handles, and the maximum number of file handles. Linux 2.6 always +reports 0 as the number of free file handles -- this is not an +error, it just means that the number of allocated file handles +exactly matches the number of used file handles. + +Attempts to allocate more file descriptors than file-max are +reported with printk, look for "VFS: file-max limit <number> +reached". ============================================================== nr_open: diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 2dbff53369d0..a028b92001ed 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -22,6 +22,7 @@ show up in /proc/sys/kernel: - callhome [ S390 only ] - auto_msgmni - core_pattern +- core_pipe_limit - core_uses_pid - ctrl-alt-del - dentry-state @@ -135,6 +136,27 @@ core_pattern is used to specify a core dumpfile pattern name. ============================================================== +core_pipe_limit: + +This sysctl is only applicable when core_pattern is configured to pipe core +files to user space helper a (when the first character of core_pattern is a '|', +see above). When collecting cores via a pipe to an application, it is +occasionally usefull for the collecting application to gather data about the +crashing process from its /proc/pid directory. In order to do this safely, the +kernel must wait for the collecting process to exit, so as not to remove the +crashing processes proc files prematurely. This in turn creates the possibility +that a misbehaving userspace collecting process can block the reaping of a +crashed process simply by never exiting. This sysctl defends against that. It +defines how many concurrent crashing processes may be piped to user space +applications in parallel. If this value is exceeded, then those crashing +processes above that value are noted via the kernel log and their cores are +skipped. 0 is a special value, indicating that unlimited processes may be +captured in parallel, but that no waiting will take place (i.e. the collecting +process is not guaranteed access to /proc/<crahing pid>/). This value defaults +to 0. + +============================================================== + core_uses_pid: The default coredump filename is "core". By setting @@ -313,31 +335,43 @@ send before ratelimiting kicks in. ============================================================== +printk_delay: + +Delay each printk message in printk_delay milliseconds + +Value from 0 - 10000 is allowed. + +============================================================== + randomize-va-space: This option can be used to select the type of process address space randomization that is used in the system, for architectures that support this feature. -0 - Turn the process address space randomization off by default. +0 - Turn the process address space randomization off. This is the + default for architectures that do not support this feature anyways, + and kernels that are booted with the "norandmaps" parameter. 1 - Make the addresses of mmap base, stack and VDSO page randomized. This, among other things, implies that shared libraries will be - loaded to random addresses. Also for PIE-linked binaries, the location - of code start is randomized. + loaded to random addresses. Also for PIE-linked binaries, the + location of code start is randomized. This is the default if the + CONFIG_COMPAT_BRK option is enabled. - With heap randomization, the situation is a little bit more - complicated. - There a few legacy applications out there (such as some ancient +2 - Additionally enable heap randomization. This is the default if + CONFIG_COMPAT_BRK is disabled. + + There are a few legacy applications out there (such as some ancient versions of libc.so.5 from 1996) that assume that brk area starts - just after the end of the code+bss. These applications break when - start of the brk area is randomized. There are however no known + just after the end of the code+bss. These applications break when + start of the brk area is randomized. There are however no known non-legacy applications that would be broken this way, so for most - systems it is safe to choose full randomization. However there is - a CONFIG_COMPAT_BRK option for systems with ancient and/or broken - binaries, that makes heap non-randomized, but keeps all other - parts of process address space randomized if randomize_va_space - sysctl is turned on. + systems it is safe to choose full randomization. + + Systems with ancient and/or broken binaries should be configured + with CONFIG_COMPAT_BRK enabled, which excludes the heap from process + address space randomization. ============================================================== diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index c4de6359d440..a6e360d2055c 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -32,6 +32,8 @@ Currently, these files are in /proc/sys/vm: - legacy_va_layout - lowmem_reserve_ratio - max_map_count +- memory_failure_early_kill +- memory_failure_recovery - min_free_kbytes - min_slab_ratio - min_unmapped_ratio @@ -53,7 +55,6 @@ Currently, these files are in /proc/sys/vm: - vfs_cache_pressure - zone_reclaim_mode - ============================================================== block_dump @@ -275,6 +276,44 @@ e.g., up to one or two maps per allocation. The default value is 65536. +============================================================= + +memory_failure_early_kill: + +Control how to kill processes when uncorrected memory error (typically +a 2bit error in a memory module) is detected in the background by hardware +that cannot be handled by the kernel. In some cases (like the page +still having a valid copy on disk) the kernel will handle the failure +transparently without affecting any applications. But if there is +no other uptodate copy of the data it will kill to prevent any data +corruptions from propagating. + +1: Kill all processes that have the corrupted and not reloadable page mapped +as soon as the corruption is detected. Note this is not supported +for a few types of pages, like kernel internally allocated data or +the swap cache, but works for the majority of user pages. + +0: Only unmap the corrupted page from all processes and only kill a process +who tries to access it. + +The kill is done using a catchable SIGBUS with BUS_MCEERR_AO, so processes can +handle this if they want to. + +This is only active on architectures/platforms with advanced machine +check handling and depends on the hardware capabilities. + +Applications can override this setting individually with the PR_MCE_KILL prctl + +============================================================== + +memory_failure_recovery + +Enable memory failure recovery (when supported by the platform) + +1: Attempt recovery. + +0: Always panic on a memory failure. + ============================================================== min_free_kbytes: @@ -585,7 +624,9 @@ caching of directory and inode objects. At the default value of vfs_cache_pressure=100 the kernel will attempt to reclaim dentries and inodes at a "fair" rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer -to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 +to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will +never reclaim dentries and inodes due to memory pressure and this can easily +lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes. ============================================================== |