summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mm/kmemleak: increase the max mem pool to 1MQian Cai2019-09-241-1/+1
| | | | | | | | | | | | | | | There are some machines with slow disk and fast CPUs. When they are under memory pressure, it could take a long time to swap before the OOM kicks in to free up some memory. As the results, it needs a large mem pool for kmemleak or suffering from higher chance of a kmemleak metadata allocation failure. 524288 proves to be the good number for all architectures here. Increase the upper bound to 1M to leave some room for the future. Link: http://lkml.kernel.org/r/1565807572-26041-1-git-send-email-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/kmemleak.c: record the current memory pool sizeQian Cai2019-09-241-1/+2
| | | | | | | | | | | | | The only way to obtain the current memory pool size for a running kernel is to check the kernel config file which is inconvenient. Record it in the kernel messages. [akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin] Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: kmemleak: use the memory pool for early allocationsCatalin Marinas2019-09-243-245/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kmemleak uses a static early_log buffer to trace all memory allocation/freeing before the slab allocator is initialised. Such early log is replayed during kmemleak_init() to properly initialise the kmemleak metadata for objects allocated up that point. With a memory pool that does not rely on the slab allocator, it is possible to skip this early log entirely. In order to remove the early logging, consider kmemleak_enabled == 1 by default while the kmem_cache availability is checked directly on the object_cache and scan_area_cache variables. The RCU callback is only invoked after object_cache has been initialised as we wouldn't have any concurrent list traversal before this. In order to reduce the number of callbacks before kmemleak is fully initialised, move the kmemleak_init() call to mm_init(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: remove WARN_ON(), per Catalin] Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: kmemleak: simple memory allocation pool for kmemleak objectsCatalin Marinas2019-09-241-2/+52
| | | | | | | | | | | | | | | | | | | | Add a memory pool for struct kmemleak_object in case the normal kmem_cache_alloc() fails under the gfp constraints passed by the caller. The mem_pool[] array size is currently fixed at 16000. We are not using the existing mempool kernel API since this requires the slab allocator to be available (for pool->elements allocation). A subsequent kmemleak patch will replace the static early log buffer with the pool allocation introduced here and this functionality is required to be available before the slab was initialised. Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: kmemleak: make the tool tolerant to struct scan_area allocation failuresCatalin Marinas2019-09-241-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: kmemleak: Use a memory pool for kmemleak object allocations", v3. Following the discussions on v2 of this patch(set) [1], this series takes slightly different approach: - it implements its own simple memory pool that does not rely on the slab allocator - drops the early log buffer logic entirely since it can now allocate metadata from the memory pool directly before kmemleak is fully initialised - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE - moves the kmemleak_init() call earlier (mm_init()) - to avoid a separate memory pool for struct scan_area, it makes the tool robust when such allocations fail as scan areas are rather an optimisation [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com This patch (of 3): Object scan areas are an optimisation aimed to decrease the false positives and slightly improve the scanning time of large objects known to only have a few specific pointers. If a struct scan_area fails to allocate, kmemleak can still function normally by scanning the full object. Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area allocation fails. Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16KNicolas Boichat2019-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current default value (400) is too low on many systems (e.g. some ARM64 platform takes up 1000+ entries). syzbot uses 16000 as default value, and has proved to be enough on beefy configurations, so let's pick that value. This consumes more RAM on boot (each entry is 160 bytes, so in total ~2.5MB of RAM), but the memory would later be freed (early_log is __initdata). Link: http://lkml.kernel.org/r/20190730154027.101525-1-drinkcat@chromium.org Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Kees Cook <keescook@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/slub.c: fix -Wunused-function compiler warningsQian Cai2019-09-241-0/+2
| | | | | | | | | | | | | | | tid_to_cpu() and tid_to_event() are only used in note_cmpxchg_failure() when SLUB_DEBUG_CMPXCHG=y, so when SLUB_DEBUG_CMPXCHG=n by default, Clang will complain that those unused functions. Link: http://lkml.kernel.org/r/1568752232-5094-1-git-send-email-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, slab: move memcg_cache_params structure to mm/slab.hWaiman Long2019-09-242-62/+63
| | | | | | | | | | | | | | | | | | | | | | | | | The memcg_cache_params structure is only embedded into the kmem_cache of slab and slub allocators as defined in slab_def.h and slub_def.h and used internally by mm code. There is no needed to expose it in a public header. So move it from include/linux/slab.h to mm/slab.h. It is just a refactoring patch with no code change. In fact both the slub_def.h and slab_def.h should be moved into the mm directory as well, but that will probably cause many merge conflicts. Link: http://lkml.kernel.org/r/20190718180827.18758-1-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, slab: extend slab/shrink to shrink all memcg cachesWaiman Long2019-09-244-5/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink file to shrink the slab by flushing out all the per-cpu slabs and free slabs in partial lists. This can be useful to squeeze out a bit more memory under extreme condition as well as making the active object counts in /proc/slabinfo more accurate. This usually applies only to the root caches, as the SLUB_MEMCG_SYSFS_ON option is usually not enabled and "slub_memcg_sysfs=1" not set. Even if memcg sysfs is turned on, it is too cumbersome and impractical to manage all those per-memcg sysfs files in a real production system. So there is no practical way to shrink memcg caches. Fix this by enabling a proper write to the shrink sysfs file of the root cache to scan all the available memcg caches and shrink them as well. For a non-root memcg cache (when SLUB_MEMCG_SYSFS_ON or slub_memcg_sysfs is on), only that cache will be shrunk when written. On a 2-socket 64-core 256-thread arm64 system with 64k page after a parallel kernel build, the the amount of memory occupied by slabs before shrinking slabs were: # grep task_struct /proc/slabinfo task_struct 53137 53192 4288 61 4 : tunables 0 0 0 : slabdata 872 872 0 # grep "^S[lRU]" /proc/meminfo Slab: 3936832 kB SReclaimable: 399104 kB SUnreclaim: 3537728 kB After shrinking slabs (by echoing "1" to all shrink files): # grep "^S[lRU]" /proc/meminfo Slab: 1356288 kB SReclaimable: 263296 kB SUnreclaim: 1092992 kB # grep task_struct /proc/slabinfo task_struct 2764 6832 4288 61 4 : tunables 0 0 0 : slabdata 112 112 0 Link: http://lkml.kernel.org/r/20190723151445.7385-1-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: fix spelling mistake "ambigous" -> "ambiguous"Colin Ian King2019-09-241-1/+1
| | | | | | | | | | There is a spelling mistake in a mlog_bug_on_msg message. Fix it. Link: http://lkml.kernel.org/r/831bdff4-064e-038b-f45d-c4d265cbff1e@linux.alibaba.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: checkpoint appending truncate log transaction before flushingChangwei Ge2019-09-241-0/+15
| | | | | | | | | | | | | | | | | | Appending truncate log(TA) and and flushing truncate log(TF) are two separated transactions. They can be both committed but not checkpointed. If crash occurs then, both transaction will be replayed with several already released to global bitmap clusters. Then truncate log will be replayed resulting in cluster double free. To reproduce this issue, just crash the host while punching hole to files. Signed-off-by: Changwei Ge <gechangwei@live.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: wait for recovering done after direct unlock requestChangwei Ge2019-09-241-4/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a scenario causing ocfs2 umount hang when multiple hosts are rebooting at the same time. NODE1 NODE2 NODE3 send unlock requset to NODE2 dies become recovery master recover NODE2 find NODE2 dead mark resource RECOVERING directly remove lock from grant list calculate usage but RECOVERING marked **miss the window of purging clear RECOVERING To reproduce this issue, crash a host and then umount ocfs2 from another node. To solve this, just let unlock progress wait for recovery done. Link: http://lkml.kernel.org/r/1550124866-20367-1-git-send-email-gechangwei@live.cn Signed-off-by: Changwei Ge <gechangwei@live.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: delete unnecessary checks before brelse()Markus Elfring2019-09-242-7/+3
| | | | | | | | | | | | | | | | | | | brelse() tests whether its argument is NULL and then returns immediately. Thus the tests around the shown calls are not needed. This issue was detected by using the Coccinelle software. Link: http://lkml.kernel.org/r/55cde320-394b-f985-56ce-1a2abea782aa@web.de Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/ocfs2/dir.c: remove set but not used variableszhengbin2019-09-241-2/+1
| | | | | | | | | | | | | | | | | | | | | Fixes gcc '-Wunused-but-set-variable' warning: fs/ocfs2/dir.c: In function ocfs2_dx_dir_transfer_leaf: fs/ocfs2/dir.c:3653:42: warning: variable new_list set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/1566522588-63786-4-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Changwei Ge <chge@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/ocfs2/file.c: remove set but not used variableszhengbin2019-09-241-3/+0
| | | | | | | | | | | | | | | | | | | | | Fixes gcc '-Wunused-but-set-variable' warning: fs/ocfs2/file.c: In function ocfs2_prepare_inode_for_write: fs/ocfs2/file.c:2143:9: warning: variable end set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/1566522588-63786-3-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Changwei Ge <chge@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/ocfs2/namei.c: remove set but not used variableszhengbin2019-09-241-2/+0
| | | | | | | | | | | | | | | | | | | | | Fixes gcc '-Wunused-but-set-variable' warning: fs/ocfs2/namei.c: In function ocfs2_create_inode_in_orphan: fs/ocfs2/namei.c:2503:23: warning: variable di set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/1566522588-63786-2-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Changwei Ge <chge@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: remove unused ocfs2_orphan_scan_exit() declarationGuozhonghua2019-09-241-2/+1
| | | | | | | | | | | | | | | | | | ocfs2_orphan_scan_exit() is declared but not implemented. Also perform a minor cleanup in ocfs2_link_credits() Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4014FC208AC@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: guozhonghua <guozhonghua@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: remove unused ocfs2_calc_tree_trunc_credits()Guozhonghua2019-09-241-28/+0
| | | | | | | | | | | | | | | | | ocfs2_calc_tree_trunc_credits() is not called anywhere. Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4014FC2050F@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: guozhonghua <guozhonghua@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: further debugfs cleanupsGreg Kroah-Hartman2019-09-249-180/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to check return value of debugfs_create functions, but the last sweep through ocfs missed a number of places where this was happening. There is also no need to save the individual dentries for the debugfs files, as everything is can just be removed at once when the directory is removed. By getting rid of the file dentries for the debugfs entries, a bit of local memory can be saved as well. [colin.king@canonical.com: ensure ret is set to zero before returning] Link: http://lkml.kernel.org/r/20190807121929.28918-1-colin.king@canonical.com Link: http://lkml.kernel.org/r/20190731132119.GA12603@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jia Guo <guojia12@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* jbd2: remove jbd2_journal_inode_add_[write|wait]Joseph Qi2019-09-243-16/+0
| | | | | | | | | | | | | | | | | | | Since ext4/ocfs2 are using jbd2_inode dirty range scoping APIs now, jbd2_journal_inode_add_[write|wait] are not used any more, remove them. Link: http://lkml.kernel.org/r/1562977611-8412-2-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Ross Zwisler <zwisler@google.com> Acked-by: Changwei Ge <chge@linux.alibaba.com> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: use jbd2_inode dirty range scopingJoseph Qi2019-09-244-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | 6ba0e7dc64a5 ("jbd2: introduce jbd2_inode dirty range scoping") allow us scoping each of the inode dirty ranges associated with a given transaction, and ext4 already does this way. Now let's also use the newly introduced jbd2_inode dirty range scoping to prevent us from waiting forever when trying to complete a journal transaction in ocfs2. Link: http://lkml.kernel.org/r/1562977611-8412-1-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Ross Zwisler <zwisler@google.com> Reviewed-by: Changwei Ge <chge@linux.alibaba.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kbuild: clean compressed initramfs imageGreg Thelen2019-09-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig") "make clean" leaves behind compressed initramfs images. Example: $ make defconfig $ sed -i 's|CONFIG_INITRAMFS_SOURCE=""|CONFIG_INITRAMFS_SOURCE="/tmp/ir.cpio"|' .config $ make olddefconfig $ make -s $ make -s clean $ git clean -ndxf | grep initramfs Would remove usr/initramfs_data.cpio.gz clean rules do not have CONFIG_* context so they do not know which compression format was used. Thus they don't know which files to delete. Tell clean to delete all possible compression formats. Once patched usr/initramfs_data.cpio.gz and friends are deleted by "make clean". Link: http://lkml.kernel.org/r/20190722063251.55541-1-gthelen@google.com Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig") Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* z3fold: fix retry mechanism in page reclaimVitaly Wool2019-09-241-15/+34
| | | | | | | | | | | | | | | | | | | | | | | z3fold_page_reclaim()'s retry mechanism is broken: on a second iteration it will have zhdr from the first one so that zhdr is no longer in line with struct page. That leads to crashes when the system is stressed. Fix that by moving zhdr assignment up. While at it, protect against using already freed handles by using own local slots structure in z3fold_page_reclaim(). Link: http://lkml.kernel.org/r/20190908162919.830388dc7404d1e2c80f4095@gmail.com Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Reported-by: Markus Linnala <markus.linnala@gmail.com> Reported-by: Chris Murphy <bugzilla@colorremedies.com> Reported-by: Agustin Dall'Alba <agustin@dallalba.com.ar> Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> Cc: Shakeel Butt <shakeelb@google.com> Cc: Henry Burns <henrywolfeburns@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: add dummy can_do_mlock() helperArnd Bergmann2019-09-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | On kernels without CONFIG_MMU, we get a link error for the siw driver: drivers/infiniband/sw/siw/siw_mem.o: In function `siw_umem_get': siw_mem.c:(.text+0x4c8): undefined reference to `can_do_mlock' This is probably not the only driver that needs the function and could otherwise build correctly without CONFIG_MMU, so add a dummy variant that always returns false. Link: http://lkml.kernel.org/r/20190909204201.931830-1-arnd@arndb.de Fixes: 2251334dcac9 ("rdma/siw: application buffer management") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Bernard Metzler <bmt@zurich.ibm.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Revert "mm/z3fold.c: fix race between migration and destruction"Vitaly Wool2019-09-241-90/+0
| | | | | | | | | | | | | | | | | | | | With the original commit applied, z3fold_zpool_destroy() may get blocked on wait_event() for indefinite time. Revert this commit for the time being to get rid of this problem since the issue the original commit addresses is less severe. Link: http://lkml.kernel.org/r/20190910123142.7a9c8d2de4d0acbc0977c602@gmail.com Fixes: d776aaa9895eb6eb77 ("mm/z3fold.c: fix race between migration and destruction") Reported-by: Agustín Dall'Alba <agustin@dallalba.com.ar> Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Jonathan Adams <jwadams@google.com> Cc: Henry Burns <henrywolfeburns@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fat: work around race with userspace's read via blockdev while mountingOGAWA Hirofumi2019-09-242-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If userspace reads the buffer via blockdev while mounting, sb_getblk()+modify can race with buffer read via blockdev. For example, FS userspace bh = sb_getblk() modify bh->b_data read ll_rw_block(bh) fill bh->b_data by on-disk data /* lost modified data by FS */ set_buffer_uptodate(bh) set_buffer_uptodate(bh) Userspace should not use the blockdev while mounting though, the udev seems to be already doing this. Although I think the udev should try to avoid this, workaround the race by small overhead. Link: http://lkml.kernel.org/r/87pnk7l3sw.fsf_-_@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'for-v5.4' of ↵Linus Torvalds2019-09-2215-120/+237
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply and reset updates from Sebastian Reichel: "Core: - Ensure HWMON devices are registered with valid names - Fix device wakeup code Drivers: - bq25890_charger: Add BQ25895 support - axp288_fuel_gauge: Add Minix Neo Z83-4 to blacklist - sc27xx: improve battery calibration - misc small fixes all over drivers" * tag 'for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (24 commits) power: supply: cpcap-charger: Enable vbus boost voltage power: supply: sc27xx: Add POWER_SUPPLY_PROP_CALIBRATE attribute power: supply: sc27xx: Optimize the battery capacity calibration power: supply: sc27xx: Make sure the alarm capacity is larger than 0 power: supply: sc27xx: Fix the the accuracy issue of coulomb calculation power: supply: sc27xx: Fix conditon to enable the FGU interrupt power: supply: sc27xx: Add POWER_SUPPLY_PROP_ENERGY_FULL_DESIGN attribute power: supply: max77650: add MODULE_ALIAS() power: supply: isp1704: remove redundant assignment to variable ret power: supply: bq25890_charger: Add the BQ25895 part power: supply: sc27xx: Replace devm_add_action() followed by failure action with devm_add_action_or_reset() power: supply: sc27xx: Introduce local variable 'struct device *dev' power: reset: reboot-mode: Fix author email format power: supply: ab8500: remove set but not used variables 'vbup33_vrtcn' and 'bup_vch_range' power: supply: max17042_battery: Fix a typo in function names power: reset: gpio-restart: Fix typo when gpio reset is not found power: supply: Init device wakeup after device_add() power: supply: ab8500_charger: Mark expected switch fall-through power: supply: sbs-battery: only return health when battery present MAINTAINERS: N900: Remove isp1704_charger.h record ...
| * power: supply: cpcap-charger: Enable vbus boost voltageTony Lindgren2019-09-022-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are currently not enabling VBUS boost for cpcap when in host mode. This means the VBUS is fed at the battery voltage level, which can cause flakeyness enumerating devices. Looks like the boost control for VBUS is CPCAP_BIT_VBUS_SWITCH that we must enable in the charger for nice 4.92 V VBUS output. And looks like we must not use the STBY pin enabling but must instead use manual VBUS control in phy-cpcap-usb. We want to do this in cpcap_charger_vbus_work() and also set a flag for feeding_vbus to avoid races between USB detection and charger detection, and disable charging if feeding_vbus is set. Cc: Jacopo Mondi <jacopo@jmondi.org> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Marcel Partap <mpartap@gmx.net> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Michael Scott <hashcode0f@gmail.com> Cc: NeKit <nekit1000@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Add POWER_SUPPLY_PROP_CALIBRATE attributeYuanjiang Yu2019-09-021-8/+19
| | | | | | | | | | | | | | | | | | Add the 'POWER_SUPPLY_PROP_CALIBRATE' attribute to allow chareger manager to calibrate the battery capacity. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Optimize the battery capacity calibrationYuanjiang Yu2019-09-021-35/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch factors out the capacity calibration into one single function to calibrate the battery capacity, and adding more abnormal cases to calibrate the capacity when the OCV value is not matchable with current capacity. Moreover we also allow to calibrate the capacity when charger magager tries to get current capacity to make sure we give a correct capacity for userspace. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Make sure the alarm capacity is larger than 0Yuanjiang Yu2019-09-021-0/+2
| | | | | | | | | | | | | | | | | | We must make sure the alarm capacity is larger than 0, to help to calibrate the low battery capacity. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Fix the the accuracy issue of coulomb calculationYuanjiang Yu2019-09-021-7/+4
| | | | | | | | | | | | | | | | | | | | | | The Spreadtrum fuel gauge will multiply by 2 for counting the coulomb counter to improve the accuracy, which means the value saved in fuel gauge is: coulomb counter * 2 * 1000ma_adc. Thus fix the conversion formular to improve the accuracy of calculating the battery capacity. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Fix conditon to enable the FGU interruptYuanjiang Yu2019-09-021-1/+2
| | | | | | | | | | | | | | | | | | We should allow to enable FGU interrupt to adjust the battery capacity, when charging status is POWER_SUPPLY_STATUS_DISCHARGING. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Add POWER_SUPPLY_PROP_ENERGY_FULL_DESIGN attributeYuanjiang Yu2019-09-021-0/+5
| | | | | | | | | | | | | | | | | | Add POWER_SUPPLY_PROP_ENERGY_FULL_DESIGN attribute to provide the battery's design capacity for charger manager to calculate the charging counter. Signed-off-by: Yuanjiang Yu <yuanjiang.yu@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: max77650: add MODULE_ALIAS()Bartosz Golaszewski2019-09-021-0/+1
| | | | | | | | | | | | | | | | Define a MODULE_ALIAS() in the charger sub-driver for max77650 so that the appropriate module gets loaded together with the core mfd driver. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: isp1704: remove redundant assignment to variable retColin Ian King2019-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | The variable ret is being assigned with a value that is never read and it is being updated later with a new value. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: bq25890_charger: Add the BQ25895 partAngus Ainslie (Purism)2019-09-021-4/+8
| | | | | | | | | | | | | | | | The BQ25895 is almost identical to the BQ25890. Signed-off-by: Angus Ainslie (Purism) <angus@akkea.ca> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Replace devm_add_action() followed by failure action ↵Fuqian Huang2019-09-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | with devm_add_action_or_reset() devm_add_action_or_reset() is introduced as a helper function which internally calls devm_add_action(). If devm_add_action() fails then it will execute the action mentioned and return the error code. This reduce source code size (avoid writing the action twice) and reduce the likelyhood of bugs. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sc27xx: Introduce local variable 'struct device *dev'Fuqian Huang2019-09-021-23/+24
| | | | | | | | | | | | | | | | | | Introduce local variable 'struct device *dev' and use it instead of dereferencing it repeatly. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: reset: reboot-mode: Fix author email formatMatwey V. Kornilov2019-09-021-1/+1
| | | | | | | | | | | | | | Closing angle bracket was missing. Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: ab8500: remove set but not used variables 'vbup33_vrtcn' and ↵YueHaibing2019-09-021-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'bup_vch_range' Fixes gcc '-Wunused-but-set-variable' warnings: drivers/power/supply/ab8500_charger.c: In function ab8500_charger_init_hw_registers: drivers/power/supply/ab8500_charger.c:3013:24: warning: variable vbup33_vrtcn set but not used [-Wunused-but-set-variable] drivers/power/supply/ab8500_charger.c:3013:5: warning: variable bup_vch_range set but not used [-Wunused-but-set-variable] Fixes: 4c4268dc97c4 ("power: supply: ab8500: Drop AB8540/9540 support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: max17042_battery: Fix a typo in function namesChristophe JAILLET2019-09-021-4/+4
| | | | | | | | | | | | | | | | | | It is likely that 'max10742_[un]lock_model()' functions should be 'max17042_[un]lock_model()' (0 and 7 switched in 10742) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: reset: gpio-restart: Fix typo when gpio reset is not foundMichal Simek2019-09-021-1/+1
| | | | | | | | | | | | | | | | Trivial patch which just corrects error message. Fixes: 371bb20d6927 ("power: Add simple gpio-restart driver") Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: Init device wakeup after device_add()Stephen Boyd2019-09-021-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We may want to use the device pointer in device_init_wakeup() with functions that expect the device to already be added with device_add(). For example, if we were to link the device initializing wakeup to something in sysfs such as a class for wakeups we'll run into an error. It looks like this code was written with the assumption that the device would be added before initializing wakeup due to the order of operations in power_supply_unregister(). Let's change the order of operations so we don't run into problems here. Fixes: 948dcf966228 ("power_supply: Prevent suspend until power supply events are processed") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tri Vo <trong@android.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Ravi Chandra Sadineni <ravisadineni@chromium.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: ab8500_charger: Mark expected switch fall-throughGustavo A. R. Silva2019-09-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | Mark switch cases where we are expecting to fall through. This patch fixes the following warning (Building: allmodconfig arm): drivers/power/supply/ab8500_charger.c:738:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sbs-battery: only return health when battery presentMichael Nosthoff2019-09-021-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when the battery is set to sbs-mode and no gpio detection is enabled "health" is always returning a value even when the battery is not present. All other fields return "not present". This leads to a scenario where the driver is constantly switching between "present" and "not present" state. This generates a lot of constant traffic on the i2c. This commit changes the response of "health" to an error when the battery is not responding leading to a consistent "not present" state. Fixes: 76b16f4cdfb8 ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats") Cc: <stable@vger.kernel.org> Signed-off-by: Michael Nosthoff <committed@heine.so> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * MAINTAINERS: N900: Remove isp1704_charger.h recordDenis Efremov2019-09-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | Update MAINTAINERS to reflect that isp1704_charger.h file was removed. Cc: Pavel Machek <pavel@ucw.cz> Cc: linux-pm@vger.kernel.org Fixes: f5d782d46aa5 ("power: supply: isp1704: switch to gpiod API") Signed-off-by: Denis Efremov <efremov@linux.com> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: sbs-battery: use correct flags fieldMichael Nosthoff2019-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | the type flag is stored in the chip->flags field not in the client->flags field. This currently leads to never using the ti specific health function as client->flags doesn't use that bit. So it's always falling back to the general one. Fixes: 76b16f4cdfb8 ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats") Cc: <stable@vger.kernel.org> Signed-off-by: Michael Nosthoff <committed@heine.so> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: axp288_fuel_gauge: Add Minix Neo Z83-4 to the blacklistHans de Goede2019-09-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | The Minix Neo Z83-4 is another mini PC using the AXP288 PMIC where the EFI code does not disable the charger part of the PMIC causing us to report battery readings (of always 100%) to userspace even though there is no battery in this wall-outlet powered device. Add it to the blacklist to avoid the bogus battery status reporting. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
| * power: supply: axp288_fuel_gauge: Sort the DMI blacklist alphabeticallyHans de Goede2019-09-011-6/+7
| | | | | | | | | | | | | | | | The blacklist is getting big enough that it is good to have some sort of fixed order for it, sort it alphabetically. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>