| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have holes, the holes will automatically get detected and removed
once we remove the next bigger/smaller section. The extra checks can go.
Link: http://lkml.kernel.org/r/20191006085646.5768-9-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With shrink_pgdat_span() out of the way, we now always have a valid zone.
Link: http://lkml.kernel.org/r/20191006085646.5768-8-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's poison the pages similar to when adding new memory in
sparse_add_section(). Also call remove_pfn_range_from_zone() from
memunmap_pages(), so we can poison the memmap from there as well.
Link: http://lkml.kernel.org/r/20191006085646.5768-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6.
This series fixes the access of uninitialized memmaps when shrinking
zones/nodes and when removing memory. Also, it contains all fixes for
crashes that can be triggered when removing certain namespace using
memunmap_pages() - ZONE_DEVICE, reported by Aneesh.
We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be
more involved (we don't have SECTION_IS_ONLINE as an indicator), and
shrinking is only of limited use (set_zone_contiguous() cannot detect the
ZONE_DEVICE as contiguous).
We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of
code to a minimum. Shrinking is especially necessary to keep
zone->contiguous set where possible, especially, on memory unplug of DIMMs
at zone boundaries.
--------------------------------------------------------------------------
Zones are now properly shrunk when offlining memory blocks or when
onlining failed. This allows to properly shrink zones on memory unplug
even if the separate memory blocks of a DIMM were onlined to different
zones or re-onlined to a different zone after offlining.
Example:
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
:/# echo "online_movable" > /sys/devices/system/memory/memory41/state
:/# echo "online_movable" > /sys/devices/system/memory/memory43/state
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 98304
present 65536
managed 65536
:/# echo 0 > /sys/devices/system/memory/memory43/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 32768
present 32768
managed 32768
:/# echo 0 > /sys/devices/system/memory/memory41/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
This patch (of 6):
The third argument is actually number of pages. Change the variable name
from size to nr_pages to indicate this better.
No functional change in this patch.
Link: http://lkml.kernel.org/r/20191006085646.5768-3-david@redhat.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's move it to the header and use the shorter variant from
mm/page_alloc.c (the original one will also check
"__highest_present_section_nr + 1", which is not necessary). While at
it, make the section_nr in next_pfn() const.
In next_pfn(), we now return section_nr_to_pfn(-1) instead of -1 once we
exceed __highest_present_section_nr, which doesn't make a difference in
the caller as it is big enough (>= all sane end_pfn).
Link: http://lkml.kernel.org/r/20200113144035.10848-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Jin, Zhi" <zhi.jin@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's update the pfn manually whenever we continue the loop. This makes
the code easier to read but also less error prone (and we can directly fix
one issue).
When overlap_memmap_init() returns true, pfn is updated to
"memblock_region_memory_end_pfn(r)". So it already points at the *next*
pfn to process. Incrementing the pfn another time is wrong, we might
leave one uninitialized. I spotted this by inspecting the code, so I have
no idea if this is relevant in practise (with kernelcore=mirror).
Link: http://lkml.kernel.org/r/20200113144035.10848-2-david@redhat.com
Fixes: a9a9e77fbf27 ("mm: move mirrored memory specific code outside of memmap_init_zone")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "Jin, Zhi" <zhi.jin@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's make sure that all memory holes are actually marked PageReserved(),
that page_to_pfn() produces reliable results, and that these pages are not
detected as "mmap" pages due to the mapcount.
E.g., booting a x86-64 QEMU guest with 4160 MB:
[ 0.010585] Early memory node ranges
[ 0.010586] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.010588] node 0: [mem 0x0000000000100000-0x00000000bffdefff]
[ 0.010589] node 0: [mem 0x0000000100000000-0x0000000143ffffff]
max_pfn is 0x144000.
Before this change:
[root@localhost ~]# ./page-types -r -a 0x144000,
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000800 16384 64 ___________M_______________________________ mmap
total 16384 64
After this change:
[root@localhost ~]# ./page-types -r -a 0x144000,
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000100000000 16384 64 ___________________________r_______________ reserved
total 16384 64
IOW, especially the unavailable physical memory ("memory hole") in the
last section would not get properly marked PageReserved() and is indicated
to be "mmap" memory.
Drop the trace of that function from include/linux/mm.h - nobody else
needs it, and rename it accordingly.
Note: The fake zone/node might not be covered by the zone/node span. This
is not an urgent issue (for now, we had the same node/zone due to the
zeroing). We'll need a clean way to mark memory holes (e.g., using a page
type PageHole() if possible or a fake ZONE_INVALID) and eventually stop
marking these memory holes PageReserved().
Link: http://lkml.kernel.org/r/20191211163201.17179-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If max_pfn does not fall onto a section boundary, it is possible to
inspect PFNs up to max_pfn, and PFNs above max_pfn, however, max_pfn
itself can't be inspected. We can have a valid (and online) memmap at and
above max_pfn if max_pfn is not aligned to a section boundary. The whole
early section has a memmap and is marked online. Being able to inspect
the state of these PFNs is valuable for debugging, especially because
max_pfn can change on memory hotplug and expose these memmaps.
Also, querying page flags via "./page-types -r -a 0x144001,"
(tools/vm/page-types.c) inside a x86-64 guest with 4160MB under QEMU
results in an (almost) endless loop in user space, because the end is not
detected properly when starting after max_pfn.
Instead, let's allow to inspect all pages in the highest section and
return 0 directly if we try to access pages above that section.
While at it, check the count before adjusting it, to avoid masking user
errors.
Link: http://lkml.kernel.org/r/20191211163201.17179-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch series "mm: fix max_pfn not falling on section boundary", v2.
Playing with different memory sizes for a x86-64 guest, I discovered that
some memmaps (highest section if max_mem does not fall on the section
boundary) are marked as being valid and online, but contain garbage. We
have to properly initialize these memmaps.
Looking at /proc/kpageflags and friends, I found some more issues,
partially related to this.
This patch (of 3):
If max_pfn is not aligned to a section boundary, we can easily run into
BUGs. This can e.g., be triggered on x86-64 under QEMU by specifying a
memory size that is not a multiple of 128MB (e.g., 4097MB, but also
4160MB). I was told that on real HW, we can easily have this scenario
(esp., one of the main reasons sub-section hotadd of devmem was added).
The issue is, that we have a valid memmap (pfn_valid()) for the whole
section, and the whole section will be marked "online".
pfn_to_online_page() will succeed, but the memmap contains garbage.
E.g., doing a "./page-types -r -a 0x144001" when QEMU was started with "-m
4160M" - (see tools/vm/page-types.c):
[ 200.476376] BUG: unable to handle page fault for address: fffffffffffffffe
[ 200.477500] #PF: supervisor read access in kernel mode
[ 200.478334] #PF: error_code(0x0000) - not-present page
[ 200.479076] PGD 59614067 P4D 59614067 PUD 59616067 PMD 0
[ 200.479557] Oops: 0000 [#4] SMP NOPTI
[ 200.479875] CPU: 0 PID: 603 Comm: page-types Tainted: G D W 5.5.0-rc1-next-20191209 #93
[ 200.480646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
[ 200.481648] RIP: 0010:stable_page_flags+0x4d/0x410
[ 200.482061] Code: f3 ff 41 89 c0 48 b8 00 00 00 00 01 00 00 00 45 84 c0 0f 85 cd 02 00 00 48 8b 53 08 48 8b 2b 48f
[ 200.483644] RSP: 0018:ffffb139401cbe60 EFLAGS: 00010202
[ 200.484091] RAX: fffffffffffffffe RBX: fffffbeec5100040 RCX: 0000000000000000
[ 200.484697] RDX: 0000000000000001 RSI: ffffffff9535c7cd RDI: 0000000000000246
[ 200.485313] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000
[ 200.485917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000144001
[ 200.486523] R13: 00007ffd6ba55f48 R14: 00007ffd6ba55f40 R15: ffffb139401cbf08
[ 200.487130] FS: 00007f68df717580(0000) GS:ffff9ec77fa00000(0000) knlGS:0000000000000000
[ 200.487804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 200.488295] CR2: fffffffffffffffe CR3: 0000000135d48000 CR4: 00000000000006f0
[ 200.488897] Call Trace:
[ 200.489115] kpageflags_read+0xe9/0x140
[ 200.489447] proc_reg_read+0x3c/0x60
[ 200.489755] vfs_read+0xc2/0x170
[ 200.490037] ksys_pread64+0x65/0xa0
[ 200.490352] do_syscall_64+0x5c/0xa0
[ 200.490665] entry_SYSCALL_64_after_hwframe+0x49/0xbe
But it can be triggered much easier via "cat /proc/kpageflags > /dev/null"
after cold/hot plugging a DIMM to such a system:
[root@localhost ~]# cat /proc/kpageflags > /dev/null
[ 111.517275] BUG: unable to handle page fault for address: fffffffffffffffe
[ 111.517907] #PF: supervisor read access in kernel mode
[ 111.518333] #PF: error_code(0x0000) - not-present page
[ 111.518771] PGD a240e067 P4D a240e067 PUD a2410067 PMD 0
This patch fixes that by at least zero-ing out that memmap (so e.g.,
page_to_pfn() will not crash). Commit 907ec5fca3dc ("mm: zero remaining
unavailable struct pages") tried to fix a similar issue, but forgot to
consider this special case.
After this patch, there are still problems to solve. E.g., not all of
these pages falling into a memory hole will actually get initialized later
and set PageReserved - they are only zeroed out - but at least the
immediate crashes are gone. A follow-up patch will take care of this.
Link: http://lkml.kernel.org/r/20191211163201.17179-2-david@redhat.com
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: <stable@vger.kernel.org> [4.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Writing a cloned file triggers a kernel oops and the user-space command
process is also killed by the system. The bug can be reproduced stably
via:
1) create a file under ocfs2 file system directory.
journalctl -b > aa.txt
2) create a cloned file for this file.
reflink aa.txt bb.txt
3) write the cloned file with dd command.
dd if=/dev/zero of=bb.txt bs=512 count=1 conv=notrunc
The dd command is killed by the kernel, then you can see the oops message
via dmesg command.
[ 463.875404] BUG: kernel NULL pointer dereference, address: 0000000000000028
[ 463.875413] #PF: supervisor read access in kernel mode
[ 463.875416] #PF: error_code(0x0000) - not-present page
[ 463.875418] PGD 0 P4D 0
[ 463.875425] Oops: 0000 [#1] SMP PTI
[ 463.875431] CPU: 1 PID: 2291 Comm: dd Tainted: G OE 5.3.16-2-default
[ 463.875433] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 463.875500] RIP: 0010:ocfs2_refcount_cow+0xa4/0x5d0 [ocfs2]
[ 463.875505] Code: 06 89 6c 24 38 89 eb f6 44 24 3c 02 74 be 49 8b 47 28
[ 463.875508] RSP: 0018:ffffa2cb409dfce8 EFLAGS: 00010202
[ 463.875512] RAX: ffff8b1ebdca8000 RBX: 0000000000000001 RCX: ffff8b1eb73a9df0
[ 463.875515] RDX: 0000000000056a01 RSI: 0000000000000000 RDI: 0000000000000000
[ 463.875517] RBP: 0000000000000001 R08: ffff8b1eb73a9de0 R09: 0000000000000000
[ 463.875520] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 463.875522] R13: ffff8b1eb922f048 R14: 0000000000000000 R15: ffff8b1eb922f048
[ 463.875526] FS: 00007f8f44d15540(0000) GS:ffff8b1ebeb00000(0000) knlGS:0000000000000000
[ 463.875529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 463.875532] CR2: 0000000000000028 CR3: 000000003c17a000 CR4: 00000000000006e0
[ 463.875546] Call Trace:
[ 463.875596] ? ocfs2_inode_lock_full_nested+0x18b/0x960 [ocfs2]
[ 463.875648] ocfs2_file_write_iter+0xaf8/0xc70 [ocfs2]
[ 463.875672] new_sync_write+0x12d/0x1d0
[ 463.875688] vfs_write+0xad/0x1a0
[ 463.875697] ksys_write+0xa1/0xe0
[ 463.875710] do_syscall_64+0x60/0x1f0
[ 463.875743] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 463.875758] RIP: 0033:0x7f8f4482ed44
[ 463.875762] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00
[ 463.875765] RSP: 002b:00007fff300a79d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 463.875769] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8f4482ed44
[ 463.875771] RDX: 0000000000000200 RSI: 000055f771b5c000 RDI: 0000000000000001
[ 463.875774] RBP: 0000000000000200 R08: 00007f8f44af9c78 R09: 0000000000000003
[ 463.875776] R10: 000000000000089f R11: 0000000000000246 R12: 000055f771b5c000
[ 463.875779] R13: 0000000000000200 R14: 0000000000000000 R15: 000055f771b5c000
This regression problem was introduced by commit e74540b28556 ("ocfs2:
protect extent tree in ocfs2_prepare_inode_for_write()").
Link: http://lkml.kernel.org/r/20200121050153.13290-1-ghe@suse.com
Fixes: e74540b28556 ("ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()").
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit ddd09bcc899f ("initramfs: make compression options not
depend on INITRAMFS_SOURCE"), Kconfig asks the compression mode for
the built-in initramfs regardless of INITRAMFS_SOURCE.
It is technically simpler, but pointless from a UI perspective,
Linus says [1].
When INITRAMFS_SOURCE is empty, usr/Makefile creates a tiny default
cpio, which is so small that nobody cares about the compression.
This commit hides the Kconfig choice in that case. The default cpio
is embedded without compression, which was the original behavior.
[1]: https://lkml.org/lkml/2020/2/1/160
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull more btrfs updates from David Sterba:
"Fixes that arrived after the merge window freeze, mostly stable
material.
- fix race in tree-mod-log element tracking
- fix bio flushing inside extent writepages
- fix assertion when in-memory tracking of discarded extents finds an
empty tree (eg. after adding a new device)
- update logic of temporary read-only block groups to take into
account overcommit
- fix some fixup worker corner cases:
- page could not go through proper COW cycle and the dirty status
is lost due to page migration
- deadlock if delayed allocation is performed under page lock
- fix send emitting invalid clones within the same file
- fix statfs reporting 0 free space when global block reserve size is
larger than remaining free space but there is still space for new
chunks"
* tag 'for-5.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: do not zero f_bavail if we have available space
Btrfs: send, fix emission of invalid clone operations within the same file
btrfs: do not do delalloc reservation under page lock
btrfs: drop the -EBUSY case in __extent_writepage_io
Btrfs: keep pages dirty when using btrfs_writepage_fixup_worker
btrfs: take overcommit into account in inc_block_group_ro
btrfs: fix force usage in inc_block_group_ro
btrfs: Correctly handle empty trees in find_first_clear_extent_bit
btrfs: flush write bio if we loop in extent_write_cache_pages
Btrfs: fix race between adding and putting tree mod seq elements and nodes
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There was some logic added a while ago to clear out f_bavail in statfs()
if we did not have enough free metadata space to satisfy our global
reserve. This was incorrect at the time, however didn't really pose a
problem for normal file systems because we would often allocate chunks
if we got this low on free metadata space, and thus wouldn't really hit
this case unless we were actually full.
Fast forward to today and now we are much better about not allocating
metadata chunks all of the time. Couple this with d792b0f19711 ("btrfs:
always reserve our entire size for the global reserve") which now means
we'll easily have a larger global reserve than our free space, we are
now more likely to trip over this while still having plenty of space.
Fix this by skipping this logic if the global rsv's space_info is not
full. space_info->full is 0 unless we've attempted to allocate a chunk
for that space_info and that has failed. If this happens then the space
for the global reserve is definitely sacred and we need to report
b_avail == 0, but before then we can just use our calculated b_avail.
Reported-by: Martin Steigerwald <martin@lichtvoll.de>
Fixes: ca8a51b3a979 ("btrfs: statfs: report zero available if metadata are exhausted")
CC: stable@vger.kernel.org # 4.5+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Tested-By: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When doing an incremental send and a file has extents shared with itself
at different file offsets, it's possible for send to emit clone operations
that will fail at the destination because the source range goes beyond the
file's current size. This happens when the file size has increased in the
send snapshot, there is a hole between the shared extents and both shared
extents are at file offsets which are greater the file's size in the
parent snapshot.
Example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt/sdb
$ xfs_io -f -c "pwrite -S 0xf1 0 64K" /mnt/sdb/foobar
$ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base
$ btrfs send -f /tmp/1.snap /mnt/sdb/base
# Create a 320K extent at file offset 512K.
$ xfs_io -c "pwrite -S 0xab 512K 64K" /mnt/sdb/foobar
$ xfs_io -c "pwrite -S 0xcd 576K 64K" /mnt/sdb/foobar
$ xfs_io -c "pwrite -S 0xef 640K 64K" /mnt/sdb/foobar
$ xfs_io -c "pwrite -S 0x64 704K 64K" /mnt/sdb/foobar
$ xfs_io -c "pwrite -S 0x73 768K 64K" /mnt/sdb/foobar
# Clone part of that 320K extent into a lower file offset (192K).
# This file offset is greater than the file's size in the parent
# snapshot (64K). Also the clone range is a bit behind the offset of
# the 320K extent so that we leave a hole between the shared extents.
$ xfs_io -c "reflink /mnt/sdb/foobar 448K 192K 192K" /mnt/sdb/foobar
$ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr
$ btrfs send -p /mnt/sdb/base -f /tmp/2.snap /mnt/sdb/incr
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt/sdc
$ btrfs receive -f /tmp/1.snap /mnt/sdc
$ btrfs receive -f /tmp/2.snap /mnt/sdc
ERROR: failed to clone extents to foobar: Invalid argument
The problem is that after processing the extent at file offset 256K, which
refers to the first 128K of the 320K extent created by the buffered write
operations, we have 'cur_inode_next_write_offset' set to 384K, which
corresponds to the end offset of the partially shared extent (256K + 128K)
and to the current file size in the receiver. Then when we process the
extent at offset 512K, we do extent backreference iteration to figure out
if we can clone the extent from some other inode or from the same inode,
and we consider the extent at offset 256K of the same inode as a valid
source for a clone operation, which is not correct because at that point
the current file size in the receiver is 384K, which corresponds to the
end of last processed extent (at file offset 256K), so using a clone
source range from 256K to 256K + 320K is invalid because that goes past
the current size of the file (384K) - this makes the receiver get an
-EINVAL error when attempting the clone operation.
So fix this by excluding clone sources that have a range that goes beyond
the current file size in the receiver when iterating extent backreferences.
A test case for fstests follows soon.
Fixes: 11f2069c113e02 ("Btrfs: send, allow clone operations within the same file")
CC: stable@vger.kernel.org # 5.5+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We ran into a deadlock in production with the fixup worker. The stack
traces were as follows:
Thread responsible for the writeout, waiting on the page lock
[<0>] io_schedule+0x12/0x40
[<0>] __lock_page+0x109/0x1e0
[<0>] extent_write_cache_pages+0x206/0x360
[<0>] extent_writepages+0x40/0x60
[<0>] do_writepages+0x31/0xb0
[<0>] __writeback_single_inode+0x3d/0x350
[<0>] writeback_sb_inodes+0x19d/0x3c0
[<0>] __writeback_inodes_wb+0x5d/0xb0
[<0>] wb_writeback+0x231/0x2c0
[<0>] wb_workfn+0x308/0x3c0
[<0>] process_one_work+0x1e0/0x390
[<0>] worker_thread+0x2b/0x3c0
[<0>] kthread+0x113/0x130
[<0>] ret_from_fork+0x35/0x40
[<0>] 0xffffffffffffffff
Thread of the fixup worker who is holding the page lock
[<0>] start_delalloc_inodes+0x241/0x2d0
[<0>] btrfs_start_delalloc_roots+0x179/0x230
[<0>] btrfs_alloc_data_chunk_ondemand+0x11b/0x2e0
[<0>] btrfs_check_data_free_space+0x53/0xa0
[<0>] btrfs_delalloc_reserve_space+0x20/0x70
[<0>] btrfs_writepage_fixup_worker+0x1fc/0x2a0
[<0>] normal_work_helper+0x11c/0x360
[<0>] process_one_work+0x1e0/0x390
[<0>] worker_thread+0x2b/0x3c0
[<0>] kthread+0x113/0x130
[<0>] ret_from_fork+0x35/0x40
[<0>] 0xffffffffffffffff
Thankfully the stars have to align just right to hit this. First you
have to end up in the fixup worker, which is tricky by itself (my
reproducer does DIO reads into a MMAP'ed region, so not a common
operation). Then you have to have less than a page size of free data
space and 0 unallocated space so you go down the "commit the transaction
to free up pinned space" path. This was accomplished by a random
balance that was running on the host. Then you get this deadlock.
I'm still in the process of trying to force the deadlock to happen on
demand, but I've hit other issues. I can still trigger the fixup worker
path itself so this patch has been tested in that regard, so the normal
case is fine.
Fixes: 87826df0ec36 ("btrfs: delalloc for page dirtied out-of-band in fixup worker")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that we only return 0 or -EAGAIN from btrfs_writepage_cow_fixup, we
do not need this -EBUSY case.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For COW, btrfs expects pages dirty pages to have been through a few setup
steps. This includes reserving space for the new block allocations and marking
the range in the state tree for delayed allocation.
A few places outside btrfs will dirty pages directly, especially when unmapping
mmap'd pages. In order for these to properly go through COW, we run them
through a fixup worker to wait for stable pages, and do the delalloc prep.
87826df0ec36 added a window where the dirty pages were cleaned, but pending
more action from the fixup worker. We clear_page_dirty_for_io() before
we call into writepage, so the page is no longer dirty. The commit
changed it so now we leave the page clean between unlocking it here and
the fixup worker starting at some point in the future.
During this window, page migration can jump in and relocate the page. Once our
fixup work actually starts, it finds page->mapping is NULL and we end up
freeing the page without ever writing it.
This leads to crc errors and other exciting problems, since it screws up the
whole statemachine for waiting for ordered extents. The fix here is to keep
the page dirty while we're waiting for the fixup worker to get to work.
This is accomplished by returning -EAGAIN from btrfs_writepage_cow_fixup
if we queued the page up for fixup, which will cause the writepage
function to redirty the page.
Because we now expect the page to be dirty once it gets to the fixup
worker we must adjust the error cases to call clear_page_dirty_for_io()
on the page. That is the bulk of the patch, but it is not the fix, the
fix is the -EAGAIN from btrfs_writepage_cow_fixup. We cannot separate
these two changes out because the error conditions change with the new
expectations.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
inc_block_group_ro does a calculation to see if we have enough room left
over if we mark this block group as read only in order to see if it's ok
to mark the block group as read only.
The problem is this calculation _only_ works for data, where our used is
always less than our total. For metadata we will overcommit, so this
will almost always fail for metadata.
Fix this by exporting btrfs_can_overcommit, and then see if we have
enough space to remove the remaining free space in the block group we
are trying to mark read only. If we do then we can mark this block
group as read only.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For some reason we've translated the do_chunk_alloc that goes into
btrfs_inc_block_group_ro to force in inc_block_group_ro, but these are
two different things.
force for inc_block_group_ro is used when we are forcing the block group
read only no matter what, for example when the underlying chunk is
marked read only. We need to not do the space check here as this block
group needs to be read only.
btrfs_inc_block_group_ro() has a do_chunk_alloc flag that indicates that
we need to pre-allocate a chunk before marking the block group read
only. This has nothing to do with forcing, and in fact we _always_ want
to do the space check in this case, so unconditionally pass false for
force in this case.
Then fixup inc_block_group_ro to honor force as it's expected and
documented to do.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Raviu reported that running his regular fs_trim segfaulted with the
following backtrace:
[ 237.525947] assertion failed: prev, in ../fs/btrfs/extent_io.c:1595
[ 237.525984] ------------[ cut here ]------------
[ 237.525985] kernel BUG at ../fs/btrfs/ctree.h:3117!
[ 237.525992] invalid opcode: 0000 [#1] SMP PTI
[ 237.525998] CPU: 4 PID: 4423 Comm: fstrim Tainted: G U OE 5.4.14-8-vanilla #1
[ 237.526001] Hardware name: ASUSTeK COMPUTER INC.
[ 237.526044] RIP: 0010:assfail.constprop.58+0x18/0x1a [btrfs]
[ 237.526079] Call Trace:
[ 237.526120] find_first_clear_extent_bit+0x13d/0x150 [btrfs]
[ 237.526148] btrfs_trim_fs+0x211/0x3f0 [btrfs]
[ 237.526184] btrfs_ioctl_fitrim+0x103/0x170 [btrfs]
[ 237.526219] btrfs_ioctl+0x129a/0x2ed0 [btrfs]
[ 237.526227] ? filemap_map_pages+0x190/0x3d0
[ 237.526232] ? do_filp_open+0xaf/0x110
[ 237.526238] ? _copy_to_user+0x22/0x30
[ 237.526242] ? cp_new_stat+0x150/0x180
[ 237.526247] ? do_vfs_ioctl+0xa4/0x640
[ 237.526278] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
[ 237.526283] do_vfs_ioctl+0xa4/0x640
[ 237.526288] ? __do_sys_newfstat+0x3c/0x60
[ 237.526292] ksys_ioctl+0x70/0x80
[ 237.526297] __x64_sys_ioctl+0x16/0x20
[ 237.526303] do_syscall_64+0x5a/0x1c0
[ 237.526310] entry_SYSCALL_64_after_hwframe+0x49/0xbe
That was due to btrfs_fs_device::aloc_tree being empty. Initially I
thought this wasn't possible and as a percaution have put the assert in
find_first_clear_extent_bit. Turns out this is indeed possible and could
happen when a file system with SINGLE data/metadata profile has a 2nd
device added. Until balance is run or a new chunk is allocated on this
device it will be completely empty.
In this case find_first_clear_extent_bit should return the full range
[0, -1ULL] and let the caller handle this i.e for trim the end will be
capped at the size of actual device.
Link: https://lore.kernel.org/linux-btrfs/izW2WNyvy1dEDweBICizKnd2KDwDiDyY2EYQr4YCwk7pkuIpthx-JRn65MPBde00ND6V0_Lh8mW0kZwzDiLDv25pUYWxkskWNJnVP0kgdMA=@protonmail.com/
Fixes: 45bfcfc168f8 ("btrfs: Implement find_first_clear_extent_bit")
CC: stable@vger.kernel.org # 5.2+
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There exists a deadlock with range_cyclic that has existed forever. If
we loop around with a bio already built we could deadlock with a writer
who has the page locked that we're attempting to write but is waiting on
a page in our bio to be written out. The task traces are as follows
PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5"
#0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
#3 [ffffc900297bb708] __lock_page at ffffffff811f145b
#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
#9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd
PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND:
"aio-dio-invalid"
#0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
#9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032
I used drgn to find the respective pages we were stuck on
page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874
As you can see the kworker is waiting for bit 0 (PG_locked) on index
7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
8148. aio-dio-invalid has 7680, and the kworker epd looks like the
following
crash> struct extent_page_data ffffc900297bbbb0
struct extent_page_data {
bio = 0xffff889f747ed830,
tree = 0xffff889eed6ba448,
extent_locked = 0,
sync_io = 0
}
Probably worth mentioning as well that it waits for writeback of the
page to complete while holding a lock on it (at prepare_pages()).
Using drgn I walked the bio pages looking for page
0xffffea00fbfc7500 which is the one we're waiting for writeback on
bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
for i in range(0, bio.bi_vcnt.value_()):
bv = bio.bi_io_vec[i]
if bv.bv_page.value_() == 0xffffea00fbfc7500:
print("FOUND IT")
which validated what I suspected.
The fix for this is simple, flush the epd before we loop back around to
the beginning of the file during writeout.
Fixes: b293f02e1423 ("Btrfs: Add writepages support")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There is a race between adding and removing elements to the tree mod log
list and rbtree that can lead to use-after-free problems.
Consider the following example that explains how/why the problems happens:
1) Task A has mod log element with sequence number 200. It currently is
the only element in the mod log list;
2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to
access the tree mod log. When it enters the function, it initializes
'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock'
before checking if there are other elements in the mod seq list.
Since the list it empty, 'min_seq' remains set to (u64)-1. Then it
unlocks the lock 'tree_mod_seq_lock';
3) Before task A acquires the lock 'tree_mod_log_lock', task B adds
itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a
sequence number of 201;
4) Some other task, name it task C, modifies a btree and because there
elements in the mod seq list, it adds a tree mod elem to the tree
mod log rbtree. That node added to the mod log rbtree is assigned
a sequence number of 202;
5) Task B, which is doing fiemap and resolving indirect back references,
calls btrfs get_old_root(), with 'time_seq' == 201, which in turn
calls tree_mod_log_search() - the search returns the mod log node
from the rbtree with sequence number 202, created by task C;
6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating
the mod log rbtree and finds the node with sequence number 202. Since
202 is less than the previously computed 'min_seq', (u64)-1, it
removes the node and frees it;
7) Task B still has a pointer to the node with sequence number 202, and
it dereferences the pointer itself and through the call to
__tree_mod_log_rewind(), resulting in a use-after-free problem.
This issue can be triggered sporadically with the test case generic/561
from fstests, and it happens more frequently with a higher number of
duperemove processes. When it happens to me, it either freezes the VM or
it produces a trace like the following before crashing:
[ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
[ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1
[ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
[ 1245.321287] RIP: 0010:rb_next+0x16/0x50
[ 1245.321307] Code: ....
[ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202
[ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b
[ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80
[ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000
[ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038
[ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8
[ 1245.321539] FS: 00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000
[ 1245.321591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0
[ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1245.321706] Call Trace:
[ 1245.321798] __tree_mod_log_rewind+0xbf/0x280 [btrfs]
[ 1245.321841] btrfs_search_old_slot+0x105/0xd00 [btrfs]
[ 1245.321877] resolve_indirect_refs+0x1eb/0xc60 [btrfs]
[ 1245.321912] find_parent_nodes+0x3dc/0x11b0 [btrfs]
[ 1245.321947] btrfs_check_shared+0x115/0x1c0 [btrfs]
[ 1245.321980] ? extent_fiemap+0x59d/0x6d0 [btrfs]
[ 1245.322029] extent_fiemap+0x59d/0x6d0 [btrfs]
[ 1245.322066] do_vfs_ioctl+0x45a/0x750
[ 1245.322081] ksys_ioctl+0x70/0x80
[ 1245.322092] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1245.322113] __x64_sys_ioctl+0x16/0x20
[ 1245.322126] do_syscall_64+0x5c/0x280
[ 1245.322139] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1245.322155] RIP: 0033:0x7fdee3942dd7
[ 1245.322177] Code: ....
[ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7
[ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004
[ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44
[ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48
[ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50
[ 1245.322423] Modules linked in: ....
[ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]---
Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum
sequence number and iterates the rbtree while holding the lock
'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock'
lock, since it is now redundant.
Fixes: bd989ba359f2ac ("Btrfs: add tree modification log functions")
Fixes: 097b8a7c9e48e2 ("Btrfs: join tree mod log code with the code holding back delayed refs")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"Everything for kgdb this time around is either simplifications or
clean ups.
In particular Douglas Anderson's modifications to the backtrace
machine in the *last* dev cycle have enabled Doug to tidy up some MIPS
specific backtrace code and stop sharing certain data structures
across the kernel. Note that The MIPS folks were on Cc: for the MIPS
patch and reacted positively (but without an explicit Acked-by).
Doug also got rid of the implicit switching between tasks and register
sets during some but not of kdb's backtrace actions (because the
implicit switching was either confusing for users, pointless or both).
Finally there is a coverity fix and patch to replace open coded
console traversal with the proper helper function"
* tag 'kgdb-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kdb: Use for_each_console() helper
kdb: remove redundant assignment to pointer bp
kdb: Get rid of confusing diag msg from "rd" if current task has no regs
kdb: Gid rid of implicit setting of the current task / regs
kdb: kdb_current_task shouldn't be exported
kdb: kdb_current_regs should be private
MIPS: kdb: Remove old workaround for backtracing on other CPUs
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Replace open coded single-linked list iteration loop with for_each_console()
helper in use.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The point bp is assigned a value that is never read, it is being
re-assigned later to bp = &kdb_breakpoints[lowbp] in a for-loop.
Remove the redundant assignment.
Addresses-Coverity ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20191128130753.181246-1-colin.king@canonical.com
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If you switch to a sleeping task with the "pid" command and then type
"rd", kdb tells you this:
No current kdb registers. You may need to select another task
diag: -17: Invalid register name
The first message makes sense, but not the second. Fix it by just
returning 0 after commands accessing the current registers finish if
we've already printed the "No current kdb registers" error.
While fixing kdb_rd(), change the function to use "if" rather than
"ifdef". It cleans the function up a bit and any modern compiler will
have no trouble handling still producing good code.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20191109111624.5.I121f4c6f0c19266200bf6ef003de78841e5bfc3d@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Some (but not all?) of the kdb backtrace paths would cause the
kdb_current_task and kdb_current_regs to remain changed. As discussed
in a review of a previous patch [1], this doesn't seem intuitive, so
let's fix that.
...but, it turns out that there's actually no longer any reason to set
the current task / current regs while backtracing anymore anyway. As
of commit 2277b492582d ("kdb: Fix stack crawling on 'running' CPUs
that aren't the master") if we're backtracing on a task running on a
CPU we ask that CPU to do the backtrace itself. Linux can do that
without anything fancy. If we're doing backtrace on a sleeping task
we can also do that fine without updating globals. So this patch
mostly just turns into deleting a bunch of code.
[1] https://lore.kernel.org/r/20191010150735.dhrj3pbjgmjrdpwr@holly.lan
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20191109111624.4.Ibc3d982bbeb9e46872d43973ba808cd4c79537c7@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The kdb_current_task variable has been declared in
"kernel/debug/kdb/kdb_private.h" since 2010 when kdb was added to the
mainline kernel. This is not a public header. There should be no
reason that kdb_current_task should be exported and there are no
in-kernel users that need it. Remove the export.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20191109111623.3.I14b22b5eb15ca8f3812ab33e96621231304dc1f7@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As of the patch ("MIPS: kdb: Remove old workaround for backtracing on
other CPUs") there is no reason for kdb_current_regs to be in the
public "kdb.h". Let's move it next to kdb_current_task.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20191109111623.2.Iadbfb484e90b557cc4b5ac9890bfca732cd99d77@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As of commit 2277b492582d ("kdb: Fix stack crawling on 'running' CPUs
that aren't the master") we no longer need any special case for doing
stack dumps on CPUs that are not the kdb master. Let's remove.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20191109111623.1.I30a0cac4d9880040c8d41495bd9a567fe3e24989@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fix from Greg KH:
"Here is a single patch, that fixes up a commit that came in the
previous char/misc merge.
It fixes a bug in the hpet driver that everyone keeps tripping over in
their automated testing. Good thing is, people are catching it. Bad
thing it wasn't caught by anyone testing before this. Oh well...
This has been in linux-next for a few days with no reported issues"
* tag 'char-misc-5.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
char: hpet: Fix out-of-bounds read bug
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, there is an out-of-bounds read on array hpetp->hp_dev
in the following for loop:
870 for (i = 0; i < hdp->hd_nirqs; i++)
871 hpetp->hp_dev[i].hd_hdwirq = hdp->hd_irq[i];
This is due to the recent change from one-element array to
flexible-array member in struct hpets:
104 struct hpets {
...
113 struct hpet_dev hp_dev[];
114 };
This change affected the total size of the dynamic memory
allocation, decreasing it by one time the size of struct hpet_dev.
Fix this by adjusting the allocation size when calling
struct_size().
Fixes: 987f028b8637c ("char: hpet: Use flexible-array member")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Eric Biggers <ebiggers@kernel.org>
Link: https://lore.kernel.org/r/20200129022613.GA24281@embeddedor.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
Pull backlight updates from Lee Jones:
"Fix-ups:
- Remove superfluous code in ams369fg06
- Convert over to GPIO descriptor (gpiod) in bd6107
Bug Fixes:
- Fix unsigned comparison to less than zero in qcom-wled"
* tag 'backlight-next-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
backlight: qcom-wled: Fix unsigned comparison to zero
backlight: bd6107: Convert to use GPIO descriptor
backlight: ams369fg06: Drop GPIO include
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fixes coccicheck warning:
./drivers/video/backlight/qcom-wled.c:1104:5-15:
WARNING: Unsigned expression compared with zero: string_len > 0
The unsigned variable string_len is assigned a return value from the call
to of_property_count_elems_of_size(), which may return negative error code.
Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Kiran Gunda <kgunda@codeaurora.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The Rohm BD6107 driver can pass a fixed GPIO line using the old
GPIO API using platform data. As there are no in-tree users of this
platform data since 2013, we can convert this to use a GPIO descriptor
and require any out-of-tree consumers to pass the GPIO using
a machine descriptor table instead.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The driver includes <linux/gpio.h> yet fails to use symbols
from any the header so drop the include.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Drivers:
- Add support for ROHM BD71828 PMICs and GPIOs
- Add support for Qualcomm Aqstic Audio Codecs WCD9340 and WCD9341
New Device Support:
- Add support for BD71828 to BD70528 RTC driver
- Add support for Intel's Jasper Lake to LPSS PCI
New Functionality:
- Add support for Power Key to ROHM BD71828
- Add support for Clocks to ROHM BD71828
- Add support for GPIOs to Dialog DA9062
- Add support for USB PD Notify to ChromiumOS EC
- Allow callers to specify args when requesting regmap lookup; syscon
Fix-ups:
- Improve error handling and sanity checking; atmel-hlcdc, dln2
- Device Tree support/documentation; bd71828, da9062, xylon,logicvc,
ab8500, max14577, atmel-usart
- Match devices using platform IDs; bd7xxxx
- Refactor BD718x7 regulator component; bd718x7-regulator
- Use standard interfaces/helpers; syscon, sm501
- Trivial (whitespace, spelling, etc); ab8500-core, Kconfig
- Remove unused code; db8500-prcmu, tqmx86
- Wait until boot has finished before accessing registers;
madera-core
- Provide missing register value defaults; cs47l15-tables
- Allow more time for hardware to reset; madera-core
Bug Fixes:
- Fix erroneous register values; rohm-bd70528
- Fix register volatility; axp20x, rn5t618
- Fix Kconfig dependencies; MFD_MAX77650
- Fix incorrect compatible string; da9062-core
- Fix syscon_regmap_lookup_by_phandle_args() stub; syscon"
* tag 'mfd-next-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (41 commits)
mfd: syscon: Fix syscon_regmap_lookup_by_phandle_args() dummy
mfd: wcd934x: Add support to wcd9340/wcd9341 codec
mfd: syscon: Add arguments support for syscon reference
mfd: rn5t618: Mark ADC control register volatile
dt-bindings: atmel-usart: Add microchip,sam9x60-{usart, dbgu}
dt-bindings: atmel-usart: Remove wildcard
mfd: cros_ec: Add cros-usbpd-notify subdevice
mfd: da9062: Fix watchdog compatible string
mfd: madera: Allow more time for hardware reset
mfd: cs47l15: Add missing register default
mfd: madera: Wait for boot done before accessing any other registers
mfd: Kconfig: Rename Samsung to lowercase
mfd: tqmx86: remove set but not used variable 'i2c_ien'
mfd: dbx500-prcmu: Drop DSI pll clock functions
mfd: dbx500-prcmu: Drop set_display_clocks()
mfd: max77650: Select REGMAP_IRQ in Kconfig
mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile
mfd: ab8500: Fix ab8500-clk typo
mfd: intel-lpss: Add Intel Jasper Lake PCI IDs
dt-bindings: mfd: max14577: Add reference to max14040_battery.txt descriptions
...
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If CONFIG_MFD_SYSCON=n:
include/linux/mfd/syscon.h:54:23: warning: ‘syscon_regmap_lookup_by_phandle_args’ defined but not used [-Wunused-function]
Fix this by adding the missing inline keyword.
Fixes: 6a24f567af4accef ("mfd: syscon: Add arguments support for syscon reference")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Qualcomm WCD9340/WCD9341 Codec is a standalone Hi-Fi audio codec IC.
This codec has integrated SoundWire controller, pin controller and
interrupt controller.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
There are a lot of similar global registers being used across multiple SoCs
from Unisoc. But most of these registers are assigned with different offset
for different SoCs. It is hard to handle all of them in an all-in-one
kernel image.
Add a helper function to get regmap with arguments where we could put some
extra information such as the offset value.
Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
Tested-by: Baolin Wang <baolin.wang@unisoc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
There is a bit which gets cleared after conversion.
Fixes: 9bb9e29c78f8 ("mfd: Add Ricoh RN5T618 PMIC core driver")
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add microchip,sam9x60-usart and add microchip,sam9x60-dbgu to DT
bindings documentation.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Remove chip whildcard and introduce the list of compatibles instead.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add the cros-usbpd-notify driver as a subdevice on platforms that
support the EC_FEATURE_USB_PD EC feature flag and don't have the
ACPI PD notification device defined.
This driver allows other cros-ec devices to receive PD event
notifications from the Chrome OS Embedded Controller (EC) via a
notification chain.
Signed-off-by: Prashant Malani <pmalani@chromium.org>
Reviewed-by: Benson Leung <bleung@chromium.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The watchdog driver compatible is "dlg,da9062-watchdog" and not
"dlg,da9062-wdt". Therefore the mfd-core can't populate the of_node and
fwnode. As result the watchdog driver can't parse the devicetree.
Fixes: 9b40b030c4ad ("mfd: da9062: Supply core driver")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Both manual and power on resets have a brief period where the chip will
not be accessible immediately afterwards. Extend the time allowed for
this from a minimum of 1mS to 2mS based on newer evaluation of the
hardware and ensure this reset happens in all reset conditions. Whilst
making the change also remove the redundant NULL checks in the reset
functions as the GPIO functions already check for this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Accessory detect mode 1 is missing a default, add one to the table.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
It is advised to wait for the boot done bit to be set before reading
any other register, update the driver to respect this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fix up inconsistent usage of upper and lowercase letters in "Samsung"
name.
"SAMSUNG" is not an abbreviation but a regular trademarked name.
Therefore it should be written with lowercase letters starting with
capital letter.
Although advertisement materials usually use uppercase "SAMSUNG", the
lowercase version is used in all legal aspects (e.g. on Wikipedia and in
privacy/legal statements on
https://www.samsung.com/semiconductor/privacy-global/).
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/mfd/tqmx86.c: In function ‘tqmx86_probe’:
drivers/mfd/tqmx86.c:161:29: warning: variable ‘i2c_ien’
set but not used I[-Wunused-but-set-variable]
It is never used, and so can be removed.
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|