summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | | mm/kasan: don't vfree() nonexistent vm_areaAndrey Ryabinin2018-05-251-2/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN uses different routines to map shadow for hot added memory and memory obtained in boot process. Attempt to offline memory onlined by normal boot process leads to this: Trying to vfree() nonexistent vm area (000000005d3b34b9) WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190 Call Trace: kasan_mem_notifier+0xad/0xb9 notifier_call_chain+0x166/0x260 __blocking_notifier_call_chain+0xdb/0x140 __offline_pages+0x96a/0xb10 memory_subsys_offline+0x76/0xc0 device_offline+0xb8/0x120 store_mem_state+0xfa/0x120 kernfs_fop_write+0x1d5/0x320 __vfs_write+0xd4/0x530 vfs_write+0x105/0x340 SyS_write+0xb0/0x140 Obviously we can't call vfree() to free memory that wasn't allocated via vmalloc(). Use find_vm_area() to see if we can call vfree(). Unfortunately it's a bit tricky to properly unmap and free shadow allocated during boot, so we'll have to keep it. If memory will come online again that shadow will be reused. Matthew asked: how can you call vfree() on something that isn't a vmalloc address? vfree() is able to free any address returned by __vmalloc_node_range(). And __vmalloc_node_range() gives you any address you ask. It doesn't have to be an address in [VMALLOC_START, VMALLOC_END] range. That's also how the module_alloc()/module_memfree() works on architectures that have designated area for modules. [aryabinin@virtuozzo.com: improve comments] Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com [akpm@linux-foundation.org: fix typos, reflow comment] Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Paul Menzel <pmenzel+linux-kasan-dev@molgen.mpg.de> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | MAINTAINERS: change hugetlbfs maintainer and update filesMike Kravetz2018-05-251-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current hugetlbfs maintainer has not been active for more than a few years. I have been been active in this area for more than two years and plan to remain active in the foreseeable future. Also, update the hugetlbfs entry to include linux-mm mail list and additional hugetlbfs related files. hugetlb.c and hugetlb.h are not 100% hugetlbfs, but a majority of their content is hugetlbfs related. Link: http://lkml.kernel.org/r/20180518225236.19079-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Nadia Yvette Chambers <nyc@holomorphy.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | ipc/shm: fix shmat() nil address after round-down when remappingDavidlohr Bueso2018-05-251-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shmat()'s SHM_REMAP option forbids passing a nil address for; this is in fact the very first thing we check for. Andrea reported that for SHM_RND|SHM_REMAP cases we can end up bypassing the initial addr check, but we need to check again if the address was rounded down to nil. As of this patch, such cases will return -EINVAL. Link: http://lkml.kernel.org/r/20180503204934.kk63josdu6u53fbd@linux-n805 Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | Revert "ipc/shm: Fix shmat mmap nil-page protection"Davidlohr Bueso2018-05-251-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "ipc/shm: shmat() fixes around nil-page". These patches fix two issues reported[1] a while back by Joe and Andrea around how shmat(2) behaves with nil-page. The first reverts a commit that it was incorrectly thought that mapping nil-page (address=0) was a no no with MAP_FIXED. This is not the case, with the exception of SHM_REMAP; which is address in the second patch. I chose two patches because it is easier to backport and it explicitly reverts bogus behaviour. Both patches ought to be in -stable and ltp testcases need updated (the added testcase around the cve can be modified to just test for SHM_RND|SHM_REMAP). [1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805 This patch (of 2): Commit 95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection") worked on the idea that we should not be mapping as root addr=0 and MAP_FIXED. However, it was reported that this scenario is in fact valid, thus making the patch both bogus and breaks userspace as well. For example X11's libint10.so relies on shmat(1, SHM_RND) for lowmem initialization[1]. [1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/int10/linux.c#n347 Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net Fixes: 95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | idr: fix invalid ptr dereference on item deleteMatthew Wilcox2018-05-252-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the radix tree underlying the IDR happens to be full and we attempt to remove an id which is larger than any id in the IDR, we will call __radix_tree_delete() with an uninitialised 'slot' pointer, at which point anything could happen. This was easiest to hit with a single entry at id 0 and attempting to remove a non-0 id, but it could have happened with 64 entries and attempting to remove an id >= 64. Roman said: The syzcaller test boils down to opening /dev/kvm, creating an eventfd, and calling a couple of KVM ioctls. None of this requires superuser. And the result is dereferencing an uninitialized pointer which is likely a crash. The specific path caught by syzbot is via KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are other user-triggerable paths, so cc:stable is probably justified. Matthew added: We have around 250 calls to idr_remove() in the kernel today. Many of them pass an ID which is embedded in the object they're removing, so they're safe. Picking a few likely candidates: drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl. drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar drivers/atm/nicstar.c could be taken down by a handcrafted packet Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com> Debugged-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | ocfs2: revert "ocfs2/o2hb: check len for bio_add_page() to avoid getting ↵Changwei Ge2018-05-251-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | incorrect bio" This reverts commit ba16ddfbeb9d ("ocfs2/o2hb: check len for bio_add_page() to avoid getting incorrect bio"). In my testing, this patch introduces a problem that mkfs can't have slots more than 16 with 4k block size. And the original logic is safe actually with the situation it mentions so revert this commit. Attach test log: (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 0, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 1, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 2, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 3, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 4, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 5, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 6, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 7, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 8, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 9, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 10, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 11, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 12, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 13, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 14, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 15, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 16, vec_len = 4096, vec_start = 0 (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:471 ERROR: Adding page[16] to bio failed, page ffffea0002d7ed40, len 0, vec_len 4096, vec_start 0,bi_sector 8192 (mkfs.ocfs2,27479,2):o2hb_read_slots:500 ERROR: status = -5 (mkfs.ocfs2,27479,2):o2hb_populate_slot_data:1911 ERROR: status = -5 (mkfs.ocfs2,27479,2):o2hb_region_dev_write:2012 ERROR: status = -5 Link: http://lkml.kernel.org/r/SIXPR06MB0461721F398A5A92FC68C39ED5920@SIXPR06MB0461.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | mm: fix nr_rotate_swap leak in swapon() error caseOmar Sandoval2018-05-251-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If swapon() fails after incrementing nr_rotate_swap, we don't decrement it and thus effectively leak it. Make sure we decrement it if we incremented it. Link: http://lkml.kernel.org/r/b6fe6b879f17fa68eee6cbd876f459f6e5e33495.1526491581.git.osandov@fb.com Fixes: 81a0298bdfab ("mm, swap: don't use VMA based swap readahead if HDD is used as swap") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-05-2549-193/+372
|\ \ \ \ \ \ \ \ | |/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Let's begin the holiday weekend with some networking fixes: 1) Whoops need to restrict cfg80211 wiphy names even more to 64 bytes. From Eric Biggers. 2) Fix flags being ignored when using kernel_connect() with SCTP, from Xin Long. 3) Use after free in DCCP, from Alexey Kodanev. 4) Need to check rhltable_init() return value in ipmr code, from Eric Dumazet. 5) XDP handling fixes in virtio_net from Jason Wang. 6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu. 7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack Morgenstein. 8) Prevent out-of-bounds speculation using indexes in BPF, from Daniel Borkmann. 9) Fix regression added by AF_PACKET link layer cure, from Willem de Bruijn. 10) Correct ENIC dma mask, from Govindarajulu Varadarajan. 11) Missing config options for PMTU tests, from Stefano Brivio" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits) ibmvnic: Fix partial success login retries selftests/net: Add missing config options for PMTU tests mlx4_core: allocate ICM memory in page size chunks enic: set DMA mask to 47 bit ppp: remove the PPPIOCDETACH ioctl ipv4: remove warning in ip_recv_error net : sched: cls_api: deal with egdev path only if needed vhost: synchronize IOTLB message with dev cleanup packet: fix reserve calculation net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands net/mlx5e: When RXFCS is set, add FCS data into checksum calculation bpf: properly enforce index mask to prevent out-of-bounds speculation net/mlx4: Fix irq-unsafe spinlock usage net: phy: broadcom: Fix bcm_write_exp() net: phy: broadcom: Fix auxiliary control register reads net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message ibmvnic: Only do H_EOI for mobility events tuntap: correctly set SOCKWQ_ASYNC_NOSPACE virtio-net: fix leaking page for gso packet during mergeable XDP ...
| * | | | | | | ibmvnic: Fix partial success login retriesThomas Falcon2018-05-251-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In its current state, the driver will handle backing device login in a loop for a certain number of retries while the device returns a partial success, indicating that the driver may need to try again using a smaller number of resources. The variable it checks to continue retrying may change over the course of operations, resulting in reallocation of resources but exits without sending the login attempt. Guard against this by introducing a boolean variable that will retain the state indicating that the driver needs to reattempt login with backing device firmware. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2018-05-254-24/+70
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2018-05-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a bug in the original fix to prevent out of bounds speculation when multiple tail call maps from different branches or calls end up at the same tail call helper invocation, from Daniel. 2) Two selftest fixes, one in reuseport_bpf_numa where test is skipped in case of missing numa support and another one to update kernel config to properly support xdp_meta.sh test, from Anders. ... Would be great if you have a chance to merge net into net-next after that. The verifier fix would be needed later as a dependency in bpf-next for upcomig work there. When you do the merge there's a trivial conflict on BPF side with 849fa50662fb ("bpf/verifier: refine retval R0 state for bpf_get_stack helper"): Resolution is to keep both functions, the do_refine_retval_range() and record_func_map(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | bpf: properly enforce index mask to prevent out-of-bounds speculationDaniel Borkmann2018-05-242-23/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While reviewing the verifier code, I recently noticed that the following two program variants in relation to tail calls can be loaded. Variant 1: # bpftool p d x i 15 0: (15) if r1 == 0x0 goto pc+3 1: (18) r2 = map[id:5] 3: (05) goto pc+2 4: (18) r2 = map[id:6] 6: (b7) r3 = 7 7: (35) if r3 >= 0xa0 goto pc+2 8: (54) (u32) r3 &= (u32) 255 9: (85) call bpf_tail_call#12 10: (b7) r0 = 1 11: (95) exit # bpftool m s i 5 5: prog_array flags 0x0 key 4B value 4B max_entries 4 memlock 4096B # bpftool m s i 6 6: prog_array flags 0x0 key 4B value 4B max_entries 160 memlock 4096B Variant 2: # bpftool p d x i 20 0: (15) if r1 == 0x0 goto pc+3 1: (18) r2 = map[id:8] 3: (05) goto pc+2 4: (18) r2 = map[id:7] 6: (b7) r3 = 7 7: (35) if r3 >= 0x4 goto pc+2 8: (54) (u32) r3 &= (u32) 3 9: (85) call bpf_tail_call#12 10: (b7) r0 = 1 11: (95) exit # bpftool m s i 8 8: prog_array flags 0x0 key 4B value 4B max_entries 160 memlock 4096B # bpftool m s i 7 7: prog_array flags 0x0 key 4B value 4B max_entries 4 memlock 4096B In both cases the index masking inserted by the verifier in order to control out of bounds speculation from a CPU via b2157399cc98 ("bpf: prevent out-of-bounds speculation") seems to be incorrect in what it is enforcing. In the 1st variant, the mask is applied from the map with the significantly larger number of entries where we would allow to a certain degree out of bounds speculation for the smaller map, and in the 2nd variant where the mask is applied from the map with the smaller number of entries, we get buggy behavior since we truncate the index of the larger map. The original intent from commit b2157399cc98 is to reject such occasions where two or more different tail call maps are used in the same tail call helper invocation. However, the check on the BPF_MAP_PTR_POISON is never hit since we never poisoned the saved pointer in the first place! We do this explicitly for map lookups but in case of tail calls we basically used the tail call map in insn_aux_data that was processed in the most recent path which the verifier walked. Thus any prior path that stored a pointer in insn_aux_data at the helper location was always overridden. Fix it by moving the map pointer poison logic into a small helper that covers both BPF helpers with the same logic. After that in fixup_bpf_calls() the poison check is then hit for tail calls and the program rejected. Latter only happens in unprivileged case since this is the *only* occasion where a rewrite needs to happen, and where such rewrite is specific to the map (max_entries, index_mask). In the privileged case the rewrite is generic for the insn->imm / insn->code update so multiple maps from different paths can be handled just fine since all the remaining logic happens in the instruction processing itself. This is similar to the case of map lookups: in case there is a collision of maps in fixup_bpf_calls() we must skip the inlined rewrite since this will turn the generic instruction sequence into a non- generic one. Thus the patch_call_imm will simply update the insn->imm location where the bpf_map_lookup_elem() will later take care of the dispatch. Given we need this 'poison' state as a check, the information of whether a map is an unpriv_array gets lost, so enforcing it prior to that needs an additional state. In general this check is needed since there are some complex and tail call intensive BPF programs out there where LLVM tends to generate such code occasionally. We therefore convert the map_ptr rather into map_state to store all this w/o extra memory overhead, and the bit whether one of the maps involved in the collision was from an unpriv_array thus needs to be retained as well there. Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | | | | | selftests: net: reuseport_bpf_numa: don't fail if no numa supportAnders Roxell2018-05-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reuseport_bpf_numa test case fails there's no numa support. The test shouldn't fail if there's no support it should be skipped. Fixes: 3c2c3c16aaf6 ("reuseport, bpf: add test case for bpf_get_numa_node_id") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | | | | | selftests: bpf: config: enable NET_SCH_INGRESS for xdp_meta.shAnders Roxell2018-05-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running bpf's selftest test_xdp_meta.sh it fails: ./test_xdp_meta.sh Error: Specified qdisc not found. selftests: test_xdp_meta [FAILED] Need to enable CONFIG_NET_SCH_INGRESS and CONFIG_NET_CLS_ACT to get the test to pass. Fixes: 22c8852624fc ("bpf: improve selftests and add tests for meta pointer") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | | | | | | selftests/net: Add missing config options for PMTU testsStefano Brivio2018-05-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PMTU tests in pmtu.sh need support for VTI, VTI6 and dummy interfaces: add them to config file. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge tag 'batadv-net-for-davem-20180524' of git://git.open-mesh.org/linux-mergeDavid S. Miller2018-05-252-18/+68
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - prevent hardif_put call with NULL parameter, by Colin Ian King - Avoid race in Translation Table allocator, by Sven Eckelmann - Fix Translation Table sync flags for intermediate Responses, by Linus Luessing - prevent sending inconsistent Translation Table TVLVs, by Marek Lindner ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | batman-adv: prevent TT request storms by not sending inconsistent TT TLVLsMarek Lindner2018-05-121-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A translation table TVLV changset sent with an OGM consists of a number of headers (one per VLAN) plus the changeset itself (addition and/or deletion of entries). The per-VLAN headers are used by OGM recipients for consistency checks. Said consistency check might determine that a full translation table request is needed to restore consistency. If the TT sender adds per-VLAN headers of empty VLANs into the OGM, recipients are led to believe to have reached an inconsistent state and thus request a full table update. The full table does not contain empty VLANs (due to missing entries) the cycle restarts when the next OGM is issued. Consequently, when the translation table TVLV headers are composed, empty VLANs are to be excluded. Fixes: 21a57f6e7a3b ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
| | * | | | | | | | batman-adv: Fix TT sync flags for intermediate TT responsesLinus Lüssing2018-05-121-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous TT sync fix so far only fixed TT responses issued by the target node directly. So far, TT responses issued by intermediate nodes still lead to the wrong flags being added, leading to CRC mismatches. This behaviour was observed at Freifunk Hannover in a 800 nodes setup where a considerable amount of nodes were still infected with 'WI' TT flags even with (most) nodes having the previous TT sync fix applied. I was able to reproduce the issue with intermediate TT responses in a four node test setup and this patch fixes this issue by ensuring to use the per originator instead of the summarized, OR'd ones. Fixes: e9c00136a475 ("batman-adv: fix tt_global_entries flags update") Reported-by: Leonardo Mörlein <me@irrelefant.net> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
| | * | | | | | | | batman-adv: Avoid race in TT TVLV allocator helperSven Eckelmann2018-05-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions batadv_tt_prepare_tvlv_local_data and batadv_tt_prepare_tvlv_global_data are responsible for preparing a buffer which can be used to store the TVLV container for TT and add the VLAN information to it. This will be done in three phases: 1. count the number of VLANs and their entries 2. allocate the buffer using the counters from the previous step and limits from the caller (parameter tt_len) 3. insert the VLAN information to the buffer The step 1 and 3 operate on a list which contains the VLANs. The access to these lists must be protected with an appropriate lock or otherwise they might operate on on different entries. This could for example happen when another context is adding VLAN entries to this list. This could lead to a buffer overflow in these functions when enough entries were added between step 1 and 3 to the VLAN lists that the buffer room for the entries (*tt_change) is smaller then the now required extra buffer for new VLAN entries. Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
| | * | | | | | | | batman-adv: don't pass a NULL hard_iface to batadv_hardif_putColin Ian King2018-04-221-1/+1
| | | |_|_|_|_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where hard_iface is NULL, the error path may pass a null pointer to batadv_hardif_put causing a null pointer dereference error. Avoid this by only calling the function if hard_iface not null. Detected by CoverityScan, CID#1466456 ("Explicit null dereferenced") Fixes: 53dd9a68ba68 ("batman-adv: add multicast flags netlink support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
| * | | | | | | | mlx4_core: allocate ICM memory in page size chunksQing Huang2018-05-251-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations. When that happens, user processes calling uverb APIs may get stuck for more than 120s easily even though there are a lot of free pages in smaller chunks available in the system. Syslog: ... Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task oracle_205573_e:205573 blocked for more than 120 seconds. ... With 4KB ICM chunk size on x86_64 arch, the above issue is fixed. However in order to support smaller ICM chunk size, we need to fix another issue in large size kcalloc allocations. E.g. Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt entry). So we need a 16MB allocation for a table->icm pointer array to hold 2M pointers which can easily cause kcalloc to fail. The solution is to use kvzalloc to replace kcalloc which will fall back to vmalloc automatically if kmalloc fails. Signed-off-by: Qing Huang <qing.huang@oracle.com> Acked-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | enic: set DMA mask to 47 bitGovindarajulu Varadarajan2018-05-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 624dbf55a359b ("driver/net: enic: Try DMA 64 first, then failover to DMA") DMA mask was changed from 40 bits to 64 bits. Hardware actually supports only 47 bits. Fixes: 624dbf55a359b ("driver/net: enic: Try DMA 64 first, then failover to DMA") Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ppp: remove the PPPIOCDETACH ioctlEric Biggers2018-05-243-29/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file before f_count has reached 0, which is fundamentally a bad idea. It does check 'f_count < 2', which excludes concurrent operations on the file since they would only be possible with a shared fd table, in which case each fdget() would take a file reference. However, it fails to account for the fact that even with 'f_count == 1' the file can still be linked into epoll instances. As reported by syzbot, this can trivially be used to cause a use-after-free. Yet, the only known user of PPPIOCDETACH is pppd versions older than ppp-2.4.2, which was released almost 15 years ago (November 2003). Also, PPPIOCDETACH apparently stopped working reliably at around the same time, when the f_count check was added to the kernel, e.g. see https://lkml.org/lkml/2002/12/31/83. Also, the current 'f_count < 2' check makes PPPIOCDETACH only work in single-threaded applications; it always fails if called from a multithreaded application. All pppd versions released in the last 15 years just close() the file descriptor instead. Therefore, instead of hacking around this bug by exporting epoll internals to modules, and probably missing other related bugs, just remove the PPPIOCDETACH ioctl and see if anyone actually notices. Leave a stub in place that prints a one-time warning and returns EINVAL. Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ipv4: remove warning in ip_recv_errorWillem de Bruijn2018-05-241-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A precondition check in ip_recv_error triggered on an otherwise benign race. Remove the warning. The warning triggers when passing an ipv6 socket to this ipv4 error handling function. RaceFuzzer was able to trigger it due to a race in setsockopt IPV6_ADDRFORM. --- CPU0 do_ipv6_setsockopt sk->sk_socket->ops = &inet_dgram_ops; --- CPU1 sk->sk_prot->recvmsg udp_recvmsg ip_recv_error WARN_ON_ONCE(sk->sk_family == AF_INET6); --- CPU0 do_ipv6_setsockopt sk->sk_family = PF_INET; This socket option converts a v6 socket that is connected to a v4 peer to an v4 socket. It updates the socket on the fly, changing fields in sk as well as other structs. This is inherently non-atomic. It races with the lockless udp_recvmsg path. No other code makes an assumption that these fields are updated atomically. It is benign here, too, as ip_recv_error cares only about the protocol of the skbs enqueued on the error queue, for which sk_family is not a precise predictor (thanks to another isue with IPV6_ADDRFORM). Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr Fixes: 7ce875e5ecb8 ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error") Reported-by: DaeRyong Jeong <threeearcat@gmail.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net : sched: cls_api: deal with egdev path only if neededOr Gerlitz2018-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When dealing with ingress rule on a netdev, if we did fine through the conventional path, there's no need to continue into the egdev route, and we can stop right there. Not doing so may cause a 2nd rule to be added by the cls api layer with the ingress being the egdev. For example, under sriov switchdev scheme, a user rule of VFR A --> VFR B will end up with two HW rules (1) VF A --> VF B and (2) uplink --> VF B Fixes: 208c0f4b5237 ('net: sched: use tc_setup_cb_call to call per-block callbacks') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | vhost: synchronize IOTLB message with dev cleanupJason Wang2018-05-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DaeRyong Jeong reports a race between vhost_dev_cleanup() and vhost_process_iotlb_msg(): Thread interleaving: CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup) (In the case of both VHOST_IOTLB_UPDATE and VHOST_IOTLB_INVALIDATE) ===== ===== vhost_umem_clean(dev->iotlb); if (!dev->iotlb) { ret = -EFAULT; break; } dev->iotlb = NULL; The reason is we don't synchronize between them, fixing by protecting vhost_process_iotlb_msg() with dev mutex. Reported-by: DaeRyong Jeong <threeearcat@gmail.com> Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge tag 'mlx5-fixes-2018-05-24' of ↵David S. Miller2018-05-242-7/+47
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-05-24 This series includes two mlx5 fixes. 1) add FCS data to checksum complete when required, from Eran Ben Elisha. 2) Fix A race in IPSec sandbox QP commands, from Yossi Kuperman. Please pull and let me know if there's any problem. for -stable v4.15 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | net/mlx5: IPSec, Fix a race between concurrent sandbox QP commandsYossi Kuperman2018-05-241-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sandbox QP Commands are retired in the order they are sent. Outstanding commands are stored in a linked-list in the order they appear. Once a response is received and the callback gets called, we pull the first element off the pending list, assuming they correspond. Sending a message and adding it to the pending list is not done atomically, hence there is an opportunity for a race between concurrent requests. Bind both send and add under a critical section. Fixes: bebb23e6cb02 ("net/mlx5: Accel, Add IPSec acceleration interface") Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Adi Nissim <adin@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | | | | | | | net/mlx5e: When RXFCS is set, add FCS data into checksum calculationEran Ben Elisha2018-05-241-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When RXFCS feature is enabled, the HW do not strip the FCS data, however it is not present in the checksum calculated by the HW. Fix that by manually calculating the FCS checksum and adding it to the SKB checksum field. Add helper function to find the FCS data for all SKB forms (linear, one fragment or more). Fixes: 102722fc6832 ("net/mlx5e: Add support for RXFCS feature flag") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | | | | | | packet: fix reserve calculationWillem de Bruijn2018-05-241-1/+1
| |/ / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b84bbaf7a6c8 ("packet: in packet_snd start writing at link layer allocation") ensures that packet_snd always starts writing the link layer header in reserved headroom allocated for this purpose. This is needed because packets may be shorter than hard_header_len, in which case the space up to hard_header_len may be zeroed. But that necessary padding is not accounted for in skb->len. The fix, however, is buggy. It calls skb_push, which grows skb->len when moving skb->data back. But in this case packet length should not change. Instead, call skb_reserve, which moves both skb->data and skb->tail back, without changing length. Fixes: b84bbaf7a6c8 ("packet: in packet_snd start writing at link layer allocation") Reported-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net/mlx4: Fix irq-unsafe spinlock usageJack Morgenstein2018-05-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spin_lock/unlock was used instead of spin_un/lock_irq in a procedure used in process space, on a spinlock which can be grabbed in an interrupt. This caused the stack trace below to be displayed (on kernel 4.17.0-rc1 compiled with Lock Debugging enabled): [ 154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected [ 154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G I [ 154.675856] ----------------------------------------------------- [ 154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core] [ 154.700927] and this task is already holding: [ 154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib] [ 154.718028] which would create a new lock dependency: [ 154.723705] (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.} [ 154.731922] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 154.740798] (&(&cq->lock)->rlock){..-.} [ 154.740800] ... which became SOFTIRQ-irq-safe at: [ 154.752163] _raw_spin_lock_irqsave+0x3e/0x50 [ 154.757163] mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib] [ 154.762554] ipoib_tx_poll+0x4a/0xf0 [ib_ipoib] ... to a SOFTIRQ-irq-unsafe lock: [ 154.815603] (&(&qp_table->lock)->rlock){+.+.} [ 154.815604] ... which became SOFTIRQ-irq-unsafe at: [ 154.827718] ... [ 154.827720] _raw_spin_lock+0x35/0x50 [ 154.833912] mlx4_qp_lookup+0x1e/0x50 [mlx4_core] [ 154.839302] mlx4_flow_attach+0x3f/0x3d0 [mlx4_core] Since mlx4_qp_lookup() is called only in process space, we can simply replace the spin_un/lock calls with spin_un/lock_irq calls. Fixes: 6dc06c08bef1 ("net/mlx4: Fix the check in attaching steering rules") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: phy: broadcom: Fix bcm_write_exp()Florian Fainelli2018-05-233-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On newer PHYs, we need to select the expansion register to write with setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior to being migrated to generic code under bcm-phy-lib.c which unfortunately used the older implementation from the BCM54xx days. Fix this by creating an inline stub: bcm_write_exp_sel() which adds the correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY and BCM7xxx PHY drivers which require setting these bits. broadcom.c is unchanged because some PHYs even use a different selector method, so let them specify it directly (e.g: SerDes secondary selector). Fixes: a1cba5613edf ("net: phy: Add Broadcom phy library for common interfaces") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: phy: broadcom: Fix auxiliary control register readsFlorian Fainelli2018-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are currently doing auxiliary control register reads with the shadow register value 0b111 (0x7) which incidentally is also the selector value that should be present in bits [2:0]. Fix this by using the appropriate selector mask which is defined (MII_BCM54XX_AUXCTL_SHDWSEL_MASK). This does not have a functional impact yet because we always access the MII_BCM54XX_AUXCTL_SHDWSEL_MISC (0x7) register in the current code. This might change at some point though. Fixes: 5b4e29005123 ("net: phy: broadcom: add bcm54xx_auxctl_read") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: ipv4: add missing RTA_TABLE to rtm_ipv4_policyRoopa Prabhu2018-05-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase messageColin Ian King2018-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivial fix to spelling mistake in mlx4_dbg debug message and also change the phrasing of the message so that is is more readable Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ibmvnic: Only do H_EOI for mobility eventsNathan Fontenot2018-05-231-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enabling the sub-CRQ IRQ a previous update sent a H_EOI prior to the enablement to clear any pending interrupts that may be present across a partition migration. This fixed a firmware bug where a migration could erroneously indicate that a H_EOI was pending. The H_EOI should only be sent when enabling during a mobility event though. Doing so at other time could wrong and can produce extra driver output when IRQs are enabled when doing TX completion. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge tag 'wireless-drivers-for-davem-2018-05-22' of ↵David S. Miller2018-05-233-7/+7
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.17 Hopefully the last fixes for 4.17. ssb is again causing problems so we had to revert a commit and fix it better. Also a small fix to bcma and some MAINTAINERS file updates. ssb * fix regression with all module PCI cards, for example using b43 and b44 drivers * try again fixing a MIPS linker error bcma * fix truncated info log messages ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | MAINTAINERS: change Kalle as wcn36xx maintainerKalle Valo2018-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eugene hasn't worked on wcn36xx for some time now. Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * | | | | | | | MAINTAINERS: change Kalle as ath.ko maintainerKalle Valo2018-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Luis hasn't worked on ath.ko for some time now. Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * | | | | | | | MAINTAINERS: update Kalle's email addressKalle Valo2018-05-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I switched to use my codeaurora.org address. Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * | | | | | | | ssb: make SSB_PCICORE_HOSTMODE depend on SSB = yRafał Miłecki2018-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SSB_PCICORE_HOSTMODE protects MIPS specific code that calls not exported symbols pcibios_enable_device and register_pci_controller. This code is supposed to be compiled only with ssb builtin. This fixes: ERROR: "pcibios_enable_device" [drivers/ssb/ssb.ko] undefined! ERROR: "register_pci_controller" [drivers/ssb/ssb.ko] undefined! make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1 Reported-by: Matt Redfearn <matt.redfearn@mips.com> Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * | | | | | | | Revert "ssb: Prevent build of PCI host features in module"Rafał Miłecki2018-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 882164a4a928bcaa53280940436ca476e6b1db8e. Above commit added "SSB = y" dependency to the wrong symbol SSB_DRIVER_PCICORE_POSSIBLE and prevented SSB_DRIVER_PCICORE from being selected when needed. PCI core driver for core running in clienthost mode is important for bus initialization. It's perfectly valid scenario to have ssb built as module and use it with buses on PCI cards. This fixes regression that affected all *module* users with PCI cards. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1572349 Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * | | | | | | | bcma: fix buffer size caused crash in bcma_core_mips_print_irq()Rafał Miłecki2018-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Used buffer wasn't big enough to hold whole strings. Example output of this function is: [ 0.180892] bcma: bus0: core 0x0800, irq: 2(S)* 3 4 5 6 D I [ 0.180948] bcma: bus0: core 0x0812, irq: 2(S) 3* 4 5 6 D I [ 0.180998] bcma: bus0: core 0x082d, irq: 2(S) 3 4* 5 6 D I [ 0.181046] bcma: bus0: core 0x082c, irq: 2(S) 3 4 5 6 D I* which means we need to store up to 24 chars. Fixes: 758f7e06063a8 ("bcma: Use bcma_debug and not pr_cont in MIPS driver") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * | | | | | | | | tuntap: correctly set SOCKWQ_ASYNC_NOSPACEJason Wang2018-05-231-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When link is down, writes to the device might fail with -EIO. Userspace needs an indication when the status is resolved. As a fix, tun_net_open() attempts to wake up writers - but that is only effective if SOCKWQ_ASYNC_NOSPACE has been set in the past. This is not the case of vhost_net which only poll for EPOLLOUT after it meets errors during sendmsg(). This patch fixes this by making sure SOCKWQ_ASYNC_NOSPACE is set when socket is not writable or device is down to guarantee EPOLLOUT will be raised in either tun_chr_poll() or tun_sock_write_space() after device is up. Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Eric Dumazet <edumazet@google.com> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | Merge branch 'virtio_net-mergeable-XDP'David S. Miller2018-05-231-11/+10
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Wang says: ==================== Fix several issues of virtio-net mergeable XDP Please review the patches that tries to fix several issues of virtio-net mergeable XDP. Changes from V1: - check against 1 before decreasing instead of resetting to 1 - typoe fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | virtio-net: fix leaking page for gso packet during mergeable XDPJason Wang2018-05-231-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to drop refcnt to xdp_page if we see a gso packet. Otherwise it will be leaked. Fixing this by moving the check of gso packet above the linearizing logic. While at it, remove useless comment as well. Cc: John Fastabend <john.fastabend@gmail.com> Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | virtio-net: correctly check num_buf during err pathJason Wang2018-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we successfully linearize the packet, num_buf will be set to zero which may confuse error handling path which assumes num_buf is at least 1 and this can lead the code tries to pop the descriptor of next buffer. Fixing this by checking num_buf against 1 before decreasing. Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | virtio-net: correctly transmit XDP buff after linearizingJason Wang2018-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not go for the error path after successfully transmitting a XDP buffer after linearizing. Since the error path may try to pop and drop next packet and increase the drop counters. Fixing this by simply drop the refcnt of original page and go for xmit path. Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers") Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | virtio-net: correctly redirect linearized packetJason Wang2018-05-231-1/+1
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a linearized packet was redirected by XDP, we should not go for the err path which will try to pop buffers for the next packet and increase the drop counter. Fixing this by just drop the page refcnt for the original page. Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT") Reported-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | Merge tag 'mac80211-for-davem-2018-05-23' of ↵David S. Miller2018-05-235-8/+12
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A handful of fixes: * hwsim radio dump wasn't working for the first radio * mesh was updating statistics incorrectly * a netlink message allocation was possibly too short * wiphy name limit was still too long * in certain cases regdb query could find a NULL pointer ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | mac80211_hwsim: Fix radio dump for radio idx 0Andrew Zaborowski2018-05-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 6335698e24ec11e1324b916177da6721df724dd8 the radio with idx of 0 will not get dumped in HWSIM_CMD_GET_RADIO because of the last_idx checks. Offset cb->args[0] by 1 similarly to what is done in nl80211.c. Fixes: 6335698e24ec ("mac80211_hwsim: add generation count for netlink dump operation") Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>