summaryrefslogtreecommitdiffstats
path: root/virt/kvm/coalesced_mmio.c
diff options
context:
space:
mode:
authorIlias Stamatis <ilstam@amazon.com>2024-07-18 20:35:38 +0100
committerSean Christopherson <seanjc@google.com>2024-08-23 09:50:36 -0700
commit92f6d4130497f2bedfaf86a4e46e890cf8983307 (patch)
tree4c8731343339f7054f0db0ec3518d1e1cc210591 /virt/kvm/coalesced_mmio.c
parent47ac09b91befbb6a235ab620c32af719f8208399 (diff)
downloadlinux-92f6d4130497f2bedfaf86a4e46e890cf8983307.tar.gz
linux-92f6d4130497f2bedfaf86a4e46e890cf8983307.tar.bz2
linux-92f6d4130497f2bedfaf86a4e46e890cf8983307.zip
KVM: Fix coalesced_mmio_has_room() to avoid premature userspace exit
The following calculation used in coalesced_mmio_has_room() to check whether the ring buffer is full is wrong and results in premature exits if the start of the valid entries is in the first half of the ring buffer. avail = (ring->first - last - 1) % KVM_COALESCED_MMIO_MAX; if (avail == 0) /* full */ Because negative values are handled using two's complement, and KVM computes the result as an unsigned value, the above will get a false positive if "first < last" and the ring is half-full. The above might have worked as expected in python for example: >>> (-86) % 170 84 However it doesn't work the same way in C. printf("avail: %d\n", (-86) % 170); printf("avail: %u\n", (-86) % 170); printf("avail: %u\n", (-86u) % 170u); Using gcc-11 these print: avail: -86 avail: 4294967210 avail: 0 For illustration purposes, given a 4-bit integer and a ring size of 0xA (unsigned), 0xA == 0x1010 == -6, and thus (-6u % 0xA) == 0. Fix the calculation and allow all but one entries in the buffer to be used as originally intended. Note, KVM's behavior is self-healing to some extent, as KVM will allow the entire buffer to be used if ring->first is beyond the halfway point. In other words, in the unlikely scenario that a use case benefits from being able to coalesce more than 86 entries at once, KVM will still provide such behavior, sometimes. Note #2, the % operator in C is not the modulo operator but the remainder operator. Modulo and remainder operators differ with respect to negative values. But, the relevant values in KVM are all unsigned, so it's a moot point in this case anyway. Note #3, this is almost a pure revert of the buggy commit, plus a READ_ONCE() to provide additional safety. Thue buggy commit justified the change with "it paves the way for making this function lockless", but it's not at all clear what was intended, nor is there any evidence that the buggy code was somehow safer. (a) the fields in question were already accessed locklessly, from the perspective that they could be modified by userspace at any time, and (b) the lock guarding the ring itself was changed, but never dropped, i.e. whatever lockless scheme (SRCU?) was planned never landed. Fixes: 105f8d40a737 ("KVM: Calculate available entries in coalesced mmio ring") Signed-off-by: Ilias Stamatis <ilstam@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20240718193543.624039-2-ilstam@amazon.com [sean: rework changelog to clarify behavior, call out weirdness of buggy commit] Signed-off-by: Sean Christopherson <seanjc@google.com>
Diffstat (limited to 'virt/kvm/coalesced_mmio.c')
-rw-r--r--virt/kvm/coalesced_mmio.c4
1 files changed, 1 insertions, 3 deletions
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 1b90acb6e3fe..184c5c40c9c1 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -43,7 +43,6 @@ static int coalesced_mmio_in_range(struct kvm_coalesced_mmio_dev *dev,
static int coalesced_mmio_has_room(struct kvm_coalesced_mmio_dev *dev, u32 last)
{
struct kvm_coalesced_mmio_ring *ring;
- unsigned avail;
/* Are we able to batch it ? */
@@ -52,8 +51,7 @@ static int coalesced_mmio_has_room(struct kvm_coalesced_mmio_dev *dev, u32 last)
* there is always one unused entry in the buffer
*/
ring = dev->kvm->coalesced_mmio_ring;
- avail = (ring->first - last - 1) % KVM_COALESCED_MMIO_MAX;
- if (avail == 0) {
+ if ((last + 1) % KVM_COALESCED_MMIO_MAX == READ_ONCE(ring->first)) {
/* full */
return 0;
}