block/blk-mq: Don't complete locally if capacities are different

The logic in blk_mq_complete_need_ipi() assumes SMP systems where all CPUs have equal compute capacities and only LLC cache can make a different on perceived performance. But this assumption falls apart on HMP systems where LLC is shared, but the CPUs have different capacities. Staying local then can have a big performance impact if the IO request was done from a CPU with higher capacity but the interrupt is serviced on a lower capacity CPU. Use the new cpus_equal_capacity() function to check if we need to send an IPI. Without the patch I see the BLOCK softirq always running on little cores (where the hardirq is serviced). With it I can see it running on all cores. This was noticed after the topology change [1] where now on a big.LITTLE we truly get that the LLC is shared between all cores where as in the past it was being misrepresented for historical reasons. The logic exposed a missing dependency on capacities for such systems where there can be a big performance difference between the CPUs. This of course introduced a noticeable change in behavior depending on how the topology is presented. Leading to regressions in some workloads as the performance of the BLOCK softirq on littles can be noticeably worse on some platforms. Worth noting that we could have checked for capacities being greater than or equal instead for equality. This will lead to favouring higher performance always. But opted for equality instead to match the performance of the requester without making an assumption that can lead to power trade-offs which these systems tend to be sensitive about. If the requester would like to run faster, it's better to rely on the scheduler to give the IO requester via some facility to run on a faster core; and then if the interrupt triggered on a CPU with different capacity we'll make sure to match the performance the requester is supposed to run at. [1] https://lpc.events/event/16/contributions/1342/attachments/962/1883/LPC-2022-Android-MC-Phantom-Domains.pdf Signed-off-by: Qais Yousef <qyousef@layalina.io> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240223155749.2958009-3-qyousef@layalina.io Signed-off-by: Jens Axboe <axboe@kernel.dk>
author: Qais Yousef <qyousef@layalina.io> 2024-02-23 15:57:49 +0000
committer: Jens Axboe <axboe@kernel.dk> 2024-02-24 12:48:01 -0700
commit: af550e4c968294398fc76b075f12d51c76caf753 (patch)
tree: b10a42ca5907321da6dfeccffa719722abe3c2d0 /block
parent: b361c9027b4e4159e7bcca4eb64fd26507c19994 (diff)
download: linux-af550e4c968294398fc76b075f12d51c76caf753.tar.gz
linux-af550e4c968294398fc76b075f12d51c76caf753.tar.bz2
linux-af550e4c968294398fc76b075f12d51c76caf753.zip
1 files changed, 3 insertions, 2 deletions
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 45f994c10044..7111bd4180e7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1166,10 +1166,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
 	if (force_irqthreads())
 		return false;
 
-	/* same CPU or cache domain?  Complete locally */
+	/* same CPU or cache domain and capacity?  Complete locally */
 	if (cpu == rq->mq_ctx->cpu ||
 	    (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
-	     cpus_share_cache(cpu, rq->mq_ctx->cpu)))
+	     cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
+	     cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
 		return false;
 
 	/* don't try to IPI to an offline CPU */
author	Qais Yousef <qyousef@layalina.io>	2024-02-23 15:57:49 +0000
committer	Jens Axboe <axboe@kernel.dk>	2024-02-24 12:48:01 -0700
commit	af550e4c968294398fc76b075f12d51c76caf753 (patch)
tree	b10a42ca5907321da6dfeccffa719722abe3c2d0 /block
parent	b361c9027b4e4159e7bcca4eb64fd26507c19994 (diff)
download	linux-af550e4c968294398fc76b075f12d51c76caf753.tar.gz linux-af550e4c968294398fc76b075f12d51c76caf753.tar.bz2 linux-af550e4c968294398fc76b075f12d51c76caf753.zip