summaryrefslogtreecommitdiffstats
path: root/net/core/flow.c
diff options
context:
space:
mode:
authorEric Dumazet <eric.dumazet@gmail.com>2010-05-05 01:07:37 -0700
committerDavid S. Miller <davem@davemloft.net>2010-05-05 01:07:37 -0700
commitec7d2f2cf3a1b76202986519ec4f8ec75b2de232 (patch)
tree177c324eb0cf7e687d1bbd10a6add3a7d5979002 /net/core/flow.c
parent8753d29fd5daf890004a38c80835e1eb3acda394 (diff)
downloadlinux-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.tar.gz
linux-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.tar.bz2
linux-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.zip
net: __alloc_skb() speedup
With following patch I can reach maximum rate of my pktgen+udpsink simulator : - 'old' machine : dual quad core E5450 @3.00GHz - 64 UDP rx flows (only differ by destination port) - RPS enabled, NIC interrupts serviced on cpu0 - rps dispatched on 7 other cores. (~130.000 IPI per second) - SLAB allocator (faster than SLUB in this workload) - tg3 NIC - 1.080.000 pps without a single drop at NIC level. Idea is to add two prefetchw() calls in __alloc_skb(), one to prefetch first sk_buff cache line, the second to prefetch the shinfo part. Also using one memset() to initialize all skb_shared_info fields instead of one by one to reduce number of instructions, using long word moves. All skb_shared_info fields before 'dataref' are cleared in __alloc_skb(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/core/flow.c')
0 files changed, 0 insertions, 0 deletions