| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d4d6ec6dac07f263f06d847d6f732d6855522845 upstream.
In case of TCA_HHF_NON_HH_WEIGHT or TCA_HHF_QUANTUM is zero,
it would make no progress inside the loop in hhf_dequeue() thus
kernel would get stuck.
Fix this by checking this corner case in hhf_change().
Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Reported-by: syzbot+bc6297c11f19ee807dc2@syzkaller.appspotmail.com
Reported-by: syzbot+041483004a7f45f1f20a@syzkaller.appspotmail.com
Reported-by: syzbot+55be5f513bed37fc4367@syzkaller.appspotmail.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Terry Lam <vtlam@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 32db864d33c21fd70a217ba53cb7224889354ffb upstream.
If sch_hhf fails in its ->init() function (either due to wrong
user-space arguments as below or memory alloc failure of hh_flows) it
will do a null pointer deref of q->hh_flows in its ->destroy() function.
To reproduce the crash:
$ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000
Crash log:
[ 690.654882] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 690.655565] IP: hhf_destroy+0x48/0xbc
[ 690.655944] PGD 37345067
[ 690.655948] P4D 37345067
[ 690.656252] PUD 58402067
[ 690.656554] PMD 0
[ 690.656857]
[ 690.657362] Oops: 0000 [#1] SMP
[ 690.657696] Modules linked in:
[ 690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57
[ 690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 690.659255] task: ffff880058578000 task.stack: ffff88005acbc000
[ 690.659747] RIP: 0010:hhf_destroy+0x48/0xbc
[ 690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246
[ 690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[ 690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0
[ 690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000
[ 690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea
[ 690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000
[ 690.663769] FS: 00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[ 690.667069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0
[ 690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 690.671003] Call Trace:
[ 690.671743] qdisc_create+0x377/0x3fd
[ 690.672534] tc_modify_qdisc+0x4d2/0x4fd
[ 690.673324] rtnetlink_rcv_msg+0x188/0x197
[ 690.674204] ? rcu_read_unlock+0x3e/0x5f
[ 690.675091] ? rtnl_newlink+0x729/0x729
[ 690.675877] netlink_rcv_skb+0x6c/0xce
[ 690.676648] rtnetlink_rcv+0x23/0x2a
[ 690.677405] netlink_unicast+0x103/0x181
[ 690.678179] netlink_sendmsg+0x326/0x337
[ 690.678958] sock_sendmsg_nosec+0x14/0x3f
[ 690.679743] sock_sendmsg+0x29/0x2e
[ 690.680506] ___sys_sendmsg+0x209/0x28b
[ 690.681283] ? __handle_mm_fault+0xc7d/0xdb1
[ 690.681915] ? check_chain_key+0xb0/0xfd
[ 690.682449] __sys_sendmsg+0x45/0x63
[ 690.682954] ? __sys_sendmsg+0x45/0x63
[ 690.683471] SyS_sendmsg+0x19/0x1b
[ 690.683974] entry_SYSCALL_64_fastpath+0x23/0xc2
[ 690.684516] RIP: 0033:0x7f8ae529d690
[ 690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690
[ 690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003
[ 690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000
[ 690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002
[ 690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000
[ 690.688475] ? trace_hardirqs_off_caller+0xa7/0xcf
[ 690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83
c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02
00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1
[ 690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0
[ 690.690636] CR2: 0000000000000000
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 87b60cfacf9f17cf71933c6e33b66e68160af71d upstream.
Dmitry reported uses after free in qdisc code [1]
The problem here is that ops->init() can return an error.
qdisc_create_dflt() then call ops->destroy(),
while qdisc_create() does _not_ call it.
Four qdisc chose to call their own ops->destroy(), assuming their caller
would not.
This patch makes sure qdisc_create() calls ops->destroy()
and fixes the four qdisc to avoid double free.
[1]
BUG: KASAN: use-after-free in mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 at addr ffff8801d415d440
Read of size 8 by task syz-executor2/5030
CPU: 0 PID: 5030 Comm: syz-executor2 Not tainted 4.3.5-smp-DEV #119
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
0000000000000046 ffff8801b435b870 ffffffff81bbbed4 ffff8801db000400
ffff8801d415d440 ffff8801d415dc40 ffff8801c4988510 ffff8801b435b898
ffffffff816682b1 ffff8801b435b928 ffff8801d415d440 ffff8801c49880c0
Call Trace:
[<ffffffff81bbbed4>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffffff81bbbed4>] dump_stack+0x6c/0x98 lib/dump_stack.c:51
[<ffffffff816682b1>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
[<ffffffff81668524>] print_address_description mm/kasan/report.c:196 [inline]
[<ffffffff81668524>] kasan_report_error+0x1b4/0x4b0 mm/kasan/report.c:285
[<ffffffff81668953>] kasan_report mm/kasan/report.c:305 [inline]
[<ffffffff81668953>] __asan_report_load8_noabort+0x43/0x50 mm/kasan/report.c:326
[<ffffffff82527b02>] mq_destroy+0x242/0x290 net/sched/sch_mq.c:33
[<ffffffff82524bdd>] qdisc_destroy+0x12d/0x290 net/sched/sch_generic.c:953
[<ffffffff82524e30>] qdisc_create_dflt+0xf0/0x120 net/sched/sch_generic.c:848
[<ffffffff8252550d>] attach_default_qdiscs net/sched/sch_generic.c:1029 [inline]
[<ffffffff8252550d>] dev_activate+0x6ad/0x880 net/sched/sch_generic.c:1064
[<ffffffff824b1db1>] __dev_open+0x221/0x320 net/core/dev.c:1403
[<ffffffff824b24ce>] __dev_change_flags+0x15e/0x3e0 net/core/dev.c:6858
[<ffffffff824b27de>] dev_change_flags+0x8e/0x140 net/core/dev.c:6926
[<ffffffff824f5bf6>] dev_ifsioc+0x446/0x890 net/core/dev_ioctl.c:260
[<ffffffff824f61fa>] dev_ioctl+0x1ba/0xb80 net/core/dev_ioctl.c:546
[<ffffffff82430509>] sock_do_ioctl+0x99/0xb0 net/socket.c:879
[<ffffffff82430d30>] sock_ioctl+0x2a0/0x390 net/socket.c:958
[<ffffffff816f3b68>] vfs_ioctl fs/ioctl.c:44 [inline]
[<ffffffff816f3b68>] do_vfs_ioctl+0x8a8/0xe50 fs/ioctl.c:611
[<ffffffff816f41a4>] SYSC_ioctl fs/ioctl.c:626 [inline]
[<ffffffff816f41a4>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:617
[<ffffffff8123e357>] entry_SYSCALL_64_fastpath+0x12/0x17
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 upstream.
When the bottom qdisc decides to, for example, drop some packet,
it calls qdisc_tree_decrease_qlen() to update the queue length
for all its ancestors, we need to update the backlog too to
keep the stats on root qdisc accurate.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
|
|
|
|
|
|
|
| |
It is available since v3.15-rc5.
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
When I use the following command, eth0 cannot send any packets.
#tc qdisc add dev eth0 root handle 1: hhf limit 1
Because qlen need be smaller than limit, all packets were dropped.
Fix this by qlen *<=* limit.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hhf_change() takes the sch_tree_lock and releases it but misses the
error cases. Fix the missed case here.
To reproduce try a command like this,
# tc qdisc change dev p3p2 root hhf quantum 40960 non_hh_weight 300000
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
nla_nest_end() already has return skb->len, so replace
return skb->len with return nla_nest_end instead().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.
Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
This is to be compatible with the use of "get_time" (i.e. default
time unit in us) in iproute2 patch for HHF as requested by Stephen.
Signed-off-by: Terry Lam <vtlam@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
This module shouldn't be randomly exporting symbols
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements the first size-based qdisc that attempts to
differentiate between small flows and heavy-hitters. The goal is to
catch the heavy-hitters and move them to a separate queue with less
priority so that bulk traffic does not affect the latency of critical
traffic. Currently "less priority" means less weight (2:1 in
particular) in a Weighted Deficit Round Robin (WDRR) scheduler.
In essence, this patch addresses the "delay-bloat" problem due to
bloated buffers. In some systems, large queues may be necessary for
obtaining CPU efficiency, or due to the presence of unresponsive
traffic like UDP, or just a large number of connections with each
having a small amount of outstanding traffic. In these circumstances,
HHF aims to reduce the HoL blocking for latency sensitive traffic,
while not impacting the queues built up by bulk traffic. HHF can also
be used in conjunction with other AQM mechanisms such as CoDel.
To capture heavy-hitters, we implement the "multi-stage filter" design
in the following paper:
C. Estan and G. Varghese, "New Directions in Traffic Measurement and
Accounting", in ACM SIGCOMM, 2002.
Some configurable qdisc settings through 'tc':
- hhf_reset_timeout: period to reset counter values in the multi-stage
filter (default 40ms)
- hhf_admit_bytes: threshold to classify heavy-hitters
(default 128KB)
- hhf_evict_timeout: threshold to evict idle heavy-hitters
(default 1s)
- hhf_non_hh_weight: Weighted Deficit Round Robin (WDRR) weight for
non-heavy-hitters (default 2)
- hh_flows_limit: max number of heavy-hitter flow entries
(default 2048)
Note that the ratio between hhf_admit_bytes and hhf_reset_timeout
reflects the bandwidth of heavy-hitters that we attempt to capture
(25Mbps with the above default settings).
The false negative rate (heavy-hitter flows getting away unclassified)
is zero by the design of the multi-stage filter algorithm.
With 100 heavy-hitter flows, using four hashes and 4000 counters yields
a false positive rate (non-heavy-hitters mistakenly classified as
heavy-hitters) of less than 1e-4.
Signed-off-by: Terry Lam <vtlam@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|