linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	bpf: get rid of pure_initcall dependency to enable jits	Daniel Borkmann	2018-01-19	13	-42/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
*	bpf: add couple of test cases for div/mod by zero	Daniel Borkmann	2018-01-19	1	-0/+87
\| \| \| \| \| \| \| \| \| \|	Add couple of missing test cases for eBPF div/mod by zero to the new test_verifier prog runtime feature. Also one for an empty prog and only exit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
*	bpf: add couple of test cases for signed extended imms	Daniel Borkmann	2018-01-19	1	-0/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a couple of test cases for interpreter and JIT that are related to an issue we faced some time ago in Cilium [1], which is fixed in LLVM with commit e53750e1e086 ("bpf: fix bug on silently truncating 64-bit immediate"). Test cases were run-time checking kernel to behave as intended which should also provide some guidance for current or new JITs in case they should trip over this. Added for cBPF and eBPF. [1] https://github.com/cilium/cilium/pull/2162 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
*	bpf: add csum_diff helper to xdp as well	Daniel Borkmann	2018-01-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Useful for porting cls_bpf programs w/o increasing program complexity limits much at the same time, so add the helper to XDP as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
*	bpf, verifier: detect misconfigured mem, size argument pair	Daniel Borkmann	2018-01-19	1	-24/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	I've seen two patch proposals now for helper additions that used ARG_PTR_TO_MEM or similar in reg_X but no corresponding ARG_CONST_SIZE in reg_X+1. Verifier won't complain in such case, but it will omit verifying the memory passed to the helper thus ending up badly. Detect such buggy helper function signature and bail out during verification rather than finding them through review. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
*	samples/bpf: xdp_monitor include cpumap tracepoints in monitoring	Jesper Dangaard Brouer	2018-01-20	2	-67/+443
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xdp_redirect_cpu sample have some "builtin" monitoring of the tracepoints for xdp_cpumap_, but it is practical to have an external tool that can monitor these transpoint as an easy way to troubleshoot an application using XDP + cpumap. Specifically I need such external tool when working on Suricata and XDP cpumap redirect. Extend the xdp_monitor tool sample with monitoring of these xdp_cpumap_ tracepoints. Model the output format like xdp_redirect_cpu. Given I needed to handle per CPU decoding for cpumap, this patch also add per CPU info on the existing monitor events. This resembles part of the builtin monitoring output from sample xdp_rxq_info. Thus, also covering part of that sample in an external monitoring tool. Performance wise, the cpumap tracepoints uses bulking, which cause them to have very little overhead. Thus, they are enabled by default. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	Merge branch 'bpf-lpm-get-next-key'	Daniel Borkmann	2018-01-19	2	-2/+215
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yonghong Song says: ==================== This patch set implements MAP_GET_NEXT_KEY command for LPM_TRIE map. This command is really useful for key enumeration, and for key deletion if what keys in the trie are unknown. Patch #1 implements the functionality in the kernel and patch #2 adds a test case in tools/testing/selftests/bpf. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	tools/bpf: add a testcase for MAP_GET_NEXT_KEY command of LPM_TRIE map	Yonghong Song	2018-01-19	1	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A test case is added in tools/testing/selftests/bpf/test_lpm_map.c for MAP_GET_NEXT_KEY command. A four node trie, which is described in kernel/bpf/lpm_trie.c, is built and the MAP_GET_NEXT_KEY results are checked. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map	Yonghong Song	2018-01-19	1	-2/+93
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current LPM_TRIE map type does not implement MAP_GET_NEXT_KEY command. This command is handy when users want to enumerate keys. Otherwise, a different map which supports key enumeration may be required to store the keys. If the map data is sparse and all map data are to be deleted without closing file descriptor, using MAP_GET_NEXT_KEY to find all keys is much faster than enumerating all key space. This patch implements MAP_GET_NEXT_KEY command for LPM_TRIE map. If user provided key pointer is NULL or the key does not have an exact match in the trie, the first key will be returned. Otherwise, the next key will be returned. In this implemenation, key enumeration follows a postorder traversal of internal trie. More specific keys will be returned first than less specific ones, given a sequence of MAP_GET_NEXT_KEY syscalls. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	selftests: bpf: update .gitignore with missing generated files	Shuah Khan	2018-01-19	1	-0/+7
\| \| \| \| \| \| \|	Update .gitignore with missing generated files. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	bpftool: recognize BPF_MAP_TYPE_CPUMAP maps	Roman Gushchin	2018-01-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add BPF_MAP_TYPE_CPUMAP map type to the list of map type recognized by bpftool and define corresponding text representation. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	Merge branch 'bpf-array-map-offload-and-tests'	Daniel Borkmann	2018-01-18	14	-48/+583
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jakub Kicinski says: ==================== This set brings in the rest of map offload code held up by urgent fixes and improvements to the BPF arrays. The first 3 patches take care of array map offload, similarly to hash maps the attribute validation is split out to a separate map op, and used for both offloaded and non-offloaded case (allocation only happens if map is on the host). Offload support comes down to allowing this map type through the offload check in the core. NFP driver also rejects the delete operation in case of array maps. Subsequent patches add reporting of target device in a very similar way target device of programs is reported (ifindex+netns dev/ino). Netdevsim is extended with a trivial map implementation allowing us to test the offload in test_offload.py. Last patch adds a small busy wait to NFP map IO, this improves the response times which is especially useful for map dumps. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	nfp: bpf: add short busy wait for FW replies	Jakub Kicinski	2018-01-18	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scheduling out and in for every FW message can slow us down unnecessarily. Our experiments show that even under heavy load the FW responds to 99.9% messages within 200 us. Add a short busy wait before entering the wait queue. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	selftest/bpf: extend the offload test with map checks	Jakub Kicinski	2018-01-18	3	-25/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check map device information is reported correctly, and perform basic map operations. Check device destruction gets rid of the maps and map allocation failure path by telling netdevsim to reject map offload via DebugFS. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	netdevsim: bpf: support fake map offload	Jakub Kicinski	2018-01-18	2	-0/+249
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add to netdevsim ability to pretend it's offloading BPF maps. We only allow allocation of tiny 2 entry maps, to keep things simple. Mutex lock may seem heavy for the operations we perform, but we want to make sure callbacks can sleep. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	tools: bpftool: report device information for offloaded maps	Jakub Kicinski	2018-01-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Print the information about device on which map is created. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: offload: report device information about offloaded maps	Jakub Kicinski	2018-01-18	5	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tell user space about device on which the map was created. Unfortunate reality of user ABI makes sharing this code with program offload difficult but the information is the same. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: offload: allow array map offload	Jakub Kicinski	2018-01-18	2	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The special handling of different map types is left to the driver. Allow offload of array maps by simply adding it to accepted types. For nfp we have to make sure array elements are not deleted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: arraymap: use bpf_map_init_from_attr()	Jakub Kicinski	2018-01-18	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arraymap was not converted to use bpf_map_init_from_attr() to avoid merge conflicts with emergency fixes. Do it now. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: arraymap: move checks out of alloc function	Jakub Kicinski	2018-01-18	1	-14/+28
\|/ \| \| \| \| \| \| \| \| \|	Use the new callback to perform allocation checks for array maps. The fd maps don't need a special allocation callback, they only need a special check callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	Merge branch 'bpf-improve-test-verifier-coverage'	Daniel Borkmann	2018-01-18	3	-1/+52
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Alexei Starovoitov says: ==================== BPF verifier has 700+ tests used to check correctness of the verifier. Beyond checking the verifier log tell kernel to run accepted programs as well via bpf_prog_test_run() command. That improves quality of the tests and increases bpf test coverage. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	selftests/bpf: make test_verifier run most programs	Alexei Starovoitov	2018-01-18	1	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to improve test coverage make test_verifier run all successfully loaded programs on 64-byte zero initialized data. For clsbpf and xdp it means empty 64-byte packet. For lwt and socket_filters it's 64-byte packet where skb->data points after L2. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: allow socket_filter programs to use bpf_prog_test_run	Alexei Starovoitov	2018-01-18	2	-0/+3
\|/ \| \| \| \| \| \| \| \| \| \|	in order to improve test coverage allow socket_filter program type to be run via bpf_prog_test_run command. Since such programs can be loaded by non-root tighten permissions for bpf_prog_test_run to be root only to avoid surprises. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	bpf: Sync kernel ABI header with tooling header	Jesper Dangaard Brouer	2018-01-18	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \|	Update tools/include/uapi/linux/bpf.h to bring it in sync with include/uapi/linux/bpf.h. The listed commits forgot to update it. Fixes: 02dd3291b2f0 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs") Fixes: f19397a5c656 ("bpf: Add access to snd_cwnd and others in sock_ops") Fixes: 06ef0ccb5a36 ("bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	tools/bpf_jit_disasm: silence a static checker warning	Dan Carpenter	2018-01-18	1	-3/+4
\| \| \| \| \| \| \| \| \|	There is a static checker warning that "proglen" has an upper bound but no lower bound. The allocation will just fail harmlessly so it's not a big deal. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	bpf: add comments to BPF ld/ldx sizes	Jesper Dangaard Brouer	2018-01-18	2	-4/+5
\| \| \| \| \| \| \| \| \|	Doc BPF ld/ldx size defines as comments in code, as it makes in faster to lookup in a programming/review setting, than looking up the sizes in Documentation/networking/filter.txt. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	bpf: change fake_ip for bpf_trace_printk helper	Yonghong Song	2018-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, for bpf_trace_printk helper, fake ip address 0x1 is used with comments saying that fake ip will not be printed. This is indeed true for 4.12 and earlier version, but for 4.13 and later version, the ip address will be printed if it cannot be resolved with kallsym. Running samples/bpf/tracex5 program and you will have the following in the debugfs trace_pipe output: ... <...>-1819 [003] .... 443.497877: 0x00000001: mmap <...>-1819 [003] .... 443.498289: 0x00000001: syscall=102 (one of get/set uid/pid/gid) ... The kernel commit changed this behavior is: commit feaf1283d11794b9d518fcfd54b6bf8bee1f0b4b Author: Steven Rostedt (VMware) <rostedt@goodmis.org> Date: Thu Jun 22 17:04:55 2017 -0400 tracing: Show address when function names are not found ... This patch changed the comment and also altered the fake ip address to 0x0 as users may think 0x1 has some special meaning while it doesn't. The new output: ... <...>-1799 [002] .... 25.953576: 0: mmap <...>-1799 [002] .... 25.953865: 0: read(fd=0, buf=00000000053936b5, size=512) ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	samples/bpf: xdp2skb_meta comment explain why pkt-data pointers are invalidated	Jesper Dangaard Brouer	2018-01-18	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	Improve the 'unknown reason' comment, with an actual explaination of why the ctx pkt-data pointers need to be loaded after the helper function bpf_xdp_adjust_meta(). Based on the explaination Daniel gave. Fixes: 36e04a2d78d9 ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	Merge branch 'bpf-dump-and-disasm-nfp-jit'	Daniel Borkmann	2018-01-18	8	-17/+154
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jakub Kicinski says: ==================== Jiong says: Currently bpftool could disassemble host jited image, for example x86_64, using libbfd. However it couldn't disassemble offload jited image. There are two reasons: 1. bpf_obj_get_info_by_fd/struct bpf_prog_info couldn't get the address of jited image and image's length. 2. Even after issue 1 resolved, bpftool couldn't figure out what is the offload arch from bpf_prog_info, therefore can't drive libbfd disassembler correctly. This patch set resolve issue 1 by introducing two new fields "jited_len" and "jited_image" in bpf_dev_offload. These two fields serve as the generic interface to communicate the jited image address and length for all offload targets to higher level caller. For example, bpf_obj_get_info_by_fd could use them to fill the userspace visible fields jited_prog_len and jited_prog_insns. This patch set resolve issue 2 by getting bfd backend name through "ifindex", i.e network interface index. v1: - Deduct bfd arch name through ifindex, i.e network interface index. First, map ifindex to devname through ifindex_to_name_ns, then get pci id through /sys/class/dev/DEVNAME/device/vendor. (Daniel, Alexei) ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	tools: bpftool: improve architecture detection by using ifindex	Jiong Wang	2018-01-18	4	-3/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current architecture detection method in bpftool is designed for host case. For offload case, we can't use the architecture of "bpftool" itself. Instead, we could call the existing "ifindex_to_name_ns" to get DEVNAME, then read pci id from /sys/class/dev/DEVNAME/device/vendor, finally we map vendor id to bfd arch name which will finally be used to select bfd backend for the disassembler. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	nfp: bpf: set new jit info fields	Jiong Wang	2018-01-18	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch set those new jit info fields introduced in this patch set. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
\| *	bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info	Jiong Wang	2018-01-18	3	-13/+43
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For host JIT, there are "jited_len"/"bpf_func" fields in struct bpf_prog used by all host JIT targets to get jited image and it's length. While for offload, targets are likely to have different offload mechanisms that these info are kept in device private data fields. Therefore, BPF_OBJ_GET_INFO_BY_FD syscall needs an unified way to get JIT length and contents info for offload targets. One way is to introduce new callback to parse device private data then fill those fields in bpf_prog_info. This might be a little heavy, the other way is to add generic fields which will be initialized by all offload targets. This patch follow the second approach to introduce two new fields in struct bpf_dev_offload and teach bpf_prog_get_info_by_fd about them to fill correct jited_prog_len and jited_prog_insns in bpf_prog_info. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
*	Merge tag 'linux-can-next-for-4.16-20180116' of ↵	David S. Miller	2018-01-17	6	-67/+202
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2018-01-16 this is a pull request for net-next/master consisting of 9 patches. This is a series of patches, some of them initially by Franklin S Cooper Jr, which was picked up by Faiz Abbas. Faiz Abbas added some patches while working on this series, I contributed one as well. The first two patches add support to CAN device infrastructure to limit the bitrate of a CAN adapter if the used CAN-transceiver has a certain maximum bitrate. The remaining patches improve the m_can driver. They add support for bitrate limiting to the driver, clean up the driver and add support for runtime PM. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	can: m_can: Add call to of_can_transceiver	Franklin S Cooper Jr	2018-01-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add call to new generic functions that provides support via a binding to limit the arbitration rate and/or data rate imposed by the physical transceiver connected to the MCAN peripheral. Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: m_can: Add PM Support	Faiz Abbas	2018-01-16	1	-38/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CONFIG_PM which is the new way to handle managing clocks. Move the clock management to pm_runtime_resume() and pm_runtime_suspend() callbacks for the driver. CONFIG_PM is required by OMAP based devices to handle clock management. Therefore, this allows future Texas Instruments SoCs that have the MCAN IP to work with this driver. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: m_can: get rid of function free_m_can_dev()	Marc Kleine-Budde	2018-01-16	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the previous patch removed alloc_m_can_dev(), let's get rid of the corresponding free_m_can_dev() and call free_candev() directly. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: m_can: Move allocation of net device to probe	Faiz Abbas	2018-01-16	1	-17/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the version no longer required to allocate the net device, it can be moved to probe and the alloc_m_can_dev() function can be simplified. Therefore, move the allocation of net device to probe and change alloc_m_can_dev() to setup_m_can_dev(). Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: m_can: Remove check for version when allocating m_can net device	Faiz Abbas	2018-01-16	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the m_can version is used to set the tx_fifo_count to 1 when allocating the net device. However, this is redundant as a value of 1 for the tx_fifo_count needs to be provided in the bosch,mram-cfg property of the device tree node anyway. Therefore, remove check for version when allocating the net device. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: m_can: Support higher speed CAN-FD bitrates	Franklin S Cooper Jr	2018-01-16	1	-3/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During test transmitting using CAN-FD at high bitrates (> 2 Mbps) would fail. Scoping the signals I noticed that only a single bit was being transmitted and with a bit more investigation realized the actual MCAN IP would go back to initialization mode automatically. It appears this issue is due to the MCAN needing to use the Transmitter Delay Compensation Mode with the correct value for the transmitter delay compensation offset (tdco). What impacts the tdco value isn't 100% clear but to calculate it you use an equation defined in the MCAN User's Guide. The user guide mentions that this register needs to be set based on clock values, secondary sample point and the data bitrate. One of the key variables that can't automatically be determined is the secondary sample point (ssp). This ssp is similar to the sp but is specific to this transmitter delay compensation mode. The guidelines for configuring ssp is rather vague but via some CAN test it appears for DRA76x that putting the value same as data sampling point works. The CAN-CIA's "Bit Time Requirements for CAN FD" paper presented at the International CAN Conference 2013 indicates that this TDC mode is only needed for data bit rates above 2.5 Mbps. Therefore, only enable this mode when the data bit rate is above 2.5 Mbps. Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	dt-bindings: can: m_can: Document new can transceiver binding	Franklin S Cooper Jr	2018-01-16	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add information regarding can-transceiver binding. This is especially important for MCAN since the IP allows CAN FD mode to run significantly faster than what most transceivers are capable of. Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: dev: Add support for limiting configured bitrate	Franklin S Cooper Jr	2018-01-16	3	-1/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various CAN or CAN-FD IP may be able to run at a faster rate than what the transceiver the CAN node is connected to. This can lead to unexpected errors. However, CAN transceivers typically have fixed limitations and provide no means to discover these limitations at runtime. Therefore, add support for a can-transceiver node that can be reused by other CAN peripheral drivers to determine for both CAN and CAN-FD what the max bitrate that can be used. If the user tries to configure CAN to pass these maximum bitrates it will throw an error. Also add support for reading bitrate_max via the netlink interface. Reviewed-by: Suman Anna <s-anna@ti.com> Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> [nsekhar@ti.com: fix build error with !CONFIG_OF] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	dt-bindings: can: can-transceiver: Document new binding	Franklin S Cooper Jr	2018-01-16	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add documentation to describe usage of the new can-transceiver binding. This new binding is applicable for any CAN device therefore it exists as its own document. Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* \|	vxlan: Fix trailing semicolon	Luis de Bethencourt	2018-01-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The trailing semicolon is an empty statement that does no operation. It is completely stripped out by the compiler. Removing it since it doesn't do anything. Fixes: 5f35227ea34b ("net: Generalize ndo_gso_check to ndo_features_check") Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	cxgb4: restructure VF mgmt code	Ganesh Goudar	2018-01-17	2	-204/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	restructure the code which adds support for configuring PCIe VF via mgmt netdevice. which was added by commit 7829451c695e ("cxgb4: Add control net_device for configuring PCIe VF") Original work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: Remove spinlock from get_net_ns_by_id()	Kirill Tkhai	2018-01-17	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idr_find() is safe under rcu_read_lock() and maybe_get_net() guarantees that net is alive. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: Fix possible race in peernet2id_alloc()	Kirill Tkhai	2018-01-17	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	peernet2id_alloc() is racy without rtnl_lock() as refcount_read(&peer->count) under net->nsid_lock does not guarantee, peer is alive: rcu_read_lock() peernet2id_alloc() .. spin_lock_bh(&net->nsid_lock) .. refcount_read(&peer->count) (!= 0) .. .. put_net() .. cleanup_net() .. for_each_net(tmp) .. spin_lock_bh(&tmp->nsid_lock) .. __peernet2id(tmp, net) == -1 .. .. .. .. __peernet2id_alloc(alloc == true) .. .. .. rcu_read_unlock() .. .. synchronize_rcu() .. kmem_cache_free(net) After the above situation, net::netns_id contains id pointing to freed memory, and any other dereferencing by the id will operate with this freed memory. Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc() is generic interface, and better we fix it before someone really starts use it in wrong context. v2: Don't place refcount_read(&net->count) under net->nsid_lock as suggested by Eric W. Biederman <ebiederm@xmission.com> v3: Rebase on top of net-next Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'tun-allow-to-attach-eBPF-filter'	David S. Miller	2018-01-17	2	-20/+51
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jason Wang says: ==================== tun: allow to attach eBPF filter This series tries to implement eBPF socket filter for tun. This could be used for implementing efficient virtio-net receive filter for vhost-net. Changes from V2: - fix typo - remove unnecessary double check ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tun: allow to attach ebpf socket filter	Jason Wang	2018-01-17	2	-4/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows userspace to attach eBPF filter to tun. This will allow to implement VM dataplane filtering in a more efficient way compared to cBPF filter by allowing either qemu or libvirt to attach eBPF filter to tun. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tuntap: rename struct tun_steering_prog to struct tun_prog	Jason Wang	2018-01-17	1	-16/+16
\|/ / \| \| \| \| \| \| \| \| \| \| \| \|	To be reused by other eBPF program other than queue selection. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'	David S. Miller	2018-01-17	17	-322/+1111
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jiri Pirko says: ==================== net: sched: allow qdiscs to share filter block instances Currently the filters added to qdiscs are independent. So for example if you have 2 netdevices and you create ingress qdisc on both and you want to add identical filter rules both, you need to add them twice. This patchset makes this easier and mainly saves resources allowing to share all filters within a qdisc - I call it a "filter block". Also this helps to save resources when we do offload to hw for example to expensive TCAM. So back to the example. First, we create 2 qdiscs. Both will share block number 22. "22" is just an identification: $ tc qdisc add dev ens7 ingress_block 22 ingress ^^^^^^^^^^^^^^^^ $ tc qdisc add dev ens8 ingress_block 22 ingress ^^^^^^^^^^^^^^^^ If we don't specify "block" command line option, no shared block would be created: $ tc qdisc add dev ens9 ingress Now if we list the qdiscs, we will see the block index in the output: $ tc qdisc qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22 qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22 qdisc ingress ffff: dev ens9 parent ffff:fff1 To make is more visual, the situation looks like this: ens7 ingress qdisc ens7 ingress qdisc \| \| \| \| +----------> block 22 <----------+ Unlimited number of qdiscs may share the same block. Note that this patchset introduces block sharing support also for clsact qdisc: $ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact $ tc qdisc show dev ens10 qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24 We can add filter using the block index: $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop Note we cannot use the qdisc for filter manipulations of shared blocks: $ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop Error: This filter block is shared. Please use the block index to manipulate the filters. We will see the same output if we list filters for ingress qdisc of ens7 and ens8, also for the block 22: $ tc filter show block 22 filter block 22 protocol ip pref 25 flower chain 0 filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 ... $ tc filter show dev ens7 ingress filter block 22 protocol ip pref 25 flower chain 0 filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 ... $ tc filter show dev ens8 ingress filter block 22 protocol ip pref 25 flower chain 0 filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 ... --- v10->v11: - patch 2: - fixed error path when register_pernet_subsys fails pointed out by Cong - patch 9: - rebased on top of the current net-next v9->v10: - patch 7: - fixed ifindex magic in the patch description - userspace patches: - added manpages and patch descriptions v8->v9: - patch "net: sched: add rt netlink message type for block get" was removed, userspace check filter existence using qdisc dump v7->v8: - patch 7: - added comment to ifindex block magic - patch 9: - new patch - patch 10: - base this on the patch that introduces qdisc-generic block index attributes parsing/dumping - patch 13: - rebased on top of current net-next v6->v7: - patch 1: - unsquashed shared block patch that was previously squashed by mistake - fixed error path in block create - freeing chain 0 - patch 2: - new patch - splitted from the previous one as it got accidentaly squashed in the rebasing process in the past - converted to idr extended - removed auto-generating of block indexes. Callers have to explicily tell that the block is shared by passing non-zero block index - fixed error path in block get ext - freeing chain 0 - patch 7: - changed extack message for block index handle as suggested by DaveA - added extack message when block index does not exist - the block ifindex magic is in define and change to 0xffffffff as suggested by Jamal - patch 8: - new patch implementing RTM_GETBLOCK in order to query if the block with some index exists - patch 9: - adjust to the core changes and check block index attributes for being 0 v5->v6: - added patch 6 that introduces block handle v4->v5: - patch 5: - add tracking of binding of devs that are unable to offload and check that before block cbs call. v3->v4: - patch 1: - rebased on top of the current net-next - added some extack strings - patch 3: - rebased on top of the current net-next - patch 5: - propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind caller - patch 7: - rebased on top of the current net-next v2->v3: - removed original patch 1, removing tp->q cls_bpf dependency. Fixed by Jakub in the meantime. - patch 1: - rebased on top of the current net-next - patch 5: - new patch - patch 8: - removed "p_" prefix from block index function args - patch 10: - add tc offload feature handling ==================== Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>