summaryrefslogtreecommitdiffstats
path: root/net/xfrm
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'master' of ↵Jakub Kicinski2020-11-042-7/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) Fix packet receiving of standard IP tunnels when the xfrm_interface module is installed. From Xin Long. 2) Fix a race condition between spi allocating and hash list resizing. From zhuoliang zhang. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: xfrm: fix a race condition during allocing spizhuoliang zhang2020-10-231-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | we found that the following race condition exists in xfrm_alloc_userspi flow: user thread state_hash_work thread ---- ---- xfrm_alloc_userspi() __find_acq_core() /*alloc new xfrm_state:x*/ xfrm_state_alloc() /*schedule state_hash_work thread*/ xfrm_hash_grow_check() xfrm_hash_resize() xfrm_alloc_spi /*hold lock*/ x->id.spi = htonl(spi) spin_lock_bh(&net->xfrm.xfrm_state_lock) /*waiting lock release*/ xfrm_hash_transfer() spin_lock_bh(&net->xfrm.xfrm_state_lock) /*add x into hlist:net->xfrm.state_byspi*/ hlist_add_head_rcu(&x->byspi) spin_unlock_bh(&net->xfrm.xfrm_state_lock) /*add x into hlist:net->xfrm.state_byspi 2 times*/ hlist_add_head_rcu(&x->byspi) 1. a new state x is alloced in xfrm_state_alloc() and added into the bydst hlist in __find_acq_core() on the LHS; 2. on the RHS, state_hash_work thread travels the old bydst and tranfers every xfrm_state (include x) into the new bydst hlist and new byspi hlist; 3. user thread on the LHS gets the lock and adds x into the new byspi hlist again. So the same xfrm_state (x) is added into the same list_hash (net->xfrm.state_byspi) 2 times that makes the list_hash become an inifite loop. To fix the race, x->id.spi = htonl(spi) in the xfrm_alloc_spi() is moved to the back of spin_lock_bh, sothat state_hash_work thread no longer add x which id.spi is zero into the hash_list. Fixes: f034b5d4efdf ("[XFRM]: Dynamic xfrm_state hash table sizing.") Signed-off-by: zhuoliang zhang <zhuoliang.zhang@mediatek.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: interface: fix the priorities for ipip and ipv6 tunnelsXin Long2020-10-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Nicolas noticed in his case, when xfrm_interface module is installed the standard IP tunnels will break in receiving packets. This is caused by the IP tunnel handlers with a higher priority in xfrm interface processing incoming packets by xfrm_input(), which would drop the packets and return 0 instead when anything wrong happens. Rather than changing xfrm_input(), this patch is to adjust the priority for the IP tunnel handlers in xfrm interface, so that the packets would go to xfrmi's later than the others', as the others' would not drop the packets when the handlers couldn't process them. Note that IPCOMP also defines its own IPIP tunnel handler and it calls xfrm_input() as well, so we must make its priority lower than xfrmi's, which means having xfrmi loaded would still break IPCOMP. We may seek another way to fix it in xfrm_input() in the future. Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Fixes: da9bbf0598c9 ("xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handler") FIxes: d7b360c2869f ("xfrm: interface: support IP6IP6 and IP6IP tunnels processing with .cb_handler") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: use new function dev_fetch_sw_netstatsHeiner Kallweit2020-10-131-21/+1
| | | | | | | | | | | | | | | | Simplify the code by using new function dev_fetch_sw_netstats(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/a6b816f4-bbf2-9db0-d59a-7e4e9cc808fe@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | xfrm: use dev_sw_netstats_rx_add()Fabian Frederick2020-10-061-8/+1
| | | | | | | | | | | | | | | | use new helper for netstats settings Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2020-10-053-7/+43
|\| | | | | | | | | | | | | | | | | | | | | Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2020-09-283-7/+43
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-09-28 1) Fix a build warning in ip_vti if CONFIG_IPV6 is not set. From YueHaibing. 2) Restore IPCB on espintcp before handing the packet to xfrm as the information there is still needed. From Sabrina Dubroca. 3) Fix pmtu updating for xfrm interfaces. From Sabrina Dubroca. 4) Some xfrm state information was not cloned with xfrm_do_migrate. Fixes to clone the full xfrm state, from Antony Antony. 5) Use the correct address family in xfrm_state_find. The struct flowi must always be interpreted along with the original address family. This got lost over the years. Fix from Herbert Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * xfrm: Use correct address family in xfrm_state_findHerbert Xu2020-09-251-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct flowi must never be interpreted by itself as its size depends on the address family. Therefore it must always be grouped with its original family value. In this particular instance, the original family value is lost in the function xfrm_state_find. Therefore we get a bogus read when it's coupled with the wrong family which would occur with inter- family xfrm states. This patch fixes it by keeping the original family value. Note that the same bug could potentially occur in LSM through the xfrm_state_pol_flow_match hook. I checked the current code there and it seems to be safe for now as only secid is used which is part of struct flowi_common. But that API should be changed so that so that we don't get new bugs in the future. We could do that by replacing fl with just secid or adding a family field. Reported-by: syzbot+577fbac3145a6eb2e7a5@syzkaller.appspotmail.com Fixes: 48b8d78315bf ("[XFRM]: State selection update to use inner...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrm: clone whole liftime_cur structure in xfrm_do_migrateAntony Antony2020-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we clone state only add_time was cloned. It missed values like bytes, packets. Now clone the all members of the structure. v1->v3: - use memcpy to copy the entire structure Fixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)") Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrm: clone XFRMA_SEC_CTX in xfrm_do_migrateAntony Antony2020-09-071-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFRMA_SEC_CTX was not cloned from the old to the new. Migrate this attribute during XFRMA_MSG_MIGRATE v1->v2: - return -ENOMEM on error v2->v3: - fix return type to int Fixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)") Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrm: clone XFRMA_SET_MARK in xfrm_do_migrateAntony Antony2020-09-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | XFRMA_SET_MARK and XFRMA_SET_MARK_MASK was not cloned from the old to the new. Migrate these two attributes during XFRMA_MSG_MIGRATE Fixes: 9b42c1f179a6 ("xfrm: Extend the output_mark to support input direction and masking.") Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrmi: drop ignore_df check before updating pmtuSabrina Dubroca2020-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfrm interfaces currently test for !skb->ignore_df when deciding whether to update the pmtu on the skb's dst. Because of this, no pmtu exception is created when we do something like: ping -s 1438 <dest> By dropping this check, the pmtu exception will be created and the next ping attempt will work. Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * espintcp: restore IP CB before handing the packet to xfrmSabrina Dubroca2020-08-171-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xiumei reported a bug with espintcp over IPv6 in transport mode, because xfrm6_transport_finish expects to find IP6CB data (struct inet6_skb_cb). Currently, espintcp zeroes the CB, but the relevant part is actually preserved by previous layers (first set up by tcp, then strparser only zeroes a small part of tcp_skb_tb), so we can just relocate it to the start of skb->cb. Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm/compat: Translate 32-bit user_policy from sockptrDmitry Safonov2020-09-242-3/+40
| | | | | | | | | | | | | | | | | | | | | | | | Provide compat_xfrm_userpolicy_info translation for xfrm setsocketopt(). Reallocate buffer and put the missing padding for 64-bit message. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm/compat: Add 32=>64-bit messages translatorDmitry Safonov2020-09-243-19/+315
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide the user-to-kernel translator under XFRM_USER_COMPAT, that creates for 32-bit xfrm-user message a 64-bit translation. The translation is afterwards reused by xfrm_user code just as if userspace had sent 64-bit message. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm/compat: Attach xfrm dumps to 64=>32 bit translatorDmitry Safonov2020-09-241-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently nlmsg_unicast() is used by functions that dump structures that can be different in size for compat tasks, see dump_one_state() and dump_one_policy(). The following nlmsg_unicast() users exist today in xfrm: Function | Message can be different | in size on compat -------------------------------------------|------------------------------ xfrm_get_spdinfo() | N xfrm_get_sadinfo() | N xfrm_get_sa() | Y xfrm_alloc_userspi() | Y xfrm_get_policy() | Y xfrm_get_ae() | N Besides, dump_one_state() and dump_one_policy() can be used by filtered netlink dump for XFRM_MSG_GETSA, XFRM_MSG_GETPOLICY. Just as for xfrm multicast, allocate frag_list for compat skb journey down to recvmsg() which will give user the desired skb according to syscall bitness. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm/compat: Add 64=>32-bit messages translatorDmitry Safonov2020-09-242-1/+310
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide the kernel-to-user translator under XFRM_USER_COMPAT, that creates for 64-bit xfrm-user message a 32-bit translation and puts it in skb's frag_list. net/compat.c layer provides MSG_CMSG_COMPAT to decide if the message should be taken from skb or frag_list. (used by wext-core which has also an ABI difference) Kernel sends 64-bit xfrm messages to the userspace for: - multicast (monitor events) - netlink dumps Wire up the translator to xfrm_nlmsg_multicast(). Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm: Provide API to register translator moduleDmitry Safonov2020-09-244-0/+100
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a skeleton for xfrm_compat module and provide API to register it in xfrm_state.ko. struct xfrm_translator will have function pointers to translate messages received from 32-bit userspace or to be sent to it from 64-bit kernel. module_get()/module_put() are used instead of rcu_read_lock() as the module will vmalloc() memory for translation. The new API is registered with xfrm_state module, not with xfrm_user as the former needs translator for user_policy set by setsockopt() and xfrm_user already uses functions from xfrm_state. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* | Merge tag 'locking-urgent-2020-08-10' of ↵Linus Torvalds2020-08-101-5/+5
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Thomas Gleixner: "A set of locking fixes and updates: - Untangle the header spaghetti which causes build failures in various situations caused by the lockdep additions to seqcount to validate that the write side critical sections are non-preemptible. - The seqcount associated lock debug addons which were blocked by the above fallout. seqcount writers contrary to seqlock writers must be externally serialized, which usually happens via locking - except for strict per CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot validate that the lock is held. This new debug mechanism adds the concept of associated locks. sequence count has now lock type variants and corresponding initializers which take a pointer to the associated lock used for writer serialization. If lockdep is enabled the pointer is stored and write_seqcount_begin() has a lockdep assertion to validate that the lock is held. Aside of the type and the initializer no other code changes are required at the seqcount usage sites. The rest of the seqcount API is unchanged and determines the type at compile time with the help of _Generic which is possible now that the minimal GCC version has been moved up. Adding this lockdep coverage unearthed a handful of seqcount bugs which have been addressed already independent of this. While generally useful this comes with a Trojan Horse twist: On RT kernels the write side critical section can become preemtible if the writers are serialized by an associated lock, which leads to the well known reader preempts writer livelock. RT prevents this by storing the associated lock pointer independent of lockdep in the seqcount and changing the reader side to block on the lock when a reader detects that a writer is in the write side critical section. - Conversion of seqcount usage sites to associated types and initializers" * tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) locking/seqlock, headers: Untangle the spaghetti monster locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header x86/headers: Remove APIC headers from <asm/smp.h> seqcount: More consistent seqprop names seqcount: Compress SEQCNT_LOCKNAME_ZERO() seqlock: Fold seqcount_LOCKNAME_init() definition seqlock: Fold seqcount_LOCKNAME_t definition seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g hrtimer: Use sequence counter with associated raw spinlock kvm/eventfd: Use sequence counter with associated spinlock userfaultfd: Use sequence counter with associated spinlock NFSv4: Use sequence counter with associated spinlock iocost: Use sequence counter with associated spinlock raid5: Use sequence counter with associated spinlock vfs: Use sequence counter with associated spinlock timekeeping: Use sequence counter with associated raw spinlock xfrm: policy: Use sequence counters with associated lock netfilter: nft_set_rbtree: Use sequence counter with associated rwlock netfilter: conntrack: Use sequence counter with associated spinlock sched: tasks: Use sequence counter with associated spinlock ...
| * xfrm: policy: Use sequence counters with associated lockAhmed S. Darwish2020-07-291-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A sequence counter write side critical section must be protected by some form of locking to serialize writers. If the serialization primitive is not disabling preemption implicitly, preemption has to be explicitly disabled before entering the sequence counter write side critical section. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_spinlock_t and seqcount_mutex_t data types instead, which allow to associate a lock with the sequence counter. This enables lockdep to verify that the lock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-17-a.darwish@linutronix.de
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2020-08-023-47/+76
|\ \ | | | | | | | | | | | | | | | | | | Resolved kernel/bpf/btf.c using instructions from merge commit 69138b34a7248d2396ab85c8652e20c0c39beaba Signed-off-by: David S. Miller <davem@davemloft.net>
| * \ Merge branch 'master' of ↵David S. Miller2020-07-313-47/+76
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-07-31 1) Fix policy matching with mark and mask on userspace interfaces. From Xin Long. 2) Several fixes for the new ESP in TCP encapsulation. From Sabrina Dubroca. 3) Fix crash when the hold queue is used. The assumption that xdst->path and dst->child are not a NULL pointer only if dst->xfrm is not a NULL pointer is true with the exception of using the hold queue. Fix this by checking for hold queue usage before dereferencing xdst->path or dst->child. 4) Validate pfkey_dump parameter before sending them. From Mark Salyzyn. 5) Fix the location of the transport header with ESP in UDPv6 encapsulation. From Sabrina Dubroca. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * espintcp: count packets dropped in espintcp_rcvSabrina Dubroca2020-07-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, espintcp_rcv drops packets silently, which makes debugging issues difficult. Count packets as either XfrmInHdrError (when the packet was too short or contained invalid data) or XfrmInError (for other issues). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * espintcp: handle short messages instead of breaking the encap socketSabrina Dubroca2020-07-301-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, short messages (less than 4 bytes after the length header) will break the stream of messages. This is unnecessary, since we can still parse messages even if they're too short to contain any usable data. This is also bogus, as keepalive messages (a single 0xff byte), though not needed with TCP encapsulation, should be allowed. This patch changes the stream parser so that short messages are accepted and dropped in the kernel. Messages that contain a valid SPI or non-ESP header are processed as before. Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") Reported-by: Andrew Cagney <cagney@libreswan.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrm: policy: fix IPv6-only espintcp compilationSabrina Dubroca2020-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | In case we're compiling espintcp support only for IPv6, we should still initialize the common code. Fixes: 26333c37fc28 ("xfrm: add IPv6 support for espintcp") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * espintcp: recv() should return 0 when the peer socket is closedSabrina Dubroca2020-07-171-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | man 2 recv says: RETURN VALUE When a stream socket peer has performed an orderly shutdown, the return value will be 0 (the traditional "end-of-file" return). Currently, this works for blocking reads, but non-blocking reads will return -EAGAIN. This patch overwrites that return value when the peer won't send us any more data. Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") Reported-by: Andrew Cagney <cagney@libreswan.org> Tested-by: Andrew Cagney <cagney@libreswan.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * espintcp: support non-blocking sendsSabrina Dubroca2020-07-171-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, non-blocking sends from userspace result in EOPNOTSUPP. To support this, we need to tell espintcp_sendskb_locked() and espintcp_sendskmsg_locked() that non-blocking operation was requested from espintcp_sendmsg(). Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") Reported-by: Andrew Cagney <cagney@libreswan.org> Tested-by: Andrew Cagney <cagney@libreswan.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * xfrm: policy: match with both mark and mask on user interfacesXin Long2020-06-242-30/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"), it would take 'priority' to make a policy unique, and allow duplicated policies with different 'priority' to be added, which is not expected by userland, as Tobias reported in strongswan. To fix this duplicated policies issue, and also fix the issue in commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"), when doing add/del/get/update on user interfaces, this patch is to change to look up a policy with both mark and mask by doing: mark.v == pol->mark.v && mark.m == pol->mark.m and leave the check: (mark & pol->mark.m) == pol->mark.v for tx/rx path only. As the userland expects an exact mark and mask match to manage policies. v1->v2: - make xfrm_policy_mark_match inline and fix the changelog as Tobias suggested. Fixes: 295fae568885 ("xfrm: Allow user space manipulation of SPD mark") Fixes: ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list") Reported-by: Tobias Brunner <tobias@strongswan.org> Tested-by: Tobias Brunner <tobias@strongswan.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | Merge branch 'master' of ↵David S. Miller2020-07-304-34/+149
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-07-30 Please note that I did the first time now --no-ff merges of my testing branch into the master branch to include the [PATCH 0/n] message of a patchset. Please let me know if this is desirable, or if I should do it any different. 1) Introduce a oseq-may-wrap flag to disable anti-replay protection for manually distributed ICVs as suggested in RFC 4303. From Petr Vaněk. 2) Patchset to fully support IPCOMP for vti4, vti6 and xfrm interfaces. From Xin Long. 3) Switch from a linear list to a hash list for xfrm interface lookups. From Eyal Birger. 4) Fixes to not register one xfrm(6)_tunnel object twice. From Xin Long. 5) Fix two compile errors that were introduced with the IPCOMP support for vti and xfrm interfaces. Also from Xin Long. 6) Make the policy hold queue work with VTI. This was forgotten when VTI was implemented. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | xfrm: Make the policy hold queue work with VTI.Steffen Klassert2020-07-211-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We forgot to support the xfrm policy hold queue when VTI was implemented. This patch adds everything we need so that we can use the policy hold queue together with VTI interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: interface: use IS_REACHABLE to avoid some compile errorsXin Long2020-07-171-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel test robot reported some compile errors: ia64-linux-ld: net/xfrm/xfrm_interface.o: in function `xfrmi4_fini': net/xfrm/xfrm_interface.c:900: undefined reference to `xfrm4_tunnel_deregister' ia64-linux-ld: net/xfrm/xfrm_interface.c:901: undefined reference to `xfrm4_tunnel_deregister' ia64-linux-ld: net/xfrm/xfrm_interface.o: in function `xfrmi4_init': net/xfrm/xfrm_interface.c:873: undefined reference to `xfrm4_tunnel_register' ia64-linux-ld: net/xfrm/xfrm_interface.c:876: undefined reference to `xfrm4_tunnel_register' ia64-linux-ld: net/xfrm/xfrm_interface.c:885: undefined reference to `xfrm4_tunnel_deregister' This happened when set CONFIG_XFRM_INTERFACE=y and CONFIG_INET_TUNNEL=m. We don't really want xfrm_interface to depend inet_tunnel completely, but only to disable the tunnel code when inet_tunnel is not seen. So instead of adding "select INET_TUNNEL" for XFRM_INTERFACE, this patch is only to change to IS_REACHABLE to avoid these compile error. Reported-by: kernel test robot <lkp@intel.com> Fixes: da9bbf0598c9 ("xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handler") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: interface: not xfrmi_ipv6/ipip_handler twiceXin Long2020-07-141-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we did in the last 2 patches for vti(6), this patch is to define a new xfrm_tunnel object 'xfrmi_ipip6_handler' to register for AF_INET6, and a new xfrm6_tunnel object 'xfrmi_ip6ip_handler' to register for AF_INET. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm interface: store xfrmi contexts in a hash by if_idEyal Birger2020-07-131-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfrmi_lookup() is called on every packet. Using a single list for looking up if_id becomes a bottleneck when having many xfrm interfaces. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm interface: avoid xi lookup in xfrmi_decode_session()Eyal Birger2020-07-131-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xfrmi context exists in the netdevice priv context. Avoid looking for it in a separate list. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handlerXin Long2020-07-091-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to ip_vti, IPIP and IPIP6 tunnels processing can easily be done with .cb_handler for xfrm interface. v1->v2: - no change. v2-v3: - enable it only when CONFIG_INET_XFRM_TUNNEL is defined, to fix the build error, reported by kbuild test robot. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: interface: support IP6IP6 and IP6IP tunnels processing with .cb_handlerXin Long2020-07-091-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to ip6_vti, IP6IP6 and IP6IP tunnels processing can easily be done with .cb_handler for xfrm interface. v1->v2: - no change. v2-v3: - enable it only when CONFIG_INET6_XFRM_TUNNEL is defined, to fix the build error, reported by kbuild test robot. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: add is_ipip to struct xfrm_input_afinfoXin Long2020-07-091-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to add a new member is_ipip to struct xfrm_input_afinfo, to allow another group family of callback functions to be registered with is_ipip set. This will be used for doing a callback for struct xfrm(6)_tunnel of ipip/ipv6 tunnels in xfrm_input() by calling xfrm_rcv_cb(), which is needed by ipip/ipv6 tunnels' support in ip(6)_vti and xfrm interface in the next patches. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: introduce oseq-may-wrap flagPetr Vaněk2020-06-241-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC 4303 in section 3.3.3 suggests to disable anti-replay for manually distributed ICVs in which case the sender does not need to monitor or reset the counter. However, the sender still increments the counter and when it reaches the maximum value, the counter rolls over back to zero. This patch introduces new extra_flag XFRM_SA_XFLAG_OSEQ_MAY_WRAP which allows sequence number to cycle in outbound packets if set. This flag is used only in legacy and bmp code, because esn should not be negotiated if anti-replay is disabled (see note in 3.3.3 section). Signed-off-by: Petr Vaněk <pv@excello.cz> Acked-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | | net/xfrm: switch xfrm_user_policy to sockptr_tChristoph Hellwig2020-07-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2020-07-111-0/+2
|\ \ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | All conflicts seemed rather trivial, with some guidance from Saeed Mameed on the tc_ct.c one. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: xfrmi: implement header_ops->parse_protocol for AF_PACKETJason A. Donenfeld2020-06-301-0/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xfrm interface uses skb->protocol to determine packet type, and bails out if it's not set. For AF_PACKET injection, we need to support its call chain of: packet_sendmsg -> packet_snd -> packet_parse_headers -> dev_parse_header_protocol -> parse_protocol Without a valid parse_protocol, this returns zero, and xfrmi rejects the skb. So, this wires up the ip_tunnel handler for layer 3 packets for that case. Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2020-06-253-5/+27
|\| | | | | | | | | | | | | | | | | | | | | | | Minor overlapping changes in xfrm_device.c, between the double ESP trailing bug fix setting the XFRM_INIT flag and the changes in net-next preparing for bonding encryption support. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2020-06-193-5/+27
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-06-19 1) Fix double ESP trailer insertion in IPsec crypto offload if netif_xmit_frozen_or_stopped is true. From Huy Nguyen. 2) Merge fixup for "remove output_finish indirection from xfrm_state_afinfo". From Stephen Rothwell. 3) Select CRYPTO_SEQIV for ESP as this is needed for GCM and several other encryption algorithms. Also modernize the crypto algorithm selections for ESP and AH, remove those that are maked as "MUST NOT" and add those that are marked as "MUST" be implemented in RFC 8221. From Eric Biggers. Please note the merge conflict between commit: a7f7f6248d97 ("treewide: replace '---help---' in Kconfig files with 'help'") from Linus' tree and commits: 7d4e39195925 ("esp, ah: consolidate the crypto algorithm selections") be01369859b8 ("esp, ah: modernize the crypto algorithm selections") from the ipsec tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | esp, ah: modernize the crypto algorithm selectionsEric Biggers2020-06-151-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The crypto algorithms selected by the ESP and AH kconfig options are out-of-date with the guidance of RFC 8221, which lists the legacy algorithms MD5 and DES as "MUST NOT" be implemented, and some more modern algorithms like AES-GCM and HMAC-SHA256 as "MUST" be implemented. But the options select the legacy algorithms, not the modern ones. Therefore, modify these options to select the MUST algorithms -- and *only* the MUST algorithms. Also improve the help text. Note that other algorithms may still be explicitly enabled in the kconfig, and the choice of which to actually use is still controlled by userspace. This change only modifies the list of algorithms for which kernel support is guaranteed to be present. Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Suggested-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Corentin Labbe <clabbe@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * | esp: select CRYPTO_SEQIVEric Biggers2020-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f23efcbcc523 ("crypto: ctr - no longer needs CRYPTO_SEQIV") made CRYPTO_CTR stop selecting CRYPTO_SEQIV. This breaks IPsec for most users since GCM and several other encryption algorithms require "seqiv" -- and RFC 8221 lists AES-GCM as "MUST" be implemented. Just make XFRM_ESP select CRYPTO_SEQIV. Fixes: f23efcbcc523 ("crypto: ctr - no longer needs CRYPTO_SEQIV") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Corentin Labbe <clabbe@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * | esp, ah: consolidate the crypto algorithm selectionsEric Biggers2020-06-151-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of duplicating the algorithm selections between INET_AH and INET6_AH and between INET_ESP and INET6_ESP, create new tristates XFRM_AH and XFRM_ESP that do the algorithm selections, and make these be selected by the corresponding INET* options. Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Corentin Labbe <clabbe@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * | xfrm: merge fixup for "remove output_finish indirection from xfrm_state_afinfo"Stephen Rothwell2020-06-051-4/+0
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| | * | xfrm: Fix double ESP trailer insertion in IPsec crypto offload.Huy Nguyen2020-06-041-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During IPsec performance testing, we see bad ICMP checksum. The error packet has duplicated ESP trailer due to double validate_xmit_xfrm calls. The first call is from ip_output, but the packet cannot be sent because netif_xmit_frozen_or_stopped is true and the packet gets dev_requeue_skb. The second call is from NET_TX softirq. However after the first call, the packet already has the ESP trailer. Fix by marking the skb with XFRM_XMIT bit after the packet is handled by validate_xmit_xfrm to avoid duplicate ESP trailer insertion. Fixes: f6e27114a60a ("net: Add a xfrm validate function to validate_xmit_skb") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | | bonding/xfrm: use real_dev instead of slave_devJarod Wilson2020-06-231-2/+3
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than requiring every hw crypto capable NIC driver to do a check for slave_dev being set, set real_dev in the xfrm layer and xso init time, and then override it in the bonding driver as needed. Then NIC drivers can always use real_dev, and at the same time, we eliminate the use of a variable name that probably shouldn't have been used in the first place, particularly given recent current events. CC: Boris Pismenny <borisp@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> CC: Leon Romanovsky <leon@kernel.org> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>