summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* net: fix network drivers ndo_start_xmit() return values (part 7)Patrick McHardy2009-06-131-1/+2
| | | | | | | | | | | | Fix up ATM drivers that return an errno value to qdisc_restart(), causing qdisc_restart() to print a warning an requeue/retransmit the skb. - lec: condition can only be remedied by userspace, until that retransmissions Compile tested only. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Simplify interface for ATM LANEMichał Mirosław2009-06-115-58/+44
| | | | | | | | | | | | | | | This patch changes FDB entry check for ATM LANE bridge integration. There's no point in holding a FDB entry around SKB building. br_fdb_get()/br_fdb_put() pair are changed into single br_fdb_test_addr() hook that checks if the addr has FDB entry pointing to other port to the one the request arrived on. FDB entry refcounting is removed as it's not used anywhere else. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PATCH] net core: Some interface flags not returned by SIOCGIFFLAGSJohn Dykstra2009-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | Commit b00055aacdb172c05067612278ba27265fcd05ce " [NET] core: add RFC2863 operstate" defined new interface flag values. Its documentation specified that these flags could be accessed from user space via SIOCGIFFLAGS. However, this does not work because the new flags do not fit in that ioctl's argument width. Change the documentation to match the code's behavior. Also change the source to explicitly show the truncation. This _should_ have no effect on executable code, and did not with gcc 4.2.4 generating x86 code. A new ioctl could be defined to return all interface flags to user space. However, since this has been broken for three years with no one complaining, there doesn't seem much need. They are still accessible via netlink. Reported-by: "Fredrik Arnerup" <fredrik.arnerup@edgeware.tv> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-06-1125-581/+1189
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
| * netfilter: ip_tables: fix build errorPatrick McHardy2009-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix build error introduced by commit bb70dfa5 (netfilter: xtables: consolidate comefrom debug cast access): net/ipv4/netfilter/ip_tables.c: In function 'ipt_do_table': net/ipv4/netfilter/ip_tables.c:421: error: 'comefrom' undeclared (first use in this function) net/ipv4/netfilter/ip_tables.c:421: error: (Each undeclared identifier is reported only once net/ipv4/netfilter/ip_tables.c:421: error: for each function it appears in.) Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_ct_tcp: fix up build after mergePatrick McHardy2009-06-111-1/+1
| | | | | | | | | | | | Replace the last occurence of tcp_lock by the per-conntrack lock. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * Merge branch 'master' of ↵Patrick McHardy2009-06-11171-2846/+5357
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
| * | netfilter: nf_conntrack: use per-conntrack locks for protocol dataPatrick McHardy2009-06-106-53/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce per-conntrack locks and use them instead of the global protocol locks to avoid contention. Especially tcp_lock shows up very high in profiles on larger machines. This will also allow to simplify the upcoming reliable event delivery patches. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_socket: added new revision of the 'socket' match supporting flagsLaszlo Attila Toth2009-06-091-11/+52
| | | | | | | | | | | | | | | | | | | | | | | | If the XT_SOCKET_TRANSPARENT flag is set, enabled 'transparent' socket option is required for the socket to be matched. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: passive OS fingerprint xtables matchEvgeniy Polyakov2009-06-083-0/+442
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Passive OS fingerprinting netfilter module allows to passively detect remote OS and perform various netfilter actions based on that knowledge. This module compares some data (WS, MSS, options and it's order, ttl, df and others) from packets with SYN bit set with dynamically loaded OS fingerprints. Fingerprint matching rules can be downloaded from OpenBSD source tree or found in archive and loaded via netfilter netlink subsystem into the kernel via special util found in archive. Archive contains library file (also attached), which was shipped with iptables extensions some time ago (at least when ipt_osf existed in patch-o-matic). Following changes were made in this release: * added NLM_F_CREATE/NLM_F_EXCL checks * dropped _rcu list traversing helpers in the protected add/remove calls * dropped unneded structures, debug prints, obscure comment and check Fingerprints can be downloaded from http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os or can be found in archive Example usage: -d switch removes fingerprints Please consider for inclusion. Thank you. Passive OS fingerprint homepage (archives, examples): http://www.ioremap.net/projects/osf Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_ct_icmp: keep the ICMP ct entries longerJan Kasprzak2009-06-082-24/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current conntrack code kills the ICMP conntrack entry as soon as the first reply is received. This is incorrect, as we then see only the first ICMP echo reply out of several possible duplicates as ESTABLISHED, while the rest will be INVALID. Also this unnecessarily increases the conntrackd traffic on H-A firewalls. Make all the ICMP conntrack entries (including the replied ones) last for the default of nf_conntrack_icmp{,v6}_timeout seconds. Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ipt_MASQUERADE: remove redundant rwlockFlorian Westphal2009-06-051-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lock "protects" an assignment and a comparision of an integer. When the caller of device_cmp() evaluates the result, nat->masq_index may already have been changed (regardless if the lock is there or not). So, the lock either has to be held during nf_ct_iterate_cleanup(), or can be removed. This does the latter. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_NFQUEUE: queue balancing supportFlorian Westphal2009-06-051-0/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for specifying a range of queues instead of a single queue id. Flows will be distributed across the given range. This is useful for multicore systems: Instead of having a single application read packets from a queue, start multiple instances on queues x, x+1, .. x+n. Each instance can process flows independently. Packets for the same connection are put into the same queue. Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com> Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_NFQUEUE: use NFPROTO_UNSPECFlorian Westphal2009-06-051-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We can use wildcard matching here, just like ab4f21e6fb1c09b13c4c3cb8357babe8223471bd ("xtables: use NFPROTO_UNSPEC in more extensions"). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: x_tables: added hook number into match extension parameter structure.Evgeniy Polyakov2009-06-043-3/+3
| | | | | | | | | | | | | | | Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: conntrack: replace notify chain by function pointerPablo Neira Ayuso2009-06-033-40/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the notify chain infrastructure and replace it by a simple function pointer. This issue has been mentioned in the mailing list several times: the use of the notify chain adds too much overhead for something that is only used by ctnetlink. This patch also changes nfnetlink_send(). It seems that gfp_any() returns GFP_KERNEL for user-context request, like those via ctnetlink, inside the RCU read-side section which is not valid. Using GFP_KERNEL is also evil since netlink may schedule(), this leads to "scheduling while atomic" bug reports. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: conntrack: simplify event caching systemPablo Neira Ayuso2009-06-026-19/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the conntrack event caching system by removing several events: * IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted since the have no clients. * IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter days. * IPCT_REFRESH which is not of any use since we always include the timeout in the messages. After this patch, the existing events are: * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify addition and deletion of entries. * IPCT_STATUS, that notes that the status bits have changes, eg. IPS_SEEN_REPLY and IPS_ASSURED. * IPCT_PROTOINFO, that reports that internal protocol information has changed, eg. the TCP, DCCP and SCTP protocol state. * IPCT_HELPER, that a helper has been assigned or unassigned to this entry. * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this covers the case when a mark is set to zero. * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence adjustment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: conntrack: don't report events on module removalPablo Neira Ayuso2009-06-022-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | During the module removal there are no possible event listeners since ctnetlink must be removed before to allow removing nf_conntrack. This patch removes the event reporting for the module removal case which is not of any use in the existing code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: ctnetlink: cleanup message-size calculationPablo Neira Ayuso2009-06-021-62/+40
| | | | | | | | | | | | | | | | | | | | | This patch cleans up the message calculation to make it similar to rtnetlink, moreover, it removes unneeded verbose information. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: ctnetlink: use nlmsg_* helper function to build messagesPablo Neira Ayuso2009-06-021-42/+42
| | | | | | | | | | | | | | | | | | | | | Replaces the old macros to build Netlink messages with the new nlmsg_*() helper functions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: ctnetlink: rename tuple() by nf_ct_tuple() macro definitionPablo Neira Ayuso2009-06-021-7/+6
| | | | | | | | | | | | | | | | | | | | | This patch move the internal tuple() macro definition to the header file as nf_ct_tuple(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: ctnetlink: remove nowait parameter from *fill_info()Pablo Neira Ayuso2009-06-021-14/+10
| | | | | | | | | | | | | | | | | | | | | This patch is a cleanup, it removes the `nowait' parameter from all *fill_info() function since it is always set to one. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: nfnetlink: cleanup for nfnetlink_rcv_msg() functionPablo Neira Ayuso2009-06-021-14/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans up the message handling path in two aspects: * it uses NLMSG_LENGTH() instead of NLMSG_SPACE() like rtnetlink does in this case to check if there is enough room for the Netlink/nfnetlink headers. No need to check for the padding room. * it removes a redundant header size checking that has been already do at the beginning of the function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: nf_ct_tcp: TCP simultaneous open supportJozsef Kadlecsik2009-06-021-37/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch below adds supporting TCP simultaneous open to conntrack. The unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the second SYN sent from the reply direction in the new case. The state table is updated and the function tcp_in_window is modified to handle simultaneous open. The functionality can fairly easily be tested by socat. A sample tcpdump recording 23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7> 23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0 23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1> 23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7> 23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423> and the corresponding netlink events: [NEW] tcp 6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED] The RST packet was dropped in the raw table, thus it did not reach conntrack. nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2 state as the old unused LISTEN. With TCP simultaneous open support we satisfy REQ-2 in RFC 5382 ;-) . Additional minor correction in this patch is that in order to catch uninitialized reply directions, "td_maxwin == 0" is used instead of "td_end == 0" because the former can't be true except in uninitialized state while td_end may accidentally be equal to zero in the mid of a connection. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | Merge branch 'master' of git://dev.medozas.de/linuxPatrick McHardy2009-06-028-235/+278
| |\ \
| | * | netfilter: xtables: print hook name instead of maskJan Engelhardt2009-05-081-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Users cannot make anything of these numbers. Let's just tell them directly. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: consolidate comefrom debug cast accessJan Engelhardt2009-05-082-9/+17
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: remove another level of indentJan Engelhardt2009-05-083-66/+63
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: remove some gotoJan Engelhardt2009-05-082-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | Combining two ifs, and goto is easily gone. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: reduce indent level by oneJan Engelhardt2009-05-083-189/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cosmetic only. Transformation applied: -if (foo) { long block; } else { short block; } +if (!foo) { short block; continue; } long block; Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: consolidate open-coded logicJan Engelhardt2009-05-084-18/+40
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: fix const inconsistencyJan Engelhardt2009-05-082-14/+14
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: remove redundant castsJan Engelhardt2009-05-082-2/+2
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: use NFPROTO_ in standard targetsJan Engelhardt2009-05-082-6/+6
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: queue: use NFPROTO_ for queue callsitesJan Engelhardt2009-05-083-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | af is an nfproto. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * | netfilter: xtables: use NFPROTO_ for xt_proto_init callsitesJan Engelhardt2009-05-082-4/+4
| | | | | | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | | netfilter: conntrack: add support for DCCP handshake sequence to ctnetlinkPablo Neira Ayuso2009-05-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ that exposes the u64 handshake sequence number to user-space. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | | Merge branch 'linux-2.6.31.y' of ↵David S. Miller2009-06-111-1/+2
|\ \ \ \ | |_|_|/ |/| | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax
| * | | wimax: fix warning caused by not checking retval of rfkill_set_hw_state()Inaky Perez-Gonzalez2009-06-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Caused by an API update. The return value can be safely ignored, as there is notthing we can do with it. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
* | | | Merge branch 'master' of ↵David S. Miller2009-06-113-57/+113
|\ \ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6
| * | | | Bluetooth: Add native RFKILL soft-switch support for all devicesMarcel Holtmann2009-06-081-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the re-write of the RFKILL subsystem it is now possible to easily integrate RFKILL soft-switch support into the Bluetooth subsystem. All Bluetooth devices will now get automatically RFKILL support. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Remove pointless endian conversion helpersMarcel Holtmann2009-06-082-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Bluetooth source uses some endian conversion helpers, that in the end translate to kernel standard routines. So remove this obfuscation since it is fully pointless. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Add basic constants for L2CAP ERTM support and use themMarcel Holtmann2009-06-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the basic constants required to add support for L2CAP Enhanced Retransmission feature. Based on a patch from Nathan Holstein <nathan@lampreynetworks.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Fix errors and warnings in L2CAP reported by checkpatch.plGustavo F. Padovan2009-06-081-27/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the errors without changing the l2cap.o binary: text data bss dec hex filename 18059 568 0 18627 48c3 l2cap.o.after 18059 568 0 18627 48c3 l2cap.o.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Remove unnecessary variable initializationMarcel Holtmann2009-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial value of err is not used until it is set to -ENOMEM. So just remove the initialization completely. Based on a patch from Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Use macro for L2CAP hint mask on receiving config requestGustavo F. Padovan2009-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using the L2CAP_CONF_HINT macro is easier to understand than using a hardcoded 0x80 value. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Use macros for L2CAP channel identifiersGustavo F. Padovan2009-06-081-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use macros instead of hardcoded numbers to make the L2CAP source code more readable. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* | | | | neigh: fix state transition INCOMPLETE->FAILED via Netlink requestTimo Teras2009-06-111-18/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code errors out the INCOMPLETE neigh entry skb queue only from the timer if maximum probes have been attempted and there has been no reply. This also causes the transtion to FAILED state. However, the neigh entry can be also updated via Netlink to inform that the address is unavailable. Currently, neigh_update() just stops the timers and leaves the pending skb's unreleased. This results that the clean up code in the timer callback is never called, preventing also proper garbage collection. This fixes neigh_update() to process the pending skb queue immediately if INCOMPLETE -> FAILED state transtion occurs due to a Netlink request. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: No more expensive sock_hold()/sock_put() on each txEric Dumazet2009-06-113-6/+25
| |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the problem with sock memory accounting is it uses a pair of sock_hold()/sock_put() for each transmitted packet. This slows down bidirectional flows because the receive path also needs to take a refcount on socket and might use a different cpu than transmit path or transmit completion path. So these two atomic operations also trigger cache line bounces. We can see this in tx or tx/rx workloads (media gateways for example), where sock_wfree() can be in top five functions in profiles. We use this sock_hold()/sock_put() so that sock freeing is delayed until all tx packets are completed. As we also update sk_wmem_alloc, we could offset sk_wmem_alloc by one unit at init time, until sk_free() is called. Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc) to decrement initial offset and atomicaly check if any packets are in flight. skb_set_owner_w() doesnt call sock_hold() anymore sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc reached 0 to perform the final freeing. Drawback is that a skb->truesize error could lead to unfreeable sockets, or even worse, prematurely calling __sk_free() on a live socket. Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt contention point. 5 % speedup on a UDP transmit workload (depends on number of flows), lowering TX completion cpu usage. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ieee802154: Use '%Zu' printf format for size_t.David S. Miller2009-06-112-2/+2
| | | | | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>