summaryrefslogtreecommitdiffstats
path: root/net/rxrpc
Commit message (Collapse)AuthorAgeFilesLines
* rxrpc: Implement slow-startDavid Howells2016-09-248-13/+294
| | | | | | | | | | | | | | | | | | | | | | | | | | Implement RxRPC slow-start, which is similar to RFC 5681 for TCP. A tracepoint is added to log the state of the congestion management algorithm and the decisions it makes. Notes: (1) Since we send fixed-size DATA packets (apart from the final packet in each phase), counters and calculations are in terms of packets rather than bytes. (2) The ACK packet carries the equivalent of TCP SACK. (3) The FLIGHT_SIZE calculation in RFC 5681 doesn't seem particularly suited to SACK of a small number of packets. It seems that, almost inevitably, by the time three 'duplicate' ACKs have been seen, we have narrowed the loss down to one or two missing packets, and the FLIGHT_SIZE calculation ends up as 2. (4) In rxrpc_resend(), if there was no data that apparently needed retransmission, we transmit a PING ACK to ask the peer to tell us what its Rx window state is. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Schedule an ACK if the reply to a client call appears overdueDavid Howells2016-09-244-0/+13
| | | | | | | | | | | | | | | | | | If we've sent all the request data in a client call but haven't seen any sign of the reply data yet, schedule an ACK to be sent to the server to find out if the reply data got lost. If the server hasn't yet hard-ACK'd the request data, we send a PING ACK to demand a response to find out whether we need to retransmit. If the server says it has received all of the data, we send an IDLE ACK to tell the server that we haven't received anything in the receive phase as yet. To make this work, a non-immediate PING ACK must carry a delay. I've chosen the same as the IDLE ACK for the moment. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Generate a summary of the ACK state for later useDavid Howells2016-09-242-11/+48
| | | | | | | Generate a summary of the Tx buffer packet state when an ACK is received for use in a later patch that does congestion management. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Delay the resend timer to allow for nsec->jiffies conv errorDavid Howells2016-09-241-2/+8
| | | | | | | | | | | | | | | | | | | | | | When determining the resend timer value, we have a value in nsec but the timer is in jiffies which may be a million or more times more coarse. nsecs_to_jiffies() rounds down - which means that the resend timeout expressed as jiffies is very likely earlier than the one expressed as nanoseconds from which it was derived. The problem is that rxrpc_resend() gets triggered by the timer, but can't then find anything to resend yet. It sets the timer again - but gets kicked off immediately again and again until the nanosecond-based expiry time is reached and we actually retransmit. Fix this by adding 1 to the jiffies-based resend_at value to counteract the rounding and make sure that the timer happens after the nanosecond-based expiry is passed. Alternatives would be to adjust the timestamp on the packets to align with the jiffie scale or to switch back to using jiffie-timestamps. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Reinitialise the call ACK and timer state for client reply phaseDavid Howells2016-09-243-0/+11
| | | | | | | | | | | Clear the ACK reason, ACK timer and resend timer when entering the client reply phase when the first DATA packet is received. New ACKs will be proposed once the data is queued. The resend timer is no longer relevant and we need to cancel ACKs scheduled to probe for a lost reply. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Include the last reply DATA serial number in the final ACKDavid Howells2016-09-241-3/+3
| | | | | | | In a client call, include the serial number of the last DATA packet of the reply in the final ACK. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Send an immediate ACK if we fill in a holeDavid Howells2016-09-241-1/+9
| | | | | | | | Send an immediate ACK if we fill in a hole in the buffer left by an out-of-sequence packet. This may allow the congestion management in the peer to avoid a retransmission if packets got reordered on the wire. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Send an ACK after every few DATA packets we receiveDavid Howells2016-09-244-9/+33
| | | | | | | | | | | | | Send an ACK if we haven't sent one for the last two packets we've received. This keeps the other end apprised of where we've got to - which is important if they're doing slow-start. We do this in recvmsg so that we can dispatch a packet directly without the need to wake up the background thread. This should possibly be made configurable in future. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to log which packets will be retransmittedDavid Howells2016-09-231-0/+2
| | | | | | | | | | Add a tracepoint to log in rxrpc_resend() which packets will be retransmitted. Note that if a positive ACK comes in whilst we have dropped the lock to retransmit another packet, the actual retransmission may not happen, though some of the effects will (such as altering the congestion management). Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add tracepoint for ACK proposalDavid Howells2016-09-236-28/+73
| | | | | | | | | | | | | Add a tracepoint to log proposed ACKs, including whether the proposal is used to update a pending ACK or is discarded in favour of an easlier, higher priority ACK. Whilst we're at it, get rid of the rxrpc_acks() function and access the name array directly. We do, however, need to validate the ACK reason number given to trace_rxrpc_rx_ack() to make sure we don't overrun the array. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to log injected Rx packet lossDavid Howells2016-09-231-6/+5
| | | | | | | Add a tracepoint to log received packets that get discarded due to Rx packet loss. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add data Tx tracepoint and adjust Tx ACK tracepointDavid Howells2016-09-232-4/+6
| | | | | | | | | | Add a tracepoint to log transmission of DATA packets (including loss injection). Adjust the ACK transmission tracepoint to include the packet serial number and to line this up with the DATA transmission display. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint for the call timerDavid Howells2016-09-235-7/+29
| | | | | | Add a tracepoint to log call timer initiation, setting and expiry. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Don't call the tx_ack tracepoint if don't generate an ACKDavid Howells2016-09-231-15/+11
| | | | | | | | | | | | | rxrpc_send_call_packet() is invoking the tx_ack tracepoint before it checks whether there's an ACK to transmit (another thread may jump in and transmit it). Fix this by only invoking the tracepoint if we get a valid ACK to transmit. Further, only allocate a serial number if we're going to actually transmit something. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Pass the last Tx packet marker in the annotation bufferDavid Howells2016-09-234-45/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the last packet of data to be transmitted on a call is queued, tx_top is set and then the RXRPC_CALL_TX_LAST flag is set. Unfortunately, this leaves a race in the ACK processing side of things because the flag affects the interpretation of tx_top and also allows us to start receiving reply data before we've finished transmitting. To fix this, make the following changes: (1) rxrpc_queue_packet() now sets a marker in the annotation buffer instead of setting the RXRPC_CALL_TX_LAST flag. (2) rxrpc_rotate_tx_window() detects the marker and sets the flag in the same context as the routines that use it. (3) rxrpc_end_tx_phase() is simplified to just shift the call state. The Tx window must have been rotated before calling to discard the last packet. (4) rxrpc_receiving_reply() is added to handle the arrival of the first DATA packet of a reply to a client call (which is an implicit ACK of the Tx phase). (5) The last part of rxrpc_input_ack() is reordered to perform Tx rotation, then soft-ACK application and then to end the phase if we've rotated the last packet. In the event of a terminal ACK, the soft-ACK application will be skipped as nAcks should be 0. (6) rxrpc_input_ackall() now has to rotate as well as ending the phase. In addition: (7) Alter the transmit tracepoint to log the rotation of the last packet. (8) Remove the no-longer relevant queue_reqack tracepoint note. The ACK-REQUESTED packet header flag is now set as needed when we actually transmit the packet and may vary by retransmission. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix call timerDavid Howells2016-09-232-13/+16
| | | | | | | | | | | | | | | | | | | Fix the call timer in the following ways: (1) If call->resend_at or call->ack_at are before or equal to the current time, then ignore that timeout. (2) If call->expire_at is before or equal to the current time, then don't set the timer at all (possibly we should queue the call). (3) Don't skip modifying the timer if timer_pending() is true. This indicates that the timer is working, not that it has expired and is running/waiting to run its expiry handler. Also call rxrpc_set_timer() to start the call timer going rather than calling add_timer(). Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix accidental cancellation of scheduled resend by ACK parserDavid Howells2016-09-231-0/+2
| | | | | | | | | | | | | | | | | When rxrpc_input_soft_acks() is parsing the soft-ACKs from an ACK packet, it updates the Tx packet annotations in the annotation buffer. If a soft-ACK is an ACK, then we overwrite unack'd, nak'd or to-be-retransmitted states and that is fine; but if the soft-ACK is an NACK, we overwrite the to-be-retransmitted with a nak - which isn't. Instead, we need to let any scheduled retransmission stand if the packet was NAK'd. Note that we don't reissue a resend if the annotation is in the to-be-retransmitted state because someone else must've scheduled the resend already. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Need to start the resend timer on initial transmissionDavid Howells2016-09-233-1/+11
| | | | | | | | When a DATA packet has its initial transmission, we may need to start or adjust the resend timer. Without this we end up relying on being sent a NACK to initiate the resend. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Use before_eq() and friends to compare serial numbersDavid Howells2016-09-231-1/+1
| | | | | | | | before_eq() and friends should be used to compare serial numbers (when not checking for (non)equality) rather than casting to int, subtracting and checking the result. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Should be using ktime_add_ms() not ktime_add_ns()David Howells2016-09-231-1/+1
| | | | | | | ktime_add_ms() should be used to add the resend time (in ms) rather than ktime_add_ns(). Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Make sure sendmsg() is woken on call completionDavid Howells2016-09-231-0/+1
| | | | | | | Make sure that sendmsg() gets woken up if the call it is waiting for completes abnormally. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Don't send an ACK at the end of service call response transmissionDavid Howells2016-09-231-2/+0
| | | | | | | | Don't send an IDLE ACK at the end of the transmission of the response to a service call. The service end resends DATA packets until the client sends an ACK that hard-acks all the send data. At that point, the call is complete. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Preset timestamp on Tx sk_buffsDavid Howells2016-09-231-0/+5
| | | | | | | | | | | | | Set the timestamp on sk_buffs holding packets to be transmitted before queueing them because the moment the packet is on the queue it can be seen by the retransmission algorithm - which may see a completely random timestamp. If the retransmission algorithm sees such a timestamp, it may retransmit the packet and, in future, tell the congestion management algorithm that the retransmit timer expired. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Reduce the number of PING ACKs sentDavid Howells2016-09-222-3/+6
| | | | | | | | | | We don't want to send a PING ACK for every new incoming call as that just adds to the network traffic. Instead, we send a PING ACK to the first three that we receive and then once per second thereafter. This could probably be made adjustable in future. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Reduce the number of ACK-Requests sentDavid Howells2016-09-224-4/+13
| | | | | | | | | | | | | | | Reduce the number of ACK-Requests we set on DATA packets that we're sending to reduce network traffic. We set the flag on odd-numbered DATA packets to start off the RTT cache until we have at least three entries in it and then probe once per second thereafter to keep it topped up. This could be made tunable in future. Note that from this point, the RXRPC_REQUEST_ACK flag is set on DATA packets as we transmit them and not stored statically in the sk_buff. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Obtain RTT data by requesting ACKs on DATA packetsDavid Howells2016-09-227-20/+57
| | | | | | | | | | | | | | | In addition to sending a PING ACK to gain RTT data, we can set the RXRPC_REQUEST_ACK flag on a DATA packet and get a REQUESTED-ACK ACK. The ACK packet contains the serial number of the packet it is in response to, so we can look through the Tx buffer for a matching DATA packet. This requires that the data packets be stamped with the time of transmission as a ktime rather than having the resend_at time in jiffies. This further requires the resend code to do the resend determination in ktimes and convert to jiffies to set the timer. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Expedite ping response transmissionDavid Howells2016-09-221-0/+4
| | | | | | | | | | | | | Expedite the transmission of a response to a PING ACK by sending it from sendmsg if one is pending. We're most likely to see a PING ACK during the client call Tx phase as the other side may use it to determine a number of parameters, such as the client's receive window size, the RTT and whether the client is doing slow start (similar to RFC5681). If we don't expedite it, it's left to the background processing thread to transmit. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Send pings to get RTT dataDavid Howells2016-09-224-8/+80
| | | | | | | | | | | | | | | | | | | | | | | | Send a PING ACK packet to the peer when we get a new incoming call from a peer we don't have a record for. The PING RESPONSE ACK packet will tell us the following about the peer: (1) its receive window size (2) its MTU sizes (3) its support for jumbo DATA packets (4) if it supports slow start (similar to RFC 5681) (5) an estimate of the RTT This is necessary because the peer won't normally send us an ACK until it gets to the Rx phase and we send it a packet, but we would like to know some of this information before we start sending packets. A pair of tracepoints are added so that RTT determination can be observed. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add per-peer RTT trackerDavid Howells2016-09-223-4/+70
| | | | | | | | | | | | | | Add a function to track the average RTT for a peer. Sources of RTT data will be added in subsequent patches. The RTT data will be useful in the future for determining resend timeouts and for handling the slow-start part of the Rx protocol. Also add a pair of tracepoints, one to log transmissions to elicit a response for RTT purposes and one to log responses that contribute RTT data. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add re-sent Tx annotationDavid Howells2016-09-223-12/+32
| | | | | | | | | Add a Tx-phase annotation for packet buffers to indicate that a buffer has already been retransmitted. This will be used by future congestion management. Re-retransmissions of a packet don't affect the congestion window managment in the same way as initial retransmissions. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Don't store the rxrpc header in the Tx queue sk_buffsDavid Howells2016-09-226-88/+71
| | | | | | | | | | | Don't store the rxrpc protocol header in sk_buffs on the transmit queue, but rather generate it on the fly and pass it to kernel_sendmsg() as a separate iov. This reduces the amount of storage required. Note that the security header is still stored in the sk_buff as it may get encrypted along with the data (and doesn't change with each transmission). Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add config to inject packet lossDavid Howells2016-09-173-0/+24
| | | | | | | | | | Add a configuration option to inject packet loss by discarding approximately every 8th packet received and approximately every 8th DATA packet transmitted. Note that no locking is used, but it shouldn't really matter. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Improve skb tracingDavid Howells2016-09-1713-55/+127
| | | | | | | | | | | | | | | | | | | | | | | Improve sk_buff tracing within AF_RXRPC by the following means: (1) Use an enum to note the event type rather than plain integers and use an array of event names rather than a big multi ?: list. (2) Distinguish Rx from Tx packets and account them separately. This requires the call phase to be tracked so that we know what we might find in rxtx_buffer[]. (3) Add a parameter to rxrpc_{new,see,get,free}_skb() to indicate the event type. (4) A pair of 'rotate' events are added to indicate packets that are about to be rotated out of the Rx and Tx windows. (5) A pair of 'lost' events are added, along with rxrpc_lose_skb() for packet loss injection recording. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Remove printks from rxrpc_recvmsg_data() to fix uninit varDavid Howells2016-09-171-8/+0
| | | | | | | Remove _enter/_debug/_leave calls from rxrpc_recvmsg_data() of which one uses an uninitialised variable. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to follow what recvmsg doesDavid Howells2016-09-173-8/+57
| | | | | Add a tracepoint to follow what recvmsg does within AF_RXRPC. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to follow packets in the Rx bufferDavid Howells2016-09-175-1/+40
| | | | | | | Add a tracepoint to follow the life of packets that get added to a call's receive buffer. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to log ACK transmissionDavid Howells2016-09-172-1/+9
| | | | | | Add a tracepoint to log information about ACK transmission. Signed-off-by: David Howels <dhowells@redhat.com>
* rxrpc: Add a tracepoint to log received ACK packetsDavid Howells2016-09-171-0/+2
| | | | | | Add a tracepoint to log information from received ACK packets. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add a tracepoint to follow the life of a packet in the Tx bufferDavid Howells2016-09-174-1/+31
| | | | | | | Add a tracepoint to follow the insertion of a packet into the transmit buffer, its transmission and its rotation out of the buffer. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add connection tracepoint and client conn state tracepointDavid Howells2016-09-178-59/+214
| | | | | | | Add a pair of tracepoints, one to track rxrpc_connection struct ref counting and the other to track the client connection cache state. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Add some additional call tracingDavid Howells2016-09-172-4/+17
| | | | | | | | | | Add additional call tracepoint points for noting call-connected, call-released and connection-failed events. Also fix one tracepoint that was using an integer instead of the corresponding enum value as the point type. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Print the packet type name in the Rx packet traceDavid Howells2016-09-171-3/+3
| | | | | | | Print a symbolic packet type name for each valid received packet in the trace output, not just a number. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix the basic transmit DATA packet content size at 1412 bytesDavid Howells2016-09-171-1/+1
| | | | | | | | | | | | Fix the basic transmit DATA packet content size at 1412 bytes so that they can be arbitrarily assembled into jumbo packets. In the future, I'm thinking of moving to keeping a jumbo packet header at the beginning of each packet in the Tx queue and creating the packet header on the spot when kernel_sendmsg() is invoked. That way, jumbo packets can be assembled on the spur of the moment for (re-)transmission. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Be consistent about switch value in rxrpc_send_call_packet()David Howells2016-09-171-1/+1
| | | | | | | | rxrpc_send_call_packet() should use type in both its switch-statements rather than using pkt->whdr.type. This might give the compiler an easier job of uninitialised variable checking. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Don't transmit an ACK if there's no reason setDavid Howells2016-09-171-0/+5
| | | | | | | | Don't transmit an ACK if call->ackr_reason in unset. There's the possibility of a race between recvmsg() sending an ACK and the background processing thread trying to send the same one. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix retransmission algorithmDavid Howells2016-09-171-8/+4
| | | | | | | | | | | | | | Make the retransmission algorithm use for-loops instead of do-loops and move the counter increments into the for-statement increment slots. Though the do-loops are slighly more efficient since there will be at least one pass through the each loop, the counter increments are harder to get right as the continue-statements skip them. Without this, if there are any positive acks within the loop, the do-loop will cycle forever because the counter increment is never done. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix the parsing of soft-ACKsDavid Howells2016-09-171-1/+1
| | | | | | | | | | | The soft-ACK parser doesn't increment the pointer into the soft-ACK list, resulting in the first ACK/NACK value being applied to all the relevant packets in the Tx queue. This has the potential to miss retransmissions and cause excessive retransmissions. Fix this by incrementing the pointer. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix unexposed client conn releaseDavid Howells2016-09-171-1/+0
| | | | | | | | | | | | | | | | | | | | If the last call on a client connection is release after the connection has had a bunch of calls allocated but before any DATA packets are sent (so that it's not yet marked RXRPC_CONN_EXPOSED), an assertion will happen in rxrpc_disconnect_client_call(). af_rxrpc: Assertion failed - 1(0x1) >= 2(0x2) is false ------------[ cut here ]------------ kernel BUG at ../net/rxrpc/conn_client.c:753! This is because it's expecting the conn to have been exposed and to have 2 or more refs - but this isn't necessarily the case. Simply remove the assertion. This allows the conn to be moved into the inactive state and deleted if it isn't resurrected before the final put is called. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()David Howells2016-09-171-24/+12
| | | | | | | | | | | | Call rxrpc_release_call() on getting an error in rxrpc_new_client_call() rather than trying to do the cleanup ourselves. This isn't a problem, provided we set RXRPC_CALL_HAS_USERID only if we actually add the call to the calls tree as cleanup code fragments that would otherwise cause problems are conditional. Without this, we miss some of the cleanup. Signed-off-by: David Howells <dhowells@redhat.com>
* rxrpc: Fix the putting of client connectionsDavid Howells2016-09-171-15/+13
| | | | | | | | | | | | | | | | | | | In rxrpc_put_one_client_conn(), if a connection has RXRPC_CONN_COUNTED set on it, then it's accounted for in rxrpc_nr_client_conns and may be on various lists - and this is cleaned up correctly. However, if the connection doesn't have RXRPC_CONN_COUNTED set on it, then the put routine returns rather than just skipping the extra bit of cleanup. Fix this by making the extra bit of clean up conditional instead and always killing off the connection. This manifests itself as connections with a zero usage count hanging around in /proc/net/rxrpc_conns because the connection allocated, but discarded, due to a race with another process that set up a parallel connection, which was then shared instead. Signed-off-by: David Howells <dhowells@redhat.com>