summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tipc: Fix problem with missing link in "tipc-config -l" outputAllan Stephens2011-03-133-6/+8
| | | | | | | | | | | | | | | | | | Removes a race condition that could cause TIPC's internal counter of the number of links it has to neighboring nodes to have the incorrect value if two independent threads of control simultaneously create new link endpoints connecting to two different nodes using two different bearers. Such under counting would result in TIPC failing to list the final link(s) in its response to a configuration request to list all of the node's links. The counter is now updated atomically to ensure that simultaneous increments do not interfere with each other. Thanks go to Peter Butler <pbutler@pt.com> for his assistance in diagnosing and fixing this problem. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Add support for SO_RCVTIMEO socket optionAllan Stephens2011-03-131-15/+17
| | | | | | | | | | | Adds support for the SO_RCVTIMEO socket option to TIPC's socket receive routines. Thanks go out to Raj Hegde <rajenhegde@yahoo.ca> for his contribution to the development and testing this enhancement. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Cosmetic changes to node subscription codeAllan Stephens2011-03-134-13/+27
| | | | | | | | | | | | | | Relocates the code that notifies users of node subscriptions so that it is adjacent to the rest of the routines that implement TIPC's node subscription capability. Renames the name table routine that is invoked by a node subscription to better reflect its purpose and to be consistent with other, similar name table routines. These changes are cosmetic in nature, and do not alter the behavior of TIPC. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Prevent null pointer error when removing a node subscriptionAllan Stephens2011-03-132-8/+10
| | | | | | | | | | | | | | | | | Prevents a null pointer dereference from occurring if a node subscription is triggered at the same time that the subscribing port or publication is terminating the subscription. The problem arises if the triggering routine asynchronously activates and deregisters the node subscription while deregistration is already underway -- the deregistration routine may find that the pointer it has just verified to be non-NULL is now NULL. To avoid this race condition the triggering routine now simply marks the node subscription as defunct (to prevent it from re-activating) instead of deregistering it. The subscription is now both deregistered and destroyed only when the subscribing port or publication code terminates the node subscription. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Add network address mask helper routinesAllan Stephens2011-03-133-7/+16
| | | | | | | | | | | | Introduces a pair of helper routines that convert the network address for a TIPC node into the network address for its cluster or zone. This is a cosmetic change designed to avoid future errors caused by the incorrect use of address bitmasks, and does not alter the existing operation of TIPC. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Correct broadcast link peer info when displaying linksAllan Stephens2011-03-131-1/+1
| | | | | | | | | | | Fixes a typo in the calculation of the network address of a node's own cluster when generating a response to the configuration command that lists all of the node's links. The correct mask value for a <Z.C.N> network address uses 1's for the 8-bit zone and 12-bit cluster parts and 0's for the 12-bit node part. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* tipc: Allow receiving into iovec containing multiple entriesAllan Stephens2011-03-131-23/+15
| | | | | | | | | | | | Enhances TIPC's socket receive routines to support iovec structures containing more than a single entry. This change leverages existing sk_buff routines to do most of the work; the only significant change to TIPC itself is that an sk_buff now records how much data has been already consumed as an numeric offset, rather than as a pointer to the first unread data byte. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* decnet: Convert to use flowidn where applicable.David S. Miller2011-03-1210-185/+196
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put fl6_* macros to struct flowi6 and use them again.David S. Miller2011-03-1213-61/+58
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Convert to use flowi6 where applicable.David S. Miller2011-03-1235-630/+632
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put fl4_* macros to struct flowi4 and use them again.David S. Miller2011-03-1212-50/+46
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill fib_semantic_match declaration from fib_lookup.hDavid S. Miller2011-03-121-3/+0
| | | | | | This function no longer exists. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Use flowi4 and flowi6 in xfrm layer.David S. Miller2011-03-126-76/+89
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add flowi6_* member helper macros.David S. Miller2011-03-121-0/+8
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: Use flowi4 and flowi6 in xt_TCPMSSDavid S. Miller2011-03-121-5/+10
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: Use flowi4 and flowi6 in nf_conntrack_h323_mainDavid S. Miller2011-03-121-12/+20
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4 in UDPDavid S. Miller2011-03-121-6/+8
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: Use flowi4 in nf_nat_standalone.cDavid S. Miller2011-03-121-4/+5
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4 in ipmr code.David S. Miller2011-03-121-16/+17
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4 in FIB layer.David S. Miller2011-03-123-30/+31
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4 in public route lookup interfaces.David S. Miller2011-03-1214-224/+231
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use struct flowi4 internally in routing lookups.David S. Miller2011-03-121-115/+115
| | | | | | We will change the externally visible APIs next. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Pass ipv4 flow objects into fib_lookup() paths.David S. Miller2011-03-127-20/+28
| | | | | | | | To start doing these conversions, we need to add some temporary flow4_* macros which will eventually go away when all the protocol code paths are changed to work on AF specific flowi objects. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add flowiX_to_flowi() shorthands.David S. Miller2011-03-121-0/+15
| | | | | | | This is just a shorthand which will help in passing around AF specific flow structures as generic ones. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Break struct flowi out into AF specific instances.David S. Miller2011-03-126-69/+69
| | | | | | | | | | | Now we have struct flowi4, flowi6, and flowidn for each address family. And struct flowi is just a union of them all. It might have been troublesome to convert flow_cache_uli_match() but as it turns out this function is completely unused and therefore can be simply removed. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make flowi ports AF dependent.David S. Miller2011-03-1228-129/+139
| | | | | | | | | | | | Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Create union flowi_uliDavid S. Miller2011-03-121-23/+25
| | | | | | This will be used when we have seperate flowi types. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Create struct flowi_commonDavid S. Miller2011-03-121-9/+21
| | | | | | | Pull out the AF independent members of struct flowi into a new struct flowi_common Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller2011-03-1256-351/+365
| | | | | | | | | | I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok().David S. Miller2011-03-121-19/+3
| | | | | | | There is only one caller of xfrm_bundle_ok(), and that always passes these parameters as NULL. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove unnecessary padding in struct flowiDavid S. Miller2011-03-121-10/+10
| | | | | | | Move tos, scope, proto, and flags to the beginning of the structure. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Create and use route lookup helpers.David S. Miller2011-03-1223-326/+206
| | | | | | | The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2011-03-1229-469/+1316
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6
| * ixgbe: DCB, PFC not cleared until reset occursJohn Fastabend2011-03-122-51/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PFC configuration is not cleared until the device is reset. This has not been a problem because setting DCB attributes forced a hardware reset. Now that we no longer require this reset to occur PFC remains configured even after being disabled until the device is reset. This removes a goto in the PFC hardware set routines for 82598 and 82599 devices that was short circuiting the clear. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: add support for VF Transmit rate limit using iproute2Lior Levy2011-03-125-2/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implemented ixgbe_ndo_set_vf_bw function which is being used by iproute2 tool. In addition, updated ixgbe_ndo_get_vf_config function to show the actual rate limit to the user. The rate limitation can be configured only when the link is up and the link speed is 10Gb. The rate limit value can be 0 or ranged between 11 and actual link speed measured in Mbps. A value of '0' disables the rate limit for this specific VF. iproute2 usage will be 'ip link set ethX vf Y rate Z'. After the command is made, the rate will be changed instantly. To view the current rate limit, use 'ip link show ethX'. The rates will be zeroed only upon driver reload or a link speed change. This feature is being supported by 82599 and X540 devices. Signed-off-by: Lior Levy <lior.levy@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB, set minimum bandwidth per traffic classJohn Fastabend2011-03-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | DCB provides a guaranteed bandwidth in the case with 0% bandwidth then no bandwidth is guaranteed. However the traffic class should still be able to transmit traffic. For this to work the traffic class must be given the minimum credits required to send a frame. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: correct typo in define nameEmil Tantilov2011-03-122-3/+3
| | | | | | | | | | | | | | | | | | VF Free Running Timer register name missing an F. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Evan Swanson <evan.swanson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: update PHY code to support 100Mbps as well as 1G/10GEmil Tantilov2011-03-124-99/+245
| | | | | | | | | | | | | | | | | | This change updates the PHY setup code to support 100Mbps capable PHYs as well as 10G and 1Gbps. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: remove timer reset to 0 on timeoutEmil Tantilov2011-03-121-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | The VF mailbox polling for acks and messages would reset the timer to zero on a timeout. Under heavy load a timeout may actually occur without being the result of an error and when this occurs it is not practical to perform a full VF driver reset on every message timeout. Instead, just return an error (which is already done) and the VF driver will have an opportunity to retry the operation. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB during ifup use correct CEE or IEEE modeJohn Fastabend2011-03-121-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | DCB settings are cleared in the hardware across link events during ifup ixgbe reprograms the hardware for DCB if it is enabled. Now that we have two modes CEE or IEEE we need to use the correct set of configuration data. This patch checks the dcbx_cap bits and then enables the device in the correct mode. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: IEEE 802.1Qaz, implement priority assignment tableJohn Fastabend2011-03-125-19/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to use the priority assignment table in the ieee_ets structure to map priorities to traffic classes. Previously ixgbe only supported a 1:1 mapping. Now we can enable and disable hardware DCB support when multiple traffic classes are actually being used. This allows the default case all priorities mapped to traffic class 0 to work in normal hardware mode and utilize the full packet buffer. This patch does not address putting the hardware in 4TC mode so packet buffer space may be underutilized in this case. A follow up patch can address this optimization. But at least we have the hooks to do this now. Also CEE will behave as it always has and map priorities 1:1 with traffic classes. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB, missed translation from 8021Qaz TSA to CEE link strictJohn Fastabend2011-03-122-25/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch below allowed IEEE 802.1Qaz and CEE DCB hardware configurations to use common hardware set routines, commit 88eb696cc6a7af8f9272266965b1a4dd7d6a931b Author: John Fastabend <john.r.fastabend@intel.com> Date: Thu Feb 10 03:02:11 2011 -0800 ixgbe: DCB, abstract out dcb_config from DCB hardware configuration However the case when CEE link strict and group strict are set was missed and are currently being mapped incorrectly in some configurations. This patch resolves this. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB: enable RSS to be used with DCBJohn Fastabend2011-03-122-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | RSS had previously been disabled when DCB was enabled because DCB was single queued per traffic class. Now that DCB implements multiple Tx/Rx rings per traffic class enable RSS. Here RSS hashes across the queues in the traffic class. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain.@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: enable ndo_tc_setupJohn Fastabend2011-03-121-1/+23
| | | | | | | | | | | | | | | | | | This patch adds the ndo_tc_setup to ixgbe. By default we set the device to use strict priority. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain.@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB, use multiple Tx rings per traffic classJohn Fastabend2011-03-123-168/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables multiple {Tx|Rx} rings per traffic class while in DCB mode. In order to get this working as expected the tc_to_tx net device mapping is configured as well as the prio_tc_map. skb priorities are mapped across a range of queue pairs to get a distribution per traffic class. The maximum number of queue pairs used while in DCB mode is capped at 64. The hardware max is actually 128 queues but 64 is sufficient for now and allocating more seemed a bit excessive. It is easy enough to increase the cap later if need be. To get the 802.1Q priority tags inserted correctly ixgbe was previously using the skb queue_mapping field to directly set the 802.1Q priority. This no longer works because we have removed the 1:1 mapping between queues and traffic class. Each ring is aligned with an 802.1Qaz traffic class so here we add an extra field to the ring struct to identify the 802.1Q traffic class. This uses an extra byte of the ixgbe_ring struct fortunately there was a 2byte hole, struct ixgbe_ring { void * desc; /* 0 8 */ struct device * dev; /* 8 8 */ struct net_device * netdev; /* 16 8 */ union { struct ixgbe_tx_buffer * tx_buffer_info; /* 8 */ struct ixgbe_rx_buffer * rx_buffer_info; /* 8 */ }; /* 24 8 */ long unsigned int state; /* 32 8 */ u8 atr_sample_rate; /* 40 1 */ u8 atr_count; /* 41 1 */ u16 count; /* 42 2 */ u16 rx_buf_len; /* 44 2 */ u16 next_to_use; /* 46 2 */ u16 next_to_clean; /* 48 2 */ u8 queue_index; /* 50 1 */ u8 reg_idx; /* 51 1 */ u16 work_limit; /* 52 2 */ /* XXX 2 bytes hole, try to pack */ u8 * tail; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ Now we can set the VLAN priority directly and it will be correct. User space can indicate the 802.1Qaz priority using the SO_PRIORITY setsocket() option and QOS layer will steer the skb to the correct rings. Additionally using the multiq qdisc with a queue_mapping action works as well. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB remove ixgbe_fcoe_getapp routineJohn Fastabend2011-03-122-30/+13
| | | | | | | | | | | | | | | | | | | | | | Remove ixgbe_fcoe_getapp() and use the generic kernel routine instead. Also add application priority to the kernel maintained list on setapp so applications and stacks can query the value. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB, implement ieee_setapp dcbnl opsJohn Fastabend2011-03-121-0/+30
| | | | | | | | | | | | | | | | | | | | | | Implement ieee_setapp dcbnl ops in ixgbe. This is required to setup FCoE which requires dedicated resources. If the app data is not for FCoE then no action is taken in ixgbe except to add it to the dcb_app_list. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: DCB, implement capabilities flagsJohn Fastabend2011-03-123-36/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements dcbnl get and set capabilities ops. The devices supported by ixgbe can be configured to run in IEEE or CEE modes but not both. With the DCBX set capabilities bit we add an explicit signal that must be used to toggle between these modes. This patch adds logic to fail the CEE command set_hw_all() which programs the device with a CEE configuration if the CEE caps bit is not set. Similarly, IEEE set commands will fail if the IEEE caps bit is not set. We allow most CEE config set commands to occur because they do not touch the hardware until set_hw_all() is called. The one exception to the above is the {set|get}app routines. These must always be protected by caps bits to ensure side effects do not corrupt the current configured mode. By requiring the caps bit to be set correctly we can maintain a consistent configuration in the hardware for CEE or IEEE modes and prevent partial hardware configurations that may occur if user space does not send a complete IEEE or CEE configurations. It is expected that user space will signal a DCBX mode before programming device. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * igb: Bump version to 3.0.6Carolyn Wyborny2011-03-121-1/+6
| | | | | | | | | | | | | | | | This patch updates igb version to 3.0.6. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * igb: Add DMA Coalescing feature to driverCarolyn Wyborny2011-03-125-1/+106
| | | | | | | | | | | | | | | | | | | | This patch add DMA Coalescing which is a power-saving feature that coalesces DMA writes in order to stay in a low-power state as much as possible. Feature is disabled by default. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>