linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	xfs: Extend project quotas to support 32bit project ids	Arkadiusz Mi?kiewicz	2010-10-18	16	-44/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for 32bit project quota identifiers. On disk format is backward compatible with 16bit projid numbers. projid on disk is now kept in two 16bit values - di_projid_lo (which holds the same position as old 16bit projid value) and new di_projid_hi (takes existing padding) and converts from/to 32bit value on the fly. xfs_admin (for existing fs), mkfs.xfs (for new fs) needs to be used to enable PROJID32BIT support. Signed-off-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove xfs_buf wrappers	Christoph Hellwig	2010-10-18	16	-48/+33
\| \| \| \| \| \| \| \|	Stop having two different names for many buffer functions and use the more descriptive xfs_buf_* names directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove xfs_cred.h	Christoph Hellwig	2010-10-18	12	-58/+18
\| \| \| \| \| \| \| \| \|	We're not actually passing around credentials inside XFS for a while now, so remove all xfs_cred.h with it's cred_t typedef and all instances of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove xfs_globals.h	Christoph Hellwig	2010-10-18	2	-24/+0
\| \| \| \| \| \| \| \|	This header only provides one extern that isn't actually declared anywhere, and shadowed by a macro. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove xfs_version.h	Christoph Hellwig	2010-10-18	3	-30/+1
\| \| \| \| \| \| \| \|	It used to have a place when it contained an automatically generated CVS version, but these days it's entirely superflous. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove xfs_refcache.h	Christoph Hellwig	2010-10-18	1	-52/+0
\| \| \| \| \| \| \|	This header has been completely unused for a couple of years. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: fix the xfs_trans_committed	Christoph Hellwig	2010-10-18	1	-2/+3
\| \| \| \| \| \| \|	Use the correct prototype for xfs_trans_committed instead of casting it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove unused t_callback field in struct xfs_trans	Christoph Hellwig	2010-10-18	2	-6/+0
\| \| \| \| \|	Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: fix bogus m_maxagi check in xfs_iget	Christoph Hellwig	2010-10-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	These days inode64 should only control which AGs we allocate new inodes from, while we still try to support reading all existing inodes. To make this actually work the check ontop of xfs_iget needs to be relaxed to allow inodes in all allocation groups instead of just those that we allow allocating inodes from. Note that we can't simply remove the check - it prevents us from accessing invalid data when fed invalid inode numbers from NFS or bulkstat. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: do not use xfs_mod_incore_sb_batch for per-cpu counters	Christoph Hellwig	2010-10-18	2	-107/+85
\| \| \| \| \| \| \| \| \| \|	Update the per-cpu counters manually in xfs_trans_unreserve_and_mod_sb and remove support for per-cpu counters from xfs_mod_incore_sb_batch to simplify it. And added benefit is that we don't have to take m_sb_lock for transactions that only modify per-cpu counters. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: do not use xfs_mod_incore_sb for per-cpu counters	Christoph Hellwig	2010-10-18	5	-40/+35
\| \| \| \| \| \| \| \| \|	Export xfs_icsb_modify_counters and always use it for modifying the per-cpu counters. Remove support for per-cpu counters from xfs_mod_incore_sb to simplify it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove XFS_MOUNT_NO_PERCPU_SB	Christoph Hellwig	2010-10-18	3	-29/+19
\| \| \| \| \| \| \| \| \|	Fail the mount if we can't allocate memory for the per-CPU counters. This is consistent with how we handle everything else in the mount path and makes the superblock counter modification a lot simpler. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: pack xfs_buf structure more tightly	Dave Chinner	2010-10-18	1	-11/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pahole reports the struct xfs_buf has quite a few holes in it, so packing the structure better will reduce the size of it by 16 bytes. Also, move all the fields used in cache lookups into the first cacheline. Before on x86_64: /* size: 320, cachelines: 5 / / sum members: 298, holes: 6, sum holes: 22 / After on x86_64: / size: 304, cachelines: 5 / / padding: 6 / / last cacheline: 48 bytes */ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: convert buffer cache hash to rbtree	Dave Chinner	2010-10-18	4	-76/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The buffer cache hash is showing typical hash scalability problems. In large scale testing the number of cached items growing far larger than the hash can efficiently handle. Hence we need to move to a self-scaling cache indexing mechanism. I have selected rbtrees for indexing becuse they can have O(log n) search scalability, and insert and remove cost is not excessive, even on large trees. Hence we should be able to cache large numbers of buffers without incurring the excessive cache miss search penalties that the hash is imposing on us. To ensure we still have parallel access to the cache, we need multiple trees. Rather than hashing the buffers by disk address to select a tree, it seems more sensible to separate trees by typical access patterns. Most operations use buffers from within a single AG at a time, so rather than searching lots of different lists, separate the buffer indexes out into per-AG rbtrees. This means that searches during metadata operation have a much higher chance of hitting cache resident nodes, and that updates of the tree are less likely to disturb trees being accessed on other CPUs doing independent operations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: serialise inode reclaim within an AG	Dave Chinner	2010-10-18	3	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memory reclaim via shrinkers has a terrible habit of having N+M concurrent shrinker executions (N = num CPUs, M = num kswapds) all trying to shrink the same cache. When the cache they are all working on is protected by a single spinlock, massive contention an slowdowns occur. Wrap the per-ag inode caches with a reclaim mutex to serialise reclaim access to the AG. This will block concurrent reclaim in each AG but still allow reclaim to scan multiple AGs concurrently. Allow shrinkers to move on to the next AG if it can't get the lock, and if we can't get any AG, then start blocking on locks. To prevent reclaimers from continually scanning the same inodes in each AG, add a cursor that tracks where the last reclaim got up to and start from that point on the next reclaim. This should avoid only ever scanning a small number of inodes at the satart of each AG and not making progress. If we have a non-shrinker based reclaim pass, ignore the cursor and reset it to zero once we are done. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: batch inode reclaim lookup	Dave Chinner	2010-10-18	1	-33/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Batch and optimise the per-ag inode lookup for reclaim to minimise scanning overhead. This involves gang lookups on the radix trees to get multiple inodes during each tree walk, and tighter validation of what inodes can be reclaimed without blocking befor we take any locks. This is based on ideas suggested in a proof-of-concept patch posted by Nick Piggin. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: implement batched inode lookups for AG walking	Dave Chinner	2010-10-18	2	-23/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	With the reclaim code separated from the generic walking code, it is simple to implement batched lookups for the generic walk code. Separate out the inode validation from the execute operations and modify the tree lookups to get a batch of inodes at a time. Reclaim operations will be optimised separately. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: split out inode walk inode grabbing	Dave Chinner	2010-10-18	2	-54/+34
\| \| \| \| \| \| \| \| \| \| \| \|	When doing read side inode cache walks, the code to validate and grab an inode is common to all callers. Split it out of the execute callbacks in preparation for batching lookups. Similarly, split out the inode reference dropping from the execute callbacks into the main lookup look to be symmetric with the grab. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: split inode AG walking into separate code for reclaim	Dave Chinner	2010-10-18	6	-115/+122
\| \| \| \| \| \| \| \| \| \|	The reclaim walk requires different locking and has a slightly different walk algorithm, so separate it out so that it can be optimised separately. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: remove buftarg hash for external devices	Dave Chinner	2010-10-18	1	-1/+5
\| \| \| \| \| \| \| \| \|	For RT and external log devices, we never use hashed buffers on them now. Remove the buftarg hash tables that are set up for them. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: use unhashed buffers for size checks	Dave Chinner	2010-10-18	3	-45/+34
\| \| \| \| \| \| \| \| \| \| \| \|	When we are checking we can access the last block of each device, we do not need to use cached buffers as they will be tossed away immediately. Use uncached buffers for size checks so that all IO prior to full in-memory structure initialisation does not use the buffer cache. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: kill XBF_FS_MANAGED buffers	Dave Chinner	2010-10-18	3	-59/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Filesystem level managed buffers are buffers that have their lifecycle controlled by the filesystem layer, not the buffer cache. We currently cache these buffers, which makes cleanup and cache walking somewhat troublesome. Convert the fs managed buffers to uncached buffers obtained by via xfs_buf_get_uncached(), and remove the XBF_FS_MANAGED special cases from the buffer cache. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: store xfs_mount in the buftarg instead of in the xfs_buf	Dave Chinner	2010-10-18	5	-21/+20
\| \| \| \| \| \| \| \| \| \| \|	Each buffer contains both a buftarg pointer and a mount pointer. If we add a mount pointer into the buftarg, we can avoid needing the b_mount field in every buffer and grab it from the buftarg when needed instead. This shrinks the xfs_buf by 8 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: introduced uncached buffer read primitve	Dave Chinner	2010-10-18	2	-0/+37
\| \| \| \| \| \| \| \| \| \|	To avoid the need to use cached buffers for single-shot or buffers cached at the filesystem level, introduce a new buffer read primitive that bypasses the cache an reads directly from disk. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: rename xfs_buf_get_nodaddr to be more appropriate	Dave Chinner	2010-10-18	6	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xfs_buf_get_nodaddr() is really used to allocate a buffer that is uncached. While it is not directly assigned a disk address, the fact that they are not cached is a more important distinction. With the upcoming uncached buffer read primitive, we should be consistent with this disctinction. While there, make page allocation in xfs_buf_get_nodaddr() safe against memory reclaim re-entrancy into the filesystem by allowing a flags parameter to be passed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: don't use vfs writeback for pure metadata modifications	Dave Chinner	2010-10-18	12	-86/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under heavy multi-way parallel create workloads, the VFS struggles to write back all the inodes that have been changed in age order. The bdi flusher thread becomes CPU bound, spending 85% of it's time in the VFS code, mostly traversing the superblock dirty inode list to separate dirty inodes old enough to flush. We already keep an index of all metadata changes in age order - in the AIL - and continued log pressure will do age ordered writeback without any extra overhead at all. If there is no pressure on the log, the xfssyncd will periodically write back metadata in ascending disk address offset order so will be very efficient. Hence we can stop marking VFS inodes dirty during transaction commit or when changing timestamps during transactions. This will keep the inodes in the superblock dirty list to those containing data or unlogged metadata changes. However, the timstamp changes are slightly more complex than this - there are a couple of places that do unlogged updates of the timestamps, and the VFS need to be informed of these. Hence add a new function xfs_trans_ichgtime() for transactional changes, and leave xfs_ichgtime() for the non-transactional changes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: lockless per-ag lookups	Dave Chinner	2010-10-18	3	-11/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we start taking a reference to the per-ag for every cached buffer in the system, kernel lockstat profiling on an 8-way create workload shows the mp->m_perag_lock has higher acquisition rates than the inode lock and has significantly more contention. That is, it becomes the highest contended lock in the system. The perag lookup is trivial to convert to lock-less RCU lookups because perag structures never go away. Hence the only thing we need to protect against is tree structure changes during a grow. This can be done simply by replacing the locking in xfs_perag_get() with RCU read locking. This removes the mp->m_perag_lock completely from this path. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: remove debug assert for per-ag reference counting	Dave Chinner	2010-10-18	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	When we start taking references per cached buffer to the the perag it is cached on, it will blow the current debug maximum reference count assert out of the water. The assert has never caught a bug, and we have tracing to track changes if there ever is a problem, so just remove it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: reduce the number of CIL lock round trips during commit	Dave Chinner	2010-10-18	1	-105/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When commiting a transaction, we do a lock CIL state lock round trip on every single log vector we insert into the CIL. This is resulting in the lock being as hot as the inode and dcache locks on 8-way create workloads. Rework the insertion loops to bring the number of lock round trips to one per transaction for log vectors, and one more do the busy extents. Also change the allocation of the log vector buffer not to zero it as we copy over the entire allocated buffer anyway. This patch also includes a structural cleanup to the CIL item insertion provided by Christoph Hellwig. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
*	xfs: eliminate some newly-reported gcc warnings	Poyo VL	2010-10-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ionut Gabriel Popescu <poyo_vl@yahoo.com> submitted a simple change to eliminate some "may be used uninitialized" warnings when building XFS. The reported condition seems to be something that GCC did not used to recognize or report. The warnings were produced by: gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux) Signed-off-by: Ionut Gabriel Popescu <poyo_vl@yahoo.com> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove the ->kill_root btree operation	Christoph Hellwig	2010-10-18	4	-88/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementation os ->kill_root only differ by either simply zeroing out the now unused buffer in the btree cursor in the inode allocation btree or using xfs_btree_setbuf in the allocation btree. Initially both of them used xfs_btree_setbuf, but the use in the ialloc btree was removed early on because it interacted badly with xfs_trans_binval. In addition to zeroing out the buffer in the cursor xfs_btree_setbuf updates the bc_ra array in the btree cursor, and calls xfs_trans_brelse on the buffer previous occupying the slot. The bc_ra update should be done for the alloc btree updated too, although the lack of it does not cause serious problems. The xfs_trans_brelse call on the other hand is effectively a no-op in the end - it keeps decrementing the bli_recur refcount until it hits zero, and then just skips out because the buffer will always be dirty at this point. So removing it for the allocation btree is just fine. So unify the code and move it to xfs_btree.c. While we're at it also replace the call to xfs_btree_setbuf with a NULL bp argument in xfs_btree_del_cursor with a direct call to xfs_trans_brelse given that the cursor is beeing freed just after this and the state updates are superflous. After this xfs_btree_setbuf is only used with a non-NULL bp argument and can thus be simplified. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: stop using xfs_qm_dqtobp in xfs_qm_dqflush	Christoph Hellwig	2010-10-18	1	-88/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In xfs_qm_dqflush we know that q_blkno must be initialized already from a previous xfs_qm_dqread. So instead of calling xfs_qm_dqtobp we can simply read the quota buffer directly. This also saves us from a duplicate xfs_qm_dqcheck call check and allows xfs_qm_dqtobp to be simplified now that it is always called for a newly initialized inode. In addition to that properly unwind all locks in xfs_qm_dqflush when xfs_qm_dqcheck fails. This mirrors a similar cleanup in the inode lookup done earlier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: simplify xfs_qm_dqusage_adjust	Christoph Hellwig	2010-10-18	1	-142/+61
\| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to have the users and group/project quota locked at the same time. Get rid of xfs_qm_dqget_noattach and just do a xfs_qm_dqget inside xfs_qm_quotacheck_dqadjust for the quota we are operating on right now. The new version of xfs_qm_quotacheck_dqadjust holds the inode lock over it's operations, which is not a problem as it simply increments counters and there is no concern about log contention during mount time. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: Introduce XFS_IOC_ZERO_RANGE	Dave Chinner	2010-10-18	6	-9/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	XFS_IOC_ZERO_RANGE is the equivalent of an atomic XFS_IOC_UNRESVSP/ XFS_IOC_RESVSP call pair. It enabled ranges of written data to be turned into zeroes without requiring IO or having to free and reallocate the extents in the range given as would occur if we had to punch and then preallocate them separately. This enables applications to zero parts of files very quickly without changing the layout of the files in any way. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: use range primitives for xfs page cache operations	Dave Chinner	2010-10-18	1	-16/+15
\| \| \| \| \| \| \| \| \| \| \|	While XFS passes ranges to operate on from the core code, the functions being called ignore the either the entire range or the end of the range. This is historical because when the function were written linux didn't have the necessary range operations. Update the functions to use the correct operations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	Linux 2.6.36-rc8v2.6.36-rc8	Linus Torvalds	2010-10-14	1	-2/+2
\|
*	Un-inline the core-dump helper functions	Linus Torvalds	2010-10-14	2	-32/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tony Luck reports that the addition of the access_ok() check in commit 0eead9ab41da ("Don't dump task struct in a.out core-dumps") broke the ia64 compile due to missing the necessary header file includes. Rather than add yet another include (<asm/unistd.h>) to make everything happy, just uninline the silly core dump helper functions and move the bodies to fs/exec.c where they make a lot more sense. dump_seek() in particular was too big to be an inline function anyway, and none of them are in any way performance-critical. And we really don't need to mess up our include file headers more than they already are. Reported-and-tested-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2010-10-14	13	-77/+104
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: ehea: Fix a checksum issue on the receive path net: allow FEC driver to use fixed PHY support tg3: restore rx_dropped accounting b44: fix carrier detection on bind net: clear heap allocations for privileged ethtool actions NET: wimax, fix use after free ATM: iphase, remove sleep-inside-atomic ATM: mpc, fix use after free ATM: solos-pci, remove use after free net/fec: carrier off initially to avoid root mount failure r8169: use device model DMA API r8169: allocate with GFP_KERNEL flag when able to sleep
\| *	ehea: Fix a checksum issue on the receive path	Breno Leitao	2010-10-13	2	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we set all skbs with CHECKSUM_UNNECESSARY, even those whose protocol we don't know. This patch just add the CHECKSUM_COMPLETE tag for non TCP/UDP packets. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: allow FEC driver to use fixed PHY support	Greg Ungerer	2010-10-13	1	-14/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At least one board using the FEC driver does not have a conventional PHY attached to it, it is directly connected to a somewhat simple ethernet switch (the board is the SnapGear/LITE, and the attached 4-port ethernet switch is a RealTek RTL8305). This switch does not present the usual register interface of a PHY, it presents nothing. So a PHY scan will find nothing - it finds ID's of 0 for each PHY on the attached MII bus. After the FEC driver was changed to use phylib for supporting PHYs it no longer works on this particular board/switch setup. Add code support to use a fixed phy if no PHY is found on the MII bus. This is based on the way the cpmac.c driver solved this same problem. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tg3: restore rx_dropped accounting	Eric Dumazet	2010-10-11	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 511d22247be7 (tg3: 64 bit stats on all arches), overlooked the rx_dropped accounting. We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but forgot to report it in tg3_get_stats64(). Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and report this value to stats readers. Increment rx_dropped counter for oversized frames. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Michael Chan <mchan@broadcom.com> CC: Matt Carlson <mcarlson@broadcom.com> Acked-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	b44: fix carrier detection on bind	Paul Fertser	2010-10-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For carrier detection to work properly when binding the driver with a cable unplugged, netif_carrier_off() should be called after register_netdev(), not before. Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: clear heap allocations for privileged ethtool actions	Kees Cook	2010-10-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several other ethtool functions leave heap uncleared (potentially) by drivers. Some interfaces appear safe (eeprom, etc), in that the sizes are well controlled. In some situations (e.g. unchecked error conditions), the heap will remain unchanged in areas before copying back to userspace. Note that these are less of an issue since these all require CAP_NET_ADMIN. Cc: stable@kernel.org Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	NET: wimax, fix use after free	Jiri Slaby	2010-10-11	1	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stanse found that i2400m_rx frees skb, but still uses skb->len even though it has skb_len defined. So use skb_len properly in the code. And also define it unsinged int rather than size_t to solve compilation warnings. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Cc: linux-wimax@intel.com Acked-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ATM: iphase, remove sleep-inside-atomic	Jiri Slaby	2010-10-11	2	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stanse found that ia_init_one locks a spinlock and inside of that it calls ia_start which calls: * request_irq * tx_init which does kmalloc(GFP_KERNEL) Both of them can thus sleep and result in a deadlock. I don't see a reason to have a per-device spinlock there which is used only there and inited right before the lock location. So remove it completely. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ATM: mpc, fix use after free	Jiri Slaby	2010-10-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stanse found that mpc_push frees skb and then it dereferences it. It is a typo, new_skb should be dereferenced there. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ATM: solos-pci, remove use after free	Jiri Slaby	2010-10-11	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stanse found we do in console_show: kfree_skb(skb); return skb->len; which is not good. Fix that by remembering the len and use it in the function instead. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/fec: carrier off initially to avoid root mount failure	Oskar Schirmer	2010-10-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with hardware slow in negotiation, the system did freeze while trying to mount root on nfs at boot time. the link state has not been initialised so network stack tried to start transmission right away. this caused instant retries, as the driver solely stated business upon link down, rendering the system unusable. notify carrier off initially to prevent transmission until phylib will report link up. Signed-off-by: Oskar Schirmer <oskar@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	r8169: use device model DMA API	Stanislaw Gruszka	2010-10-09	1	-24/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use DMA API as PCI equivalents will be deprecated. This change also allow to allocate with GFP_KERNEL where possible. Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	r8169: allocate with GFP_KERNEL flag when able to sleep	Stanislaw Gruszka	2010-10-09	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have fedora bug report where driver fail to initialize after suspend/resume because of memory allocation errors: https://bugzilla.redhat.com/show_bug.cgi?id=629158 To fix use GFP_KERNEL allocation where possible. Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>