summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* vfs: Move waiting for inode writeback from end_writeback() to evict_inode()Jan Kara2012-05-061-1/+2
| | | | | | | | | | | | | | | | | | | Currently, I_SYNC can never be set when evict_inode() (and thus end_writeback()) is called because flusher thread holds inode reference while inode is under writeback. As a result inode_sync_wait() in those places currently does nothing. However that is going to change and unveils problems with calling inode_sync_wait() from end_writeback(). Several filesystems call end_writeback() after they have deleted the inode (btrfs, gfs2, ...) and other filesystems (ext3, ext4, reiserfs, ...) can deadlock when waiting for I_SYNC because they call end_writeback() from within a transaction. To avoid these issues, we move inode_sync_wait() into evict_inode() before calling ->evict_inode(). That way we preserve the current property that ->evict_inode() and writeback never run in parallel and all filesystems are safe. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Refactor writeback_single_inode()Jan Kara2012-05-061-56/+86
| | | | | | | | | | | | | | | | | | | The code in writeback_single_inode() is relatively complex. The list requeing logic makes sense only for flusher thread but not really for sync_inode() or write_inode_now() callers. Also when we want to get rid of inode references held by flusher thread, we will need a special I_SYNC handling there. So separate part of writeback_single_inode() which does the real writeback work into __writeback_single_inode() and make writeback_single_inode() do only stuff necessary for callers writing only one inode, moving the special list handling into writeback_sb_inodes(). As a sideeffect this fixes a possible race where we could skip some inode during sync(2) because other writer refiled it from b_io to b_dirty list. Also I_SYNC handling is moved into the callers of __writeback_single_inode() to make locking easier. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Remove wb->list_lock from writeback_single_inode()Jan Kara2012-05-061-10/+7
| | | | | | | | | | | | | | | | writeback_single_inode() doesn't need wb->list_lock for anything on entry now. So remove the requirement. This makes locking of writeback_single_inode() temporarily awkward (entering with i_lock, returning with i_lock and wb->list_lock) but it will be sanitized in the next patch. Also inode_wait_for_writeback() doesn't need wb->list_lock for anything. It was just taking it to make usage convenient for callers but with writeback_single_inode() changing it's not very convenient anymore. So remove the lock from that function. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Separate inode requeueing after writebackJan Kara2012-05-061-47/+55
| | | | | | | | | Move inode requeueing after inode has been written out into a separate function. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Move I_DIRTY_PAGES handlingJan Kara2012-05-061-2/+3
| | | | | | | | | | | | | | Instead of clearing I_DIRTY_PAGES and resetting it when we didn't succeed in writing them all, just clear the bit only when we succeeded writing all the pages. We also move the clearing of the bit close to other i_state handling to separate it from writeback list handling. This is desirable because list handling will differ for flusher thread and other writeback_single_inode() callers in future. No filesystem plays any tricks with I_DIRTY_PAGES (like checking it in ->writepages or ->write_inode implementation) so this movement is safe. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()Jan Kara2012-05-062-21/+45
| | | | | | | | | | | | | | When writeback_single_inode() is called on inode which has I_SYNC already set while doing WB_SYNC_NONE, inode is moved to b_more_io list. However this makes sense only if the caller is flusher thread. For other callers of writeback_single_inode() it doesn't really make sense and may be even wrong - flusher thread may be doing WB_SYNC_ALL writeback in parallel. So we move requeueing from writeback_single_inode() to writeback_sb_inodes(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: Move clearing of I_SYNC into inode_sync_complete()Jan Kara2012-05-061-6/+2
| | | | | | | | | | | | Move clearing of I_SYNC into inode_sync_complete(). It is more logical to have clearing of I_SYNC bit and waking of waiters in one place. Also later we will have two places needing to clear I_SYNC and wake up waiters so this allows them to use the common helper. Moving of I_SYNC clearing to a later stage of writeback_single_inode() is safe since we hold i_lock all the time. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* writeback: initialize global_dirty_limitFengguang Wu2012-05-061-0/+1
| | | | | | | | | | | This prevents global_dirty_limit from remaining 0 (the initial value) for long time, since it's only updated in update_dirty_limit() when above the dirty freerun area. It will avoid unexpected consequences when some random code use it as a convenient approximation of the global dirty threshold. Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* fs: remove 8 bytes of padding from struct writeback_control on 64 bit buildsRichard Kennedy2012-04-251-1/+2
| | | | | | | | | | | Reorder structure writeback_control to remove 8 bytes of padding on 64 bit builds, this shrinks its size from 48 to 40 bytes. This structure is always on the stack and uses C99 named initialisation, so should be safe and have a small impact on stack usage. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* mm: page-writeback.c: local functions should not be exposed globallyH Hartley Sweeten2012-04-141-1/+1
| | | | | | | | | | | | The function global_dirtyable_memory is only referenced in this file and should be marked static to prevent it from being exposed globally. This quiets the sparse warning: warning: symbol 'global_dirtyable_memory' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* Merge branch 'systemh-fixes' of ↵Linus Torvalds2012-04-136-183/+223
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull system.h fixups for less common arch's from Paul Gortmaker: "Here is what is hopefully the last of the system.h related fixups. The fixes for Alpha and ia64 are code relocations consistent with what was done for the more mainstream architectures. Note that the diffstat lines removed vs lines added are not the same since I've fixed some of the whitespace issues in the relocated code blocks. However they are functionally the same. Compile tested locally, plus these two have been in linux-next for a while. There is also a trivial one line system.h related fix for the Tilera arch from Chris Metcalf to fix an implict include.." * 'systemh-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: irq_work: fix compile failure on tile from missing include ia64: populate the cmpxchg header with appropriate code alpha: fix build failures from system.h dismemberment
| * irq_work: fix compile failure on tile from missing includeChris Metcalf2012-04-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Building with IRQ_WORK configured results in kernel/irq_work.c: In function ‘irq_work_run’: kernel/irq_work.c:110: error: implicit declaration of function ‘irqs_disabled’ The appropriate header just needs to be included. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * ia64: populate the cmpxchg header with appropriate codePaul Gortmaker2012-04-132-114/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 93f378883cecb9dcb2cf5b51d9d24175906659da "Fix ia64 build errors (fallout from system.h disintegration)" introduced arch/ia64/include/asm/cmpxchg.h as a temporary build fix and stated: "... leave the migration of xchg() and cmpxchg() to this new header file for a future patch." Migrate the appropriate chunks from asm/intrinsics.h and fix the whitespace issues in the migrated chunk. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: David Howells <dhowells@redhat.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * alpha: fix build failures from system.h dismembermentPaul Gortmaker2012-04-133-69/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ec2212088c42ff7d1362629ec26dda4f3e8bdad3 "Disintegrate asm/system.h for Alpha" combined with commit b4816afa3986704d1404fc48e931da5135820472 "Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h" introduced the concept of asm/cmpxchg.h but the alpha arch never got one. Fork the cmpxchg content out of the asm/atomic.h file to create one. Some minor whitespace fixups were done on the block of code that created the new file. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: David Howells <dhowells@redhat.com> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | Merge tag 'fbdev-fixes-for-3.4-1' of git://github.com/schandinat/linux-2.6Linus Torvalds2012-04-134-193/+201
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull fbdev fixes from Florian Tobias Schandinat: - a compile fix for au1*fb - a fix to make kyrofb usable on x86_64 - a fix for uvesafb to prevent an oops due to NX-protection "The fix for kyrofb is a bit large but it's just replacing "unsigned long" by "u32" for 64 bit compatibility." * tag 'fbdev-fixes-for-3.4-1' of git://github.com/schandinat/linux-2.6: video:uvesafb: Fix oops that uvesafb try to execute NX-protected page fbdev: fix au1*fb builds kyrofb: fix on x86_64
| * | video:uvesafb: Fix oops that uvesafb try to execute NX-protected pageWang YanQing2012-04-091-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix the oops below that catched in my machine [ 81.560602] uvesafb: NVIDIA Corporation, GT216 Board - 0696a290, Chip Rev , OEM: NVIDIA, VBE v3.0 [ 81.609384] uvesafb: protected mode interface info at c000:d350 [ 81.609388] uvesafb: pmi: set display start = c00cd3b3, set palette = c00cd40e [ 81.609390] uvesafb: pmi: ports = 3b4 3b5 3ba 3c0 3c1 3c4 3c5 3c6 3c7 3c8 3c9 3cc 3ce 3cf 3d0 3d1 3d2 3d3 3d4 3d5 3da [ 81.614558] uvesafb: VBIOS/hardware doesn't support DDC transfers [ 81.614562] uvesafb: no monitor limits have been set, default refresh rate will be used [ 81.614994] uvesafb: scrolling: ypan using protected mode interface, yres_virtual=4915 [ 81.744147] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 81.744153] BUG: unable to handle kernel paging request at c00cd3b3 [ 81.744159] IP: [<c00cd3b3>] 0xc00cd3b2 [ 81.744167] *pdpt = 00000000016d6001 *pde = 0000000001c7b067 *pte = 80000000000cd163 [ 81.744171] Oops: 0011 [#1] SMP [ 81.744174] Modules linked in: uvesafb(+) cfbcopyarea cfbimgblt cfbfillrect [ 81.744178] [ 81.744181] Pid: 3497, comm: modprobe Not tainted 3.3.0-rc4NX+ #71 Acer Aspire 4741 /Aspire 4741 [ 81.744185] EIP: 0060:[<c00cd3b3>] EFLAGS: 00010246 CPU: 0 [ 81.744187] EIP is at 0xc00cd3b3 [ 81.744189] EAX: 00004f07 EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 81.744191] ESI: f763f000 EDI: f763f6e8 EBP: f57f3a0c ESP: f57f3a00 [ 81.744192] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 81.744195] Process modprobe (pid: 3497, ti=f57f2000 task=f748c600 task.ti=f57f2000) [ 81.744196] Stack: [ 81.744197] f82512c5 f759341c 00000000 f57f3a30 c124a9bc 00000001 00000001 000001e0 [ 81.744202] f8251280 f763f000 f7593400 00000000 f57f3a40 c12598dd f5c0c000 00000000 [ 81.744206] f57f3b10 c1255efe c125a21a 00000006 f763f09c 00000000 c1c6cb60 f7593400 [ 81.744210] Call Trace: [ 81.744215] [<f82512c5>] ? uvesafb_pan_display+0x45/0x60 [uvesafb] [ 81.744222] [<c124a9bc>] fb_pan_display+0x10c/0x160 [ 81.744226] [<f8251280>] ? uvesafb_vbe_find_mode+0x180/0x180 [uvesafb] [ 81.744230] [<c12598dd>] bit_update_start+0x1d/0x50 [ 81.744232] [<c1255efe>] fbcon_switch+0x39e/0x550 [ 81.744235] [<c125a21a>] ? bit_cursor+0x4ea/0x560 [ 81.744240] [<c129b6cb>] redraw_screen+0x12b/0x220 [ 81.744245] [<c128843b>] ? tty_do_resize+0x3b/0xc0 [ 81.744247] [<c129ef42>] vc_do_resize+0x3d2/0x3e0 [ 81.744250] [<c129efb4>] vc_resize+0x14/0x20 [ 81.744253] [<c12586bd>] fbcon_init+0x29d/0x500 [ 81.744255] [<c12984c4>] ? set_inverse_trans_unicode+0xe4/0x110 [ 81.744258] [<c129b378>] visual_init+0xb8/0x150 [ 81.744261] [<c129c16c>] bind_con_driver+0x16c/0x360 [ 81.744264] [<c129b47e>] ? register_con_driver+0x6e/0x190 [ 81.744267] [<c129c3a1>] take_over_console+0x41/0x50 [ 81.744269] [<c1257b7a>] fbcon_takeover+0x6a/0xd0 [ 81.744272] [<c12594b8>] fbcon_event_notify+0x758/0x790 [ 81.744277] [<c10929e2>] notifier_call_chain+0x42/0xb0 [ 81.744280] [<c1092d30>] __blocking_notifier_call_chain+0x60/0x90 [ 81.744283] [<c1092d7a>] blocking_notifier_call_chain+0x1a/0x20 [ 81.744285] [<c124a5a1>] fb_notifier_call_chain+0x11/0x20 [ 81.744288] [<c124b759>] register_framebuffer+0x1d9/0x2b0 [ 81.744293] [<c1061c73>] ? ioremap_wc+0x33/0x40 [ 81.744298] [<f82537c6>] uvesafb_probe+0xaba/0xc40 [uvesafb] [ 81.744302] [<c12bb81f>] platform_drv_probe+0xf/0x20 [ 81.744306] [<c12ba558>] driver_probe_device+0x68/0x170 [ 81.744309] [<c12ba731>] __device_attach+0x41/0x50 [ 81.744313] [<c12b9088>] bus_for_each_drv+0x48/0x70 [ 81.744316] [<c12ba7f3>] device_attach+0x83/0xa0 [ 81.744319] [<c12ba6f0>] ? __driver_attach+0x90/0x90 [ 81.744321] [<c12b991f>] bus_probe_device+0x6f/0x90 [ 81.744324] [<c12b8a45>] device_add+0x5e5/0x680 [ 81.744329] [<c122a1a3>] ? kvasprintf+0x43/0x60 [ 81.744332] [<c121e6e4>] ? kobject_set_name_vargs+0x64/0x70 [ 81.744335] [<c121e6e4>] ? kobject_set_name_vargs+0x64/0x70 [ 81.744339] [<c12bbe9f>] platform_device_add+0xff/0x1b0 [ 81.744343] [<f8252906>] uvesafb_init+0x50/0x9b [uvesafb] [ 81.744346] [<c100111f>] do_one_initcall+0x2f/0x170 [ 81.744350] [<f82528b6>] ? uvesafb_is_valid_mode+0x66/0x66 [uvesafb] [ 81.744355] [<c10c6994>] sys_init_module+0xf4/0x1410 [ 81.744359] [<c1157fc0>] ? vfsmount_lock_local_unlock_cpu+0x30/0x30 [ 81.744363] [<c144cb10>] sysenter_do_call+0x12/0x36 [ 81.744365] Code: f5 00 00 00 32 f6 66 8b da 66 d1 e3 66 ba d4 03 8a e3 b0 1c 66 ef b0 1e 66 ef 8a e7 b0 1d 66 ef b0 1f 66 ef e8 fa 00 00 00 61 c3 <60> e8 c8 00 00 00 66 8b f3 66 8b da 66 ba d4 03 b0 0c 8a e5 66 [ 81.744388] EIP: [<c00cd3b3>] 0xc00cd3b3 SS:ESP 0068:f57f3a00 [ 81.744391] CR2: 00000000c00cd3b3 [ 81.744393] ---[ end trace 18b2c87c925b54d6 ]--- Signed-off-by: Wang YanQing <udknight@gmail.com> Cc: Michal Januszewski <spock@gentoo.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: stable@vger.kernel.org
| * | fbdev: fix au1*fb buildsManuel Lauss2012-04-082-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1c16697bf9d5b206cb0d2b905a54de5e077296be ("drivers/video/au*fb.c: use devm_ functions) introduced 2 build failures in the au1100fb and au1200fb drivers, fix them. Signed-off-by: Manuel Lauss <manuel.lauss@googlemail.com> Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
| * | kyrofb: fix on x86_64Ondrej Zary2012-04-081-188/+188
| |/ | | | | | | | | | | | | | | | | | | kyrofb is completely broken on x86_64 because the registers are defined as unsigned long. Change them to u32 to make the driver work. Tested with Hercules 3D Prophet 4000XT. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
* | Merge branch 'for-linus-min' of ↵Linus Torvalds2012-04-137-21/+40
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull the minimal btrfs branch from Chris Mason: "We have a use-after-free in there, along with errors when mount -o discard is enabled, and a BUG_ON(we should compile with UP more often)." * 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: use commit root when loading free space cache Btrfs: fix use-after-free in __btrfs_end_transaction Btrfs: check return value of bio_alloc() properly Btrfs: remove lock assert from get_restripe_target() Btrfs: fix eof while discarding extents Btrfs: fix uninit variable in repair_eb_io_failure Revert "Btrfs: increase the global block reserve estimates"
| * | Btrfs: use commit root when loading free space cacheJosef Bacik2012-04-122-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A user reported that booting his box up with btrfs root on 3.4 was way slower than on 3.3 because I removed the ideal caching code. It turns out that we don't load the free space cache if we're in a commit for deadlock reasons, but since we're reading the cache and it hasn't changed yet we are safe reading the inode and free space item from the commit root, so do that and remove all of the deadlock checks so we don't unnecessarily skip loading the free space cache. The user reported this fixed the slowness. Thanks, Tested-by: Calvin Walton <calvin.walton@kepstin.ca> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: fix use-after-free in __btrfs_end_transactionDave Jones2012-04-121-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 49b25e0540904be0bf558b84475c69d72e4de66e introduced a use-after-free bug that caused spurious -EIO's to be returned. Do the check before we free the transaction. Cc: David Sterba <dsterba@suse.cz> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: check return value of bio_alloc() properlyTsutomu Itoh2012-04-123-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | bio_alloc() has the possibility of returning NULL. So, it is necessary to check the return value. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: remove lock assert from get_restripe_target()Ilya Dryomov2012-04-121-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a regression introduced by fc67c450. spin_is_locked() always returns 0 on UP kernels, which caused assert in get_restripe_target() to be fired on every call from btrfs_reduce_alloc_profile() on UP systems. Remove it completely for now, it's not clear if it's going to be needed in future. Reported-by: Bobby Powers <bobbypowers@gmail.com> Reported-by: Mitch Harder <mitch.harder@sabayonlinux.org> Tested-by: Mitch Harder <mitch.harder@sabayonlinux.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: fix eof while discarding extentsLiu Bo2012-04-121-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | We miscalculate the length of extents we're discarding, and it leads to an eof of device. Reported-by: Daniel Blueman <daniel@quora.org> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: fix uninit variable in repair_eb_io_failureChris Mason2012-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | We'd have to be passing bogus extent buffers for this uninit variable to actually be used, but set it to zero just in case. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Revert "Btrfs: increase the global block reserve estimates"Chris Mason2012-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf. We've had a number of complaints of early enospc that bisect down to this patch. We'll hae to fix the reservations differently. CC: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | | Merge branch 'for-3.4/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2012-04-1310-332/+796
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver bits from Jens Axboe: - A series of fixes for mtip32xx. Most from Asai at Micron, but also one from Greg, getting rid of the dependency on PCIE_HOTPLUG. - A few bug fixes for xen-blkfront, and blkback. - A virtio-blk fix for Vivek, making resize actually work. - Two fixes from Stephen, making larger transfers possible on cciss. This is needed for tape drive support. * 'for-3.4/drivers' of git://git.kernel.dk/linux-block: block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy mtip32xx: dump tagmap on failure mtip32xx: fix handling of commands in various scenarios mtip32xx: Shorten macro names mtip32xx: misc changes mtip32xx: Add new sysfs entry 'status' mtip32xx: make setting comp_time as common mtip32xx: Add new bitwise flag 'dd_flag' mtip32xx: fix error handling in mtip_init() virtio-blk: Call revalidate_disk() upon online disk resize xen/blkback: Make optional features be really optional. xen/blkback: Squash the discard support for 'file' and 'phy' type. mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port() cciss: Fix scsi tape io with more than 255 scatter gather elements cciss: Initialize scsi host max_sectors for tape drive support xen-blkfront: make blkif_io_lock spinlock per-device xen/blkfront: don't put bdev right after getting it xen-blkfront: use bitmap_set() and bitmap_clear() xen/blkback: Enable blkback on HVM guests xen/blkback: use grant-table.c hypercall wrappers
| * | | block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancyGreg Kroah-Hartman2012-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This removes the HOTPLUG_PCI_PCIE dependency on the driver and makes it depend on PCI. Cc: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: dump tagmap on failureAsai Thambi S P2012-04-091-29/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Dump tagmap on failure, instead of individual tags. Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: fix handling of commands in various scenariosAsai Thambi S P2012-04-092-51/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * If a ncq command time out and a non-ncq command is active, skip restart port * Queue(pause) ncq commands during operations spanning more than one non-ncq commands - secure erase, download microcode * When a non-ncq command is active, allow incoming non-ncq commands to wait instead of failing back * Changed timeout for download microcode and smart commands * If the device in write protect mode, fail all writes (do not send to device) * Set maximum retries to 2 Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: Shorten macro namesAsai Thambi S P2012-04-092-77/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Shortened macros used to represent mtip_port->flags and dd->dd_flag Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: misc changesAsai Thambi S P2012-04-091-22/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Handle the interrupt completion of polled internal commands * Do not check remove pending flag for standby command * On rebuild failure, - set corresponding bit dd_flag - do not send standby command * Free ida index in remove path Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: Add new sysfs entry 'status'Asai Thambi S P2012-04-093-29/+331
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add support for detecting the following device status - write protect - over temp (thermal shutdown) * Add new sysfs entry 'status', possible values - online, write_protect, thermal_shutdown * Add new file 'sysfs-block-rssd' to document ABI (Reported-by: Greg Kroah-Hartman) Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: make setting comp_time as commonAsai Thambi S P2012-04-091-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Moved setting completion time into mtip_issue_ncq_command() Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: Add new bitwise flag 'dd_flag'Asai Thambi S P2012-04-092-55/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Merged the following flags into one variable 'dd_flag': * drv_cleanup_done * resumeflag * Added the following flags into 'dd_flag' * remove pending * init done * Removed 'ftlrebuildflag' (similar flag is already part of mti_port->flags) Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: fix error handling in mtip_init()Ryosuke Saito2012-04-051-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that block device is properly unregistered, if pci_register_driver() fails. Signed-off-by: Ryosuke Saito <raitosyo@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | virtio-blk: Call revalidate_disk() upon online disk resizeVivek Goyal2012-03-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a virtio disk is open in guest and a disk resize operation is done, (virsh blockresize), new size is not visible to tools like "fdisk -l". This seems to be happening as we update only part->nr_sects and not bdev->bd_inode size. Call revalidate_disk() which should take care of it. I tested growing disk size of already open disk and it works for me. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | Merge branch 'stable/for-jens-3.4-bugfixes' of ↵Jens Axboe2012-03-263-74/+40
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.4/drivers Konrad writes: I've two small fixes for the xen-blkback - and I think one more will show up eventually (a partial revert), but not sure when. So in the spirit of keeping the patches flowing, please git pull the following branch.
| | * | | xen/blkback: Make optional features be really optional.Konrad Rzeszutek Wilk2012-03-241-21/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They were using the xenbus_dev_fatal() function which would change the state of the connection immediately. Which is not what we want when we advertise optional features. So make 'feature-discard','feature-barrier','feature-flush-cache' optional. Suggested-by: Jan Beulich <JBeulich@suse.com> [v1: Made the discard function void and static] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | | xen/blkback: Squash the discard support for 'file' and 'phy' type.Konrad Rzeszutek Wilk2012-03-243-61/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only reason for the distinction was for the special case of 'file' (which is assumed to be loopback device), was to reach inside the loopback device, find the underlaying file, and call fallocate on it. Fortunately "xen-blkback: convert hole punching to discard request on loop devices" removes that use-case and we now based the discard support based on blk_queue_discard(q) and extract all appropriate parameters from the 'struct request_queue'. CC: Li Dongyang <lidongyang@novell.com> Acked-by: Jan Beulich <JBeulich@suse.com> [v1: Dropping pointless initializer and keeping blank line] [v2: Remove the kfree as it is not used anymore] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | | mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize ↵Asai Thambi S P2012-03-231-11/+8
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and start port in mtip_restart_port() This patch includes two changes: * fix incorrect value set for drv_cleanup_done * re-initialize and start port in mtip_restart_port() Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | cciss: Fix scsi tape io with more than 255 scatter gather elementsStephen M. Cameron2012-03-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The total number of scatter gather elements in the CISS command used by the scsi tape code was being cast to a u8, which can hold at most 255 scatter gather elements. It should have been cast to a u16. Without this patch the command gets rejected by the controller since the total scatter gather count did not add up to the right value resulting in an i/o error. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | cciss: Initialize scsi host max_sectors for tape drive supportStephen M. Cameron2012-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default is too small (1024 blocks), use h->cciss_max_sectors (8192 blocks) Without this change, if you try to set the block size of a tape drive above 512*1024, via "mt -f /dev/st0 setblk nnn" where nnn is greater than 524288, it won't work right. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | xen-blkfront: make blkif_io_lock spinlock per-deviceSteven Noonan2012-03-201-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the global blkif_io_lock to the per-device structure. The spinlock seems to exists for two reasons: to disable IRQs when in the interrupt handlers for blkfront, and to protect the blkfront VBDs when a detachment is requested. Having a global blkif_io_lock doesn't make sense given the use case, and it drastically hinders performance due to contention. All VBDs with pending IOs have to take the lock in order to get work done, which serializes everything pretty badly. Signed-off-by: Steven Noonan <snoonan@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | xen/blkfront: don't put bdev right after getting itAndrew Jones2012-03-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should hang onto bdev until we're done with it. Signed-off-by: Andrew Jones <drjones@redhat.com> [v1: Fixed up git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | xen-blkfront: use bitmap_set() and bitmap_clear()Akinobu Mita2012-03-201-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use bitmap_set and bitmap_clear rather than modifying individual bits in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | xen/blkback: Enable blkback on HVM guestsDaniel De Graaf2012-03-201-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | xen/blkback: use grant-table.c hypercall wrappersDaniel De Graaf2012-03-201-25/+4
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | | Merge branch 'for-3.4/core' of git://git.kernel.dk/linux-blockLinus Torvalds2012-04-135-16/+27
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block core bits from Jens Axboe: "It's a nice and quiet round this time, since most of the tricky stuff has been pushed to 3.5 to give it more time to mature. After a few hectic block IO core changes for 3.3 and 3.2, I'm quite happy with a slow round. Really minor stuff in here, the only real functional change is making the auto-unplug threshold a per-queue entity. The threshold is set so that it's low enough that we don't hold off IO for too long, but still big enough to get a nice benefit from the batched insert (and hence queue lock cost reduction). For raid configurations, this currently breaks down." * 'for-3.4/core' of git://git.kernel.dk/linux-block: block: make auto block plug flush threshold per-disk based Documentation: Add sysfs ABI change for cfq's target latency. block: Make cfq_target_latency tunable through sysfs. block: use lockdep_assert_held for queue locking block: blk_alloc_queue_node(): use caller's GFP flags instead of GFP_KERNEL
| * | | | block: make auto block plug flush threshold per-disk basedShaohua Li2012-04-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do auto block plug flush to reduce latency, the threshold is 16 requests. This works well if the task is accessing one or two drives. The problem is if the task is accessing a raid 0 device and the raid disk number is big, say 8 or 16, 16/8 = 2 or 16/16=1, we will have heavy lock contention. This patch makes the threshold per-disk based. The latency should be still ok accessing one or two drives. The setup with application accessing a lot of drives in the meantime uaually is big machine, avoiding lock contention is more important, because any contention will actually increase latency. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>