summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'driver-core-3.11-rc2' of ↵Linus Torvalds2013-07-181-22/+48
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg KH: "Here are some driver core patches for 3.11-rc2. They aren't really bugfixes, but a bunch of new helper macros for drivers to properly create attribute groups, which drivers and subsystems need to fix up a ton of race issues with incorrectly creating sysfs files (binary and normal) after userspace has been told that the device is present. Also here is the ability to create binary files as attribute groups, to solve that race condition, which was impossible to do before this, so that's my fault the drivers were broken. The majority of the .c changes is indenting and moving code around a bit. It affects no existing code, but allows the large backlog of 70+ patches that I already have created to start flowing into the different subtrees, instead of having to live in my driver-core tree, causing merge nightmares in linux-next for the next few months. These were finalized too late for the -rc1 merge window, which is why they were didn't make that pull request, testing and review from others didn't happen until a few weeks ago, and then there's the whole distraction of the past few days, which prevented these from getting to you sooner, sorry about that. Oh, and there's a bugfix for the documentation build warning in here as well. All of these have been in linux-next this week, with no reported problems" * tag 'driver-core-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver-core: fix new kernel-doc warning in base/platform.c sysfs: use file mode defines from stat.h sysfs: add more helper macro's for (bin_)attribute(_groups) driver core: add default groups to struct class driver core: Introduce device_create_groups sysfs: prevent warning when only using binary attributes sysfs: add support for binary attributes in groups driver core: device.h: add RW and RO attribute macros sysfs.h: add BIN_ATTR macro sysfs.h: add ATTRIBUTE_GROUPS() macro sysfs.h: add __ATTR_RW() macro
| * sysfs: prevent warning when only using binary attributesOliver Schinagl2013-07-161-2/+2
| | | | | | | | | | | | | | | | When only using bin_attrs instead of attrs the kernel prints a warning and refuses to create the sysfs entry. This fixes that. Signed-off-by: Oliver Schinagl <oliver@schinagl.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * sysfs: add support for binary attributes in groupsGreg Kroah-Hartman2013-07-161-20/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | groups should be able to support binary attributes, just like it supports "normal" attributes. This lets us only handle one type of structure, groups, throughout the driver core and subsystems, making binary attributes a "full fledged" part of the driver model, and not something just "tacked on". Reported-by: Oliver Schinagl <oliver@schinagl.nl> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-07-174-8/+12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd bugfixes from Bruce Fields: "Just three minor bugfixes" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: svcrdma: underflow issue in decode_write_list() nfsd4: fix minorversion support interface lockd: protect nlm_blocked access in nlmsvc_retry_blocked
| * | nfsd4: fix minorversion support interfaceJ. Bruce Fields2013-07-123-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | You can turn on or off support for minorversions using e.g. echo "-4.2" >/proc/fs/nfsd/versions However, the current implementation is a little wonky. For example, the above will turn off 4.2 support, but it will also turn *on* 4.1 support. This didn't matter as long as we only had 2 minorversions, which was true till very recently. And do a little cleanup here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | lockd: protect nlm_blocked access in nlmsvc_retry_blockedDavid Jeffery2013-07-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring the pointer of the first entry is unprotected by any lock. This allows a rare race condition when there is only one entry on the list. A function such as nlmsvc_grant_callback() can be called, which will temporarily remove the entry from the list. Between the list_empty() and list_entry(),the list may become empty, causing an invalid pointer to be used as an nlm_block, leading to a possible crash. This patch adds the nlm_block_lock around these calls to prevent concurrent use of the nlm_blocked list. This was a regression introduced by f904be9cc77f361d37d71468b13ff3d1a1823dea "lockd: Mostly remove BKL from the server". Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2013-07-146-47/+51
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "Various regression and bug fixes for ext4" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't allow ext4_free_blocks() to fail due to ENOMEM ext4: fix spelling errors and a comment in extent_status tree ext4: rate limit printk in buffer_io_error() ext4: don't show usrquota/grpquota twice in /proc/mounts ext4: fix warning in ext4_evict_inode() ext4: fix ext4_get_group_number() ext4: silence warning in ext4_writepages()
| * | ext4: don't allow ext4_free_blocks() to fail due to ENOMEMTheodore Ts'o2013-07-131-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The filesystem should not be marked inconsistent if ext4_free_blocks() is not able to allocate memory. Unfortunately some callers (most notably ext4_truncate) don't have a way to reflect an error back up to the VFS. And even if we did, most userspace applications won't deal with most system calls returning ENOMEM anyway. Reported-by: Nagachandra P <nagachandra@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | ext4: fix spelling errors and a comment in extent_status treeTheodore Ts'o2013-07-132-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace "assertation" with "assertion" in lots and lots of debugging messages. Correct the comment stating when ext4_es_insert_extent() is used. It was no doubt tree at one point, but it is no longer true... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <gnehzuil.liu@gmail.com>
| * | ext4: rate limit printk in buffer_io_error()Anatol Pomozov2013-07-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are a lot of outstanding buffered IOs when a device is taken offline (due to hardware errors etc), ext4_end_bio prints out a message for each failed logical block. While this is desirable, we see thousands of such lines being printed out before the serial console gets overwhelmed, causing ext4_end_bio() wait for the printk to complete. This in itself isn't a disaster, except for the detail that this function is being called with the queue lock held. This causes any other function in the block layer to spin on its spin_lock_irqsave while the serial console is draining. If NMI watchdog is enabled on this machine then it eventually comes along and shoots the machine in the head. The end result is that losing any one disk causes the machine to go down. This patch rate limits the printk to bandaid around the problem. Tested: xfstests Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8 Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: don't show usrquota/grpquota twice in /proc/mountsTheodore Ts'o2013-07-111-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now print mount options in a generic fashion in ext4_show_options(), so we shouldn't be explicitly printing the {usr,grp}quota options in ext4_show_quota_options(). Without this patch, /proc/mounts can look like this: /dev/vdb /vdb ext4 rw,relatime,quota,usrquota,data=ordered,usrquota 0 0 ^^^^^^^^ ^^^^^^^^ Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | ext4: fix warning in ext4_evict_inode()Jan Kara2013-07-101-13/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following race can lead to ext4_evict_inode() seeing i_ioend_count > 0 and thus triggering a sanity check warning: CPU1 CPU2 ext4_end_bio() ext4_evict_inode() ext4_finish_bio() end_page_writeback(); truncate_inode_pages() evict page WARN_ON(i_ioend_count > 0); ext4_put_io_end_defer() ext4_release_io_end() dec i_ioend_count This is possible use-after-free bug since we decrement i_ioend_count in possibly released inode. Since i_ioend_count is used only for sanity checks one possible solution would be to just remove it but for now I'd like to keep those sanity checks to help debugging the new ext4 writeback code. This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before ext4_finish_bio() in the shortcut case when unwritten extent conversion isn't needed. In that case we don't need the io_end so we are safe to drop it early. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: fix ext4_get_group_number()Theodore Ts'o2013-07-052-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function ext4_get_group_number() was introduced as an optimization in commit bd86298e60b8. Unfortunately, this commit incorrectly calculate the group number for file systems with a 1k block size (when s_first_data_block is 1 instead of zero). This could cause the following kernel BUG: [ 568.877799] ------------[ cut here ]------------ [ 568.877833] kernel BUG at fs/ext4/mballoc.c:3728! [ 568.877840] Oops: Exception in kernel mode, sig: 5 [#1] [ 568.877845] SMP NR_CPUS=32 NUMA pSeries [ 568.877852] Modules linked in: binfmt_misc [ 568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809f-dirty #1 [ 568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000 [ 568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8 [ 568.877879] REGS: c0000001fa956ed0 TRAP: 0700 Not tainted (3.10.0-03216-g7c6809f-dirty) [ 568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 24000428 XER: 00000000 [ 568.877902] SOFTE: 1 [ 568.877905] CFAR: c0000000002b5464 [ 568.877908] GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000 GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000 GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0 GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672 GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000 GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288 GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10 [ 568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0 [ 568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 [ 568.878004] Call Trace: [ 568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable) [ 568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444 [ 568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734 [ 568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4 [ 568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178 [ 568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490 [ 568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60 [ 568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80 [ 568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0 [ 568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc [ 568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4 [ 568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80 [ 568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c [ 568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90 [ 568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c [ 568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c [ 568.878125] Instruction dump: [ 568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70 [ 568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78 [ 568.878160] ---[ end trace 594d911d9654770b ]--- In addition fix the STD_GROUP optimization so that it works for bigalloc file systems as well. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Li Zhong <lizhongfs@gmail.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org # 3.10
| * | ext4: silence warning in ext4_writepages()Jan Kara2013-07-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loop in mpage_map_and_submit_extent() is guaranteed to always run at least once since the caller of mpage_map_and_submit_extent() makes sure map->m_len > 0. So make that explicit using do-while instead of pure while which also silences the compiler warning about uninitialized 'err' variable. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2013-07-146-43/+24
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE
| * | | efivarfs: we can use simple_lookup() nowAl Viro2013-07-141-13/+1
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | make simple_lookup() usable for filesystems that set ->s_d_opAl Viro2013-07-141-1/+2
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | configfs: don't open-code d_alloc_name()Al Viro2013-07-141-11/+2
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | fput: turn "list_head delayed_fput_list" into llist_headOleg Nesterov2013-07-131-15/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fput() and delayed_fput() can use llist and avoid the locking. This is unlikely path, it is not that this change can improve the performance, but this way the code looks simpler. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Vagin <avagin@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Howells <dhowells@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | fs/file_table.c:fput(): add commentAndrew Morton2013-07-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A missed update to "fput: task_work_add() can fail if the caller has passed exit_task_work()". Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Vagin <avagin@openvz.org> Cc: David Howells <dhowells@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | Safer ABI for O_TMPFILEAl Viro2013-07-132-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE; that will fail on old kernels in a lot more cases than what I came up with. And make sure O_CREAT doesn't get there... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-07-131-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Just a bunch of small fixes and tidy ups: 1) Finish the "busy_poll" renames, from Eliezer Tamir. 2) Fix RCU stalls in IFB driver, from Ding Tianhong. 3) Linearize buffers properly in tun/macvtap zerocopy code. 4) Don't crash on rmmod in vxlan, from Pravin B Shelar. 5) Spinlock used before init in alx driver, from Maarten Lankhorst. 6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry Kravkov. 7) Dummy and ifb driver load failure paths can oops, fixes from Tan Xiaojun and Ding Tianhong. 8) Correct MTU calculations in IP tunnels, from Alexander Duyck. 9) Account all TCP retransmits in SNMP stats properly, from Yuchung Cheng. 10) atl1e and via-rhine do not handle DMA mapping failures properly, from Neil Horman. 11) Various equal-cost multipath route fixes in ipv6 from Hannes Frederic Sowa" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) ipv6: only static routes qualify for equal cost multipathing via-rhine: fix dma mapping errors atl1e: fix dma mapping warnings tcp: account all retransmit failures usb/net/r815x: fix cast to restricted __le32 usb/net/r8152: fix integer overflow in expression net: access page->private by using page_private net: strict_strtoul is obsolete, use kstrtoul instead drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe net/usb: add relative mii functions for r815x net/tipc: use %*phC to dump small buffers in hex form qlcnic: Adding Maintainers. gre: Fix MTU sizing check for gretap tunnels pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts pkt_sched: sch_qfq: improve efficiency of make_eligible gso: Update tunnel segmentation to support Tx checksum offload inet: fix spacing in assignment ifb: fix oops when loading the ifb failed ...
| * | | | net: rename include/net/ll_poll.h to include/net/busy_poll.hEliezer Tamir2013-07-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the file and correct all the places where it is included. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-07-1321-316/+435
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more xfs updates from Ben Myers: "Here are a fix for xfs_fsr, a cleanup in bulkstat, a cleanup in xfs_open_by_handle, updated mount options documentation, a cleanup in xfs_bmapi_write, a fix for the size of dquot log reservations, a fix for sgid inheritance when acls are in use, a fix for cleaning up quotainfo structures, and some more of the work which allows group and project quotas to be used together. We had a few more in this last quota category that we might have liked to get in, but it looks there are still a few items that need to be addressed. - fix for xfs_fsr returning -EINVAL - cleanup in xfs_bulkstat - cleanup in xfs_open_by_handle - update mount options documentation - clean up local format handling in xfs_bmapi_write - fix dquot log reservations which were too small - fix sgid inheritance for subdirectories when default acls are in use - add project quota fields to various structures - fix teardown of quotainfo structures when quotas are turned off" * tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs: xfs: Fix the logic check for all quotas being turned off xfs: Add pquota fields where gquota is used. xfs: fix sgid inheritance for subdirectories inheriting default acls [V3] xfs: dquot log reservations are too small xfs: remove local fork format handling from xfs_bmapi_write() xfs: update mount options documentation xfs: use get_unused_fd_flags(0) instead of get_unused_fd() xfs: clean up unused codes at xfs_bulkstat() xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJ
| * | | | | xfs: Fix the logic check for all quotas being turned offChandra Seetharaman2013-07-112-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the review of seperate pquota inode patches, David noticed that the test to detect all quotas being turned off was incorrect, and hence the block was not freeing all the quota information. The check made sense in Irix, but in Linux, quota is turned off one at a time, which makes the test invalid for Linux. This problem existed since XFS was ported to Linux. David suggested to fix the problem by detecting when all quotas are turned off by checking m_qflags. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: Add pquota fields where gquota is used.Chandra Seetharaman2013-07-1114-124/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add project quota changes to all the places where group quota field is used: * add separate project quota members into various structures * split project quota and group quotas so that instead of overriding the group quota members incore, the new project quota members are used instead * get rid of usage of the OQUOTA flag incore, in favor of separate group and project quota flags. * add a project dquot argument to various functions. Not using the pquotino field from superblock yet. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: fix sgid inheritance for subdirectories inheriting default acls [V3]Carlos Maiolino2013-07-101-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFS removes sgid bits of subdirectories under a directory containing a default acl. When a default acl is set, it implies xfs to call xfs_setattr_nonsize() in its code path. Such function is shared among mkdir and chmod system calls, and does some checks unneeded by mkdir (calling inode_change_ok()). Such checks remove sgid bit from the inode after it has been granted. With this patch, we extend the meaning of XFS_ATTR_NOACL flag to avoid these checks when acls are being inherited (thanks hch). Also, xfs_setattr_mode, doesn't need to re-check for group id and capabilities permissions, this only implies in another try to remove sgid bit from the directories. Such check is already done either on inode_change_ok() or xfs_setattr_nonsize(). Changelog: V2: Extends the meaning of XFS_ATTR_NOACL instead of wrap the tests into another function V3: Remove S_ISDIR check in xfs_setattr_nonsize() from the patch Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: dquot log reservations are too smallDave Chinner2013-07-092-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During review of the separate project quota inode patches, it became obvious that the dquot log reservation calculation underestimated the number dquots that can be modified in a transaction. This has it's roots way back in the Irix quota implementation. That is, when quotas were first implemented in XFS, it only supported user and project quotas as Irix did not have group quotas. Hence the worst case operation involving dquot modification was calculated to involve 2 user dquots and 1 project dquot or 1 user dequot and 2 project dquots. i.e. 3 dquots. This was determined back in 1996, and has remained unchanged ever since. However, back in 2001, the Linux XFS port dropped all support for project quota and implmented group quotas over the top. This was effectively done with a search-and-replace of project with group, and as such the log reservation was not changed. However, with the advent of group quotas, chmod and rename now could modify more than 3 dquots in a single transaction - both could modify 4 dquots. Hence this log reservation has been wrong for a long time. In 2005, project quota support was reintroduced into Linux, but it was implemented to be mutually exclusive to group quotas and so this didn't add any new changes to the dquot log reservation. Hence when project quotas were in use (rather than group quotas) the log reservation was again valid, just like in the Irix days. Now, with the addition of the separate project quota inode, group and project quotas are no longer mutually exclusive, and hence operations can now modify three dquots per inode where previously it was only two. The worst case here is the rename transaction, which can allocate/free space on two different directory inodes, and if they have different uid/gid/prid configurations and are world writeable, then rename can actually modify 6 different dquots now. Further, the dquot log reservation doesn't take into account the space used by the dquot log format structure that precedes the dquot that is logged, and hence further underestimates the worst case log space required by dquots during a transaction. This has been missing since the first commit in 1996. Hence the worst case log reservation needs to be increased from 3 to 6, and it needs to take into account a log format header for each of those dquots. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: remove local fork format handling from xfs_bmapi_write()Dave Chinner2013-07-094-124/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion from local format to extent format requires interpretation of the data in the fork being converted, so it cannot be done in a generic way. It is up to the caller to convert the fork format to extent format before calling into xfs_bmapi_write() so format conversion can be done correctly. The code in xfs_bmapi_write() to convert the format is used implicitly by the attribute and directory code, but they specifically zero the fork size so that the conversion does not do any allocation or manipulation. Move this conversion into the shortform to leaf functions for the dir/attr code so the conversions are explicitly controlled by all callers. Now we can remove the conversion code in xfs_bmapi_write. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: use get_unused_fd_flags(0) instead of get_unused_fd()Yann Droneaud2013-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be "unsafe": O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: clean up unused codes at xfs_bulkstat()Jie Liu2013-07-091-27/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some unused codes at xfs_bulkstat(): - Variable bp is defined to point to the on-disk inode cluster buffer, but it proved to be of no practical help. - We process the chunks of good inodes which were fetched by iterating btree records from an AG. When processing inodes from each chunk, the code recomputing agbno if run into the first inode of a cluster, however, the agbno is not being used thereafter. This patch tries to clean up those things. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | | | | xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJEric Sandeen2013-07-092-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFS_BROOT_SIZE_ADJ is an undocumented macro which accounts for the difference in size between the on-disk and in-core btree root. It's much clearer to just use the newly-added XFS_BMAP_BMDR_SPACE macro which gives us the on-disk size directly. In one case, we must test that the if_broot exists before applying the macro, however. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* | | | | | Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2013-07-1315-281/+515
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "Fixes for 4 cifs bugs, including a reconnect problem, a problem parsing responses to SMB2 open request, and setting nlink incorrectly to some servers which don't report it properly on the wire. Also improves data integrity on reconnect with series from Pavel which adds durable handle support for SMB2." * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Fix a deadlock when a file is reopened CIFS: Reopen the file if reconnect durable handle failed [CIFS] Fix minor endian error in durable handle patch series CIFS: Reconnect durable handles for SMB2 CIFS: Make SMB2_open use cifs_open_parms struct CIFS: Introduce cifs_open_parms struct CIFS: Request durable open for SMB2 opens CIFS: Simplify SMB2 create context handling CIFS: Simplify SMB2_open code path CIFS: Respect create_options in smb2_open_file CIFS: Fix lease context buffer parsing [CIFS] use sensible file nlink values if unprovided Limit allocation of crypto mechanisms to dialect which requires
| * | | | | | CIFS: Fix a deadlock when a file is reopenedPavel Shilovsky2013-07-111-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we request reading or writing on a file that needs to be reopened, it causes the deadlock: we are already holding rw semaphore for reading and then we try to acquire it for writing in cifs_relock_file. Fix this by acquiring the semaphore for reading in cifs_relock_file due to we don't make any changes in locks and don't need a write access. CC: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | | | CIFS: Reopen the file if reconnect durable handle failedPavel Shilovsky2013-07-111-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-on patch for 8/8 patch from the durable handles series. It fixes the problem when durable file handle timeout expired on the server and reopen returns -ENOENT for such files. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | | | [CIFS] Fix minor endian error in durable handle patch seriesSteve French2013-07-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix endian warning: CHECK fs/cifs/smb2pdu.c fs/cifs/smb2pdu.c:1068:40: warning: incorrect type in assignment (different base types) fs/cifs/smb2pdu.c:1068:40: expected restricted __le32 [usertype] Next fs/cifs/smb2pdu.c:1068:40: got unsigned long Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | | | CIFS: Reconnect durable handles for SMB2Pavel Shilovsky2013-07-108-12/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On reconnects, we need to reopen file and then obtain all byte-range locks held by the client. SMB2 protocol provides feature to make this process atomic by reconnecting to the same file handle with all it's byte-range locks. This patch adds this capability for SMB2 shares. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Make SMB2_open use cifs_open_parms structPavel Shilovsky2013-07-105-50/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to prepare it for further durable handle reconnect processing. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Introduce cifs_open_parms structPavel Shilovsky2013-07-106-41/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and pass it to the open() call. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Request durable open for SMB2 opensPavel Shilovsky2013-07-103-2/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by passing durable context together with a handle caching lease or batch oplock. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Simplify SMB2 create context handlingPavel Shilovsky2013-07-101-12/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to make it easier to add other create context further. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Simplify SMB2_open code pathPavel Shilovsky2013-07-102-36/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by passing a filename to a separate iovec regardless of its length. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Respect create_options in smb2_open_filePavel Shilovsky2013-07-105-22/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and eliminated unused file_attribute parms of SMB2_open. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | CIFS: Fix lease context buffer parsingPavel Shilovsky2013-07-101-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to prevent missing RqLs context if it's not the first one. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
| * | | | | | [CIFS] use sensible file nlink values if unprovidedSteve French2013-07-041-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Certain servers may not set the NumberOfLinks field in query file/path info responses. In such a case, cifs_inode_needs_reval() assumes that all regular files are hardlinks and triggers revalidation, leading to excessive and unnecessary network traffic. This change hardcodes cf_nlink (and subsequently i_nlink) when not returned by the server, similar to what already occurs in cifs_mkdir(). Cc: <stable@vger.kernel.org> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | | | Limit allocation of crypto mechanisms to dialect which requiresSteve French2013-07-044-118/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updated patch to try to prevent allocation of cifs, smb2 or smb3 crypto secmech structures unless needed. Currently cifs allocates all crypto mechanisms when the first session is established (4 functions and 4 contexts), rather than only allocating these when needed (smb3 needs two, the rest of the dialects only need one). Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>
* | | | | | | Merge branch 'for-3.11/core' of git://git.kernel.dk/linux-blockLinus Torvalds2013-07-111-1/+8
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block IO updates from Jens Axboe: "Here are the core IO block bits for 3.11. It contains: - A tweak to the reserved tag logic from Jan, for weirdo devices with just 3 free tags. But for those it improves things substantially for random writes. - Periodic writeback fix from Jan. Marked for stable as well. - Fix for a race condition in IO scheduler switching from Jianpeng. - The hierarchical blk-cgroup support from Tejun. This is the grunt of the series. - blk-throttle fix from Vivek. Just a note that I'm in the middle of a relocation, whole family is flying out tomorrow. Hence I will be awal the remainder of this week, but back at work again on Monday the 15th. CC'ing Tejun, since any potential "surprises" will most likely be from the blk-cgroup work. But it's been brewing for a while and sitting in my tree and linux-next for a long time, so should be solid." * 'for-3.11/core' of git://git.kernel.dk/linux-block: (36 commits) elevator: Fix a race in elevator switching block: Reserve only one queue tag for sync IO if only 3 tags are available writeback: Fix periodic writeback after fs mount blk-throttle: implement proper hierarchy support blk-throttle: implement throtl_grp->has_rules[] blk-throttle: Account for child group's start time in parent while bio climbs up blk-throttle: add throtl_qnode for dispatch fairness blk-throttle: make throtl_pending_timer_fn() ready for hierarchy blk-throttle: make tg_dispatch_one_bio() ready for hierarchy blk-throttle: make blk_throtl_bio() ready for hierarchy blk-throttle: make blk_throtl_drain() ready for hierarchy blk-throttle: dispatch from throtl_pending_timer_fn() blk-throttle: implement dispatch looping blk-throttle: separate out throtl_service_queue->pending_timer from throtl_data->dispatch_work blk-throttle: set REQ_THROTTLED from throtl_charge_bio() and gate stats update with it blk-throttle: implement sq_to_tg(), sq_to_td() and throtl_log() blk-throttle: add throtl_service_queue->parent_sq blk-throttle: generalize update_disptime optimization in blk_throtl_bio() blk-throttle: dispatch to throtl_data->service_queue.bio_lists[] blk-throttle: move bio_lists[] and friends to throtl_service_queue ...
| * | | | | | | writeback: Fix periodic writeback after fs mountJan Kara2013-06-281-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code in blkdev.c moves a device inode to default_backing_dev_info when the last reference to the device is put and moves the device inode back to its bdi when the first reference is acquired. This includes moving to wb.b_dirty list if the device inode is dirty. The code however doesn't setup timer to wake corresponding flusher thread and while wb.b_dirty list is non-empty __mark_inode_dirty() will not set it up either. Thus periodic writeback is effectively disabled until a sync(2) call which can lead to unexpected data loss in case of crash or power failure. Fix the problem by setting up a timer for periodic writeback in case we add the first dirty inode to wb.b_dirty list in bdev_inode_switch_bdi(). Reported-by: Bert De Jonghe <Bert.DeJonghe@amplidata.com> CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | | | Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2013-07-113-11/+28
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull second set of NFS client updates from Trond Myklebust: "This mainly contains some small readdir optimisations that had dependencies on Al Viro's readdir rewrite. There is also a fix for a nasty deadlock which surfaced earlier in this merge window. Highlights include: - Fix an_rpc pipefs regression that causes a deadlock on mount - Readdir optimisations by Scott Mayhew and Jeff Layton - clean up the rpc_pipefs dentry operation setup" * tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Fix a deadlock in rpc_client_register() rpc_pipe: rpc_dir_inode_operations can be static NFS: Allow nfs_updatepage to extend a write under additional circumstances NFS: Make nfs_readdir revalidate less often NFS: Make nfs_attribute_cache_expired() non-static rpc_pipe: set dentry operations at d_alloc time nfs: set verifier on existing dentries in nfs_prime_dcache
| * | | | | | | | NFS: Allow nfs_updatepage to extend a write under additional circumstancesScott Mayhew2013-07-091-8/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently nfs_updatepage allows a write to be extended to cover a full page only if we don't have a byte range lock lock on the file... but if we have a write delegation on the file or if we have the whole file locked for writing then we should be allowed to extend the write as well. Signed-off-by: Scott Mayhew <smayhew@redhat.com> [Trond: fix up call to nfs_have_delegation()] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>