summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* FS-Cache: Fix object state machine to have separate work and wait statesDavid Howells2013-06-199-542/+574
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix object state machine to have separate work and wait states as that makes it easier to envision. There are now three kinds of state: (1) Work state. This is an execution state. No event processing is performed by a work state. The function attached to a work state returns a pointer indicating the next state to which the OSM should transition. Returning NO_TRANSIT repeats the current state, but goes back to the scheduler first. (2) Wait state. This is an event processing state. No execution is performed by a wait state. Wait states are just tables of "if event X occurs, clear it and transition to state Y". The dispatcher returns to the scheduler if none of the events in which the wait state has an interest are currently pending. (3) Out-of-band state. This is a special work state. Transitions to normal states can be overridden when an unexpected event occurs (eg. I/O error). Instead the dispatcher disables and clears the OOB event and transits to the specified work state. This then acts as an ordinary work state, though object->state points to the overridden destination. Returning NO_TRANSIT resumes the overridden transition. In addition, the states have names in their definitions, so there's no need for tables of state names. Further, the EV_REQUEUE event is no longer necessary as that is automatic for work states. Since the states are now separate structs rather than values in an enum, it's not possible to use comparisons other than (non-)equality between them, so use some object->flags to indicate what phase an object is in. The EV_RELEASE, EV_RETIRE and EV_WITHDRAW events have been squished into one (EV_KILL). An object flag now carries the information about retirement. Similarly, the RELEASING, RECYCLING and WITHDRAWING states have been merged into an KILL_OBJECT state and additional states have been added for handling waiting dependent objects (JUMPSTART_DEPS and KILL_DEPENDENTS). A state has also been added for synchronising with parent object initialisation (WAIT_FOR_PARENT) and another for initiating look up (PARENT_READY). Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* FS-Cache: Wrap checks on object stateDavid Howells2013-06-195-12/+12
| | | | | | | | | | | | Wrap checks on object state (mostly outside of fs/fscache/object.c) with inline functions so that the mechanism can be replaced. Some of the state checks within object.c are left as-is as they will be replaced. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* FS-Cache: Uninline fscache_object_init()David Howells2013-06-191-2/+38
| | | | | | | | | Uninline fscache_object_init() so as not to expose some of the FS-Cache internals to the cache backend. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* FS-Cache: Don't sleep in page release if __GFP_FS is not setDavid Howells2013-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't sleep in __fscache_maybe_release_page() if __GFP_FS is not set. This goes some way towards mitigating fscache deadlocking against ext4 by way of the allocator, eg: INFO: task flush-8:0:24427 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-8:0 D ffff88003e2b9fd8 0 24427 2 0x00000000 ffff88003e2b9138 0000000000000046 ffff880012e3a040 ffff88003e2b9fd8 0000000000011c80 ffff88003e2b9fd8 ffffffff81a10400 ffff880012e3a040 0000000000000002 ffff880012e3a040 ffff88003e2b9098 ffffffff8106dcf5 Call Trace: [<ffffffff8106dcf5>] ? __lock_is_held+0x31/0x53 [<ffffffff81219b61>] ? radix_tree_lookup_element+0xf4/0x12a [<ffffffff81454bed>] schedule+0x60/0x62 [<ffffffffa01d349c>] __fscache_wait_on_page_write+0x8b/0xa5 [fscache] [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d [<ffffffffa01d393a>] __fscache_maybe_release_page+0x30c/0x324 [fscache] [<ffffffffa01d369a>] ? __fscache_maybe_release_page+0x6c/0x324 [fscache] [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffffa01fd7b2>] nfs_fscache_release_page+0x68/0x94 [nfs] [<ffffffffa01ef73e>] nfs_release_page+0x7e/0x86 [nfs] [<ffffffff810aa553>] try_to_release_page+0x32/0x3b [<ffffffff810b6c70>] shrink_page_list+0x535/0x71a [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff810b7352>] shrink_inactive_list+0x20a/0x2dd [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea [<ffffffff810b7a65>] shrink_lruvec+0x34c/0x3eb [<ffffffff810b7bd3>] do_try_to_free_pages+0xcf/0x355 [<ffffffff810b7fc8>] try_to_free_pages+0x9a/0xa1 [<ffffffff810b08d2>] __alloc_pages_nodemask+0x494/0x6f7 [<ffffffff810d9a07>] kmem_getpages+0x58/0x155 [<ffffffff810dc002>] fallback_alloc+0x120/0x1f3 [<ffffffff8106db23>] ? trace_hardirqs_off+0xd/0xf [<ffffffff810dbed3>] ____cache_alloc_node+0x177/0x186 [<ffffffff81162a6c>] ? ext4_init_io_end+0x1c/0x37 [<ffffffff810dc403>] kmem_cache_alloc+0xf1/0x176 [<ffffffff810b17ac>] ? test_set_page_writeback+0x101/0x113 [<ffffffff81162a6c>] ext4_init_io_end+0x1c/0x37 [<ffffffff81162ce4>] ext4_bio_write_page+0x20f/0x3af [<ffffffff8115cc02>] mpage_da_submit_io+0x26e/0x2f6 [<ffffffff811088e5>] ? __find_get_block_slow+0x38/0x133 [<ffffffff81161348>] mpage_da_map_and_submit+0x3a7/0x3bd [<ffffffff81161a60>] ext4_da_writepages+0x30d/0x426 [<ffffffff810b3359>] do_writepages+0x1c/0x2a [<ffffffff81102f4d>] __writeback_single_inode+0x3e/0xe5 [<ffffffff81103995>] writeback_sb_inodes+0x1bd/0x2f4 [<ffffffff81103b3b>] __writeback_inodes_wb+0x6f/0xb4 [<ffffffff81103c81>] wb_writeback+0x101/0x195 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff811043aa>] ? wb_do_writeback+0xaa/0x173 [<ffffffff8110434a>] wb_do_writeback+0x4a/0x173 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81038554>] ? del_timer+0x4b/0x5b [<ffffffff811044e0>] bdi_writeback_thread+0x6d/0x147 [<ffffffff81104473>] ? wb_do_writeback+0x173/0x173 [<ffffffff81048fbc>] kthread+0xd0/0xd8 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 2 locks held by flush-8:0/24427: #0: (&type->s_umount_key#41){.+.+..}, at: [<ffffffff810e3b73>] grab_super_passive+0x4c/0x76 #1: (jbd2_handle){+.+...}, at: [<ffffffff81190d81>] start_this_handle+0x475/0x4ea The problem here is that another thread, which is attempting to write the to-be-stored NFS page to the on-ext4 cache file is waiting for the journal lock, eg: INFO: task kworker/u:2:24437 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u:2 D ffff880039589768 0 24437 2 0x00000000 ffff8800395896d8 0000000000000046 ffff8800283bf040 ffff880039589fd8 0000000000011c80 ffff880039589fd8 ffff880039f0b040 ffff8800283bf040 0000000000000006 ffff8800283bf6b8 ffff880039589658 ffffffff81071a13 Call Trace: [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea [<ffffffff81455e73>] ? _raw_spin_unlock_irqrestore+0x3a/0x50 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81454bed>] schedule+0x60/0x62 [<ffffffff81190c23>] start_this_handle+0x317/0x4ea [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d [<ffffffff81190fcc>] jbd2__journal_start+0xb3/0x12e [<ffffffff81176606>] __ext4_journal_start_sb+0xb2/0xc6 [<ffffffff8115f137>] ext4_da_write_begin+0x109/0x233 [<ffffffff810a964d>] generic_file_buffered_write+0x11a/0x264 [<ffffffff811032cf>] ? __mark_inode_dirty+0x2d/0x1ee [<ffffffff810ab1ab>] __generic_file_aio_write+0x2a5/0x2d5 [<ffffffff810ab24a>] generic_file_aio_write+0x6f/0xd0 [<ffffffff81159a2c>] ext4_file_write+0x38c/0x3c4 [<ffffffff810e0915>] do_sync_write+0x91/0xd1 [<ffffffffa00a17f0>] cachefiles_write_page+0x26f/0x310 [cachefiles] [<ffffffffa01d470b>] fscache_write_op+0x21e/0x37a [fscache] [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffffa01d2479>] fscache_op_work_func+0x78/0xd7 [fscache] [<ffffffff8104455a>] process_one_work+0x232/0x3a8 [<ffffffff810444ff>] ? process_one_work+0x1d7/0x3a8 [<ffffffff81044ee0>] worker_thread+0x214/0x303 [<ffffffff81044ccc>] ? manage_workers+0x245/0x245 [<ffffffff81048fbc>] kthread+0xd0/0xd8 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 4 locks held by kworker/u:2/24437: #0: (fscache_operation){.+.+.+}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8 #1: ((&op->work)){+.+.+.}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8 #2: (sb_writers#14){.+.+.+}, at: [<ffffffff810ab22c>] generic_file_aio_write+0x51/0xd0 #3: (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffff810ab236>] generic_file_aio_write+0x5b/0x fscache already tries to cancel pending stores, but it can't cancel a write for which I/O is already in progress. An alternative would be to accept writing garbage to the cache under extreme circumstances and to kill the afflicted cache object if we have to do this. However, we really need to know how strapped the allocator is before deciding to do that. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* CacheFiles: name i_mutex lock class explicitlyJ. Bruce Fields2013-06-191-1/+1
| | | | | | | | | | | | | Just some cleanup. (And note the caller of this function may, for example, call vfs_unlink on a child, so the "1" (I_MUTEX_PARENT) really was what was intended here.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* fs/fscache: remove spin_lock() from the condition in while()Sebastian Andrzej Siewior2013-06-191-6/+10
| | | | | | | | | | | | The spinlock() within the condition in while() will cause a compile error if it is not a function. This is not a problem on mainline but it does not look pretty and there is no reason to do it that way. That patch writes it a little differently and avoids the double condition. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>
* Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2013-05-148-142/+111
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "Fixed regressions (two stability regressions and a performance regression) introduced during the 3.10-rc1 merge window. Also included is a bug fix relating to allocating blocks after resizing an ext3 file system when using the ext4 file system driver" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd,jbd2: fix oops in jbd2_journal_put_journal_head() ext4: revert "ext4: use io_end for multiple bios" ext4: limit group search loop for non-extent files ext4: fix fio regression
| * ext4: revert "ext4: use io_end for multiple bios"Theodore Ts'o2013-05-113-129/+85
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 4eec708d263f0ee10861d69251708a225b64cac7. Multiple users have reported crashes which is apparently caused by this commit. Thanks to Dmitry Monakhov for bisecting it. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Dmitry Monakhov <dmonakhov@openvz.org> Cc: Jan Kara <jack@suse.cz>
| * ext4: limit group search loop for non-extent filesLachlan McIlroy2013-05-051-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where we are allocating for a non-extent file, we must limit the groups we allocate from to those below 2^32 blocks, and ext4_mb_regular_allocator() attempts to do this initially by putting a cap on ngroups for the subsequent search loop. However, the initial target group comes in from the allocation context (ac), and it may already be beyond the artificially limited ngroups. In this case, the limit if (group == ngroups) group = 0; at the top of the loop is never true, and the loop will run away. Catch this case inside the loop and reset the search to start at group 0. [sandeen@redhat.com: add commit msg & comments] Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * ext4: fix fio regressionYan, Zheng2013-05-034-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We (Linux Kernel Performance project) found a regression introduced by commit: f7fec032aa ext4: track all extent status in extent status tree The commit causes about 20% performance decrease in fio random write test. Profiler shows that rb_next() uses a lot of CPU time. The call stack is: rb_next ext4_es_find_delayed_extent ext4_map_blocks _ext4_get_block ext4_get_block_write __blockdev_direct_IO ext4_direct_IO generic_file_direct_write __generic_file_aio_write ext4_file_write aio_rw_vect_retry aio_run_iocb do_io_submit sys_io_submit system_call_fastpath io_submit td_io_getevents io_u_queued_complete thread_main main __libc_start_main The cause is that ext4_es_find_delayed_extent() doesn't have an upper bound, it keeps searching until a delayed extent is found. When there are a lots of non-delayed entries in the extent state tree, ext4_es_find_delayed_extent() may uses a lot of CPU time. Reported-by: LKP project <lkp@linux.intel.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Cc: "Theodore Ts'o" <tytso@mit.edu>
* | Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2013-05-111-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit changes from Eric Paris: "Al used to send pull requests every couple of years but he told me to just start pushing them to you directly. Our touching outside of core audit code is pretty straight forward. A couple of interface changes which hit net/. A simple argument bug calling audit functions in namei.c and the removal of some assembly branch prediction code on ppc" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: fix message spacing printing auid Revert "audit: move kaudit thread start from auditd registration to kaudit init" audit: vfs: fix audit_inode call in O_CREAT case of do_last audit: Make testing for a valid loginuid explicit. audit: fix event coverage of AUDIT_ANOM_LINK audit: use spin_lock in audit_receive_msg to process tty logging audit: do not needlessly take a lock in tty_audit_exit audit: do not needlessly take a spinlock in copy_signal audit: add an option to control logging of passwords with pam_tty_audit audit: use spin_lock_irqsave/restore in audit tty code helper for some session id stuff audit: use a consistent audit helper to log lsm information audit: push loginuid and sessionid processing down audit: stop pushing loginid, uid, sessionid as arguments audit: remove the old depricated kernel interface audit: make validity checking generic audit: allow checking the type of audit message in the user filter audit: fix build break when AUDIT_DEBUG == 2 audit: remove duplicate export of audit_enabled Audit: do not print error when LSMs disabled ...
| * | audit: vfs: fix audit_inode call in O_CREAT case of do_lastJeff Layton2013-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri reported a regression in auditing of open(..., O_CREAT) syscalls. In older kernels, creating a file with open(..., O_CREAT) created audit_name records that looked like this: type=PATH msg=audit(1360255720.628:64): item=1 name="/abc/foo" inode=138810 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 type=PATH msg=audit(1360255720.628:64): item=0 name="/abc/" inode=138635 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 ...in recent kernels though, they look like this: type=PATH msg=audit(1360255402.886:12574): item=2 name=(null) inode=264599 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 type=PATH msg=audit(1360255402.886:12574): item=1 name=(null) inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 type=PATH msg=audit(1360255402.886:12574): item=0 name="/abc/foo" inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 Richard bisected to determine that the problems started with commit bfcec708, but the log messages have changed with some later audit-related patches. The problem is that this audit_inode call is passing in the parent of the dentry being opened, but audit_inode is being called with the parent flag false. This causes later audit_inode and audit_inode_child calls to match the wrong entry in the audit_names list. This patch simply sets the flag to properly indicate that this inode represents the parent. With this, the audit_names entries are back to looking like they did before. Cc: <stable@vger.kernel.org> # v3.7+ Reported-by: Jiri Jaburek <jjaburek@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Test By: Richard Guy Briggs <rbriggs@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | | Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-05-102-9/+18
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd fixes from Bruce Fields: "Small fixes for two bugs and two warnings" * 'for-3.10' of git://linux-nfs.org/~bfields/linux: nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error SUNRPC: fix decoding of optional gss-proxy xdr fields SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
| * | | nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT errorJeff Layton2013-05-091-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Toralf reported the following oops to the linux-nfs mailing list: -----------------[snip]------------------ NFSD: unable to generate recoverydir name (-2). NFSD: disabling legacy clientid tracking. Reboot recovery will not function correctly! BUG: unable to handle kernel NULL pointer dereference at 000003c8 IP: [<f90a3d91>] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] *pdpt = 000000002ba33001 *pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: loop nfsd auth_rpcgss ipt_MASQUERADE xt_owner xt_multiport ipt_REJECT xt_tcpudp xt_recent xt_conntrack nf_conntrack_ftp xt_limit xt_LOG iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables af_packet pppoe pppox ppp_generic slhc bridge stp llc tun arc4 iwldvm mac80211 coretemp kvm_intel uvcvideo sdhci_pci sdhci mmc_core videobuf2_vmalloc videobuf2_memops usblp videobuf2_core i915 iwlwifi psmouse videodev cfg80211 kvm fbcon bitblit cfbfillrect acpi_cpufreq mperf evdev softcursor font cfbimgblt i2c_algo_bit cfbcopyarea intel_agp intel_gtt drm_kms_helper snd_hda_codec_conexant drm agpgart fb fbdev tpm_tis thinkpad_acpi tpm nvram e1000e rfkill thermal ptp wmi pps_core tpm_bios 8250_pci processor 8250 ac snd_hda_intel snd_hda_codec snd_pcm battery video i2c_i801 snd_page_alloc snd_timer button serial_core i2c_core snd soundcore thermal_sys hwmon aesni_intel ablk_helper cryp td lrw aes_i586 xts gf128mul cbc fuse nfs lockd sunrpc dm_crypt dm_mod hid_monterey hid_microsoft hid_logitech hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid hid sr_mod cdrom sg [last unloaded: microcode] Pid: 6374, comm: nfsd Not tainted 3.9.1 #6 LENOVO 4180F65/4180F65 EIP: 0060:[<f90a3d91>] EFLAGS: 00010202 CPU: 0 EIP is at nfsd4_client_tracking_exit+0x11/0x50 [nfsd] EAX: 00000000 EBX: fffffffe ECX: 00000007 EDX: 00000007 ESI: eb9dcb00 EDI: eb2991c0 EBP: eb2bde38 ESP: eb2bde34 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 000003c8 CR3: 2ba80000 CR4: 000407f0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process nfsd (pid: 6374, ti=eb2bc000 task=eb2711c0 task.ti=eb2bc000) Stack: fffffffe eb2bde4c f90a3e0c f90a7754 fffffffe eb0a9c00 eb2bdea0 f90a41ed eb2991c0 1b270000 eb2991c0 eb2bde7c f9099ce9 eb2bde98 0129a020 eb29a020 eb2bdecc eb2991c0 eb2bdea8 f9099da5 00000000 eb9dcb00 00000001 67822f08 Call Trace: [<f90a3e0c>] legacy_recdir_name_error+0x3c/0x40 [nfsd] [<f90a41ed>] nfsd4_create_clid_dir+0x15d/0x1c0 [nfsd] [<f9099ce9>] ? nfsd4_lookup_stateid+0x99/0xd0 [nfsd] [<f9099da5>] ? nfs4_preprocess_seqid_op+0x85/0x100 [nfsd] [<f90a4287>] nfsd4_client_record_create+0x37/0x50 [nfsd] [<f909d6ce>] nfsd4_open_confirm+0xfe/0x130 [nfsd] [<f90980b1>] ? nfsd4_encode_operation+0x61/0x90 [nfsd] [<f909d5d0>] ? nfsd4_free_stateid+0xc0/0xc0 [nfsd] [<f908fd0b>] nfsd4_proc_compound+0x41b/0x530 [nfsd] [<f9081b7b>] nfsd_dispatch+0x8b/0x1a0 [nfsd] [<f857b85d>] svc_process+0x3dd/0x640 [sunrpc] [<f908165d>] nfsd+0xad/0x110 [nfsd] [<f90815b0>] ? nfsd_destroy+0x70/0x70 [nfsd] [<c1054824>] kthread+0x94/0xa0 [<c1486937>] ret_from_kernel_thread+0x1b/0x28 [<c1054790>] ? flush_kthread_work+0xd0/0xd0 Code: 86 b0 00 00 00 90 c5 0a f9 c7 04 24 70 76 0a f9 e8 74 a9 3d c8 eb ba 8d 76 00 55 89 e5 53 66 66 66 66 90 8b 15 68 c7 0a f9 85 d2 <8b> 88 c8 03 00 00 74 2c 3b 11 77 28 8b 5c 91 08 85 db 74 22 8b EIP: [<f90a3d91>] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] SS:ESP 0068:eb2bde34 CR2: 00000000000003c8 ---[ end trace 09e54015d145c9c6 ]--- The problem appears to be a regression that was introduced in commit 9a9c6478 "nfsd: make NFSv4 recovery client tracking options per net". Prior to that commit, it was safe to pass a NULL net pointer to nfsd4_client_tracking_exit in the legacy recdir case, and legacy_recdir_name_error did so. After that comit, the net pointer must be valid. This patch just fixes legacy_recdir_name_error to pass in a valid net pointer to that function. Cc: <stable@vger.kernel.org> # v3.8+ Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd4: don't allow owner override on 4.1 CLAIM_FH opensJ. Bruce Fields2013-05-031-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Linux client is using CLAIM_FH to implement regular opens, not just recovery cases, so it depends on the server to check permissions correctly. Therefore the owner override, which may make sense in the delegation recovery case, isn't right in the CLAIM_FH case. Symptoms: on a client with 49f9a0fafd844c32f2abada047c0b9a5ba0d6255 "NFSv4.1: Enable open-by-filehandle", Bryan noticed this: touch test.txt chmod 000 test.txt echo test > test.txt succeeding. Cc: stable@kernel.org Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-05-101-0/+17
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull stray syscall bits from Al Viro: "Several syscall-related commits that were missing from the original" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE unicore32: just use mmap_pgoff()... unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)
| * | | | unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINEAl Viro2013-05-091-0/+17
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | Merge tag 'ecryptfs-3.10-rc1-ablkcipher' of ↵Linus Torvalds2013-05-102-41/+103
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull eCryptfs update from Tyler Hicks: "Improve performance when AES-NI (and most likely other crypto accelerators) is available by moving to the ablkcipher crypto API. The improvement is more apparent on faster storage devices. There's no noticeable change when hardware crypto is not available" * tag 'ecryptfs-3.10-rc1-ablkcipher' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: eCryptfs: Use the ablkcipher crypto API
| * | | | | eCryptfs: Use the ablkcipher crypto APITyler Hicks2013-05-092-41/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the switch from the blkcipher kernel crypto interface to the ablkcipher interface. encrypt_scatterlist() and decrypt_scatterlist() now use the ablkcipher interface but, from the eCryptfs standpoint, still treat the crypto operation as a synchronous operation. They submit the async request and then wait until the operation is finished before they return. Most of the changes are contained inside those two functions. Despite waiting for the completion of the crypto operation, the ablkcipher interface provides performance increases in most cases when used on AES-NI capable hardware. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Colin King <colin.king@canonical.com> Reviewed-by: Zeev Zilberman <zeev@annapurnaLabs.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Ying Huang <ying.huang@intel.com> Cc: Thieu Le <thieule@google.com> Cc: Li Wang <dragonylffly@163.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>
* | | | | | Merge branch 'for-next' of ↵Linus Torvalds2013-05-101-1/+4
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: "The bulk of the changes are generalizing the ColdFire v3 core support and adding in 537x CPU support. Also a couple of other bug fixes, one to fix a reintroduction of a past bug in the romfs filesystem nommu support." * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: enable Timer on coldfire 532x m68knommu: fix ColdFire 5373/5329 QSPI base address m68knommu: add support for configuring a Freescale M5373EVB board m68knommu: add support for the ColdFire 537x family of CPUs m68knommu: make ColdFire M532x platform support more v3 generic m68knommu: create and use a common M53xx ColdFire class of CPUs m68k: remove unused asm/dbg.h m68k: Set ColdFire ACR1 cache mode depending on kernel configuration romfs: fix nommu map length to keep inside filesystem m68k: clean up unused "config ROMVECSIZE"
| * | | | | | romfs: fix nommu map length to keep inside filesystemGreg Ungerer2013-04-291-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checks introduced in commit 4991e7251 ("romfs: do not use mtd->get_unmapped_area directly") re-introduce problems fixed in the earlier commit 2b4b2482e ("romfs: fix romfs_get_unmapped_area() argument check"). If a flat binary app is located at the end of a romfs, its page aligned length may be outside of the romfs filesystem. The flat binary loader, via nommu do_mmap_pgoff(), page aligns the length it is mmaping. So simple offset+size checks will fail - returning EINVAL. We can truncate the length to keep it inside the romfs filesystem, and that also keeps the call to mtd_get_unmapped_area() happy. Are there any side effects to truncating the size here though? Signed-off-by: Greg Ungerer <gerg@uclinux.org>
* | | | | | | Merge tag 'please-pull-pstore' of ↵Linus Torvalds2013-05-091-0/+2
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull trivial pstore update from Tony Luck: "Couple of pstore cleanups" It turns out that the kmemdup() conversion ends up being undone by the fact that the memory block also needed the ecc information (see commit bd08ec33b5c2: "pstore/ram: Restore ecc information block"), so all that remains after merging is the error return code change. * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore/ram: fix error return code in ramoops_probe() fs: pstore: Replaced calls to kmalloc and memcpy with kmemdup
| * | | | | | | pstore/ram: fix error return code in ramoops_probe()Wei Yongjun2013-05-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | | | | | | fs: pstore: Replaced calls to kmalloc and memcpy with kmemdupAlexandru Gheorghiu2013-03-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replaced calls to kmalloc and memcpy with a single call to kmemdup. This patch was found using coccicheck. Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-05-091-8/+1
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs fixes from Al Viro: "Regression fix from Geert + yet another open-coded kernel_read()" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ecryptfs: don't open-code kernel_read() xtensa simdisk: Fix proc_create_data() conversion fallout
| * | | | | | | | ecryptfs: don't open-code kernel_read()Al Viro2013-05-091-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-05-0948-1902/+3215
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "These are mostly fixes. The biggest exceptions are Josef's skinny extents and Jan Schmidt's code to rebuild our quota indexes if they get out of sync (or you enable quotas on an existing filesystem). The skinny extents are off by default because they are a new variation on the extent allocation tree format. btrfstune -x enables them, and the new format makes the extent allocation tree about 30% smaller. I rebased this a few days ago to rework Dave Sterba's crc checks on the super block, but almost all of these go back to rc6, since I though 3.9 was due any minute. The biggest missing fix is the tracepoint bug that was hit late in 3.9. I ran into problems with that in overnight testing and I'm still tracking it down. I'll definitely have that fixed for rc2." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (101 commits) Btrfs: allow superblock mismatch from older mkfs btrfs: enhance superblock checks btrfs: fix misleading variable name for flags btrfs: use unsigned long type for extent state bits Btrfs: improve the loop of scrub_stripe btrfs: read entire device info under lock btrfs: remove unused gfp mask parameter from release_extent_buffer callchain btrfs: handle errors returned from get_tree_block_key btrfs: make static code static & remove dead code Btrfs: deal with errors in write_dev_supers Btrfs: remove almost all of the BUG()'s from tree-log.c Btrfs: deal with free space cache errors while replaying log Btrfs: automatic rescan after "quota enable" command Btrfs: rescan for qgroups Btrfs: split btrfs_qgroup_account_ref into four functions Btrfs: allocate new chunks if the space is not enough for global rsv Btrfs: separate sequence numbers for delayed ref tracking and tree mod log btrfs: move leak debug code to functions Btrfs: return free space in cow error path Btrfs: set UUID in root_item for created trees ...
| * | | | | | | | | Btrfs: allow superblock mismatch from older mkfsChris Mason2013-05-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've added new checks to make sure the super block crc is correct during mount. A fresh filesystem from an older mkfs won't have the crc set. This adds a warning when it finds a newly created filesystem but doesn't fail the mount. Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | btrfs: enhance superblock checksDavid Sterba2013-05-072-17/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The superblock checksum is not verified upon mount. <awkward silence> Add that check and also reorder existing checks to a more logical order. Current mkfs.btrfs does not calculate the correct checksum of super_block and thus a freshly created filesytem will fail to mount when this patch is applied. First transaction commit calculates correct superblock checksum and saves it to disk. Reproducer: $ mfks.btrfs /dev/sda $ mount /dev/sda /mnt $ btrfs scrub start /mnt $ sleep 5 $ btrfs scrub status /mnt ... super:2 ... Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | btrfs: fix misleading variable name for flagsDavid Sterba2013-05-062-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable was named 'data' in btrfs_reserve_extent and that's the only function that actually uses it to let btrfs_get_alloc_profile know what profile we want. Then it's passed down as u64 flags. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: use unsigned long type for extent state bitsDavid Sterba2013-05-063-37/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: improve the loop of scrub_stripeLiu Bo2013-05-061-26/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Right now scrub_stripe() is looping in some unnecessary cases: * when the found extent item's objectid has been out of the dev extent's range but we haven't finish scanning all the range within the dev extent * when all the items has been processed but we haven't finish scanning all the range within the dev extent In both cases, we can just finish the loop to save costs. 2) Besides, when the found extent item's length is larger than the stripe len(64k), we don't have to release the path and search again as it'll get at the same key used in the last loop, we can instead increase the logical cursor in place till all space of the extent is scanned. 3) And we use 0 as the key's offset to search btree, then get to previous item to find a smaller item, and again have to move to the next one to get the right item. Setting offset=-1 and previous_item() is the correct way. 4) As we won't find any checksum at offset unless this 'offset' is in a data extent, we can just find checksum when we're really going to scrub an extent. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: read entire device info under lockDavid Sterba2013-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a theoretical possibility of reading stale (or even more theoretically, freed) data from DEV_INFO ioctl when the device would disappear between an early mutex unlock and data being copied from the device structure. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: remove unused gfp mask parameter from release_extent_buffer callchainDavid Sterba2013-05-063-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unused since 0b32f4bbb423f02ac. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: handle errors returned from get_tree_block_keyDavid Sterba2013-05-061-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.cz> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: make static code static & remove dead codeEric Sandeen2013-05-0634-392/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Big patch, but all it does is add statics to functions which are in fact static, then remove the associated dead-code fallout. removed functions: btrfs_iref_to_path() __btrfs_lookup_delayed_deletion_item() __btrfs_search_delayed_insertion_item() __btrfs_search_delayed_deletion_item() find_eb_for_page() btrfs_find_block_group() range_straddles_pages() extent_range_uptodate() btrfs_file_extent_length() btrfs_scrub_cancel_devid() btrfs_start_transaction_lflush() btrfs_print_tree() is left because it is used for debugging. btrfs_start_transaction_lflush() and btrfs_reada_detach() are left for symmetry. ulist.c functions are left, another patch will take care of those. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: deal with errors in write_dev_supersJosef Bacik2013-05-061-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you try to mount -o loop a restored file system it will panic if the file ends up being smaller than the original disk. This is because we go to try and get a block for a super that may be past the EOF which makes __getblk return NULL for a buffer head when we aren't expecting it to. Fix this by dealing with this case and just jacking up the errors count. With this patch we no longer panic when mounting a restored file system loopback. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: remove almost all of the BUG()'s from tree-log.cJosef Bacik2013-05-061-53/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were a whole bunch and I was doing it for other things. I haven't tested these error paths but at the very least this is better than panicing. I've only left 2 BUG_ON()'s since they are logic errors and I want to replace them with a ASSERT framework that we can compile out for production users. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: deal with free space cache errors while replaying logJosef Bacik2013-05-063-32/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So everybody who got hit by my fsync bug will still continue to hit this BUG_ON() in the free space cache, which is pretty heavy handed. So I took a file system that had this bug and fixed up all the BUG_ON()'s and leaks that popped up when I tried to mount a broken file system like this. With this patch we just fail to mount instead of panicing. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: automatic rescan after "quota enable" commandJan Schmidt2013-05-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When qgroup tracking is enabled, we do an automatic cycle of the new rescan mechanism. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: rescan for qgroupsJan Schmidt2013-05-064-34/+389
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If qgroup tracking is out of sync, a rescan operation can be started. It iterates the complete extent tree and recalculates all qgroup tracking data. This is an expensive operation and should not be used unless required. A filesystem under rescan can still be umounted. The rescan continues on the next mount. Status information is provided with a separate ioctl while a rescan operation is in progress. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: split btrfs_qgroup_account_ref into four functionsJan Schmidt2013-05-061-105/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function is separated into a preparation part and the three accounting steps mentioned in the qgroups documentation. The goal is to make steps two and three usable by the rescan functionality. A side effect is that the function is restructured into readable subunits. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: allocate new chunks if the space is not enough for global rsvMiao Xie2013-05-061-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running the 208th of xfstests, the fs returned the enospc error when there was lots of free space in the disk. By bisect debug, we found it was introduced by commit 96f1bb5777. This commit makes the space check for the global reservation in can_overcommit() be inconsistent with should_alloc_chunk(). can_overcommit() requires that the free space is 2 times the size of the global reservation, or we can't do overcommit. And instead, we need reclaim some reserved space, and if we still don't have enough free space, we need allocate a new chunk. But unfortunately, should_alloc_chunk() just requires that the free space is 1 time the size of the global reservation, that is we would not try to allocate a new chunk if the free space size is in the middle of these two requires, and just return the enospc error. Fix it. Cc: Jim Schutt <jaschut@sandia.gov> Cc: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: separate sequence numbers for delayed ref tracking and tree mod logJan Schmidt2013-05-067-19/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sequence numbers for delayed refs have been introduced in the first version of the qgroup patch set. To solve the problem of find_all_roots on a busy file system, the tree mod log was introduced. The sequence numbers for that were simply shared between those two users. However, at one point in qgroup's quota accounting, there's a statement accessing the previous sequence number, that's still just doing (seq - 1) just as it would have to in the very first version. To satisfy that requirement, this patch makes the sequence number counter 64 bit and splits it into a major part (used for qgroup sequence number counting) and a minor part (incremented for each tree modification in the log). This enables us to go exactly one major step backwards, as required for qgroups, while still incrementing the sequence counter for tree mod log insertions to keep track of their order. Keeping them in a single variable means there's no need to change all the code dealing with comparisons of two sequence numbers. The sequence number is reset to 0 on commit (not new in this patch), which ensures we won't overflow the two 32 bit counters. Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs from the tree mod log code may happen. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | btrfs: move leak debug code to functionsEric Sandeen2013-05-063-56/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up the leak debugging in extent_io.c by moving the debug code into functions. This also removes the list_heads used for debugging from the extent_buffer and extent_state structures when debug is not enabled. Since we need a global debug config to do that last part, implement CONFIG_BTRFS_DEBUG to accommodate. Thanks to Dave Sterba for the Kconfig bit. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: return free space in cow error pathLiu Bo2013-05-061-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace some BUG_ONs with proper handling and take allocated space back to free space cache for later use. We don't have to worry about extent maps since they'd be freed in releasepage path. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: set UUID in root_item for created treesStefan Behrens2013-05-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a rare exception that a new tree is created, like the qgroups tree. So far these new trees have an all-zero UUID in their root items. All trees that mkfs.btrfs has created get an UUID during the first mount when btrfs_read_root_item() rewrites the root_item to the v2 structure style. These UUID are never used so far, but anyway, since it is better to have it uniform for all trees, this commit adds some lines that generate and write an UUID for newly created trees. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: delete unused parameter to btrfs_read_root_item()Stefan Behrens2013-05-063-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: fix error handling in btrfs_ioctl_send()Tsutomu Itoh2013-05-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fget() returns NULL if error. So, we should check NULL or not. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
| * | | | | | | | | Btrfs: remove unused variable in __process_changed_new_xattr()Tsutomu Itoh2013-05-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable 'p' is not used any more. So, remove it. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>