summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds2011-08-1226-204/+155
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: replace xfs_buf_geterror() with bp->b_error xfs: Check the return value of xfs_buf_read() for NULL "xfs: fix error handling for synchronous writes" revisited xfs: set cursor in xfs_ail_splice() even when AIL was empty xfs: Remove the macro XFS_BUFTARG_NAME xfs: Remove the macro XFS_BUF_TARGET xfs: Remove the macro XFS_BUF_SET_TARGET Replace the macro XFS_BUF_ISPINNED with helper xfs_buf_ispinned xfs: Remove the macro XFS_BUF_SET_PTR xfs: Remove the macro XFS_BUF_PTR xfs: Remove macro XFS_BUF_SET_START xfs: Remove macro XFS_BUF_HOLD xfs: Remove macro XFS_BUF_BUSY and family xfs: Remove the macro XFS_BUF_ERROR and family xfs: Remove the macro XFS_BUF_BFLAGS
| * xfs: replace xfs_buf_geterror() with bp->b_errorChandra Seetharaman2011-08-122-3/+3
| | | | | | | | | | | | | | | | Since we just checked bp for NULL, it is ok to replace xfs_buf_geterror() with bp->b_error in these places. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Alex Elder <aelder@sgi.com>
| * xfs: Check the return value of xfs_buf_read() for NULLChandra Seetharaman2011-08-122-0/+8
| | | | | | | | | | | | | | | | | | Check the return value of xfs_buf_read() for NULL and return ENOMEM if it is NULL. This is necessary in a few spots to avoid subsequent code blindly dereferencing the null buffer pointer. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Alex Elder <aelder@sgi.com>
| * "xfs: fix error handling for synchronous writes" revisitedAjeet Yadav2011-08-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfs: fix for hang during synchronous buffer write error If removed storage while synchronous buffer write underway, "xfslogd" hangs. Detailed log http://oss.sgi.com/archives/xfs/2011-07/msg00740.html Related work bfc60177f8ab509bc225becbb58f7e53a0e33e81 "xfs: fix error handling for synchronous writes" Given that xfs_bwrite actually does the shutdown already after waiting for the b_iodone completion and given that we actually found that calling xfs_force_shutdown from inside xfs_buf_iodone_callbacks was a major contributor the problem it better to drop this call. Signed-off-by: Ajeet Yadav <ajeet.yadav.77@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * xfs: set cursor in xfs_ail_splice() even when AIL was emptyAlex Elder2011-08-091-38/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In xfs_ail_splice(), if a cursor is provided it is updated to point to the last item on the list being spliced into the AIL. But if the AIL was found to be empty, the cursor (if provided) is just initialized instead. There is no reason the empty AIL case needs to be treated any differently. And treating it the same way allows this code to be rearranged a bit, with a somewhat tidier result. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
| * Merge branch 'master' of ↵Alex Elder2011-08-08449-12348/+16852
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
| * | xfs: Remove the macro XFS_BUFTARG_NAMEChandra Seetharaman2011-07-254-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usages of the macro XFS_BUFTARG_NAME. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_TARGETChandra Seetharaman2011-07-255-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usages of the macro XFS_BUF_TARGET Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_SET_TARGETChandra Seetharaman2011-07-252-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the macro XFS_BUF_SET_TARGET. hch: As all the buffer allocator already set ->b_target it should be safe to simply remove these calls. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | Replace the macro XFS_BUF_ISPINNED with helper xfs_buf_ispinnedChandra Seetharaman2011-07-256-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the macro XFS_BUF_ISPINNED with an inline helper function xfs_buf_ispinned, and change all its usages. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_SET_PTRChandra Seetharaman2011-07-253-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usages of the macro XFS_BUF_SET_PTR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_PTRChandra Seetharaman2011-07-2516-47/+45
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usages of the macro XFS_BUF_PTR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove macro XFS_BUF_SET_STARTChandra Seetharaman2011-07-252-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usage of the macro XFS_BUF_SET_START. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove macro XFS_BUF_HOLDChandra Seetharaman2011-07-254-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition and usage of the macro XFS_BUF_HOLD Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove macro XFS_BUF_BUSY and familyChandra Seetharaman2011-07-257-26/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the definitions and uses of the macros XFS_BUF_BUSY, XFS_BUF_UNBUSY, and XFS_BUF_ISBUSY. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_ERROR and familyChandra Seetharaman2011-07-2516-55/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the definitions and usage of the macros XFS_BUF_ERROR, XFS_BUF_GETERROR and XFS_BUF_ISERROR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | xfs: Remove the macro XFS_BUF_BFLAGSChandra Seetharaman2011-07-253-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | Remove the definition of the macro XFS_BUF_BFLAGS and its usage. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2011-08-121-0/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) e1000e: increase driver version number e1000e: alternate MAC address update e1000e: do not disable receiver on 82574/82583 e1000e: alternate MAC address does not work on device id 0x1060 PCnet: Fix section mismatch bnx2x: disable dcb on 578xx since not supported yet bnx2x: properly clean indirect addresses bnx2x: prevent race between undi_unload and load flows bnx2x: fix select_queue when FCoE is disabled bnx2x: init FCOE FP only once ipv4: some rt_iif -> rt_route_iif conversions net/bridge/netfilter/ebtables.c: use available error handling code net/netlabel/netlabel_kapi.c: add missing cleanup code net/irda: sh_sir: tidyup compile warning net/irda: sh_sir: add missing header net/irda: sh_irda: add missing header slcan: ldisc generated skbs are received in softirq context scm: Capture the full credentials of the scm sender tcp: initialize variable ecn_ok in syncookies path drivers/net/wireless/wl1251: add missing kfree ...
| * | | compat_ioctl: add compat handler for PPPIOCGL2TPSTATSFlorian Westphal2011-08-071-0/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes following error seen on x86_64 kernel: ioctl32(openl2tpd:7480): Unknown cmd fd(14) cmd(80487436){t:'t';sz:72} arg(ffa7e6c0) on socket:[105094] The argument (struct pppol2tp_ioc_stats) uses "aligned_u64" and thus doesn't need fixups. Cc: James Chapman <jchapman@katalix.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | pnfs: Automatically select blocks & objects layoutsBoaz Harrosh2011-08-111-14/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like files-layout, blocks & objects layouts are part of the NFS 4.1 protocol and should be automatically selected if NFS_4_1 is selected. The small problem is that these depend on other Kernel support being present, while files only depends on NFS itself. This patch removes from the user choice the presence of objects and blocks layout. But makes sure these are selected only if the depended subsystems are present in the Kernel. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | ext4: Properly count journal credits for long symlinksEric Sandeen2011-08-111-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit df5e6223407e ("ext4: fix deadlock in ext4_symlink() in ENOSPC conditions") recalculated the number of credits needed for a long symlink, in the process of splitting it into two transactions. However, the first credit calculation under-counted because if selinux is enabled, credits are needed to create the selinux xattr as well. Overrunning the reservation will result in an OOPS in jbd2_journal_dirty_metadata() due to this assert: J_ASSERT_JH(jh, handle->h_buffer_credits > 0); Fix this by increasing the reservation size. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | ext3: Properly count journal credits for long symlinksEric Sandeen2011-08-111-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ae54870a1dc9 ("ext3: Fix lock inversion in ext3_symlink()") recalculated the number of credits needed for a long symlink, in the process of splitting it into two transactions. However, the first credit calculation under-counted because if selinux is enabled, credits are needed to create the selinux xattr as well. Overrunning the reservation will result in an OOPS in journal_dirty_metadata() due to this assert: J_ASSERT_JH(jh, handle->h_buffer_credits > 0); Fix this by increasing the reservation size. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | move RLIMIT_NPROC check from set_user() to do_execve_common()Vasiliy Kulikov2011-08-111-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC check in set_user() to check for NPROC exceeding via setuid() and similar functions. Before the check there was a possibility to greatly exceed the allowed number of processes by an unprivileged user if the program relied on rlimit only. But the check created new security threat: many poorly written programs simply don't check setuid() return code and believe it cannot fail if executed with root privileges. So, the check is removed in this patch because of too often privilege escalations related to buggy programs. The NPROC can still be enforced in the common code flow of daemons spawning user processes. Most of daemons do fork()+setuid()+execve(). The check introduced in execve() (1) enforces the same limit as in setuid() and (2) doesn't create similar security issues. Neil Brown suggested to track what specific process has exceeded the limit by setting PF_NPROC_EXCEEDED process flag. With the change only this process would fail on execve(), and other processes' execve() behaviour is not changed. Solar Designer suggested to re-check whether NPROC limit is still exceeded at the moment of execve(). If the process was sleeping for days between set*uid() and execve(), and the NPROC counter step down under the limit, the defered execve() failure because NPROC limit was exceeded days ago would be unexpected. If the limit is not exceeded anymore, we clear the flag on successful calls to execve() and fork(). The flag is also cleared on successful calls to set_user() as the limit was exceeded for the previous user, not the current one. Similar check was introduced in -ow patches (without the process flag). v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user(). Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | NFS41: make PNFS_BLOCK selectablePeng Tao2011-08-111-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PNFS_BLOCK needs BLK_DEV_DM/MD, which is not a dependency for other pnfs layout drivers. Seperate it out so others can still build when BLK_DEV_DM/MD is not enabled. Also change select to depends on to avoid build failures. Reported-and-tested-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Peng Tao <peng_tao@emc.com> Acked-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-08-104-12/+33
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: Ecryptfs: Add mount option to check uid of device being mounted = expect uid eCryptfs: Fix payload_len unitialized variable warning eCryptfs: fix compile error eCryptfs: Return error when lower file pointer is NULL
| * | | Ecryptfs: Add mount option to check uid of device being mounted = expect uidJohn Johansen2011-08-091-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Close a TOCTOU race for mounts done via ecryptfs-mount-private. The mount source (device) can be raced when the ownership test is done in userspace. Provide Ecryptfs a means to force the uid check at mount time. Signed-off-by: John Johansen <john.johansen@canonical.com> Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | eCryptfs: Fix payload_len unitialized variable warningTyler Hicks2011-08-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fs/ecryptfs/keystore.c: In function ‘ecryptfs_generate_key_packet_set’: fs/ecryptfs/keystore.c:1991:28: warning: ‘payload_len’ may be used uninitialized in this function [-Wuninitialized] fs/ecryptfs/keystore.c:1976:9: note: ‘payload_len’ was declared here Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | eCryptfs: fix compile errorRoberto Sassu2011-08-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the compile error reported at the address: https://bugzilla.kernel.org/show_bug.cgi?id=40292 The problem arises when compiling eCryptfs as built-in and the 'encrypted' key type as a module. The patch prevents this combination from being set in the kernel configuration, by fixing the eCryptfs dependencies. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Reported-by: David Hill <hilld@binarystorm.net> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | eCryptfs: Return error when lower file pointer is NULLTyler Hicks2011-08-091-8/+10
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an eCryptfs inode's lower file has been closed, and the pointer has been set to NULL, return an error when trying to do a lower read or write rather than calling BUG(). https://bugzilla.kernel.org/show_bug.cgi?id=37292 Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Cc: <stable@kernel.org>
* | | autofs4: fix debug printk warning uncovered by cleanupLinus Torvalds2011-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous comit made the autofs4 debug printouts check types against the printout format, and uncovered this bug: fs/autofs4/waitq.c:106:2: warning: format ‘%08lx’ expects type ‘long unsigned int’, but argument 4 has type ‘autofs_wqt_t’ which is due to the insane type for wait_queue_token. That thing should be some fixed well-defined size (preferably just 'unsigned int' or 'u32') but for unexplained reasons it is randomly either 'unsigned long' or 'unsigned int' depending on the architecture. For now, cast it to 'unsigned long' for printing, the way we do elsewhere. Somebody else can try to explain the typedef mess. (There's a reason we don't support excessive use of typedefs in the kernel: it's usually just a good way of confusing yourself). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | autofs4: clean up uaotfs use of debug/info/warning printoutsLinus Torvalds2011-08-081-18/+8
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | Use 'pr_debug()' for DPRINTK, which will do the proper type checking on the arguments (without generating code) even when DEBUG isn't #defined. Also, use the standard __VA_ARGS__ for the macros, and stop the pointless abuse of 'do { xyz } while (0)' when the macro is already a perfectly well-formed single statement. Reported-by: David Howells <dhowells@redhat.com> Suggested-by: Joe Perches <joe@perches.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | vfs: rename 'do_follow_link' to 'should_follow_link'Linus Torvalds2011-08-071-2/+2
| | | | | | | | | | | | | | | | Al points out that the do_follow_link() helper function really is misnamed - it's about whether we should try to follow a symlink or not, not about actually doing the following. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Fix POSIX ACL permission checkAri Savolainen2011-08-071-1/+1
| | | | | | | | | | | | | | | | | | After commit 3567866bf261: "RCUify freeing acls, let check_acl() go ahead in RCU mode if acl is cached" posix_acl_permission is being called with an unsupported flag and the permission check fails. This patch fixes the issue. Signed-off-by: Ari Savolainen <ari.m.savolainen@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds2011-08-067-499/+487
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.open-osd.org/linux-open-osd: ore: Make ore its own module exofs: Rename raid engine from exofs/ios.c => ore exofs: ios: Move to a per inode components & device-table exofs: Move exofs specific osd operations out of ios.c exofs: Add offset/length to exofs_get_io_state exofs: Fix truncate for the raid-groups case exofs: Small cleanup of exofs_fill_super exofs: BUG: Avoid sbi realloc exofs: Remove pnfs-osd private definitions nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
| * | ore: Make ore its own moduleBoaz Harrosh2011-08-063-1/+23
| | | | | | | | | | | | | | | | | | | | | Export everything from ore need exporting. Change Kbuild and Kconfig to build ore.ko as an independent module. Import ore from exofs Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Rename raid engine from exofs/ios.c => oreBoaz Harrosh2011-08-065-255/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ORE stands for "Objects Raid Engine" This patch is a mechanical rename of everything that was in ios.c and its API declaration to an ore.c and an osd_ore.h header. The ore engine will later be used by the pnfs objects layout driver. * File ios.c => ore.c * Declaration of types and API are moved from exofs.h to a new osd_ore.h * All used types are prefixed by ore_ from their exofs_ name. * Shift includes from exofs.h to osd_ore.h so osd_ore.h is independent, include it from exofs.h. Other than a pure rename there are no other changes. Next patch will move the ore into it's own module and will export the API to be used by exofs and later the layout driver Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: ios: Move to a per inode components & device-tableBoaz Harrosh2011-08-064-183/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exofs raid engine was saving on memory space by having a single layout-info, single pid, and a single device-table, global to the filesystem. Then passing a credential and object_id info at the io_state level, private for each inode. It would also devise this contraption of rotating the device table view for each inode->ino to spread out the device usage. This is not compatible with the pnfs-objects standard, demanding that each inode can have it's own layout-info, device-table, and each object component it's own pid, oid and creds. So: Bring exofs raid engine to be usable for generic pnfs-objects use by: * Define an exofs_comp structure that holds obj_id and credential info. * Break up exofs_layout struct to an exofs_components structure that holds a possible array of exofs_comp and the array of devices + the size of the arrays. * Add a "comps" parameter to get_io_state() that specifies the ids creds and device array to use for each IO. This enables to keep the layout global, but the device-table view, creds and IDs at the inode level. It only adds two 64bit to each inode, since some of these members already existed in another form. * ios raid engine now access layout-info and comps-info through the passed pointers. Everything is pre-prepared by caller for generic access of these structures and arrays. At the exofs Level: * Super block holds an exofs_components struct that holds the device array, previously in layout. The devices there are in device-table order. The device-array is twice bigger and repeats the device-table twice so now each inode's device array can point to a random device and have a round-robin view of the table, making it compatible to previous exofs versions. * Each inode has an exofs_components struct that is initialized at load time, with it's own view of the device table IDs and creds. When doing IO this gets passed to the io_state together with the layout. While preforming this change. Bugs where found where credentials with the wrong IDs where used to access the different SB objects (super.c). As well as some dead code. It was never noticed because the target we use does not check the credentials. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Move exofs specific osd operations out of ios.cBoaz Harrosh2011-08-064-73/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ios.c will be moving to an external library, for use by the objects-layout-driver. Remove from it some exofs specific functions. Also g_attr_logical_length is used both by inode.c and ios.c move definition to the later, to keep it independent Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Add offset/length to exofs_get_io_stateBoaz Harrosh2011-08-063-16/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In future raid code we will need to know the IO offset/length and if it's a read or write to determine some of the array sizes we'll need. So add a new exofs_get_rw_state() API for use when writeing/reading. All other simple cases are left using the old way. The major change to this is that now we need to call exofs_get_io_state later at inode.c::read_exec and inode.c::write_exec when we actually know these things. So this patch is kept separate so I can test things apart from other changes. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Fix truncate for the raid-groups caseBoaz Harrosh2011-08-041-20/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the general raid-group case the truncate was wrong in that it did not also fix the object length of the neighboring groups. There are two bad cases in the old code: 1. Space that should be freed was not. 2. If a file That was big is truncated small, then made bigger again, the holes would not contain zeros but could expose old data. (If the growing of the file expands to more than a full groups cycle + group size (> S + T)) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Small cleanup of exofs_fill_superBoaz Harrosh2011-08-041-4/+2
| | | | | | | | | | | | | | | | | | | | | Small cleanup that unifies duplicated code used in both the error and success cases Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: BUG: Avoid sbi reallocBoaz Harrosh2011-08-042-24/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the beginning we realloced the sbi structure when a bigger then one device table was specified. (I know that was really stupid). Then much later when "register bdi" was added (By Jens) it was registering the pointer to sbi->bdi before the realloc. We never saw this problem because up till now the realloc did not do anything since the device table was small enough to fit in the original allocation. But once we starting testing with large device tables (Bigger then 28) we noticed the crash of writeback operating on a deallocated pointer. * Avoid the all mess by allocating the device-table as a second array and get rid of the variable-sized structure and the rest of this mess. * Take the chance to clean near by structures and comments. * Add a needed dprint on startup to indicate the loaded layout. * Also move the bdi registration to the very end because it will only fail in a low memory, which will probably fail before hand. There are many more likely causes to not load before that. This way the error handling is made simpler. (Just doing this would be enough to fix the BUG) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | exofs: Remove pnfs-osd private definitionsBoaz Harrosh2011-08-042-50/+1
| | | | | | | | | | | | | | | | | | | | | Now that pnfs-osd has hit mainline we can remove exofs's private header. (And the FIXME comment) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | | vfs: optimize inode cache access patternsLinus Torvalds2011-08-063-12/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inode structure layout is largely random, and some of the vfs paths really do care. The path lookup in particular is already quite D$ intensive, and profiles show that accessing the 'inode->i_op->xyz' fields is quite costly. We already optimized the dcache to not unnecessarily load the d_op structure for members that are often NULL using the DCACHE_OP_xyz bits in dentry->d_flags, and this does something very similar for the inode ops that are used during pathname lookup. It also re-orders the fields so that the fields accessed by 'stat' are together at the beginning of the inode structure, and roughly in the order accessed. The effect of this seems to be in the 1-2% range for an empty kernel "make -j" run (which is fairly kernel-intensive, mostly in filename lookup), so it's visible. The numbers are fairly noisy, though, and likely depend a lot on exact microarchitecture. So there's more tuning to be done. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | vfs: renumber DCACHE_xyz flags, remove some stale onesLinus Torvalds2011-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gcc tends to generate better code with small integers, including the DCACHE_xyz flag tests - so move the common ones to be first in the list. Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and DCACHE_AUTOFS_PENDING values, their users no longer exists in the source tree. And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the common case to be a nice straight-line fall-through. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2011-08-066-18/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: cope with negative dentries in cifs_get_root cifs: convert prefixpath delimiters in cifs_build_path_to_root CIFS: Fix missing a decrement of inFlight value cifs: demote DFS referral lookup errors to cFYI Revert "cifs: advertise the right receive buffer size to the server"
| * | | cifs: cope with negative dentries in cifs_get_rootJeff Layton2011-08-051-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loop around lookup_one_len doesn't handle the case where it might return a negative dentry, which can cause an oops on the next pass through the loop. Check for that and break out of the loop with an error of -ENOENT if there is one. Fixes the panic reported here: https://bugzilla.redhat.com/show_bug.cgi?id=727927 Reported-by: TR Bentley <home@trarbentley.net> Reported-by: Iain Arnell <iarnell@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | cifs: convert prefixpath delimiters in cifs_build_path_to_rootJeff Layton2011-08-051-12/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regression from 2.6.39... The delimiters in the prefixpath are not being converted based on whether posix paths are in effect. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=727834 Reported-and-Tested-by: Iain Arnell <iarnell@gmail.com> Reported-by: Patrick Oltmann <patrick.oltmann@gmx.net> Cc: Pavel Shilovsky <piastryyy@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | CIFS: Fix missing a decrement of inFlight valuePavel Shilovsky2011-08-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if we failed on getting mid entry in cifs_call_async. Cc: stable@kernel.org Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | cifs: demote DFS referral lookup errors to cFYIJeff Layton2011-08-032-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs: demote DFS referral lookup errors to cFYI Now that we call into this routine on every mount, anyone who doesn't have the upcall configured will get multiple printks about failed lookups. Reported-and-Tested-by: Martijn Uffing <mp3project@sarijopen.student.utwente.nl> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>