summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ceph: make lease code DN specificSage Weil2010-05-292-12/+13
| | | | | | | | | | | The lease code includes a mask in the CEPH_LOCK_* namespace, but that namespace is changing, and only one mask (formerly _DN == 1) is used, so hard code for that value for now. If we ever extend this code to handle leases over different data types we can extend it accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
* fs/ceph: Use ERR_CASTJulia Lawall2010-05-296-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. In the case of fs/ceph/inode.c, ERR_CAST is not needed, because the type of the returned value is the same as the type of the enclosing function. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: renew auth tickets before they expireSage Weil2010-05-294-1/+27
| | | | | | | | We were only requesting renewal after our tickets expire; do so before that. Most of the low-level logic for this was already there; just use it. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: do not resend mon requests on auth ticket renewalSage Weil2010-05-291-1/+4
| | | | | | | | We only want to send pending mon requests when we successfully authenticate. If we are already authenticated, like when we renew our ticket, there is no need to resend pending requests. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: removed duplicated #includesAndrea Gelmini2010-05-292-2/+0
| | | | | | | | | fs/ceph/auth.c: linux/slab.h is included more than once. fs/ceph/super.h: linux/slab.h is included more than once. Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: avoid possible null dereferenceSage Weil2010-05-291-2/+2
| | | | | | | ac->ops may be null; use protocol id in error message instead. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: make mds requests killable, not interruptibleSage Weil2010-05-291-2/+2
| | | | | | | | | | The underlying problem is that many mds requests can't be restarted. For example, a restarted create() would return -EEXIST if the original request succeeds. However, we do not want a hung MDS to hang the client too. So, use the _killable wait_for_completion variants to abort on SIGKILL but nothing else. Signed-off-by: Sage Weil <sage@newdream.net>
* sched: add wait_for_completion_killable_timeoutSage Weil2010-05-292-0/+19
| | | | | | | | | | | | Add missing _killable_timeout variant for wait_for_completion that will return when a timeout expires or the task is killed. CC: Ingo Molnar <mingo@elte.hu> CC: Andreas Herrmann <andreas.herrmann3@amd.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: reuse mon subscribe message instead of allocated anewSage Weil2010-05-212-10/+14
| | | | | | | Use the same message, allocated during startup. No need to reallocate a new one each time around (and potentially ENOMEM). Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: avoid resending queued message to monitorSage Weil2010-05-211-0/+2
| | | | | | | | | | | | | The auth_reply handler will (re)send any pending requests. For the initial mon authenticate phase, that's correct, but when a auth ticket renewal races with an in-flight request, we may resend a request message that is already in flight. Avoid this by revoking the message before sending it. We should also avoid resending requests at all during ticket renewal; that will come soon. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: Storage class should be before const qualifierTobias Klauser2010-05-213-6/+6
| | | | | | | | | | | The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: all allocation functions should get gfp_maskYehuda Sadeh2010-05-178-30/+32
| | | | | | | | | This is essential, as for the rados block device we'll need to run in different contexts that would need flags that are other than GFP_NOFS. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: specify max_bytes on readdir repliesSage Weil2010-05-174-1/+14
| | | | | | | Specify max bytes in request to bound size of reply. Add associated mount option with default value of 512 KB. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: cleanup pool op stringsSage Weil2010-05-171-19/+12
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: Use kzallocJulia Lawall2010-05-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | Use kzalloc rather than the combination of kmalloc and memset. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x,size,flags; statement S; @@ -x = kmalloc(size,flags); +x = kzalloc(size,flags); if (x == NULL) S -memset(x, 0, size); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: use common helper for aborted dir request invalidationSage Weil2010-05-173-31/+27
| | | | | | | We invalidate I_COMPLETE and dentry leases in two places: on aborted mds request and on request replay. Use common helper to avoid duplicate code. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: cope with out of order (unsafe after safe) mds replySage Weil2010-05-171-0/+6
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: save peer feature bits in connection structureSage Weil2010-05-172-0/+2
| | | | | | | These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: resync headers with userlandSage Weil2010-05-176-22/+91
| | | | | | | Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: use ceph. prefix for virtual xattrsSage Weil2010-05-171-10/+11
| | | | | | Drop the 'user.' prefix and use just 'ceph.' for fs virtual xattrs. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: throw out dirty caps metadata, data on session teardownSage Weil2010-05-171-3/+41
| | | | | | | | | | | | | | | | | The remove_session_caps() helper is called when an MDS closes out our session (either normally, or as a result of a failed reconnect), and when we tear down state for umount. If we remove the last cap, and there are no cap migrations in progress, then there is little hope of us flushing out that data to the mds (without heroic efforts to reconnect and flush). So, to avoid leaving inodes pinned (due to dirty state) and crashing after umount, throw out dirty caps state and unpin the inodes. Print a warning to the console so we know something was lost. NOTE: Although we drop wrbuffer refs, we don't actually mark pages clean; maybe a truncate should be queued? Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: attempt mds reconnect if mds closes our sessionSage Weil2010-05-171-2/+3
| | | | | | | | | | | | | | | | | Currently, if our session is closed (due to a timeout, or explicit close, or whatever), we just sit there doing nothing unless/until the MDS restarts, at which point we try to reconnect. Change client to attempt an immediate reconnect if our session is closed. Note that currently the MDS doesn't support this, and our attempt will fail. We'll get a session CLOSE, our caps and dirty cap state will be dropped, and the client will be free to attempt to reconnect. That's clearly not as nice as a successful reconnect, but it at least allows us to try to carry on, and in the future the MDS will support a reconnect and we will fare better. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up send_mds_reconnect interfaceSage Weil2010-05-171-31/+16
| | | | | | | | | | | Pass a ceph_mds_session, since the caller has it. Remove the dead code for sending empty reconnects. It used to be used when the MDS contacted _us_ to solicit a reconnect, and we could reply saying "go away, I have no session." Now we only send reconnects based on the mds map, and only when we do in fact have an open session. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wait for mds OPEN reply to indicate reconnect successSage Weil2010-05-171-15/+13
| | | | | | | | | | | | | | | We used to infer reconnect success by watching the MDS state, essentially assuming that hearing nothing meant things were ok. That wasn't particularly reliable. Instead, the MDS replies with an explicit OPEN message to indicate success. Strictly speaking, this is a protocol change, but it is a backwards compatible one that does not break new clients + old servers or old clients + new servers. At least not yet. Drop unused @all argument from kick_requests while we're at it. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: only send cap releases when mds is OPEN|HUNGSage Weil2010-05-171-1/+3
| | | | | | | | | On OPENING we shouldn't have any caps (or releases). On CLOSING, we should wait until we succeed (and throw it all out), or don't (and are OPEN again). On RECONNECTING we can wait until we are OPEN. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: dicard cap releases on mds restartSage Weil2010-05-171-0/+41
| | | | | | | | | If the MDS restarts, the expire caps state is no longer shared, and can be thrown out. Caps state will be rebuilt on the MDS during the reconnect process that follows. Zero out any release messages and adjust the release counter accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: make mon client statfs handling more genericYehuda Sadeh2010-05-173-52/+58
| | | | | | | | This is being done so that we could reuse the statfs infrastructure with other requests that return values. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: drop src address(es) from message header [new protocol feature]Sage Weil2010-05-173-11/+36
| | | | | | | | | | The CEPH_FEATURE_NOSRCADDR protocol feature avoids putting the full source address in each message header (twice). This patch switches the client to the new scheme, and _requires_ this feature on the server. The server will support both the old and new schemes. That means an old client will work with a new server, but a new client will not work with an old server. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: cleanup: remove unused assignementDan Carpenter2010-05-171-2/+1
| | | | | | | We don't ever use "dirty" so we can remove it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up cap release loop vs spinlockSage Weil2010-05-171-4/+3
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: name bdi ceph-%d instead of major:minorSage Weil2010-05-171-1/+4
| | | | | | | The bdi_setup_and_register() helper doesn't help us since we bdi_init() in create_client() and bdi_register() only when sget() succeeds. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: skip mds sync on forced unmountSage Weil2010-05-171-0/+3
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: adjust masked struct_v variable namesSage Weil2010-05-171-9/+9
| | | | | Reported-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up mount options, ->show_options()Sage Weil2010-05-172-40/+69
| | | | | | Ensure all options are included in /proc/mounts. Some cleanup. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: set dn offset when splicedSage Weil2010-05-172-40/+44
| | | | | | | | | | We want to assign an offset when the dentry goes from null to linked, which is always done by splice_dentry(). Notably, we should NOT assign an offset when a dentry is first created and is still null. BUG if we try to splice a non-null dentry (we shouldn't). Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: don't clobber i_max_offset on already complete dirSage Weil2010-05-171-1/+2
| | | | | | | This can screw up offsets assigned to new dentries and break dcache readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: skip set_dentry_offset work if directory not I_COMPLETESage Weil2010-05-171-0/+4
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: set next_offset on readdir finishSage Weil2010-05-171-1/+1
| | | | | | Set next_offset to 2 (always 2!), not 0, on readdir finish. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: listxattr should compare version by >=Henry C Chang2010-05-171-1/+1
| | | | | | | If the version hasn't changed, don't rebuild the index. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix xattr dangling pointer / double freeSage Weil2010-05-171-0/+1
| | | | | | | | If we use the xattr_blob, clear the pointer so we don't release the memory at the bottom of the fuction. Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: close messenger raceSage Weil2010-05-171-7/+7
| | | | | | | Simplify messenger locking, and close race between ceph_con_close() setting the CLOSED bit and con_work() checking the bit, then taking the mutex. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: name msgpools; useful error messagesSage Weil2010-05-173-7/+16
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix memory leak due to possible dentry init raceSage Weil2010-05-171-1/+4
| | | | | | Free dentry_info in error path. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: include auth method in error messagesSage Weil2010-05-174-4/+9
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: osdtimeout=0 for now timeoutSage Weil2010-05-171-1/+1
| | | | | | Allow the osd reset timeout to be disabled. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: d_obtain_alias() returns ERR_PTR()Dan Carpenter2010-05-171-6/+6
| | | | | | | d_obtain_alias() doesn't return NULL, it returns an ERR_PTR(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wake up mount thread when getting osdmapYehuda Sadeh2010-05-171-0/+1
| | | | | | | Now that the mount thread waits for the osdmap, it needs to be awaken. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* ceph: remove unused #includesHuang Weiyi2010-05-171-3/+0
| | | | | | | | Remove unused #include's in fs/ceph/super.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wait for both monmap and osdmap when opening sessionSage Weil2010-05-171-5/+6
| | | | Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* ceph: clean up connection resetSage Weil2010-05-172-1/+2
| | | | | | Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: Sage Weil <sage@newdream.net>