summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ceph: cope with out of order (unsafe after safe) mds replySage Weil2010-05-171-0/+6
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: save peer feature bits in connection structureSage Weil2010-05-172-0/+2
| | | | | | | These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: resync headers with userlandSage Weil2010-05-176-22/+91
| | | | | | | Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: use ceph. prefix for virtual xattrsSage Weil2010-05-171-10/+11
| | | | | | Drop the 'user.' prefix and use just 'ceph.' for fs virtual xattrs. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: throw out dirty caps metadata, data on session teardownSage Weil2010-05-171-3/+41
| | | | | | | | | | | | | | | | | The remove_session_caps() helper is called when an MDS closes out our session (either normally, or as a result of a failed reconnect), and when we tear down state for umount. If we remove the last cap, and there are no cap migrations in progress, then there is little hope of us flushing out that data to the mds (without heroic efforts to reconnect and flush). So, to avoid leaving inodes pinned (due to dirty state) and crashing after umount, throw out dirty caps state and unpin the inodes. Print a warning to the console so we know something was lost. NOTE: Although we drop wrbuffer refs, we don't actually mark pages clean; maybe a truncate should be queued? Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: attempt mds reconnect if mds closes our sessionSage Weil2010-05-171-2/+3
| | | | | | | | | | | | | | | | | Currently, if our session is closed (due to a timeout, or explicit close, or whatever), we just sit there doing nothing unless/until the MDS restarts, at which point we try to reconnect. Change client to attempt an immediate reconnect if our session is closed. Note that currently the MDS doesn't support this, and our attempt will fail. We'll get a session CLOSE, our caps and dirty cap state will be dropped, and the client will be free to attempt to reconnect. That's clearly not as nice as a successful reconnect, but it at least allows us to try to carry on, and in the future the MDS will support a reconnect and we will fare better. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up send_mds_reconnect interfaceSage Weil2010-05-171-31/+16
| | | | | | | | | | | Pass a ceph_mds_session, since the caller has it. Remove the dead code for sending empty reconnects. It used to be used when the MDS contacted _us_ to solicit a reconnect, and we could reply saying "go away, I have no session." Now we only send reconnects based on the mds map, and only when we do in fact have an open session. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wait for mds OPEN reply to indicate reconnect successSage Weil2010-05-171-15/+13
| | | | | | | | | | | | | | | We used to infer reconnect success by watching the MDS state, essentially assuming that hearing nothing meant things were ok. That wasn't particularly reliable. Instead, the MDS replies with an explicit OPEN message to indicate success. Strictly speaking, this is a protocol change, but it is a backwards compatible one that does not break new clients + old servers or old clients + new servers. At least not yet. Drop unused @all argument from kick_requests while we're at it. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: only send cap releases when mds is OPEN|HUNGSage Weil2010-05-171-1/+3
| | | | | | | | | On OPENING we shouldn't have any caps (or releases). On CLOSING, we should wait until we succeed (and throw it all out), or don't (and are OPEN again). On RECONNECTING we can wait until we are OPEN. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: dicard cap releases on mds restartSage Weil2010-05-171-0/+41
| | | | | | | | | If the MDS restarts, the expire caps state is no longer shared, and can be thrown out. Caps state will be rebuilt on the MDS during the reconnect process that follows. Zero out any release messages and adjust the release counter accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: make mon client statfs handling more genericYehuda Sadeh2010-05-173-52/+58
| | | | | | | | This is being done so that we could reuse the statfs infrastructure with other requests that return values. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: drop src address(es) from message header [new protocol feature]Sage Weil2010-05-173-11/+36
| | | | | | | | | | The CEPH_FEATURE_NOSRCADDR protocol feature avoids putting the full source address in each message header (twice). This patch switches the client to the new scheme, and _requires_ this feature on the server. The server will support both the old and new schemes. That means an old client will work with a new server, but a new client will not work with an old server. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: cleanup: remove unused assignementDan Carpenter2010-05-171-2/+1
| | | | | | | We don't ever use "dirty" so we can remove it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up cap release loop vs spinlockSage Weil2010-05-171-4/+3
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: name bdi ceph-%d instead of major:minorSage Weil2010-05-171-1/+4
| | | | | | | The bdi_setup_and_register() helper doesn't help us since we bdi_init() in create_client() and bdi_register() only when sget() succeeds. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: skip mds sync on forced unmountSage Weil2010-05-171-0/+3
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: adjust masked struct_v variable namesSage Weil2010-05-171-9/+9
| | | | | Reported-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up mount options, ->show_options()Sage Weil2010-05-172-40/+69
| | | | | | Ensure all options are included in /proc/mounts. Some cleanup. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: set dn offset when splicedSage Weil2010-05-172-40/+44
| | | | | | | | | | We want to assign an offset when the dentry goes from null to linked, which is always done by splice_dentry(). Notably, we should NOT assign an offset when a dentry is first created and is still null. BUG if we try to splice a non-null dentry (we shouldn't). Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: don't clobber i_max_offset on already complete dirSage Weil2010-05-171-1/+2
| | | | | | | This can screw up offsets assigned to new dentries and break dcache readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: skip set_dentry_offset work if directory not I_COMPLETESage Weil2010-05-171-0/+4
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: set next_offset on readdir finishSage Weil2010-05-171-1/+1
| | | | | | Set next_offset to 2 (always 2!), not 0, on readdir finish. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: listxattr should compare version by >=Henry C Chang2010-05-171-1/+1
| | | | | | | If the version hasn't changed, don't rebuild the index. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix xattr dangling pointer / double freeSage Weil2010-05-171-0/+1
| | | | | | | | If we use the xattr_blob, clear the pointer so we don't release the memory at the bottom of the fuction. Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: close messenger raceSage Weil2010-05-171-7/+7
| | | | | | | Simplify messenger locking, and close race between ceph_con_close() setting the CLOSED bit and con_work() checking the bit, then taking the mutex. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: name msgpools; useful error messagesSage Weil2010-05-173-7/+16
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix memory leak due to possible dentry init raceSage Weil2010-05-171-1/+4
| | | | | | Free dentry_info in error path. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: include auth method in error messagesSage Weil2010-05-174-4/+9
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: osdtimeout=0 for now timeoutSage Weil2010-05-171-1/+1
| | | | | | Allow the osd reset timeout to be disabled. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: d_obtain_alias() returns ERR_PTR()Dan Carpenter2010-05-171-6/+6
| | | | | | | d_obtain_alias() doesn't return NULL, it returns an ERR_PTR(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wake up mount thread when getting osdmapYehuda Sadeh2010-05-171-0/+1
| | | | | | | Now that the mount thread waits for the osdmap, it needs to be awaken. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* ceph: remove unused #includesHuang Weiyi2010-05-171-3/+0
| | | | | | | | Remove unused #include's in fs/ceph/super.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: wait for both monmap and osdmap when opening sessionSage Weil2010-05-171-5/+6
| | | | Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* ceph: clean up connection resetSage Weil2010-05-172-1/+2
| | | | | | Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: simplify ceph_msg_newSage Weil2010-05-177-36/+29
| | | | | | | We only need to pass in front_len. Callers can attach any other payload pieces (middle, data) as they see fit. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: make ceph_msg_new return NULL on failure; clean up, fix callersSage Weil2010-05-177-80/+48
| | | | | | | Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure instead, and fix up the callers (about half of which were wrong anyway). Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: rewrite msgpool using mempool_tSage Weil2010-05-172-151/+29
| | | | | | | | | Since we don't need to maintain large pools of messages, we can just use the standard mempool_t. We maintain a msgpool 'wrapper' because we need the mempool_t* in the alloc function, and mempool gives us only pool_data. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: use ceph_sb_to_client instead of ceph_clientCheng Renquan2010-05-1711-33/+30
| | | | | | | | | | | | | | | ceph_sb_to_client and ceph_client are really identical, we need to dump one; while function ceph_client is confusing with "struct ceph_client", ceph_sb_to_client's definition is more clear; so we'd better switch all call to ceph_sb_to_client. -static inline struct ceph_client *ceph_client(struct super_block *sb) -{ - return sb->s_fs_info; -} Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: handle kzalloc() failureCheng Renquan2010-05-171-0/+4
| | | | | Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: drop unnecessary msgpool for mon_client subscribe_ackSage Weil2010-05-172-13/+13
| | | | | | Preallocate a single message to reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: drop unnecessary msgpool for mon_client auth_replySage Weil2010-05-172-10/+14
| | | | | | Preallocate a single reply message that we can reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: clean up statfsSage Weil2010-05-172-57/+97
| | | | | | Avoid unnecessary msgpool. Preallocate reply. Fix use-after-free race. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix theoretically possible double-put on connectionSage Weil2010-05-171-0/+1
| | | | | | | This would only trigger if we bailed out before resetting r_con_filling_msg because the server reply was corrupt (oversized). Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: cleanup: remove dead codeDan Carpenter2010-05-171-6/+0
| | | | | | | | "xattr" is never NULL here. We took care of that in the previous if statement block. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: reduce build_path debug outputSage Weil2010-05-171-6/+4
| | | | Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: use __page_cache_alloc and add_to_page_cache_lruYehuda Sadeh2010-05-175-11/+6
| | | | | | | | | | | | Following Nick Piggin patches in btrfs, pagecache pages should be allocated with __page_cache_alloc, so they obey pagecache memory policies. Also, using add_to_page_cache_lru instead of using a private pagevec where applicable. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: update for removal of kref_setStephen Rothwell2010-05-171-1/+1
| | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: simplify page setup for incoming dataSage Weil2010-05-171-44/+12
| | | | | | Drop largely useless helper __prepare_pages(), and simplify sanity checks. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: invalidate affected dentry leases on aborted requestsSage Weil2010-05-174-5/+29
| | | | | | | | | | | | | If we abort a request, we return to caller, but the request may still complete. And if we hold the dir FILE_EXCL bit, we may not release a lease when sending a request. A simple un-tar, control-c, un-tar again will reproduce the bug (manifested as a 'Cannot open: File exists'). Ensure we invalidate affected dentry leases (as well dir I_COMPLETE) so we don't have valid (but incorrect) leases. Do the same, consistently, at other sites where I_COMPLETE is similarly cleared. Signed-off-by: Sage Weil <sage@newdream.net>
* ceph: fix race between aborted requests and fill_traceSage Weil2010-05-172-0/+13
| | | | | | | | | When we abort requests we need to prevent fill_trace et al from doing anything that relies on locks held by the VFS caller. This fixes a race between the reply handler and the abort code, ensuring that continue holding the dir mutex until the reply handler completes. Signed-off-by: Sage Weil <sage@newdream.net>