linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	rbd: whitelist RBD_FEATURE_OPERATIONS feature bit	Ilya Dryomov	2018-01-29	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	This feature bit restricts older clients from performing certain maintenance operations against an image (e.g. clone, snap create). krbd does not perform maintenance operations. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
*	rbd: don't NULL out ->obj_request in rbd_img_obj_parent_read_full()	Ilya Dryomov	2018-01-29	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If rbd_img_request_submit() fails, parent_request->obj_request is NULLed out, triggering an assert in rbd_obj_request_put(): rbd_img_request_put(parent_request) rbd_parent_request_destroy rbd_obj_request_put(NULL) Just remove it -- parent_request->obj_request will be put in rbd_parent_request_destroy(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: use kmem_cache_zalloc() in rbd_img_request_create()	Ilya Dryomov	2018-01-29	1	-7/+2
\| \| \| \|	Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: obj_request->completion is unused	Ilya Dryomov	2018-01-29	1	-6/+1
\| \| \| \|	Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: set max_segments to USHRT_MAX	Ilya Dryomov	2018-01-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d3834fefcfe5 ("rbd: bump queue_max_segments") bumped max_segments (unsigned short) to max_hw_sectors (unsigned int). max_hw_sectors is set to the number of 512-byte sectors in an object and overflows unsigned short for 32M (largest possible) objects, making the block layer resort to handing us single segment (i.e. single page or even smaller) bios in that case. Cc: stable@vger.kernel.org Fixes: d3834fefcfe5 ("rbd: bump queue_max_segments") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
*	rbd: reacquire lock should update lock owner client id	Florian Margaine	2018-01-09	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, future operations on this RBD using exclusive-lock are going to require the lock from a non-existent client id. Cc: stable@vger.kernel.org Fixes: 14bb211d324d ("rbd: support updating the lock cookie without releasing the lock") Link: http://tracker.ceph.com/issues/19929 Signed-off-by: Florian Margaine <florian@platform.sh> [idryomov@gmail.com: rbd_set_owner_cid() call, __rbd_lock() helper] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: default to single-major device number scheme	Ilya Dryomov	2017-11-13	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's been 3.5 years, let's turn it on by default. Support in rbd(8) utility goes back to pre-firefly, "rbd map" has been loading the module with single_major=Y ever since. However, if the module is already loaded (whether by hand or at boot time), we end up with single_major=N. Also, some people don't install rbd(8) and use the sysfs interface directly. (With single-major=N, a major number is consumed for every mapping, imposing a limit of ~240 rbd images per host. single-major=Y allows mapping thousands of rbd images on a single machine.) Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
*	rbd: set discard_alignment to zero	David Disseldorp	2017-11-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RBD devices are currently incorrectly initialised with the block queue discard_alignment set to the underlying RADOS object size. As per Documentation/ABI/testing/sysfs-block: The discard_alignment parameter indicates how many bytes the beginning of the device is offset from the internal allocation unit's natural alignment. Correcting the discard_alignment parameter from the RADOS object size to zero (the blk_set_default_limits() default) has no effect on how discard requests are propagated through the block layer - @alignment in __blkdev_issue_discard() remains zero. However, it does fix the UNMAP granularity alignment value advertised to SCSI initiators via the Block Limits VPD. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: get rid of rbd_mapping::read_only	Ilya Dryomov	2017-11-13	1	-19/+4
\| \| \| \| \| \| \|	It is redundant -- rw/ro state is stored in hd_struct and managed by the block layer. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: fix and simplify rbd_ioctl_set_ro()	Ilya Dryomov	2017-11-13	1	-28/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	->open_count/-EBUSY check is bogus and wrong: when an open device is set read-only, blkdev_write_iter() refuses further writes with -EPERM. This is standard behaviour and all other block devices allow this. set_disk_ro() call is also problematic: we affect the entire device when called on a single partition. All rbd_ioctl_set_ro() needs to do is refuse ro -> rw transition for mapped snapshots. Everything else can be handled by generic code. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: use GFP_NOIO for parent stat and data requests	Ilya Dryomov	2017-11-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	rbd_img_obj_exists_submit() and rbd_img_obj_parent_read_full() are on the writeback path for cloned images -- we attempt a stat on the parent object to see if it exists and potentially read it in to call copyup. GFP_NOIO should be used instead of GFP_KERNEL here. Cc: stable@vger.kernel.org Link: http://tracker.ceph.com/issues/22014 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: David Disseldorp <ddiss@suse.de>
*	rbd: silence bogus uninitialized use warning in rbd_acquire_lock()	Kefeng Wang	2017-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	drivers/block/rbd.c: In function 'rbd_acquire_lock': drivers/block/rbd.c:3602:44: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized] Silence the warning, found it when built old kernel(3.10) with OBS(opensuse build service). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
*	rbd: use bio_clone_fast() instead of bio_clone()	NeilBrown	2017-06-18	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bio_clone() makes a copy of the bi_io_vec, but rbd never changes that, so there is no need for a copy. bio_clone_fast() can be used instead, which avoids making the copy. This requires that we provide a bio_set. bio_clone() uses fs_bio_set, but it isn't, in general, safe to use the same bio_set at different levels of the stack, as that can lead to deadlocks. As filesystems use fs_bio_set, block devices shouldn't. As rbd never stacks, it is safe to have a single global bio_set for all rbd devices to use. So allocate that when the module is initialised, and use it with bio_clone_fast(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
*	Merge tag 'v4.12-rc5' into for-4.13/block	Jens Axboe	2017-06-12	1	-0/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| *	rbd: implement REQ_OP_WRITE_ZEROES	Ilya Dryomov	2017-05-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 93c1defedcae ("rbd: remove the discard_zeroes_data flag") explicitly didn't implement REQ_OP_WRITE_ZEROES for rbd, while the following commit 48920ff2a5a9 ("block: remove the discard_zeroes_data flag") dropped ->discard_zeroes_data in favor of REQ_OP_WRITE_ZEROES. rbd does support efficient zeroing via CEPH_OSD_OP_ZERO opcode and will release either some or all blocks depending on whether the zeroing request is rbd_obj_bytes() aligned. This is how we currently implement discards, so REQ_OP_WRITE_ZEROES can be identical to REQ_OP_DISCARD for now. Caveats: - REQ_NOUNMAP is ignored, but AFAICT that's true of at least two other current implementations - nvme and loop - there is no ->write_zeroes_alignment and blk_bio_write_zeroes_split() is hence less helpful than blk_bio_discard_split(), but this can (and should) be fixed on the rbd side In the future we will split these into two code paths to respect REQ_NOUNMAP on zeroout and save on zeroing blocks that couldn't be released on discard. Fixes: 93c1defedcae ("rbd: remove the discard_zeroes_data flag") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* \|	blk-mq: switch ->queue_rq return value to blk_status_t	Christoph Hellwig	2017-06-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* \|	block: introduce new block status code type	Christoph Hellwig	2017-06-09	1	-3/+5
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
*	Merge tag 'ceph-for-4.12-rc1' of git://github.com/ceph/ceph-client	Linus Torvalds	2017-05-10	1	-144/+215
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull ceph updates from Ilya Dryomov: "The two main items are support for disabling automatic rbd exclusive lock transfers from myself and the long awaited -ENOSPC handling series from Jeff. The former will allow rbd users to take advantage of exclusive lock's built-in blacklist/break-lock functionality while staying in control of who owns the lock. With the latter in place, we will abort filesystem writes on -ENOSPC instead of having them block indefinitely. Beyond that we've got the usual pile of filesystem fixes from Zheng, some refcount_t conversion patches from Elena and a patch for an ancient open() flags handling bug from Alexander" * tag 'ceph-for-4.12-rc1' of git://github.com/ceph/ceph-client: (31 commits) ceph: fix memory leak in __ceph_setxattr() ceph: fix file open flags on ppc64 ceph: choose readdir frag based on previous readdir reply rbd: exclusive map option rbd: return ResponseMessage result from rbd_handle_request_lock() rbd: kill rbd_is_lock_supported() rbd: support updating the lock cookie without releasing the lock rbd: store lock cookie rbd: ignore unlock errors rbd: fix error handling around rbd_init_disk() rbd: move rbd_unregister_watch() call into rbd_dev_image_release() rbd: move rbd_dev_destroy() call out of rbd_dev_image_release() ceph: when seeing write errors on an inode, switch to sync writes Revert "ceph: SetPageError() for writeback pages if writepages fails" ceph: handle epoch barriers in cap messages libceph: add an epoch_barrier field to struct ceph_osd_client libceph: abort already submitted but abortable requests when map or pool goes full libceph: allow requests to return immediately on full conditions if caller wishes libceph: remove req->r_replay_version ceph: make seeky readdir more efficient ...
\| *	rbd: exclusive map option	Ilya Dryomov	2017-05-04	1	-10/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support disabling automatic exclusive lock transfers to allow users to be in charge of which node should own the lock while being able to reuse exclusive lock's built-in blacklist/break-lock functionality. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: return ResponseMessage result from rbd_handle_request_lock()	Ilya Dryomov	2017-05-04	1	-14/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now it's just 0, but "no automatic exclusive lock transfers" mode code will need -EROFS. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: kill rbd_is_lock_supported()	Ilya Dryomov	2017-05-04	1	-11/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the exclusive lock is acquired only if the mapping is writable, i.e. an image HEAD mapped in rw mode. This means that we don't acquire the lock for executing a read from a snapshot or an image HEAD mapped in ro mode, even if lock_on_read is set. This is somewhat weird and inconsistent with "no automatic exclusive lock transfers" mode, where the lock is acquired unconditionally. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: support updating the lock cookie without releasing the lock	Ilya Dryomov	2017-05-04	1	-25/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we no longer release the lock before potentially raising BLACKLISTED in rbd_reregister_watch(), the "either locked or blacklisted" assert in rbd_queue_workfn() needs to go: we can be both locked and blacklisted at that point now. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: store lock cookie	Ilya Dryomov	2017-05-04	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for supporting set_cookie method (or rather set_cookie fallback for older OSDs), store the lock cookie on lock and use it on unlock instead of recalculating from rbd_dev->watch_cookie. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: ignore unlock errors	Ilya Dryomov	2017-05-04	1	-18/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the lock_state is set to UNLOCKED (preventing further I/O), but RELEASED_LOCK notification isn't sent. Be consistent with userspace and treat ceph_cls_unlock() errors as the image is unlocked. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: fix error handling around rbd_init_disk()	Ilya Dryomov	2017-05-04	1	-43/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add_disk() takes an extra reference on disk->queue, which is put in put_disk() -> disk_release(). Avoiding blk_cleanup_queue() (which also puts the queue) until add_disk() sets GENHD_FL_UP works for the queue itself, but leaks various queue internals. Conditioning tag_set freeing on GENHD_FL_UP is wrong too: all error paths after rbd_init_disk() leak the tag_set. Move the final "announce" steps out of rbd_dev_device_setup() so that it can be unwound like any other function. Leave "announce" steps to do_rbd_add/remove(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: move rbd_unregister_watch() call into rbd_dev_image_release()	Ilya Dryomov	2017-05-04	1	-15/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_dev->disk tear down vs rbd_watch_cb() race shouldn't be a problem anymore thanks to EXISTS and REMOVING checks in rbd_dev_update_size(). A similar race could occur on "rbd map", see commit 811c66887746 ("rbd: fix rbd map vs notify races"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: move rbd_dev_destroy() call out of rbd_dev_image_release()	Ilya Dryomov	2017-05-04	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	... to simplify error handling in do_rbd_add(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	libceph, ceph: always advertise all supported features	Ilya Dryomov	2017-05-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	No reason to hide CephFS-specific features in the rbd case. Recent feature bits mix RADOS and CephFS-specific stuff together anyway. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* \|	fs: ceph: CURRENT_TIME with ktime_get_real_ts()	Deepa Dinamani	2017-05-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CURRENT_TIME is not y2038 safe. The macro will be deleted and all the references to it will be replaced by ktime_get_* apis. struct timespec is also not y2038 safe. Retain timespec for timestamp representation here as ceph uses it internally everywhere. These references will be changed to use struct timespec64 in a separate patch. The current_fs_time() api is being changed to use vfs struct inode* as an argument instead of struct super_block*. Set the new mds client request r_stamp field using ktime_get_real_ts() instead of using current_fs_time(). Also, since r_stamp is used as mtime on the server, use timespec_trunc() to truncate the timestamp, using the right granularity from the superblock. This api will be transitioned to be y2038 safe along with vfs. Link: http://lkml.kernel.org/r/1491613030-11599-5-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> M: Ilya Dryomov <idryomov@gmail.com> M: "Yan, Zheng" <zyan@redhat.com> M: Sage Weil <sage@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	blk-mq: update ->init_request and ->exit_request prototypes	Christoph Hellwig	2017-05-02	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the request_idx parameter, which can't be used safely now that we support I/O schedulers with blk-mq. Except for a superflous check in mtip32xx it was unused anyway. Also pass the tag_set instead of just the driver data - this allows drivers to avoid some code duplication in a follow on cleanup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* \|	rbd: remove the discard_zeroes_data flag	Christoph Hellwig	2017-04-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd only supports discarding on large alignments, so the zeroing code would always fall back to explicit writings of zeroes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* \|	blk-mq: constify struct blk_mq_ops	Eric Biggers	2017-03-31	1	-1/+1
\|/ \| \| \| \| \| \|	Constify all instances of blk_mq_ops, as they are never modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
*	rbd: supported_features bus attribute	Ilya Dryomov	2017-03-07	1	-4/+12
\| \| \| \| \| \| \| \|	... so that userspace can generate meaningful error messages and spell out unsupported features that need to be disabled. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
*	Merge tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client	Linus Torvalds	2017-02-28	1	-374/+227
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull ceph updates from Ilya Dryomov: "This time around we have: - support for rbd data-pool feature, which enables rbd images on erasure-coded pools (myself). CEPH_PG_MAX_SIZE has been bumped to allow erasure-coded profiles with k+m up to 32. - a patch for ceph_d_revalidate() performance regression introduced in 4.9, along with some cleanups in the area (Jeff Layton) - a set of fixes for unsafe ->d_parent accesses in CephFS (Jeff Layton) - buffered reads are now processed in rsize windows instead of rasize windows (Andreas Gerstmayr). The new default for rsize mount option is 64M. - ack vs commit distinction is gone, greatly simplifying ->fsync() and MOSDOpReply handling code (myself) ... also a few filesystem bug fixes from Zheng, a CRUSH sync up (CRUSH computations are still serialized though) and several minor fixes and cleanups all over" * tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client: (52 commits) libceph, rbd, ceph: WRITE \| ONDISK -> WRITE libceph: get rid of ack vs commit ceph: remove special ack vs commit behavior ceph: tidy some white space in get_nonsnap_parent() crush: fix dprintk compilation crush: do is_out test only if we do not collide ceph: remove req from unsafe list when unregistering it rbd: constify device_type structure rbd: kill obj_request->object_name and rbd_segment_name_cache rbd: store and use obj_request->object_no rbd: RBD_V{1,2}_DATA_FORMAT macros rbd: factor out __rbd_osd_req_create() rbd: set offset and length outside of rbd_obj_request_create() rbd: support for data-pool feature rbd: introduce rbd_init_layout() rbd: use rbd_obj_bytes() more rbd: remove now unused rbd_obj_request_wait() and helpers rbd: switch rbd_obj_method_sync() to ceph_osdc_call() libceph: pass reply buffer length through ceph_osdc_call() rbd: do away with obj_request in rbd_obj_read_sync() ...
\| *	libceph, rbd, ceph: WRITE \| ONDISK -> WRITE	Ilya Dryomov	2017-02-24	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CEPH_OSD_FLAG_ONDISK is set in account_request(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
\| *	rbd: constify device_type structure	Bhumika Goyal	2017-02-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Declare device_type structure as const as it is only stored in the type field of a device structure. This field is of type const, so add const to the declaration of device_type structure. File size before: text data bss dec hex filename 61546 11610 208 73364 11e94 drivers/block/rbd.o File size after: text data bss dec hex filename 61610 11578 208 73396 11eb4 drivers/block/rbd.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
\| *	rbd: kill obj_request->object_name and rbd_segment_name_cache	Ilya Dryomov	2017-02-20	1	-72/+7
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: store and use obj_request->object_no	Ilya Dryomov	2017-02-20	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	object_no can be trivially formatted into an object name. We already store object names in OSD requests with special care to avoid dynamic allocations for short names. Storing a name in obj_request, obtained as below (!), is a waste and will be removed in the next commit. name = kmem_cache_alloc(rbd_segment_name_cache, ...); snprintf(name, ...); obj_request->object_name = kstrdup(name); kmem_cache_free(rbd_segment_name_cache, name); ... ceph_oid_aprintf(..., "%s", obj_request->object_name); Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: RBD_V{1,2}_DATA_FORMAT macros	Ilya Dryomov	2017-02-20	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... and also fix up the comment -- format 1 data objects have always been 12 hex digits long. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: factor out __rbd_osd_req_create()	Ilya Dryomov	2017-02-20	1	-63/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Factor OSD request allocation and initialization code out into __rbd_osd_req_create(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: set offset and length outside of rbd_obj_request_create()	Ilya Dryomov	2017-02-20	1	-15/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The allocation doesn't depend on offset and length. Both offset and length can be changed after obj_request is allocated, too. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: support for data-pool feature	Ilya Dryomov	2017-02-20	1	-2/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for RBD_FEATURE_DATA_POOL feature. rbd_dev->layout.pool_id now stores the data pool id. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: introduce rbd_init_layout()	Ilya Dryomov	2017-02-20	1	-7/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than initializing layout fields with some made up values in __rbd_dev_create(), move the initialization into rbd_init_layout() and call it after the header is actually populated. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: use rbd_obj_bytes() more	Ilya Dryomov	2017-02-20	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Returning u64 doesn't make sense: max header->obj_order is 25 and ceph_file_layout::object_size is u32. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: remove now unused rbd_obj_request_wait() and helpers	Ilya Dryomov	2017-02-20	1	-38/+0
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: switch rbd_obj_method_sync() to ceph_osdc_call()	Ilya Dryomov	2017-02-20	1	-95/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As explained in the previous commit, rbd_obj_request machinery (and rbd_osd_req_create() in particular) shouldn't be used for working with metadata objects. Switch to the recently added ceph_osdc_call(). It assumes single pages for outbound and inbound buffers, but that's OK - none of the callers need more than that. These pages need to be allocated (messenger is in dire need of proper iterator interface!), but we are swapping for pages[] and pagelist allocations in the existing code. Kill class_name argument - all rbd methods are under "rbd". Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: do away with obj_request in rbd_obj_read_sync()	Ilya Dryomov	2017-02-20	1	-49/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_obj_request machinery is completely unnecessary here; all that's being done is fetching a metadata object - no striping, cloning, etc. More importantly, rbd_osd_req_create() grabs pool id from layout and that is becoming a data pool id. Kill offset argument - all metadata objects are small and read in full. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: initialize rbd_dev->header_oloc early	Ilya Dryomov	2017-02-20	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No reason to delay it until image_id is known. This will be required by some rbd_obj_method_sync() callers, after rbd_obj_method_sync() is changed to take oloc. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: kill rbd_image_header::{crypt_type,comp_type}	Ilya Dryomov	2017-02-20	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Image format 1 is deprecated and format 2 doesn't have these. Also, __rbd_dev_create() takes care of zeroing (or otherwise initializing) format 2 specific fields. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
\| *	rbd: use kstrndup() in rbd_header_from_disk()	Ilya Dryomov	2017-02-20	1	-7/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>