| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
"Two more gfs2 fixes"
* tag 'gfs2-v5.12-rc2-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: report "already frozen/thawed" errors
gfs2: Flag a withdraw if init_threads() fails
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before this patch, gfs2's freeze function failed to report an error
when the target file system was already frozen as it should (and as
generic vfs function freeze_super does. Similarly, gfs2's thaw function
failed to report an error when trying to thaw a file system that is not
frozen, as vfs function thaw_super does. The errors were checked, but
it always returned a 0 return code.
This patch adds the missing error return codes to gfs2 freeze and thaw.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Interrupting mount with ^C quickly enough can cause the kthread_run()
calls in gfs2's init_threads() to fail and the error path leads to a
deadlock on the s_umount rwsem. The abridged chain of events is:
[mount path]
get_tree_bdev()
sget_fc()
alloc_super()
down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); [acquired]
gfs2_fill_super()
gfs2_make_fs_rw()
init_threads()
kthread_run()
( Interrupted )
[Error path]
gfs2_gl_hash_clear()
flush_workqueue(glock_workqueue)
wait_for_completion()
[workqueue context]
glock_work_func()
run_queue()
do_xmote()
freeze_go_sync()
freeze_super()
down_write(&sb->s_umount) [deadlock]
In freeze_go_sync() there is a gfs2_withdrawn() check that we can use to
make sure freeze_super() is not called in the error path, so add a
gfs2_withdraw_delayed() call when init_threads() fails.
Ref: https://bugzilla.kernel.org/show_bug.cgi?id=212231
Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull block fixes from Jens Axboe:
"Mostly just random fixes all over the map.
The only odd-one-out change is finally getting the rename of
BIO_MAX_PAGES to BIO_MAX_VECS done. This should've been done with the
multipage bvec change, but it's been left.
Do it now to avoid hassles around changes piling up for the next merge
window.
Summary:
- NVMe pull request:
- one more quirk (Dmitry Monakhov)
- fix max_zone_append_sectors initialization (Chaitanya Kulkarni)
- nvme-fc reset/create race fix (James Smart)
- fix status code on aborts/resets (Hannes Reinecke)
- fix the CSS check for ZNS namespaces (Chaitanya Kulkarni)
- fix a use after free in a debug printk in nvme-rdma (Lv Yunlong)
- Follow-up NVMe error fix for NULL 'id' (Christoph)
- Fixup for the bd_size_lock being IRQ safe, now that the offending
driver has been dropped (Damien).
- rsxx probe failure error return (Jia-Ju)
- umem probe failure error return (Wei)
- s390/dasd unbind fixes (Stefan)
- blk-cgroup stats summing fix (Xunlei)
- zone reset handling fix (Damien)
- Rename BIO_MAX_PAGES to BIO_MAX_VECS (Christoph)
- Suppress uevent trigger for hidden devices (Daniel)
- Fix handling of discard on busy device (Jan)
- Fix stale cache issue with zone reset (Shin'ichiro)"
* tag 'block-5.12-2021-03-12-v2' of git://git.kernel.dk/linux-block:
nvme: fix the nsid value to print in nvme_validate_or_alloc_ns
block: Discard page cache of zone reset target range
block: Suppress uevent for hidden device when removed
block: rename BIO_MAX_PAGES to BIO_MAX_VECS
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a
nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done
nvme-core: check ctrl css before setting up zns
nvme-fc: fix racing controller reset and create association
nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted
nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()
nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()
nvme: simplify error logic in nvme_validate_ns()
nvme: set max_zone_append_sectors nvme_revalidate_zones
block: rsxx: fix error return code of rsxx_pci_probe()
block: Fix REQ_OP_ZONE_RESET_ALL handling
umem: fix error return code in mm_pci_probe()
blk-cgroup: Fix the recursive blkg rwstat
s390/dasd: fix hanging IO request during DASD driver unbind
s390/dasd: fix hanging DASD driver unbind
block: Try to handle busy underlying device on discard
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been
horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop
confusing users of the bio API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Patch fe3e397668775 ("gfs2: Rework the log space allocation logic")
changed gfs2_log_flush to reserve a set of journal blocks in case no
transaction is active. However, gfs2_log_flush also gets called in
cases where we don't have an active journal, for example, for spectator
mounts. In that case, trying to reserve blocks would sleep forever, but
we want gfs2_log_flush to be a no-op instead.
Fixes: fe3e397668775 ("gfs2: Rework the log space allocation logic")
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before this patch, function signal_our_withdraw referenced the journal
inode immediately. But corrupt file systems may have some invalid
journals, in which case our attempt to read it in will withdraw and the
resulting signal_our_withdraw would dereference the NULL value.
This patch adds a check to signal_our_withdraw so that if the journal
has not yet been initialized, it simply returns and does the old-style
withdraw.
Thanks, Andy Price, for his analysis.
Reported-by: syzbot+50a8a9cf8127f2c6f5df@syzkaller.appspotmail.com
Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish")
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch adds code to function trans_drain to remove drained
bd elements from the ail lists, if queued, before freeing the bd.
If we don't remove the bd from the ail, function ail_drain will
try to reference the bd after it has been freed by trans_drain.
Thanks to Andy Price for his analysis of the problem.
Reported-by: Andy Price <anprice@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
| |
It fixes the following warning detected by coccinelle:
./fs/gfs2/super.c:592:5-10: Unneeded variable: "error". Return "0" on
line 628
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted stuff pile - no common topic here"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
whack-a-mole: don't open-code iminor/imajor
9p: fix misuse of sscanf() in v9fs_stat2inode()
audit_alloc_mark(): don't open-code ERR_CAST()
fs/inode.c: make inode_init_always() initialize i_ino to 0
vfs: don't unnecessarily clone write access for writable fds
|
| |
| |
| |
| |
| |
| | |
several instances creeped back into the tree...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Log space and revoke accounting rework to fix some failed asserts.
- Local resource group glock sharing for better local performance.
- Add support for version 1802 filesystems: trusted xattr support and
'-o rgrplvb' mounts by default.
- Actually synchronize on the inode glock's FREEING bit during withdraw
("gfs2: fix glock confusion in function signal_our_withdraw").
- Fix parallel recovery of multiple journals ("gfs2: keep bios separate
for each journal").
- Various other bug fixes.
* tag 'gfs2-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (49 commits)
gfs2: Don't get stuck with I/O plugged in gfs2_ail1_flush
gfs2: Per-revoke accounting in transactions
gfs2: Rework the log space allocation logic
gfs2: Minor calc_reserved cleanup
gfs2: Use resource group glock sharing
gfs2: Allow node-wide exclusive glock sharing
gfs2: Add local resource group locking
gfs2: Add per-reservation reserved block accounting
gfs2: Rename rs_{free -> requested} and rd_{reserved -> requested}
gfs2: Check for active reservation in gfs2_release
gfs2: Don't search for unreserved space twice
gfs2: Only pass reservation down to gfs2_rbm_find
gfs2: Also reflect single-block allocations in rgd->rd_extfail_pt
gfs2: Recursive gfs2_quota_hold in gfs2_iomap_end
gfs2: Add trusted xattr support
gfs2: Enable rgrplvb for sb_fs_format 1802
gfs2: Don't skip dlm unlock if glock has an lvb
gfs2: Lock imbalance on error path in gfs2_recover_one
gfs2: Move function gfs2_ail_empty_tr
gfs2: Get rid of current_tail()
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In gfs2_ail1_flush, we're using I/O plugging to give the block layer a
better chance of merging I/O requests. If we're too aggressive here, we
can end up waiting on I/O to complete while still plugged. Fix that in
a way similar to writeback_sb_inodes, except that we can't use
blk_flush_plug because blk_flush_plug_list is not exported.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | \ | |
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git
Merge the resource group glock sharing feature and the revoke accounting rework.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In the log, revokes are stored as a revoke descriptor (struct
gfs2_log_descriptor), followed by zero or more additional revoke blocks
(struct gfs2_meta_header). On filesystems with a blocksize of 4k, the
revoke descriptor contains up to 503 revokes, and the metadata blocks
contain up to 509 revokes each. We've so far been reserving space for
revokes in transactions in block granularity, so a lot more space than
necessary was being allocated and then released again.
This patch switches to assigning revokes to transactions individually
instead. Initially, space for the revoke descriptor is reserved and
handed out to transactions. When more revokes than that are reserved,
additional revoke blocks are added. When the log is flushed, the space
for the additional revoke blocks is released, but we keep the space for
the revoke descriptor block allocated.
Transactions may still reserve more revokes than they will actually need
in the end, but now we won't overshoot the target as much, and by only
returning the space for excess revokes at log flush time, we further
reduce the amount of contention between processes.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The current log space allocation logic is hard to understand or extend.
The principle it that when the log is flushed, we may or may not have a
transaction active that has space allocated in the log. To deal with
that, we set aside a magical number of blocks to be used in case we
don't have an active transaction. It isn't clear that the pool will
always be big enough. In addition, we can't return unused log space at
the end of a transaction, so the number of blocks allocated must exactly
match the number of blocks used.
Simplify this as follows:
* When transactions are allocated or merged, always reserve enough
blocks to flush the transaction (err on the safe side).
* In gfs2_log_flush, return any allocated blocks that haven't been used.
* Maintain a pool of spare blocks big enough to do one log flush, as
before.
* In gfs2_log_flush, when we have no active transaction, allocate a
suitable number of blocks. For that, use the spare pool when
called from logd, and leave the pool alone otherwise. This means
that when the log is almost full, logd will still be able to do one
more log flush, which will result in more log space becoming
available.
This will make the log space allocator code easier to work with in
the future.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
No functional change.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move this function further up in log.c so that we can use it in the next patch.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Keep the current value of the updated log tail in the super block as
sb_log_flush_tail instead of computing it on the fly. This avoids
unnecessary sd_ail_lock taking and cleans up the code.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Use a tighter bound for the number of blocks required by transactions in
gfs2_trans_begin: in the worst case, we'll have mixed data and metadata,
so we'll need a log desciptor for each type.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Wake up log waiters in gfs2_log_release when log space has actually become
available. This is a much better place for the wakeup than gfs2_logd.
Check if enough log space is immeditely available before anything else. If
there isn't, use io_wait_event to wait instead of open-coding it.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit 588bff95c94e added gfs2_write_log_header() and started using it in
clean_journal(), with an additional call to log_flush_wait() at the end of
gfs2_write_log_header() which is unnecessary for clean_journal(). Move
that call out of gfs2_write_log_header() to restore the previous behavior.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move the read locking of sd_log_flush_lock from gfs2_log_reserve to
gfs2_trans_begin, and its unlocking from gfs2_log_release to
gfs2_trans_end. Use gfs2_log_release in two places in which it was open
coded before.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This counter and the associated wait queue are only used so that
gfs2_make_fs_ro can efficiently wait for all pending log space
allocations to fail after setting the filesystem to read-only. This
comes at the cost of waking up that wait queue very frequently.
Instead, when gfs2_log_reserve fails because the filesystem has become
read-only, Wake up sd_log_waitq. In gfs2_make_fs_ro, set the file
system read-only and then wait until all the log space has been
released. Give up and report the problem after a while. With that,
sd_reserving_log and sd_reserving_log_wait can be removed.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Replace the TR_ALLOCED flag by its inverse, TR_ONSTACK: that way, the flag only
needs to be set in the exceptional case of on-stack transactions. Split off
__gfs2_trans_begin from gfs2_trans_begin and use it to replace the open-coded
version in gfs2_ail_empty_gl.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit 2e60d7683c8d ("GFS2: update freeze code to use freeze/thaw_super
on all nodes") optimized away the sb_start_intwrite ... sb_end_intwrite
protection for the on-stack transactions in gfs2_ail_empty_gl with no
explanation. I can't think of a valid reason for doing that, so revert
that change. This simplifies the next commit.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Clean up the logic in ail2_empty (no functional change).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Function gfs2_write_revokes doesn't actually write any revokes; instead, it
adds revokes to the system transaction during a flush.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Split the assert in gfs2_trans_end into two parts.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The calc_reserved description claims that buf_limit is 502 (on 4k
filesystems), but it is actually 503. Fix / clarify the entire
description.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Clean up the computations in gfs2_write_revokes (no change in functionality).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The BUF_OFFSET and DATABUF_OFFSET definitions are only used in buf_limit
and databuf_limit, respectively, and the rounding done in those
definitions is immediately wiped out by dividing by the element size.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |/
| | | |
| | | |
| | | |
| | | |
| | | | |
Clean up this function to show that it is trivial.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch takes advantage of the new glock holder sharing feature for
resource groups. We have already introduced local resource group
locking in a previous patch, so competing accesses of local processes
are already under control.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Introduce a new LM_FLAG_NODE_SCOPE glock holder flag: when taking a
glock in LM_ST_EXCLUSIVE (EX) mode and with the LM_FLAG_NODE_SCOPE flag
set, the exclusive lock is shared among all local processes who are
holding the glock in EX mode and have the LM_FLAG_NODE_SCOPE flag set.
From the point of view of other nodes, the lock is still held
exclusively.
A future patch will start using this flag to improve performance with
rgrp sharing.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Prepare for treating resource group glocks as exclusive among nodes but
shared among all tasks running on a node: introduce another layer of
node-specific locking that the local tasks can use to coordinate their
accesses.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add a rs_reserved field to struct gfs2_blkreserv to keep track of the number of
blocks reserved by this particular reservation, and a rd_reserved field to
struct gfs2_rgrpd to keep track of the total number of reserved blocks in the
resource group. Those blocks are exclusively reserved, as opposed to the
rs_requested / rd_requested blocks which are tracked in the reservation tree
(rd_rstree) and which can be stolen if necessary.
When making a reservation with gfs2_inplace_reserve, rs_reserved is set to
somewhere between ap->min_target and ap->target depending on the number of free
blocks in the resource group. When allocating blocks with gfs2_alloc_blocks,
rs_reserved is decremented accordingly. Eventually, any reserved but not
consumed blocks are returned to the resource group by gfs2_inplace_release.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We keep track of what we've so far been referring to as reservations in
rd_rstree: the nodes in that tree indicate where in a resource group we'd
like to allocate the next couple of blocks for a particular inode. Local
processes take those as hints, but they may still "steal" blocks from those
extents, so when actually allocating a block, we must double check in the
bitmap whether that block is actually still free. Likewise, other cluster
nodes may "steal" such blocks as well.
One of the following patches introduces resource group glock sharing, i.e.,
sharing of an exclusively locked resource group glock among local processes to
speed up allocations. To make that work, we'll need to keep track of how many
blocks we've actually reserved for each inode, so we end up with two different
kinds of reservations.
Distinguish these two kinds by referring to blocks which are reserved but may
still be "stolen" as "requested". This rename also makes it more obvious that
rs_requested and rd_requested are strongly related.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In gfs2_release, check if the inode has an active reservation to avoid
unnecessary lock taking.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If gfs2_inplace_reserve has chosen a resource group but it couldn't make a
reservation there, there are too many other reservations in that resource
group. In that case, don't even try to respect existing reservations in
gfs2_alloc_blocks.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Only pass the current reservation down to gfs2_rbm_find rather than the entire
inode; we don't need any of the other information.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pass a non-NULL minext to gfs2_rbm_find even for single-block allocations. In
gfs2_rbm_find, also set rgd->rd_extfail_pt when a single-block allocation
fails in a resource group: there is no reason for treating that case
differently. In gfs2_reservation_check_and_update, only check how many free
blocks we have if more than one block is requested; we already know there's at
least one free block.
In addition, when allocating N blocks fails in gfs2_rbm_find, we need to set
rd_extfail_pt to N - 1 rather than N: rd_extfail_pt defines the biggest
allocation that might still succeed.
Finally, reset rd_extfail_pt when updating the resource group statistics in
update_rgrp_lvb, as we already do in gfs2_rgrp_bh_get.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When reading a resource group from disk or when receiving the resource group
statistics from a Lock Value Block (LVB), set/clear the GBF_FULL flags of all
bitmaps in that resource group according to whether or not the resource group
is full.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Removing a reservation doesn't make any actual space available, so don't clear
the GBF_FULL flags in that case. Otherwise, we'll only spend more time
scanning the bitmaps unnecessarily.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This reverts commit e79e0e1428188b24c3b57309ffa54a33c4ae40c4.
It turns out that we're only setting the GBF_FULL flag of a bitmap if we've
been scanning from the beginning of the bitmap until the end and we haven't
found a single free block, and we're not skipping reservations in that process,
either. This means that in gfs2_rbm_find, we can always skip bitmaps with the
GBF_FULL flag set.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Clean up the reservation size computation logic in gfs2_inplace_reserve a
little.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Variable ndata is only used inside "if (!dinode)", so it can be replaced
entirely with *nblocks.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
GFS2 uses struct gfs2_rbm to represent a filesystem block number as a
bit position within a resource group. This representation is used in
the bitmap manipulation code to prevent excessive conversions between
block numbers and bit positions, but also in struct gfs2_blkreserv which
is part of struct gfs2_inode, to mark the start of a reservation. In
the inode, the bit position representation makes less sense: first, the
start position is used as a block number about as often as a bit
position; second, the bit position representation makes the code
unnecessarily complicated and difficult to read.
Therefore, change struct gfs2_blkreserv to represent the start of a
reservation as a block number instead of a bit position. (This requires
keeping track of the resource group in gfs2_blkreserv separately.) With
that change, various things can be slightly simplified, and struct
gfs2_rbm can be moved to rgrp.c.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | |/
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Change gfs2_rbm_incr to advance an rbm by a given number of blocks. Use that
in gfs2_reservation_check_and_update to save a gfs2_rbm_to_block ->
gfs2_rbm_from_block round trip.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When starting an iomap write, gfs2_quota_lock_check -> gfs2_quota_lock
-> gfs2_quota_hold is called from gfs2_iomap_begin. At the end of the
write, before unlocking the quotas, punch_hole -> gfs2_quota_hold can be
called again in gfs2_iomap_end, which is incorrect and leads to a failed
assertion. Instead, move the call to gfs2_quota_unlock before the call
to punch_hole to fix that.
Fixes: 64bc06bb32ee ("gfs2: iomap buffered write support")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|