bcachefs: gracefully unwind journal res slowpath on shutdown

bcachefs detects journal stuck conditions in a couple different places. If the logic in the journal reservation slow path happens to detect the problem, I've seen instances where the filesystem remains deadlocked even though it has been shut down. This is occasionally reproduced by generic/333, and usually manifests as one or more tasks stuck in the journal reservation slow path. To help avoid this problem, repeat the journal error check in __journal_res_get() once under spinlock to cover the case where the previous lock holder might have triggered shutdown. This also helps avoid spurious/duplicate stuck reports. Also, wake the journal from the halt code to make sure blocked callers of the journal res slowpath have a chance to wake up and observe the pending error. This survives an overnight looping run of generic/333 without the aforementioned lockups. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
author: Brian Foster <bfoster@redhat.com> 2023-03-20 13:21:19 -0400
committer: Kent Overstreet <kent.overstreet@linux.dev> 2023-10-22 17:09:58 -0400
commit: 23fd4f4dc622c36124515401d223607baec01a0d (patch)
tree: b3e30d74fa9adcb371c8876281ad25be426f1872 /fs/bcachefs/journal.c
parent: 873555f04d81b49a96ea03b37dcd499c13e67742 (diff)
download: linux-23fd4f4dc622c36124515401d223607baec01a0d.tar.gz
linux-23fd4f4dc622c36124515401d223607baec01a0d.tar.bz2
linux-23fd4f4dc622c36124515401d223607baec01a0d.zip
1 files changed, 7 insertions, 0 deletions
diff --git a/fs/bcachefs/journal.c b/fs/bcachefs/journal.c
index 801f09593e6b..43bb1d4002bd 100644
--- a/fs/bcachefs/journal.c
+++ b/fs/bcachefs/journal.c
@@ -162,6 +162,7 @@ void bch2_journal_halt(struct journal *j)
 	__journal_entry_close(j, JOURNAL_ENTRY_ERROR_VAL);
 	if (!j->err_seq)
 		j->err_seq = journal_cur_seq(j);
+	journal_wake(j);
 	spin_unlock(&j->lock);
 }
 
@@ -362,6 +363,12 @@ retry:
 
 	spin_lock(&j->lock);
 
+	/* check once more in case somebody else shut things down... */
+	if (bch2_journal_error(j)) {
+		spin_unlock(&j->lock);
+		return -BCH_ERR_erofs_journal_err;
+	}
+
 	/*
 	 * Recheck after taking the lock, so we don't race with another thread
 	 * that just did journal_entry_open() and call journal_entry_close()
author	Brian Foster <bfoster@redhat.com>	2023-03-20 13:21:19 -0400
committer	Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:58 -0400
commit	23fd4f4dc622c36124515401d223607baec01a0d (patch)
tree	b3e30d74fa9adcb371c8876281ad25be426f1872 /fs/bcachefs/journal.c
parent	873555f04d81b49a96ea03b37dcd499c13e67742 (diff)
download	linux-23fd4f4dc622c36124515401d223607baec01a0d.tar.gz linux-23fd4f4dc622c36124515401d223607baec01a0d.tar.bz2 linux-23fd4f4dc622c36124515401d223607baec01a0d.zip