summaryrefslogtreecommitdiffstats
path: root/fs/dlm/lock.c
Commit message (Collapse)AuthorAgeFilesLines
* fs: dlm: use event based wait for pending removeAlexander Aring2021-12-071-7/+12
| | | | | | | | This patch will use an event based waitqueue to wait for a possible clash with the ls_remove_name field of dlm_ls instead of doing busy waiting. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: filter user dlm messages for kernel locksAlexander Aring2021-11-021-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following crash by receiving a invalid message: [ 160.672220] ================================================================== [ 160.676206] BUG: KASAN: user-memory-access in dlm_user_add_ast+0xc3/0x370 [ 160.679659] Read of size 8 at addr 00000000deadbeef by task kworker/u32:13/319 [ 160.681447] [ 160.681824] CPU: 10 PID: 319 Comm: kworker/u32:13 Not tainted 5.14.0-rc2+ #399 [ 160.683472] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.14.0-1.module+el8.6.0+12648+6ede71a5 04/01/2014 [ 160.685574] Workqueue: dlm_recv process_recv_sockets [ 160.686721] Call Trace: [ 160.687310] dump_stack_lvl+0x56/0x6f [ 160.688169] ? dlm_user_add_ast+0xc3/0x370 [ 160.689116] kasan_report.cold.14+0x116/0x11b [ 160.690138] ? dlm_user_add_ast+0xc3/0x370 [ 160.690832] dlm_user_add_ast+0xc3/0x370 [ 160.691502] _receive_unlock_reply+0x103/0x170 [ 160.692241] _receive_message+0x11df/0x1ec0 [ 160.692926] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 160.693700] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 160.694427] ? lock_acquire+0x175/0x400 [ 160.695058] ? do_purge.isra.51+0x200/0x200 [ 160.695744] ? lock_acquired+0x360/0x5d0 [ 160.696400] ? lock_contended+0x6a0/0x6a0 [ 160.697055] ? lock_release+0x21d/0x5e0 [ 160.697686] ? lock_is_held_type+0xe0/0x110 [ 160.698352] ? lock_is_held_type+0xe0/0x110 [ 160.699026] ? ___might_sleep+0x1cc/0x1e0 [ 160.699698] ? dlm_wait_requestqueue+0x94/0x140 [ 160.700451] ? dlm_process_requestqueue+0x240/0x240 [ 160.701249] ? down_write_killable+0x2b0/0x2b0 [ 160.701988] ? do_raw_spin_unlock+0xa2/0x130 [ 160.702690] dlm_receive_buffer+0x1a5/0x210 [ 160.703385] dlm_process_incoming_buffer+0x726/0x9f0 [ 160.704210] receive_from_sock+0x1c0/0x3b0 [ 160.704886] ? dlm_tcp_shutdown+0x30/0x30 [ 160.705561] ? lock_acquire+0x175/0x400 [ 160.706197] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 160.706941] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 160.707681] process_recv_sockets+0x32/0x40 [ 160.708366] process_one_work+0x55e/0xad0 [ 160.709045] ? pwq_dec_nr_in_flight+0x110/0x110 [ 160.709820] worker_thread+0x65/0x5e0 [ 160.710423] ? process_one_work+0xad0/0xad0 [ 160.711087] kthread+0x1ed/0x220 [ 160.711628] ? set_kthread_struct+0x80/0x80 [ 160.712314] ret_from_fork+0x22/0x30 The issue is that we received a DLM message for a user lock but the destination lock is a kernel lock. Note that the address which is trying to derefence is 00000000deadbeef, which is in a kernel lock lkb->lkb_astparam, this field should never be derefenced by the DLM kernel stack. In case of a user lock lkb->lkb_astparam is lkb->lkb_ua (memory is shared by a union field). The struct lkb_ua will be handled by the DLM kernel stack but on a kernel lock it will contain invalid data and ends in most likely crashing the kernel. It can be reproduced with two cluster nodes. node 2: dlm_tool join test echo "862 fooobaar 1 2 1" > /sys/kernel/debug/dlm/test_locks echo "862 3 1" > /sys/kernel/debug/dlm/test_waiters node 1: dlm_tool join test python: foo = DLM(h_cmd=3, o_nextcmd=1, h_nodeid=1, h_lockspace=0x77222027, \ m_type=7, m_flags=0x1, m_remid=0x862, m_result=0xFFFEFFFE) newFile = open("/sys/kernel/debug/dlm/comms/2/rawmsg", "wb") newFile.write(bytes(foo)) Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: add lkb waiters debugfs functionalityAlexander Aring2021-11-021-0/+15
| | | | | | | | | | This patch adds functionality to put a lkb to the waiters state. It can be useful to combine this feature with the "rawmsg" debugfs functionality. It will bring the DLM lkb into a state that a message will be parsed by the kernel. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: add lkb debugfs functionalityAlexander Aring2021-11-021-0/+46
| | | | | | | | | | | | | | | | | | | This patch adds functionality to add an lkb during runtime. This is a highly debugging feature only, wrong input can crash the kernel. It is a early state feature as well. The goal is to provide a user interface for manipulate dlm state and combine it with the rawmsg feature. It is debugfs functionality, we don't care about UAPI breakage. Even it's possible to add lkb's/rsb's which could never be exists in such wat by using normal DLM operation. The user of this interface always need to think before using this feature, not every crash which happens can really occur during normal dlm operation. Future there should be more functionality to add a more realistic lkb which reflects normal DLM state inside the kernel. For now this is enough. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: allow create lkb with specific id rangeAlexander Aring2021-11-021-2/+8
| | | | | | | This patch adds functionality to add a lkb with a specific id range. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: initial support for tracepointsAlexander Aring2021-11-021-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds initial support for dlm tracepoints. It will introduce tracepoints to dlm main functionality dlm_lock()/dlm_unlock() and their complete ast() callback or blocking bast() callback. The lock/unlock functionality has a start and end tracepoint, this is because there exists a race in case if would have a tracepoint at the end position only the complete/blocking callbacks could occur before. To work with eBPF tracing and using their lookup hash functionality there could be problems that an entry was not inserted yet. However use the start functionality for hash insert and check again in end functionality if there was an dlm internal error so there is no ast callback. In further it might also that locks with local masters will occur those callbacks immediately so we must have such functionality. I did not make everything accessible yet, although it seems eBPF can be used to access a lot of internal datastructures if it's aware of the struct definitions of the running kernel instance. We still can change it, if you do eBPF experiments e.g. time measurements between lock and callback functionality you can simple use the local lkb_id field as hash value in combination with the lockspace id if you have multiple lockspaces. Otherwise you can simple use trace-cmd for some functionality, e.g. `trace-cmd record -e dlm` and `trace-cmd report` afterwards. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: add union in dlm header for lockspace idAlexander Aring2021-05-251-4/+4
| | | | | | | | This patch adds union inside the lockspace id to handle it also for another use case for a different dlm command. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: add more midcomms hooksAlexander Aring2021-05-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | This patch prepares hooks to redirect to the midcomms layer which will be used by the midcomms re-transmit handling. There exists the new concept of stateless buffers allocation and commits. This can be used to bypass the midcomms re-transmit handling. It is used by RCOM_STATUS and RCOM_NAMES messages, because they have their own ping-like re-transmit handling. As well these two messages will be used to determine the DLM version per node, because these two messages are per observation the first messages which are exchanged. Cluster manager events for node membership are added to add support for half-closed connections in cases that the peer connection get to an end of file but DLM still holds membership of the node. In this time DLM can still trigger new message which we should allow. After the cluster manager node removal event occurs it safe to close the connection. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* fs: dlm: use GFP_ZERO for page bufferAlexander Aring2021-03-091-2/+0
| | | | | | | | | This patch uses GFP_ZERO for allocate a page for the internal dlm sending buffer allocator instead of calling memset zero after every allocation. An already allocated space will never be reused again. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-08-231-1/+1
| | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193Thomas Gleixner2019-05-301-3/+1
| | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license v 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 45 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528170027.342746075@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* dlm: memory leaks on error path in dlm_user_request()Vasily Averin2018-11-151-7/+7
| | | | | | | | | | | | | | According to comment in dlm_user_request() ua should be freed in dlm_free_lkb() after successful attach to lkb. However ua is attached to lkb not in set_lock_args() but later, inside request_lock(). Fixes 597d0cae0f99 ("[DLM] dlm: user locks") Cc: stable@kernel.org # 2.6.19 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: lost put_lkb on error path in receive_convert() and receive_unlock()Vasily Averin2018-11-151-0/+2
| | | | | | | | Fixes 6d40c4a708e0 ("dlm: improve error and debug messages") Cc: stable@kernel.org # 3.5 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: possible memory leak on error path in create_lkb()Vasily Averin2018-11-151-0/+1
| | | | | | | | Fixes 3d6aa675fff9 ("dlm: keep lkbs in idr") Cc: stable@kernel.org # 3.1 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: remove dlm_send_rcom_lookup_dumpDavid Teigland2017-10-091-1/+0
| | | | | | | | | | | This function was only for debugging. It would be called in a condition that should not happen, and should probably have been removed from the final version of the original commit. Remove it because it does mutex lock under spin lock. Signed-off-by: David Teigland <teigland@redhat.com>
* DLM: fix conversion deadlock when DLM_LKF_NODLCKWT flag is settsutomu.owa@toshiba.co.jp2017-09-251-19/+23
| | | | | | | | | | | | | | | | | | | When the DLM_LKF_NODLCKWT flag was set, even if conversion deadlock was detected, the caller of can_be_granted() was unknown. We change the behavior of can_be_granted() and change it to detect conversion deadlock regardless of whether the DLM_LKF_NODLCKWT flag is set or not. And depending on whether the DLM_LKF_NODLCKWT flag is set or not, we change the behavior at the caller of can_be_granted(). This fix has no effect except when using DLM_LKF_NODLCKWT flag. Currently, ocfs2 uses the DLM_LKF_NODLCKWT flag and does not expect a cancel operation from conversion deadlock when calling dlm_lock(). ocfs2 is implemented to perform a cancel operation by requesting BASTs (callback). Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: Delete an error message for a failed memory allocation in ↵Markus Elfring2017-08-071-3/+1
| | | | | | | | | | dlm_recover_waiters_pre() Omit an extra message for a memory allocation failure in this function. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: Improve a size determination in dlm_recover_waiters_pre()Markus Elfring2017-08-071-1/+1
| | | | | | | | | Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: Use kcalloc() in dlm_scan_waiters()Markus Elfring2017-08-071-1/+1
| | | | | | | | | | | A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
* ktime: Get rid of ktime_equal()Thomas Gleixner2016-12-251-3/+2
| | | | | | | | No point in going through loops and hoops instead of just comparing the values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
* ktime: Cleanup ktime_set() usageThomas Gleixner2016-12-251-1/+1
| | | | | | | | | | ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
* dlm: adopt orphan locksDavid Teigland2014-11-191-2/+74
| | | | | | | | | | | | | | | | | | | | | | | A process may exit, leaving an orphan lock in the lockspace. This adds the capability for another process to acquire the orphan lock. Acquiring the orphan just moves the lock from the orphan list onto the acquiring process's list of locks. An adopting process must specify the resource name and mode of the lock it wants to adopt. If a matching lock is found, the lock is moved to the caller's 's list of locks, and the lkid of the lock is returned like the lkid of a new lock. If an orphan with a different mode is found, then -EAGAIN is returned. If no orphan lock is found on the resource, then -ENOENT is returned. No async completion is used because the result is immediately available. Also, when orphans are purged, allow a zero nodeid to refer to the local nodeid so the caller does not need to look up the local nodeid. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: use INFO for recovery messagesDavid Teigland2014-02-141-3/+3
| | | | | | | | The log messages relating to the progress of recovery are minimal and very often useful. Change these to the KERN_INFO level so they are always available. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: silence a harmless use after free warningDan Carpenter2014-02-121-0/+1
| | | | | | | | We pass the freed "r" pointer back to the caller. It's harmless but it upsets the static checkers. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: Avoid LVB truncationBart Van Assche2013-06-261-4/+4
| | | | | | | | For lockspaces with an LVB length above 64 bytes, avoid truncating the LVB while exchanging it with another node in the cluster. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: convert to idr_alloc()Tejun Heo2013-02-271-12/+6
| | | | | | | | | | Convert to the much saner new idr interface. Error return values from recover_idr_add() mix -1 and -errno. The conversion doesn't change that but it looks iffy. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* dlm: avoid scanning unchanged toss listsDavid Teigland2013-01-071-0/+15
| | | | | | | | | | | Keep track of whether a toss list contains any shrinkable rsbs. If not, dlm_scand can avoid scanning the list for rsbs to shrink. Unnecessary scanning can otherwise waste a lot of time because the toss lists can contain a large number of rsbs that are non-shrinkable (directory records). Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix lvb invalidation conditionsDavid Teigland2012-11-161-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a node is removed that held a PW/EX lock, the existing master node should invalidate the lvb on the resource due to the purged lock. Previously, the existing master node was invalidating the lvb if it found only NL/CR locks on the resource during recovery for the removed node. This could lead to cases where it invalidated the lvb and shouldn't have, or cases where it should have invalidated and didn't. When recovery selects a *new* master node for a resource, and that new master finds only NL/CR locks on the resource after lock recovery, it should invalidate the lvb. This case was handled correctly (but was incorrectly applied to the existing master case also.) When a process exits while holding a PW/EX lock, the lvb on the resource should be invalidated. This was not happening. The lvb contents and VALNOTVALID flag should be recovered before granting locks in recovery so that the recovered lvb state is provided in the callback. The lvb was being recovered after the lock was granted. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix missing dir removeDavid Teigland2012-07-161-2/+68
| | | | | | | | | | | | | | | | | | I don't know exactly how, but in some cases, a dir record is not removed, or a new one is created when it shouldn't be. The result is that the dir node lookup returns a master node where the rsb does not exist. In this case, The master node will repeatedly return -EBADR for requests, and the lock requests will be stuck. Until all possible ways for this to happen can be eliminated, a simple and effective way to recover from this situation is for the supposed master node to send a standard remove message to the dir node when it receives a request for a resource it has no rsb for. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix conversion deadlock from recoveryDavid Teigland2012-07-161-15/+40
| | | | | | | | | | The process of rebuilding locks on a new master during recovery could re-order the locks on the convert queue, creating an "in place" conversion deadlock that would not be resolved. Fix this by not considering queue order when granting conversions after recovery. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix race between remove and lookupDavid Teigland2012-07-161-37/+144
| | | | | | | | | | | | | | | It was possible for a remove message on an old rsb to be sent after a lookup message on a new rsb, where the rsbs were for the same resource name. This could lead to a missing directory entry for the new rsb. It is fixed by keeping a copy of the resource name being removed until after the remove has been sent. A lookup checks if this in-progress remove matches the name it is looking up. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: use rsbtbl as resource directoryDavid Teigland2012-07-161-203/+819
| | | | | | | | | | | | | | | | | | | Remove the dir hash table (dirtbl), and use the rsb hash table (rsbtbl) as the resource directory. It has always been an unnecessary duplication of information. This improves efficiency by using a single rsbtbl lookup in many cases where both rsbtbl and dirtbl lookups were needed previously. This eliminates the need to handle cases of rsbtbl and dirtbl being out of sync. In many cases there will be memory savings because the dir hash table no longer exists. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fixes for nodir modeDavid Teigland2012-05-021-89/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "nodir" mode (statically assign master nodes instead of using the resource directory) has always been highly experimental, and never seriously used. This commit fixes a number of problems, making nodir much more usable. - Major change to recovery: recover all locks and restart all in-progress operations after recovery. In some cases it's not possible to know which in-progess locks to recover, so recover all. (Most require recovery in nodir mode anyway since rehashing changes most master nodes.) - Change the way nodir mode is enabled, from a command line mount arg passed through gfs2, into a sysfs file managed by dlm_controld, consistent with the other config settings. - Allow recovering MSTCPY locks on an rsb that has not yet been turned into a master copy. - Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages from a previous, aborted recovery cycle. Base this on the local recovery status not being in the state where any nodes should be sending LOCK messages for the current recovery cycle. - Hold rsb lock around dlm_purge_mstcpy_locks() because it may run concurrently with dlm_recover_master_copy(). - Maintain highbast on process-copy lkb's (in addition to the master as is usual), because the lkb can switch back and forth between being a master and being a process copy as the master node changes in recovery. - When recovering MSTCPY locks, flag rsb's that have non-empty convert or waiting queues for granting at the end of recovery. (Rename flag from LOCKS_PURGED to RECOVER_GRANT and similar for the recovery function, because it's not only resources with purged locks that need grant a grant attempt.) - Replace a couple of unnecessary assertion panics with error messages. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: improve error and debug messagesDavid Teigland2012-04-261-85/+156
| | | | | | | | | Change some existing error/debug messages to collect more useful information, and add some new error/debug messages to address recently found problems. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: avoid unnecessary search in search_rsbDavid Teigland2012-04-261-0/+3
| | | | | | | | | | If the rsb is found in the "keep" tree, but is not the right type (i.e. not MASTER), we can return immediately with the result. There's no point in going on to search the "toss" list as if we hadn't found it. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix waiter recoveryDavid Teigland2012-04-261-12/+31
| | | | | | | | | | | | An outstanding remote operation (an lkb on the "waiter" list) could sometimes miss being resent during recovery. The decision was based on the lkb_nodeid field, which could have changed during an earlier aborted recovery, so it no longer represents the actual remote destination. The lkb_wait_nodeid is always the actual remote node, so it is the best value to use. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix QUECVT when convert queue is emptyDavid Teigland2012-04-231-0/+12
| | | | | | | | The QUECVT flag should not prevent conversions from being granted immediately when the convert queue is empty. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix slow rsb search in dir recoveryDavid Teigland2012-03-081-4/+4
| | | | | | | | | The function used to find an rsb during directory recovery was searching the single linear list of rsb's. This wasted a lot of time compared to using the standard hash table to find the rsb. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: convert rsb list to rb_treeBob Peterson2011-11-181-17/+70
| | | | | | | | | | Change the linked lists to rb_tree's in the rsb hash table to speed up searches. Slow rsb searches were having a large impact on gfs2 performance due to the large number of dlm locks gfs2 uses. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: use workqueue for callbacksDavid Teigland2011-07-151-12/+12
| | | | | | | | | Instead of creating our own kthread (dlm_astd) to deliver callbacks for all lockspaces, use a per-lockspace workqueue to deliver the callbacks. This eliminates complications and slowdowns from many lockspaces sharing the same thread. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: remove deadlock debug printDavid Teigland2011-07-141-3/+0
| | | | | | | gfs2 recently began using this feature heavily, creating more debug output than we want to see. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: improve rsb searchesDavid Teigland2011-07-121-37/+82
| | | | | | | | | | | | | | By pre-allocating rsb structs before searching the hash table, they can be inserted immediately. This avoids always having to repeat the search when adding the struct to hash list. This also adds space to the rsb struct for a max resource name, so an rsb allocation can be used by any request. The constant size also allows us to finally use a slab for the rsb structs. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: keep lkbs in idrDavid Teigland2011-07-111-45/+24
| | | | | | | | This is simpler and quicker than the hash table, and avoids needing to search the hash list for every new lkid to check if it's used. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: fix kmalloc argsDavid Teigland2011-07-111-1/+1
| | | | | | The gfp and size args were switched. Signed-off-by: David Teigland <teigland@redhat.com>
* dlm: don't do pointless NULL check, use kzalloc and fix order of argumentsJesper Juhl2011-07-111-6/+2
| | | | | | | | | | | | | | | | | | In fs/dlm/lock.c in the dlm_scan_waiters() function there are 3 small issues: 1) There's no need to test the return value of the allocation and do a memset if is succeedes. Just use kzalloc() to obtain zeroed memory. 2) Since kfree() handles NULL pointers gracefully, the test of 'warned' against NULL before the kfree() after the loop is completely pointless. Remove it. 3) The arguments to kmalloc() (now kzalloc()) were swapped. Thanks to Dr. David Alan Gilbert for pointing this out. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: David Teigland <teigland@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-05-241-40/+142
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: make plock operation killable dlm: remove shared message stub for recovery dlm: delayed reply message warning dlm: Remove superfluous call to recalc_sigpending()
| * dlm: remove shared message stub for recoveryDavid Teigland2011-04-051-33/+49
| | | | | | | | | | | | | | | | | | | | | | kmalloc a stub message struct during recovery instead of sharing the struct in the lockspace. This leaves the lockspace stub_ms only for faking downconvert replies, where it is never modified and sharing is not a problem. Also improve the debug messages in the same recovery function. Signed-off-by: David Teigland <teigland@redhat.com>
| * dlm: delayed reply message warningDavid Teigland2011-04-011-7/+93
| | | | | | | | | | | | | | | | Add an option (disabled by default) to print a warning message when a lock has been waiting a configurable amount of time for a reply message from another node. This is mainly for debugging. Signed-off-by: David Teigland <teigland@redhat.com>
* | Fix common misspellingsLucas De Marchi2011-03-311-1/+1
|/ | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* dlm: record full callback stateDavid Teigland2011-03-101-21/+17
| | | | | | | | | | | | | Change how callbacks are recorded for locks. Previously, information about multiple callbacks was combined into a couple of variables that indicated what the end result should be. In some situations, we could not tell from this combined state what the exact sequence of callbacks were, and would end up either delivering the callbacks in the wrong order, or suppress redundant callbacks incorrectly. This new approach records all the data for each callback, leaving no uncertainty about what needs to be delivered. Signed-off-by: David Teigland <teigland@redhat.com>