summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fscache: Attach the index key and aux data to the cookieDavid Howells2018-04-0426-786/+606
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attach copies of the index key and auxiliary data to the fscache cookie so that: (1) The callbacks to the netfs for this stuff can be eliminated. This can simplify things in the cache as the information is still available, even after the cache has relinquished the cookie. (2) Simplifies the locking requirements of accessing the information as we don't have to worry about the netfs object going away on us. (3) The cache can do lazy updating of the coherency information on disk. As long as the cache is flushed before reboot/poweroff, there's no need to update the coherency info on disk every time it changes. (4) Cookies can be hashed or put in a tree as the index key is easily available. This allows: (a) Checks for duplicate cookies can be made at the top fscache layer rather than down in the bowels of the cache backend. (b) Caching can be added to a netfs object that has a cookie if the cache is brought online after the netfs object is allocated. A certain amount of space is made in the cookie for inline copies of the data, but if it won't fit there, extra memory will be allocated for it. The downside of this is that live cache operation requires more memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anna Schumaker <anna.schumaker@netapp.com> Tested-by: Steve Dickson <steved@redhat.com>
* fscache: Add more tracepointsDavid Howells2018-04-046-8/+330
| | | | | | | | | | | | | | | | | | | | | Add more tracepoints to fscache, including: (*) fscache_page - Tracks netfs pages known to fscache. (*) fscache_check_page - Tracks the netfs querying whether a page is pending storage. (*) fscache_wake_cookie - Tracks cookies being woken up after a page completes/aborts storage in the cache. (*) fscache_op - Tracks operations being initialised. (*) fscache_wrote_page - Tracks return of the backend write_page op. (*) fscache_gang_lookup - Tracks lookup of pages to be stored in the write operation. Signed-off-by: David Howells <dhowells@redhat.com>
* fscache: Add tracepointsDavid Howells2018-04-0412-54/+731
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add some tracepoints to fscache: (*) fscache_cookie - Tracks a cookie's usage count. (*) fscache_netfs - Logs registration of a network filesystem, including the pointer to the cookie allocated. (*) fscache_acquire - Logs cookie acquisition. (*) fscache_relinquish - Logs cookie relinquishment. (*) fscache_enable - Logs enablement of a cookie. (*) fscache_disable - Logs disablement of a cookie. (*) fscache_osm - Tracks execution of states in the object state machine. and cachefiles: (*) cachefiles_ref - Tracks a cachefiles object's usage count. (*) cachefiles_lookup - Logs result of lookup_one_len(). (*) cachefiles_mkdir - Logs result of vfs_mkdir(). (*) cachefiles_create - Logs result of vfs_create(). (*) cachefiles_unlink - Logs calls to vfs_unlink(). (*) cachefiles_rename - Logs calls to vfs_rename(). (*) cachefiles_mark_active - Logs an object becoming active. (*) cachefiles_wait_active - Logs a wait for an old object to be destroyed. (*) cachefiles_mark_inactive - Logs an object becoming inactive. (*) cachefiles_mark_buried - Logs the burial of an object. Signed-off-by: David Howells <dhowells@redhat.com>
* fscache: Fix hanging wait on page discarded by writebackDavid Howells2018-04-041-4/+9
| | | | | | | | | | | | | | | | If the fscache asynchronous write operation elects to discard a page that's pending storage to the cache because the page would be over the store limit then it needs to wake the page as someone may be waiting on completion of the write. The problem is that the store limit may be updated by a different asynchronous operation - and so may miss the write - and that the store limit may not even get updated until later by the netfs. Fix the kernel hang by making fscache_write_op() mark as written any pages that are over the limit. Signed-off-by: David Howells <dhowells@redhat.com>
* fscache: Detect multiple relinquishment of a cookieDavid Howells2018-04-041-1/+2
| | | | | | Report if an fscache cookie is relinquished multiple times by the netfs. Signed-off-by: David <dhowells@redhat.com>
* fscache: Pass the correct cancelled indications to fscache_op_complete()David Howells2018-04-042-7/+10
| | | | | | | | The last parameter to fscache_op_complete() is a bool indicating whether or not the operation was cancelled. A lot of the time the inverse value is given or no differentiation is made. Fix this. Signed-off-by: David Howells <dhowells@redhat.com>
* fscache, cachefiles: Fix checker warningsDavid Howells2018-04-042-1/+1
| | | | | | | | | | | Fix a couple of checker warnings in fscache and cachefiles: (1) fscache_n_op_requeue is never used, so get rid of it. (2) cachefiles_uncache_page() is passed in a lock that it releases, so this needs annotating. Signed-off-by: David Howells <dhowells@redhat.com>
* afs: Be more aggressive in retiring cached vnodesDavid Howells2018-04-041-2/+3
| | | | | | | | When relinquishing cookies, either due to iget failure or to inode eviction, retire a cookie if we think the corresponding vnode got deleted on the server rather than just letting it lie in the cache. Signed-off-by: David Howells <dhowells@redhat.com>
* afs: Use the vnode ID uniquifier in the cache key not the aux dataDavid Howells2018-04-041-14/+8
| | | | | | | | | | | | | | | | | | | | AFS vnodes (files) are referenced by a triplet of { volume ID, vnode ID, uniquifier }. Currently, kafs is only using the vnode ID as the file key in the volume fscache index and checking the uniquifier on cookie acquisition against the contents of the auxiliary data stored in the cache. Unfortunately, this is subject to a race in which an FS.RemoveFile or FS.RemoveDir op is issued against the server but the local afs inode isn't torn down and disposed off before another thread issues something like FS.CreateFile. The latter then gets given the vnode ID that just got removed, but with a new uniquifier and a cookie collision occurs in the cache because the cookie is only keyed on the vnode ID whereas the inode is keyed on the vnode ID plus the uniquifier. Fix this by keying the cookie on the uniquifier in addition to the vnode ID and dropping the uniquifier from the auxiliary data supplied. Signed-off-by: David Howells <dhowells@redhat.com>
* afs: Invalidate cache on server data changeDavid Howells2018-04-041-0/+4
| | | | | | | Invalidate any data stored in fscache for a vnode that changes on the server so that we don't end up with the cache in a bad state locally. Signed-off-by: David Howells <dhowells@redhat.com>
* Merge branch 'userns-linus' of ↵Linus Torvalds2018-04-0315-455/+346
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There was a lot of work this cycle fixing bugs that were discovered after the merge window and getting everything ready where we can reasonably support fully unprivileged fuse. The bug fixes you already have and much of the unprivileged fuse work is coming in via other trees. Still left for fully unprivileged fuse is figuring out how to cleanly handle .set_acl and .get_acl in the legacy case, and properly handling of evm xattrs on unprivileged mounts. Included in the tree is a cleanup from Alexely that replaced a linked list with a statically allocated fix sized array for the pid caches, which simplifies and speeds things up. Then there is are some cleanups and fixes for the ipc namespace. The motivation was that in reviewing other code it was discovered that access ipc objects from different pid namespaces recorded pids in such a way that when asked the wrong pids were returned. In the worst case there has been a measured 30% performance impact for sysvipc semaphores. Other test cases showed no measurable performance impact. Manfred Spraul and Davidlohr Bueso who tend to work on sysvipc performance both gave the nod that this is good enough. Casey Schaufler and James Morris have given their approval to the LSM side of the changes. I simplified the types and the code dealing with sysvipc to pass just kern_ipc_perm for all three types of ipc. Which reduced the header dependencies throughout the kernel and simplified the lsm code. Which let me work on the pid fixes without having to worry about trivial changes causing complete kernel recompiles" * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ipc/shm: Fix pid freeing. ipc/shm: fix up for struct file no longer being available in shm.h ipc/smack: Tidy up from the change in type of the ipc security hooks ipc: Directly call the security hook in ipc_ops.associate ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces ipc/msg: Fix msgctl(..., IPC_STAT, ...) between pid namespaces ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces. ipc/util: Helpers for making the sysvipc operations pid namespace aware ipc: Move IPCMNI from include/ipc.h into ipc/util.h msg: Move struct msg_queue into ipc/msg.c shm: Move struct shmid_kernel into ipc/shm.c sem: Move struct sem and struct sem_array into ipc/sem.c msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooks shm/security: Pass kern_ipc_perm not shmid_kernel into the shm security hooks sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks pidns: simpler allocation of pid_* caches
| * ipc/shm: Fix pid freeing.Eric W. Biederman2018-03-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 0day kernel test build report reported an oops: > > IP: put_pid+0x22/0x5c > PGD 19efa067 P4D 19efa067 PUD 0 > Oops: 0000 [#1] > CPU: 0 PID: 727 Comm: trinity Not tainted 4.16.0-rc2-00010-g98f929b #1 > RIP: 0010:put_pid+0x22/0x5c > RSP: 0018:ffff986719f73e48 EFLAGS: 00010202 > RAX: 00000006d765f710 RBX: ffff98671a4fa4d0 RCX: ffff986719f73d40 > RDX: 000000006f6e6125 RSI: 0000000000000000 RDI: ffffffffa01e6d21 > RBP: ffffffffa0955fe0 R08: 0000000000000020 R09: 0000000000000000 > R10: 0000000000000078 R11: ffff986719f73e76 R12: 0000000000001000 > R13: 00000000ffffffea R14: 0000000054000fb0 R15: 0000000000000000 > FS: 00000000028c2880(0000) GS:ffffffffa06ad000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000677846439 CR3: 0000000019fc1005 CR4: 00000000000606b0 > Call Trace: > ? ipc_update_pid+0x36/0x3e > ? newseg+0x34c/0x3a6 > ? ipcget+0x5d/0x528 > ? entry_SYSCALL_64_after_hwframe+0x52/0xb7 > ? SyS_shmget+0x5a/0x84 > ? do_syscall_64+0x194/0x1b3 > ? entry_SYSCALL_64_after_hwframe+0x42/0xb7 > Code: ff 05 e7 20 9b 03 58 c9 c3 48 ff 05 85 21 9b 03 48 85 ff 74 4f 8b 47 04 8b 17 48 ff 05 7c 21 9b 03 48 83 c0 03 48 c1 e0 04 ff ca <48> 8b 44 07 08 74 1f 48 ff 05 6c 21 9b 03 ff 0f 0f 94 c2 48 ff > RIP: put_pid+0x22/0x5c RSP: ffff986719f73e48 > CR2: 0000000677846439 > ---[ end trace ab8c5cb4389d37c5 ]--- > Kernel panic - not syncing: Fatal exception In newseg when changing shm_cprid and shm_lprid from pid_t to struct pid* I misread the kvmalloc as kvzalloc and thought shp was initialized to 0. As that is not the case it is not safe to for the error handling to address shm_cprid and shm_lprid before they are initialized. Therefore move the cleanup of shm_cprid and shm_lprid from the no_file error cleanup path to the no_id error cleanup path. Ensuring that an early error exit won't cause the oops above. Reported-by: kernel test robot <fengguang.wu@intel.com> Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * ipc/shm: fix up for struct file no longer being available in shm.hStephen Rothwell2018-03-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stephen Rothewell <sfr@canb.auug.org.au> wrote: > After merging the userns tree, today's linux-next build (powerpc > ppc64_defconfig) produced this warning: > > In file included from include/linux/sched.h:16:0, > from arch/powerpc/lib/xor_vmx_glue.c:14: > include/linux/shm.h:17:35: error: 'struct file' declared inside parameter list will not be visible outside of this definition or declaration [-Werror] > bool is_file_shm_hugepages(struct file *file); > ^~~~ > > and many, many more (most warnings, but some errors - arch/powerpc is > mostly built with -Werror) I dug through this and I discovered that the error was caused by the removal of struct shmid_kernel from shm.h when building on powerpc. Except for observing the existence of "struct file *shm_file" in struct shmid_kernel I have no clue why the structure move would cause such a failure. I suspect shm.h always needed the forward declaration and someting had been confusing gcc into not issuing the warning. --EWB Fixes: a2e102cd3cdd ("shm: Move struct shmid_kernel into ipc/shm.c") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * ipc/smack: Tidy up from the change in type of the ipc security hooksEric W. Biederman2018-03-271-139/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the variables shp, sma, msq to isp. As that is how the code already refers to those variables. Collapse smack_of_shm, smack_of_sem, and smack_of_msq into smack_of_ipc, as the three functions had become completely identical. Collapse smack_shm_alloc_security, smack_sem_alloc_security and smack_msg_queue_alloc_security into smack_ipc_alloc_security as the three functions had become identical. Collapse smack_shm_free_security, smack_sem_free_security and smack_msg_queue_free_security into smack_ipc_free_security as the three functions had become identical. Requested-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc: Directly call the security hook in ipc_ops.associateEric W. Biederman2018-03-273-27/+3
| | | | | | | | | | | | | | | | | | After the last round of cleanups the shm, sem, and msg associate operations just became trivial wrappers around the appropriate security method. Simplify things further by just calling the security method directly. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc/sem: Fix semctl(..., GETPID, ...) between pid namespacesEric W. Biederman2018-03-271-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today the last process to update a semaphore is remembered and reported in the pid namespace of that process. If there are processes in any other pid namespace querying that process id with GETPID the result will be unusable nonsense as it does not make any sense in your own pid namespace. Due to ipc_update_pid I don't think you will be able to get System V ipc semaphores into a troublesome cache line ping-pong. Using struct pids from separate process are not a problem because they do not share a cache line. Using struct pid from different threads of the same process are unlikely to be a problem as the reference count update can be avoided. Further linux futexes are a much better tool for the job of mutual exclusion between processes than System V semaphores. So I expect programs that are performance limited by their interprocess mutual exclusion primitive will be using futexes. So while it is possible that enhancing the storage of the last rocess of a System V semaphore from an integer to a struct pid will cause a performance regression because of the effect of frequently updating the pid reference count. I don't expect that to happen in practice. This change updates semctl(..., GETPID, ...) to return the process id of the last process to update a semphore inthe pid namespace of the calling process. Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc/msg: Fix msgctl(..., IPC_STAT, ...) between pid namespacesEric W. Biederman2018-03-271-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today msg_lspid and msg_lrpid are remembered in the pid namespace of the creator and the processes that last send or received a sysvipc message. If you have processes in multiple pid namespaces that is just wrong. The process ids reported will not make the least bit of sense. This fix is slightly more susceptible to a performance problem than the related fix for System V shared memory. By definition the pids are updated by msgsnd and msgrcv, the fast path of System V message queues. The only concern over the previous implementation is the incrementing and decrementing of the pid reference count. As that is the only difference and multiple updates by of the task_tgid by threads in the same process have been shown in af_unix sockets to create a cache line ping-pong between cpus of the same processor. In this case I don't expect cache lines holding pid reference counts to ping pong between cpus. As senders and receivers update different pids there is a natural separation there. Further if multiple threads of the same process either send or receive messages the pid will be updated to the same value and ipc_update_pid will avoid the reference count update. Which means in the common case I expect msg_lspid and msg_lrpid to remain constant, and reference counts not to be updated when messages are sent. In rare cases it may be possible to trigger the issue which was observed for af_unix sockets, but it will require multiple processes with multiple threads to be either sending or receiving messages. It just does not feel likely that anyone would do that in practice. This change updates msgctl(..., IPC_STAT, ...) to return msg_lspid and msg_lrpid in the pid namespace of the process calling stat. This change also updates cat /proc/sysvipc/msg to return print msg_lspid and msg_lrpid in the pid namespace of the process that opened the proc file. Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user") Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces.Eric W. Biederman2018-03-271-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today shm_cpid and shm_lpid are remembered in the pid namespace of the creator and the processes that last touched a sysvipc shared memory segment. If you have processes in multiple pid namespaces that is just wrong, and I don't know how this has been over-looked for so long. As only creation and shared memory attach and shared memory detach update the pids I do not expect there to be a repeat of the issues when struct pid was attached to each af_unix skb, which in some notable cases cut the performance in half. The problem was threads of the same process updating same struct pid from different cpus causing the cache line to be highly contended and bounce between cpus. As creation, attach, and detach are expected to be rare operations for sysvipc shared memory segments I do not expect that kind of cache line ping pong to cause probems. In addition because the pid is at a fixed location in the structure instead of being dynamic on a skb, the reference count of the pid does not need to be updated on each operation if the pid is the same. This ability to simply skip the pid reference count changes if the pid is unchanging further reduces the likelihood of the a cache line holding a pid reference count ping-ponging between cpus. Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user") Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc/util: Helpers for making the sysvipc operations pid namespace awareEric W. Biederman2018-03-242-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Capture the pid namespace when /proc/sysvipc/msg /proc/sysvipc/shm and /proc/sysvipc/sem are opened, and make it available through the new helper ipc_seq_pid_ns. This makes it possible to report the pids in these files in the pid namespace of the opener of the files. Implement ipc_update_pid. A simple impline helper that will only update a struct pid pointer if the new value does not equal the old value. This removes the need for wordy code sequences like: old = object->pid; object->pid = new; put_pid(old); and old = object->pid; if (old != new) { object->pid = new; put_pid(old); } Allowing the following to be written instead: ipc_update_pid(&object->pid, new); Which is easier to read and ensures that the pid reference count is not touched the old and the new values are the same. Not touching the reference count in this case is important to help avoid issues like af_unix experienced, where multiple threads of the same process managed to bounce the struct pid between cpu cache lines, but updating the pids reference count. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ipc: Move IPCMNI from include/ipc.h into ipc/util.hEric W. Biederman2018-03-242-2/+1
| | | | | | | | | | | | | | | | The definition IPCMNI is only used in ipc/util.h and ipc/util.c. So there is no reason to keep it in a header file that the whole kernel can see. Move it into util.h to simplify future maintenance. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * msg: Move struct msg_queue into ipc/msg.cEric W. Biederman2018-03-242-18/+17
| | | | | | | | | | | | | | | | All of the users are now in ipc/msg.c so make the definition local to that file to make code maintenance easier. AKA to prevent rebuilding the entire kernel when struct msg_queue changes. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * shm: Move struct shmid_kernel into ipc/shm.cEric W. Biederman2018-03-242-22/+22
| | | | | | | | | | | | | | | | All of the users are now in ipc/shm.c so make the definition local to that file to make code maintenance easier. AKA to prevent rebuilding the entire kernel when struct shmid_kernel changes. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * sem: Move struct sem and struct sem_array into ipc/sem.cEric W. Biederman2018-03-222-39/+35
| | | | | | | | | | | | | | | | | | All of the users are now in ipc/sem.c so make the definitions local to that file to make code maintenance easier. AKA to prevent rebuilding the entire kernel when one of these files is changed. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooksEric W. Biederman2018-03-226-65/+62
| | | | | | | | | | | | | | | | | | | | | | | | All of the implementations of security hooks that take msg_queue only access q_perm the struct kern_ipc_perm member. This means the dependencies of the msg_queue security hooks can be simplified by passing the kern_ipc_perm member of msg_queue. Making this change will allow struct msg_queue to become private to ipc/msg.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * shm/security: Pass kern_ipc_perm not shmid_kernel into the shm security hooksEric W. Biederman2018-03-226-56/+52
| | | | | | | | | | | | | | | | | | | | | | All of the implementations of security hooks that take shmid_kernel only access shm_perm the struct kern_ipc_perm member. This means the dependencies of the shm security hooks can be simplified by passing the kern_ipc_perm member of shmid_kernel.. Making this change will allow struct shmid_kernel to become private to ipc/shm.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * sem/security: Pass kern_ipc_perm not sem_array into the sem security hooksEric W. Biederman2018-03-226-57/+53
| | | | | | | | | | | | | | | | | | | | | | | | All of the implementations of security hooks that take sem_array only access sem_perm the struct kern_ipc_perm member. This means the dependencies of the sem security hooks can be simplified by passing the kern_ipc_perm member of sem_array. Making this change will allow struct sem and struct sem_array to become private to ipc/sem.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * pidns: simpler allocation of pid_* cachesAlexey Dobriyan2018-03-211-43/+24
| | | | | | | | | | | | | | | | | | | | | | | | Those pid_* caches are created on demand when a process advances to the new level of pid namespace. Which means pointers are stable, write only and thus can be packed into an array instead of spreading them over and using lists(!) to find them. Both first and subsequent clone/unshare(CLONE_NEWPID) become faster. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2018-04-035-34/+93
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull workqueue updates from Tejun Heo: "rcu_work addition and a couple trivial changes" * 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove the comment about the old manager_arb mutex workqueue: fix the comments of nr_idle fs/aio: Use rcu_work instead of explicit rcu and work item cgroup: Use rcu_work instead of explicit rcu and work item RCU, workqueue: Implement rcu_work
| * | workqueue: remove the comment about the old manager_arb mutexLai Jiangshan2018-03-201-1/+0
| | | | | | | | | | | | | | | | | | | | | The manager_arb mutex doesn't exist any more. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | workqueue: fix the comments of nr_idleLai Jiangshan2018-03-201-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the worker rebinding behavior was refactored, there is no idle worker off the idle_list now. The comment is outdated and can be just removed. It also groups nr_workers and nr_idle together. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | fs/aio: Use rcu_work instead of explicit rcu and work itemTejun Heo2018-03-191-15/+6
| | | | | | | | | | | | | | | | | | | | | Workqueue now has rcu_work. Use it instead of open-coding rcu -> work item bouncing. Signed-off-by: Tejun Heo <tj@kernel.org>
| * | cgroup: Use rcu_work instead of explicit rcu and work itemTejun Heo2018-03-192-15/+8
| | | | | | | | | | | | | | | | | | | | | Workqueue now has rcu_work. Use it instead of open-coding rcu -> work item bouncing. Signed-off-by: Tejun Heo <tj@kernel.org>
| * | RCU, workqueue: Implement rcu_workTejun Heo2018-03-192-0/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are cases where RCU callback needs to be bounced to a sleepable context. This is currently done by the RCU callback queueing a work item, which can be cumbersome to write and confusing to read. This patch introduces rcu_work, a workqueue work variant which gets executed after a RCU grace period, and converts the open coded bouncing in fs/aio and kernel/cgroup. v3: Dropped queue_rcu_work_on(). Documented rcu grace period behavior after queue_rcu_work(). v2: Use rcu_barrier() instead of synchronize_rcu() to wait for completion of previously queued rcu callback as per Paul. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-4.17' of ↵Linus Torvalds2018-04-0323-135/+995
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: "Nothing too interesting. The biggest change is refcnting fix for ata_host - the bug is recent and can only be triggered on controller hotplug, so very few are hitting it. There also are a number of trivial license / error message changes and some hardware specific changes" * 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (23 commits) ahci: imx: add the imx8qm ahci sata support libata: ensure host is free'd on error exit paths ata: ahci-platform: add reset control support ahci: imx: fix the build warning ata: add Amiga Gayle PATA controller driver ahci: imx: add the imx6qp ahci sata support ata: change Tegra124 to Tegra ata: ahci_tegra: Add AHCI support for Tegra210 ata: ahci_tegra: disable DIPM ata: ahci_tegra: disable devslp for Tegra124 ata: ahci_tegra: initialize regulators from soc struct ata: ahci_tegra: Update initialization sequence dt-bindings: Tegra210: add binding documentation libata: add refcounting to ata_host pata_bk3710: clarify license version and use SPDX header pata_falcon: clarify license version and use SPDX header pata_it821x: Delete an error message for a failed memory allocation in it821x_firmware_command() pata_macio: Delete an error message for a failed memory allocation in two functions pata_mpc52xx: Delete an error message for a failed memory allocation in mpc52xx_ata_probe() sata_dwc_460ex: Delete an error message for a failed memory allocation in sata_dwc_port_start() ...
| * | | ahci: imx: add the imx8qm ahci sata supportRichard Zhu2018-03-291-0/+332
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - There are three PHY lanes on iMX8QM, and can be used in the following three cases 1. a two lanes PCIE_A, and a single lane SATA. 2. a single lane PCIE_A, a single lane PCIE_B and a single lane SATA. 3. a two lanes PCIE_A, and a single lane PCIE_B. The configuration of the iMX8QM AHCI SATA is relied on the usage of PCIE ports in the case 1 and 2. Use standalone iMX8 AHCI SATA probe and enable functions to enable iMX8QM AHCI SATA support. - To save power consumption, PHY CLKs can be gated off after the configurations are done. - The impedance ratio should be configured refer to differnet REXT values. 0x6c <--> REXT value is 85Ohms 0x80 (default value) <--> REXT value is 100Ohms. In general, REXT value should be 85ohms in standalone PCIE HW board design, and 100ohms in SATA standalone HW board design. When the PCIE and the SATA are enabled simultaneously in the HW board design. The REXT value would be set to 85ohms. Configure the SATA PHY impedance ratio to 0x6c in default. Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | libata: ensure host is free'd on error exit pathsColin Ian King2018-03-271-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The host structure is not being kfree'd on two error exit paths leading to memory leaks. Add in new err_free label and kfree host. Detected by CoverityScan, CID#1466103 ("Resource leak") Fixes: 2623c7a5f279 ("libata: add refcounting to ata_host") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci-platform: add reset control supportKunihiko Hayashi2018-03-263-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to get and control a list of resets for the device as optional and shared. These resets must be kept de-asserted until the device is enabled. This is specified as shared because some SoCs like UniPhier series have common reset controls with all ahci controller instances. Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ahci: imx: fix the build warningRichard Zhu2018-03-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the default as the last entry to fix the following build warning introduced by commit. e5878732a521 ("ahci: imx: add the imx6qp ahci sata support") drivers/ata/ahci_imx.c: In function 'imx_sata_disable': drivers/ata/ahci_imx.c:478:2: warning: enumeration value 'AHCI_IMX53' not handled in switch [-Wswitch] switch (imxpriv->type) { ^~~~~~ Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: add Amiga Gayle PATA controller driverBartlomiej Zolnierkiewicz2018-03-193-0/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Amiga Gayle PATA controller driver. It enables libata support for the on-board IDE interfaces on some Amiga models (A600, A1200, A4000 and A4000T) and also for IDE interfaces on the Zorro expansion bus (M-Tech E-Matrix 530 expansion card). Thanks to John Paul Adrian Glaubitz and Michael Schmitz for help with testing the driver. Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ahci: imx: add the imx6qp ahci sata supportRichard Zhu2018-03-193-4/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Regarding to imx6q ahci sata, imx6qp ahci sata has the reset mechanism. Add the imx6qp ahci sata support in this commit. - Use the specific reset callback for imx53 sata, and use the default ahci_ops.softreset for the others. Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: change Tegra124 to TegraPreetham Ramchandra2018-03-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ahci_tegra driver now supports Tegra124, Tegra132 and Tegra210, so change Tegra124 to Tegra. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci_tegra: Add AHCI support for Tegra210Preetham Ramchandra2018-03-141-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the AHCI-compliant Serial ATA host controller on the Tegra210 system-on-chip. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci_tegra: disable DIPMPreetham Ramchandra2018-03-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tegra does not support DIPM and it should be disabled. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci_tegra: disable devslp for Tegra124Preetham Ramchandra2018-03-141-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tegra124 does not support devslp and it should be disabled. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci_tegra: initialize regulators from soc structPreetham Ramchandra2018-03-141-10/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get the regulator names to be initialized from soc structure and initialize them. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | ata: ahci_tegra: Update initialization sequencePreetham Ramchandra2018-03-141-64/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the controller initialization sequence and move Tegra124 specifics to tegra124_ahci_init. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | dt-bindings: Tegra210: add binding documentationPreetham Ramchandra2018-03-141-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds bindings documentation for the AHCI controller on Tegra210. Signed-off-by: Preetham Chandru R <pchandru@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | libata: add refcounting to ata_hostTaras Kondratiuk2018-03-134-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 9a6d6a2ddabb ("ata: make ata port as parent device of scsi host") manual driver unbind/remove causes use-after-free. Unbind unconditionally invokes devres_release_all() which calls ata_host_release() and frees ata_host/ata_port memory while it is still being referenced as a parent of SCSI host. When SCSI host is finally released scsi_host_dev_release() calls put_device(parent) and accesses freed ata_port memory. Add reference counting to make sure that ata_host lives long enough. Bug report: https://lkml.org/lkml/2017/11/1/945 Fixes: 9a6d6a2ddabb ("ata: make ata port as parent device of scsi host") Cc: Tejun Heo <tj@kernel.org> Cc: Lin Ming <minggr@gmail.com> Cc: linux-ide@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Taras Kondratiuk <takondra@cisco.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | pata_bk3710: clarify license version and use SPDX headerBartlomiej Zolnierkiewicz2018-03-011-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - clarify license version (it should be GPL 2.0) - use SPDX header Acked-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | pata_falcon: clarify license version and use SPDX headerBartlomiej Zolnierkiewicz2018-03-011-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - clarify license version (it should be GPL 2.0) - use SPDX header Cc: Michael Schmitz <schmitzmic@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Tejun Heo <tj@kernel.org>