summaryrefslogtreecommitdiffstats
path: root/fs/crypto
Commit message (Collapse)AuthorAgeFilesLines
* fscrypt: fix keyring memory leak on mount failureEric Biggers2022-10-191-6/+11
| | | | | | | | | | | | | | | | | Commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") moved the keyring destruction from __put_super() to generic_shutdown_super() so that the filesystem's block device(s) are still available. Unfortunately, this causes a memory leak in the case where a mount is attempted with the test_dummy_encryption mount option, but the mount fails after the option has already been processed. To fix this, attempt the keyring destruction in both places. Reported-by: syzbot+104c2a89561289cec13e@syzkaller.appspotmail.com Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/r/20221011213838.209879-1-ebiggers@kernel.org
* Merge tag 'statx-dioalign-for-linus' of ↵Linus Torvalds2022-10-031-25/+24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull STATX_DIOALIGN support from Eric Biggers: "Make statx() support reporting direct I/O (DIO) alignment information. This provides a generic interface for userspace programs to determine whether a file supports DIO, and if so with what alignment restrictions. Specifically, STATX_DIOALIGN works on block devices, and on regular files when their containing filesystem has implemented support. An interface like this has been requested for years, since the conditions for when DIO is supported in Linux have gotten increasingly complex over time. Today, DIO support and alignment requirements can be affected by various filesystem features such as multi-device support, data journalling, inline data, encryption, verity, compression, checkpoint disabling, log-structured mode, etc. Further complicating things, Linux v6.0 relaxed the traditional rule of DIO needing to be aligned to the block device's logical block size; now user buffers (but not file offsets) only need to be aligned to the DMA alignment. The approach of uplifting the XFS specific ioctl XFS_IOC_DIOINFO was discarded in favor of creating a clean new interface with statx(). For more information, see the individual commits and the man page update[1]" Link: https://lore.kernel.org/r/20220722074229.148925-1-ebiggers@kernel.org [1] * tag 'statx-dioalign-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: xfs: support STATX_DIOALIGN f2fs: support STATX_DIOALIGN f2fs: simplify f2fs_force_buffered_io() f2fs: move f2fs_force_buffered_io() into file.c ext4: support STATX_DIOALIGN fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN vfs: support STATX_DIOALIGN on block devices statx: add direct I/O alignment information
| * fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGNEric Biggers2022-09-111-25/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare for STATX_DIOALIGN support, make two changes to fscrypt_dio_supported(). First, remove the filesystem-block-alignment check and make the filesystems handle it instead. It previously made sense to have it in fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction would have to be returned to filesystems. It ends up being simpler if filesystems handle this part themselves, especially for f2fs which only allows fs-block-aligned DIO in the first place. Second, make fscrypt_dio_supported() work on inodes whose encryption key hasn't been set up yet, by making it set up the key if needed. This is required for statx(), since statx() doesn't require a file descriptor. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220827065851.135710-4-ebiggers@kernel.org
* | fscrypt: work on block_devices instead of request_queuesChristoph Hellwig2022-09-211-40/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | request_queues are a block layer implementation detail that should not leak into file systems. Change the fscrypt inline crypto code to retrieve block devices instead of request_queues from the file system. As part of that, clean up the interaction with multi-device file systems by returning both the number of devices and the actual device array in a single method call. Signed-off-by: Christoph Hellwig <hch@lst.de> [ebiggers: bug fixes and minor tweaks] Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-4-ebiggers@kernel.org
* | fscrypt: stop holding extra request_queue referencesEric Biggers2022-09-215-60/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the fscrypt_master_key lifetime has been reworked to not be subject to the quirks of the keyrings subsystem, blk_crypto_evict_key() no longer gets called after the filesystem has already been unmounted. Therefore, there is no longer any need to hold extra references to the filesystem's request_queue(s). (And these references didn't always do their intended job anyway, as pinning a request_queue doesn't necessarily pin the corresponding blk_crypto_profile.) Stop taking these extra references. Instead, just pass the super_block to fscrypt_destroy_inline_crypt_key(), and use it to get the list of block devices the key needs to be evicted from. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-3-ebiggers@kernel.org
* | fscrypt: stop using keyrings subsystem for fscrypt_master_keyEric Biggers2022-09-215-307/+349
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The approach of fs/crypto/ internally managing the fscrypt_master_key structs as the payloads of "struct key" objects contained in a "struct key" keyring has outlived its usefulness. The original idea was to simplify the code by reusing code from the keyrings subsystem. However, several issues have arisen that can't easily be resolved: - When a master key struct is destroyed, blk_crypto_evict_key() must be called on any per-mode keys embedded in it. (This started being the case when inline encryption support was added.) Yet, the keyrings subsystem can arbitrarily delay the destruction of keys, even past the time the filesystem was unmounted. Therefore, currently there is no easy way to call blk_crypto_evict_key() when a master key is destroyed. Currently, this is worked around by holding an extra reference to the filesystem's request_queue(s). But it was overlooked that the request_queue reference is *not* guaranteed to pin the corresponding blk_crypto_profile too; for device-mapper devices that support inline crypto, it doesn't. This can cause a use-after-free. - When the last inode that was using an incompletely-removed master key is evicted, the master key removal is completed by removing the key struct from the keyring. Currently this is done via key_invalidate(). Yet, key_invalidate() takes the key semaphore. This can deadlock when called from the shrinker, since in fscrypt_ioctl_add_key(), memory is allocated with GFP_KERNEL under the same semaphore. - More generally, the fact that the keyrings subsystem can arbitrarily delay the destruction of keys (via garbage collection delay, or via random processes getting temporary key references) is undesirable, as it means we can't strictly guarantee that all secrets are ever wiped. - Doing the master key lookups via the keyrings subsystem results in the key_permission LSM hook being called. fscrypt doesn't want this, as all access control for encrypted files is designed to happen via the files themselves, like any other files. The workaround which SELinux users are using is to change their SELinux policy to grant key search access to all domains. This works, but it is an odd extra step that shouldn't really have to be done. The fix for all these issues is to change the implementation to what I should have done originally: don't use the keyrings subsystem to keep track of the filesystem's fscrypt_master_key structs. Instead, just store them in a regular kernel data structure, and rework the reference counting, locking, and lifetime accordingly. Retain support for RCU-mode key lookups by using a hash table. Replace fscrypt_sb_free() with fscrypt_sb_delete(), which releases the keys synchronously and runs a bit earlier during unmount, so that block devices are still available. A side effect of this patch is that neither the master keys themselves nor the filesystem keyrings will be listed in /proc/keys anymore. ("Master key users" and the master key users keyrings will still be listed.) However, this was mostly an implementation detail, and it was intended just for debugging purposes. I don't know of anyone using it. This patch does *not* change how "master key users" (->mk_users) works; that still uses the keyrings subsystem. That is still needed for key quotas, and changing that isn't necessary to solve the issues listed above. If we decide to change that too, it would be a separate patch. I've marked this as fixing the original commit that added the fscrypt keyring, but as noted above the most important issue that this patch fixes wasn't introduced until the addition of inline encryption support. Fixes: 22d94f493bfb ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl") Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-2-ebiggers@kernel.org
* | fscrypt: stop using PG_error to track error statusEric Biggers2022-09-061-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | As a step towards freeing the PG_error flag for other uses, change ext4 and f2fs to stop using PG_error to track decryption errors. Instead, if a decryption error occurs, just mark the whole bio as failed. The coarser granularity isn't really a problem since it isn't any worse than what the block layer provides, and errors from a multi-page readahead aren't reported to applications unless a single-page read fails too. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> # for f2fs part Link: https://lore.kernel.org/r/20220815235052.86545-2-ebiggers@kernel.org
* | fscrypt: remove fscrypt_set_test_dummy_encryption()Eric Biggers2022-08-221-13/+0
|/ | | | | | | | | Now that all its callers have been converted to fscrypt_parse_test_dummy_encryption() and fscrypt_add_test_dummy_key() instead, fscrypt_set_test_dummy_encryption() can be removed. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220513231605.175121-6-ebiggers@kernel.org
* Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds2022-08-114-21/+65
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ceph updates from Ilya Dryomov: "We have a good pile of various fixes and cleanups from Xiubo, Jeff, Luis and others, almost exclusively in the filesystem. Several patches touch files outside of our normal purview to set the stage for bringing in Jeff's long awaited ceph+fscrypt series in the near future. All of them have appropriate acks and sat in linux-next for a while" * tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits) libceph: clean up ceph_osdc_start_request prototype libceph: fix ceph_pagelist_reserve() comment typo ceph: remove useless check for the folio ceph: don't truncate file in atomic_open ceph: make f_bsize always equal to f_frsize ceph: flush the dirty caps immediatelly when quota is approaching libceph: print fsid and epoch with osd id libceph: check pointer before assigned to "c->rules[]" ceph: don't get the inline data for new creating files ceph: update the auth cap when the async create req is forwarded ceph: make change_auth_cap_ses a global symbol ceph: fix incorrect old_size length in ceph_mds_request_args ceph: switch back to testing for NULL folio->private in ceph_dirty_folio ceph: call netfs_subreq_terminated with was_async == false ceph: convert to generic_file_llseek ceph: fix the incorrect comment for the ceph_mds_caps struct ceph: don't leak snap_rwsem in handle_cap_grant ceph: prevent a client from exceeding the MDS maximum xattr size ceph: choose auth MDS for getxattr with the Xs caps ceph: add session already open notify support ...
| * fscrypt: add fscrypt_context_for_new_inodeJeff Layton2022-08-031-6/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most filesystems just call fscrypt_set_context on new inodes, which usually causes a setxattr. That's a bit late for ceph, which can send along a full set of attributes with the create request. Doing so allows it to avoid race windows that where the new inode could be seen by other clients without the crypto context attached. It also avoids the separate round trip to the server. Refactor the fscrypt code a bit to allow us to create a new crypto context, attach it to the inode, and write it to the buffer, but without calling set_context on it. ceph can later use this to marshal the context into the attributes we send along with the create request. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_sizeJeff Layton2022-08-033-15/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For ceph, we want to use our own scheme for handling filenames that are are longer than NAME_MAX after encryption and Base64 encoding. This allows us to have a consistent view of the encrypted filenames for clients that don't support fscrypt and clients that do but that don't have the key. Currently, fs/crypto only supports encrypting filenames using fscrypt_setup_filename, but that also handles encoding nokey names. Ceph can't use that because it handles nokey names in a different way. Export fscrypt_fname_encrypt. Rename fscrypt_fname_encrypted_size to __fscrypt_fname_encrypted_size and add a new wrapper called fscrypt_fname_encrypted_size that takes an inode argument rather than a pointer to a fscrypt_policy union. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | fscrypt: Add HCTR2 support for filename encryptionNathan Huckleberry2022-06-103-4/+19
|/ | | | | | | | | | | | | | | | | | HCTR2 is a tweakable, length-preserving encryption mode that is intended for use on CPUs with dedicated crypto instructions. HCTR2 has the property that a bitflip in the plaintext changes the entire ciphertext. This property fixes a known weakness with filename encryption: when two filenames in the same directory share a prefix of >= 16 bytes, with AES-CTS-CBC their encrypted filenames share a common substring, leaking information. HCTR2 does not have this problem. More information on HCTR2 can be found here: "Length-preserving encryption with HCTR2": https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* fscrypt: add new helper functions for test_dummy_encryptionEric Biggers2022-05-093-68/+112
| | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately the design of fscrypt_set_test_dummy_encryption() doesn't work properly for the new mount API, as it combines too many steps into one function: - Parse the argument to test_dummy_encryption - Check the setting against the filesystem instance - Apply the setting to the filesystem instance The new mount API has split these into separate steps. ext4 partially worked around this by duplicating some of the logic, but it still had some bugs. To address this, add some new helper functions that split up the steps of fscrypt_set_test_dummy_encryption(): - fscrypt_parse_test_dummy_encryption() - fscrypt_dummy_policies_equal() - fscrypt_add_test_dummy_key() While we're add it, also add a function fscrypt_is_dummy_policy_set() which will be useful to avoid some #ifdef's. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220501050857.538984-5-ebiggers@kernel.org
* fscrypt: factor out fscrypt_policy_to_key_spec()Eric Biggers2022-05-093-17/+25
| | | | | | | | | Factor out a function that builds the fscrypt_key_specifier for an fscrypt_policy. Before this was only needed when finding the key for a file, but now it will also be needed for test_dummy_encryption support. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220501050857.538984-4-ebiggers@kernel.org
* fscrypt: log when starting to use inline encryptionEric Biggers2022-04-133-3/+36
| | | | | | | | | | | When inline encryption is used, the usual message "fscrypt: AES-256-XTS using implementation <impl>" doesn't appear in the kernel log. Add a similar message for the blk-crypto case that indicates that inline encryption was used, and whether blk-crypto-fallback was used or not. This can be useful for debugging performance problems. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220414053415.158986-1-ebiggers@kernel.org
* fscrypt: split up FS_CRYPTO_BLOCK_SIZEEric Biggers2022-04-132-7/+14
| | | | | | | | | | | | | | | | | FS_CRYPTO_BLOCK_SIZE is neither the filesystem block size nor the granularity of encryption. Rather, it defines two logically separate constraints that both arise from the block size of the AES cipher: - The alignment required for the lengths of file contents blocks - The minimum input/output length for the filenames encryption modes Since there are way too many things called the "block size", and the connection with the AES block size is not easily understood, split FS_CRYPTO_BLOCK_SIZE into two constants FSCRYPT_CONTENTS_ALIGNMENT and FSCRYPT_FNAME_MIN_MSG_LEN that more clearly describe what they are. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220405010914.18519-1-ebiggers@kernel.org
* fs: Remove ->readpages address space operationMatthew Wilcox (Oracle)2022-04-011-1/+1
| | | | | | | | | | | All filesystems have now been converted to use ->readahead, so remove the ->readpages operation and fix all the comments that used to refer to it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2022-03-222-0/+101
|\ | | | | | | | | | | | | | | | | | | | | | | | | Pull fscrypt updates from Eric Biggers: "Add support for direct I/O on encrypted files when blk-crypto (inline encryption) is being used for file contents encryption" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: update documentation for direct I/O support f2fs: support direct I/O with fscrypt using blk-crypto ext4: support direct I/O with fscrypt using blk-crypto iomap: support direct I/O with fscrypt using blk-crypto fscrypt: add functions for direct I/O support
| * fscrypt: add functions for direct I/O supportEric Biggers2022-02-082-0/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encrypted files traditionally haven't supported DIO, due to the need to encrypt/decrypt the data. However, when the encryption is implemented using inline encryption (blk-crypto) instead of the traditional filesystem-layer encryption, it is straightforward to support DIO. In preparation for supporting this, add the following functions: - fscrypt_dio_supported() checks whether a DIO request is supported as far as encryption is concerned. Encrypted files will only support DIO when inline encryption is used and the I/O request is properly aligned; this function checks these preconditions. - fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing a place in the file that a bio with an encryption context cannot cross due to a DUN discontiguity. This function is needed by filesystems that use the iomap DIO implementation (which operates directly on logical ranges, so it won't use fscrypt_mergeable_bio()) and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. Co-developed-by: Satya Tangirala <satyat@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220128233940.79464-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | block: pass a block_device and opf to bio_resetChristoph Hellwig2022-02-021-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | Pass the block_device that we plan to use this bio for and the operation to bio_reset to optimize the assigment. A NULL block_device can be passed, both for the passthrough case on a raw request_queue and to temporarily avoid refactoring some nasty code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220124091107.642561-20-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | block: pass a block_device and opf to bio_allocChristoph Hellwig2022-02-021-6/+7
|/ | | | | | | | | | | | | | | Pass the block_device and operation that we plan to use this bio for to bio_alloc to optimize the assignment. NULL/0 can be passed, both for the passthrough case on a raw request_queue and to temporarily avoid refactoring some nasty code. Also move the gfp_mask argument after the nr_vecs argument for a much more logical calling convention matching what most of the kernel does. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* fscrypt: improve a few commentsEric Biggers2021-10-252-3/+13
| | | | | | | | | Improve a few comments. These were extracted from the patch "fscrypt: add support for hardware-wrapped keys" (https://lore.kernel.org/r/20211021181608.54127-4-ebiggers@kernel.org). Link: https://lore.kernel.org/r/20211026021042.6581-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: allow 256-bit master keys with AES-256-XTSEric Biggers2021-09-223-17/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fscrypt currently requires a 512-bit master key when AES-256-XTS is used, since AES-256-XTS keys are 512-bit and fscrypt requires that the master key be at least as long any key that will be derived from it. However, this is overly strict because AES-256-XTS doesn't actually have a 512-bit security strength, but rather 256-bit. The fact that XTS takes twice the expected key size is a quirk of the XTS mode. It is sufficient to use 256 bits of entropy for AES-256-XTS, provided that it is first properly expanded into a 512-bit key, which HKDF-SHA512 does. Therefore, relax the check of the master key size to use the security strength of the derived key rather than the size of the derived key (except for v1 encryption policies, which don't use HKDF). Besides making things more flexible for userspace, this is needed in order for the use of a KDF which only takes a 256-bit key to be introduced into the fscrypt key hierarchy. This will happen with hardware-wrapped keys support, as all known hardware which supports that feature uses an SP800-108 KDF using AES-256-CMAC, so the wrapped keys are wrapped 256-bit AES keys. Moreover, there is interest in fscrypt supporting the same type of AES-256-CMAC based KDF in software as an alternative to HKDF-SHA512. There is no security problem with such features, so fix the key length check to work properly with them. Reviewed-by: Paul Crowley <paulcrowley@google.com> Link: https://lore.kernel.org/r/20210921030303.5598-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: clean up comments in bio.cEric Biggers2021-09-201-15/+17
| | | | | | | | | The file comment in bio.c is almost completely irrelevant to the actual contents of the file; it was originally copied from crypto.c. Fix it up, and also add a kerneldoc comment for fscrypt_decrypt_bio(). Link: https://lore.kernel.org/r/20210909190737.140841-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: remove fscrypt_operations::max_namelenEric Biggers2021-09-201-2/+1
| | | | | | | | | The max_namelen field is unnecessary, as it is set to 255 (NAME_MAX) on all filesystems that support fscrypt (or plan to support fscrypt). For simplicity, just use NAME_MAX directly instead. Link: https://lore.kernel.org/r/20210909184513.139281-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: align Base64 encoding with RFC 4648 base64urlEric Biggers2021-07-251-41/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fscrypt uses a Base64 encoding to encode no-key filenames (the filenames that are presented to userspace when a directory is listed without its encryption key). There are many variants of Base64, but the most common ones are specified by RFC 4648. fscrypt can't use the regular RFC 4648 "base64" variant because "base64" uses the '/' character, which isn't allowed in filenames. However, RFC 4648 also specifies a "base64url" variant for use in URLs and filenames. "base64url" is less common than "base64", but it's still implemented in many programming libraries. Unfortunately, what fscrypt actually uses is a custom Base64 variant that differs from "base64url" in several ways: - The binary data is divided into 6-bit chunks differently. - Values 62 and 63 are encoded with '+' and ',' instead of '-' and '_'. - '='-padding isn't used. This isn't a problem per se, as the padding isn't technically necessary, and RFC 4648 doesn't strictly require it. But it needs to be properly documented. There have been two attempts to copy the fscrypt Base64 code into lib/ (https://lkml.kernel.org/r/20200821182813.52570-6-jlayton@kernel.org and https://lkml.kernel.org/r/20210716110428.9727-5-hare@suse.de), and both have been caught up by the fscrypt Base64 variant being nonstandard and not properly documented. Also, the planned use of the fscrypt Base64 code in the CephFS storage back-end will prevent it from being changed later (whereas currently it can still be changed), so we need to choose an encoding that we're happy with before it's too late. Therefore, switch the fscrypt Base64 variant to base64url, in order to align more closely with RFC 4648 and other implementations and uses of Base64. However, I opted not to implement '='-padding, as '='-padding adds complexity, is unnecessary, and isn't required by the RFC. Link: https://lore.kernel.org/r/20210718000125.59701-1-ebiggers@kernel.org Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: add fscrypt_symlink_getattr() for computing st_sizeEric Biggers2021-07-251-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper function fscrypt_symlink_getattr() which will be called from the various filesystems' ->getattr() methods to read and decrypt the target of encrypted symlinks in order to report the correct st_size. Detailed explanation: As required by POSIX and as documented in various man pages, st_size for a symlink is supposed to be the length of the symlink target. Unfortunately, st_size has always been wrong for encrypted symlinks because st_size is populated from i_size from disk, which intentionally contains the length of the encrypted symlink target. That's slightly greater than the length of the decrypted symlink target (which is the symlink target that userspace usually sees), and usually won't match the length of the no-key encoded symlink target either. This hadn't been fixed yet because reporting the correct st_size would require reading the symlink target from disk and decrypting or encoding it, which historically has been considered too heavyweight to do in ->getattr(). Also historically, the wrong st_size had only broken a test (LTP lstat03) and there were no known complaints from real users. (This is probably because the st_size of symlinks isn't used too often, and when it is, typically it's for a hint for what buffer size to pass to readlink() -- which a slightly-too-large size still works for.) However, a couple things have changed now. First, there have recently been complaints about the current behavior from real users: - Breakage in rpmbuild: https://github.com/rpm-software-management/rpm/issues/1682 https://github.com/google/fscrypt/issues/305 - Breakage in toybox cpio: https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html - Breakage in libgit2: https://issuetracker.google.com/issues/189629152 (on Android public issue tracker, requires login) Second, we now cache decrypted symlink targets in ->i_link. Therefore, taking the performance hit of reading and decrypting the symlink target in ->getattr() wouldn't be as big a deal as it used to be, since usually it will just save having to do the same thing later. Also note that eCryptfs ended up having to read and decrypt symlink targets in ->getattr() as well, to fix this same issue; see commit 3a60a1686f0d ("eCryptfs: Decrypt symlink target for stat size"). So, let's just bite the bullet, and read and decrypt the symlink target in ->getattr() in order to report the correct st_size. Add a function fscrypt_symlink_getattr() which the filesystems will call to do this. (Alternatively, we could store the decrypted size of symlinks on-disk. But there isn't a great place to do so, and encryption is meant to hide the original size to some extent; that property would be lost.) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: fix derivation of SipHash keys on big endian CPUsEric Biggers2021-06-051-8/+32
| | | | | | | | | | | | | | | | | | | | | | | | | Typically, the cryptographic APIs that fscrypt uses take keys as byte arrays, which avoids endianness issues. However, siphash_key_t is an exception. It is defined as 'u64 key[2];', i.e. the 128-bit key is expected to be given directly as two 64-bit words in CPU endianness. fscrypt_derive_dirhash_key() and fscrypt_setup_iv_ino_lblk_32_key() forgot to take this into account. Therefore, the SipHash keys used to index encrypted+casefolded directories differ on big endian vs. little endian platforms, as do the SipHash keys used to hash inode numbers for IV_INO_LBLK_32-encrypted directories. This makes such directories non-portable between these platforms. Fix this by always using the little endian order. This is a breaking change for big endian platforms, but this should be fine in practice since these features (encrypt+casefold support, and the IV_INO_LBLK_32 flag) aren't known to actually be used on any big endian platforms yet. Fixes: aa408f835d02 ("fscrypt: derive dirhash key for casefolded directories") Fixes: e3b1078bedd3 ("fscrypt: add support for IV_INO_LBLK_32 policies") Cc: <stable@vger.kernel.org> # v5.6+ Link: https://lore.kernel.org/r/20210605075033.54424-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: don't ignore minor_hash when hash is 0Eric Biggers2021-06-051-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the minor_hash to 0 if the (major) hash is 0. This doesn't make sense because 0 is a valid hash code, so we shouldn't ignore the filesystem-provided minor_hash in that case. Fix this by removing the special case for 'hash == 0'. This is an old bug that appears to have originated when the encryption code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and f2fs code passed the hash by pointer instead of by value. So 'if (hash)' actually made sense then, as it was checking whether a pointer was NULL. But now the hashes are passed by value, and filesystems just pass 0 for any hashes they don't have. There is no need to handle this any differently from the hashes actually being 0. It is difficult to reproduce this bug, as it only made a difference in the case where a filename's 32-bit major hash happened to be 0. However, it probably had the largest chance of causing problems on ubifs, since ubifs uses minor_hash to do lookups of no-key names, in addition to using it as a readdir cookie. ext4 only uses minor_hash as a readdir cookie, and f2fs doesn't use minor_hash at all. Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto") Cc: <stable@vger.kernel.org> # v4.6+ Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* Merge branch 'linus' of ↵Linus Torvalds2021-04-261-8/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - crypto_destroy_tfm now ignores errors as well as NULL pointers Algorithms: - Add explicit curve IDs in ECDH algorithm names - Add NIST P384 curve parameters - Add ECDSA Drivers: - Add support for Green Sardine in ccp - Add ecdh/curve25519 to hisilicon/hpre - Add support for AM64 in sa2ul" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits) fsverity: relax build time dependency on CRYPTO_SHA256 fscrypt: relax Kconfig dependencies for crypto API algorithms crypto: camellia - drop duplicate "depends on CRYPTO" crypto: s5p-sss - consistently use local 'dev' variable in probe() crypto: s5p-sss - remove unneeded local variable initialization crypto: s5p-sss - simplify getting of_device_id match data ccp: ccp - add support for Green Sardine crypto: ccp - Make ccp_dev_suspend and ccp_dev_resume void functions crypto: octeontx2 - add support for OcteonTX2 98xx CPT block. crypto: chelsio/chcr - Remove useless MODULE_VERSION crypto: ux500/cryp - Remove duplicate argument crypto: chelsio - remove unused function crypto: sa2ul - Add support for AM64 crypto: sa2ul - Support for per channel coherency dt-bindings: crypto: ti,sa2ul: Add new compatible for AM64 crypto: hisilicon - enable new error types for QM crypto: hisilicon - add new error type for SEC crypto: hisilicon - support new error types for ZIP crypto: hisilicon - dynamic configuration 'err_info' crypto: doc - fix kernel-doc notation in chacha.c and af_alg.c ...
| * fscrypt: relax Kconfig dependencies for crypto API algorithmsArd Biesheuvel2021-04-221-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if FS encryption has strict functional dependencies on various crypto algorithms and chaining modes. those dependencies could potentially be satisified by other implementations than the generic ones, and no link time dependency exists on the 'depends on' claused defined by CONFIG_FS_ENCRYPTION_ALGS. So let's relax these clauses to 'imply', so that the default behavior is still to pull in those generic algorithms, but in a way that permits them to be disabled again in Kconfig. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | block: rename BIO_MAX_PAGES to BIO_MAX_VECSChristoph Hellwig2021-03-111-3/+3
|/ | | | | | | | | | | | Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop confusing users of the bio API. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* inode: make init and permission helpers idmapped mount awareChristian Brauner2021-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The inode_owner_or_capable() helper determines whether the caller is the owner of the inode or is capable with respect to that inode. Allow it to handle idmapped mounts. If the inode is accessed through an idmapped mount it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Similarly, allow the inode_init_owner() helper to handle idmapped mounts. It initializes a new inode on idmapped mounts by mapping the fsuid and fsgid of the caller from the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
* Merge tag 'f2fs-for-5.11-rc1' of ↵Linus Torvalds2020-12-173-6/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've made more work into per-file compression support. For example, F2FS_IOC_GET | SET_COMPRESS_OPTION provides a way to change the algorithm or cluster size per file. F2FS_IOC_COMPRESS | DECOMPRESS_FILE provides a way to compress and decompress the existing normal files manually. There is also a new mount option, compress_mode=fs|user, which can control who compresses the data. Chao also added a checksum feature with a mount option so that we are able to detect any corrupted cluster. In addition, Daniel contributed casefolding with encryption patch, which will be used for Android devices. Summary: Enhancements: - add ioctls and mount option to manage per-file compression feature - support casefolding with encryption - support checksum for compressed cluster - avoid IO starvation by replacing mutex with rwsem - add sysfs, max_io_bytes, to control max bio size Bug fixes: - fix use-after-free issue when compression and fsverity are enabled - fix consistency corruption during fault injection test - fix data offset for lseek - get rid of buffer_head which has 32bits limit in fiemap - fix some bugs in multi-partitions support - fix nat entry count calculation in shrinker - fix some stat information And, we've refactored some logics and fix minor bugs as well" * tag 'f2fs-for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits) f2fs: compress: fix compression chksum f2fs: fix shift-out-of-bounds in sanity_check_raw_super() f2fs: fix race of pending_pages in decompression f2fs: fix to account inline xattr correctly during recovery f2fs: inline: fix wrong inline inode stat f2fs: inline: correct comment in f2fs_recover_inline_data f2fs: don't check PAGE_SIZE again in sanity_check_raw_super() f2fs: convert to F2FS_*_INO macro f2fs: introduce max_io_bytes, a sysfs entry, to limit bio size f2fs: don't allow any writes on readonly mount f2fs: avoid race condition for shrinker count f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE f2fs: add compress_mode mount option f2fs: Remove unnecessary unlikely() f2fs: init dirty_secmap incorrectly f2fs: remove buffer_head which has 32bits limit f2fs: fix wrong block count instead of bytes f2fs: use new conversion functions between blks and bytes f2fs: rename logical_to_blk and blk_to_logical f2fs: fix kbytes written stat for multi-device case ...
| * fscrypt: Have filesystems handle their d_opsDaniel Rosenberg2020-12-023-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This shifts the responsibility of setting up dentry operations from fscrypt to the individual filesystems, allowing them to have their own operations while still setting fscrypt's d_revalidate as appropriate. Most filesystems can just use generic_set_encrypted_ci_d_ops, unless they have their own specific dentry operations as well. That operation will set the minimal d_ops required under the circumstances. Since the fscrypt d_ops are set later on, we must set all d_ops there, since we cannot adjust those later on. This should not result in any change in behavior. Signed-off-by: Daniel Rosenberg <drosen@google.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | Merge branch 'linus' of ↵Linus Torvalds2020-12-142-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add speed testing on 1420-byte blocks for networking Algorithms: - Improve performance of chacha on ARM for network packets - Improve performance of aegis128 on ARM for network packets Drivers: - Add support for Keem Bay OCS AES/SM4 - Add support for QAT 4xxx devices - Enable crypto-engine retry mechanism in caam - Enable support for crypto engine on sdm845 in qce - Add HiSilicon PRNG driver support" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (161 commits) crypto: qat - add capability detection logic in qat_4xxx crypto: qat - add AES-XTS support for QAT GEN4 devices crypto: qat - add AES-CTR support for QAT GEN4 devices crypto: atmel-i2c - select CONFIG_BITREVERSE crypto: hisilicon/trng - replace atomic_add_return() crypto: keembay - Add support for Keem Bay OCS AES/SM4 dt-bindings: Add Keem Bay OCS AES bindings crypto: aegis128 - avoid spurious references crypto_aegis128_update_simd crypto: seed - remove trailing semicolon in macro definition crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg crypto: x86/sha512 - Use TEST %reg,%reg instead of CMP $0,%reg crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg crypto: cpt - Fix sparse warnings in cptpf hwrng: ks-sa - Add dependency on IOMEM and OF crypto: lib/blake2s - Move selftest prototype into header file crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata crypto: ecdh - avoid unaligned accesses in ecdh_set_secret() crypto: ccree - rework cache parameters handling crypto: cavium - Use dma_set_mask_and_coherent to simplify code crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code ...
| * | crypto: sha - split sha.h into sha1.h and sha2.hEric Biggers2020-11-202-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2, and <crypto/sha3.h> contains declarations for SHA-3. This organization is inconsistent, but more importantly SHA-1 is no longer considered to be cryptographically secure. So to the extent possible, SHA-1 shouldn't be grouped together with any of the other SHA versions, and usage of it should be phased out. Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and <crypto/sha2.h>, and make everyone explicitly specify whether they want the declarations for SHA-1, SHA-2, or both. This avoids making the SHA-1 declarations visible to files that don't want anything to do with SHA-1. It also prepares for potentially moving sha1.h into a new insecure/ or dangerous/ directory. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | fscrypt: allow deleting files with unsupported encryption policyEric Biggers2020-12-025-16/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it's impossible to delete files that use an unsupported encryption policy, as the kernel will just return an error when performing any operation on the top-level encrypted directory, even just a path lookup into the directory or opening the directory for readdir. More specifically, this occurs in any of the following cases: - The encryption context has an unrecognized version number. Current kernels know about v1 and v2, but there could be more versions in the future. - The encryption context has unrecognized encryption modes (FSCRYPT_MODE_*) or flags (FSCRYPT_POLICY_FLAG_*), an unrecognized combination of modes, or reserved bits set. - The encryption key has been added and the encryption modes are recognized but aren't available in the crypto API -- for example, a directory is encrypted with FSCRYPT_MODE_ADIANTUM but the kernel doesn't have CONFIG_CRYPTO_ADIANTUM enabled. It's desirable to return errors for most operations on files that use an unsupported encryption policy, but the current behavior is too strict. We need to allow enough to delete files, so that people can't be stuck with undeletable files when downgrading kernel versions. That includes allowing directories to be listed and allowing dentries to be looked up. Fix this by modifying the key setup logic to treat an unsupported encryption policy in the same way as "key unavailable" in the cases that are required for a recursive delete to work: preparing for a readdir or a dentry lookup, revalidating a dentry, or checking whether an inode has the same encryption policy as its parent directory. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-10-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: unexport fscrypt_get_encryption_info()Eric Biggers2020-12-022-1/+2
| | | | | | | | | | | | | | | | | | | | | | Now that fscrypt_get_encryption_info() is only called from files in fs/crypto/ (due to all key setup now being handled by higher-level helper functions instead of directly by filesystems), unexport it and move its declaration to fscrypt_private.h. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-9-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: move fscrypt_require_key() to fscrypt_private.hEric Biggers2020-12-021-0/+26
| | | | | | | | | | | | | | | | | | | | fscrypt_require_key() is now only used by files in fs/crypto/. So reduce its visibility to fscrypt_private.h. This is also a prerequsite for unexporting fscrypt_get_encryption_info(). Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: move body of fscrypt_prepare_setattr() out-of-lineEric Biggers2020-12-021-0/+8
| | | | | | | | | | | | | | | | | | | | In preparation for reducing the visibility of fscrypt_require_key() by moving it to fscrypt_private.h, move the call to it from fscrypt_prepare_setattr() to an out-of-line function. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: introduce fscrypt_prepare_readdir()Eric Biggers2020-12-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last remaining use of fscrypt_get_encryption_info() from filesystems is for readdir (->iterate_shared()). Every other call is now in fs/crypto/ as part of some other higher-level operation. We need to add a new argument to fscrypt_get_encryption_info() to indicate whether the encryption policy is allowed to be unrecognized or not. Doing this is easier if we can work with high-level operations rather than direct filesystem use of fscrypt_get_encryption_info(). So add a function fscrypt_prepare_readdir() which wraps the call to fscrypt_get_encryption_info() for the readdir use case. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: simplify master key lockingEric Biggers2020-11-244-34/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stated reasons for separating fscrypt_master_key::mk_secret_sem from the standard semaphore contained in every 'struct key' no longer apply. First, due to commit a992b20cd4ee ("fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()"), fscrypt_get_encryption_info() is no longer called from within a filesystem transaction. Second, due to commit d3ec10aa9581 ("KEYS: Don't write out to userspace while holding key semaphore"), the semaphore for the "keyring" key type no longer ranks above page faults. That leaves performance as the only possible reason to keep the separate mk_secret_sem. Specifically, having mk_secret_sem reduces the contention between setup_file_encryption_key() and FS_IOC_{ADD,REMOVE}_ENCRYPTION_KEY. However, these ioctls aren't executed often, so this doesn't seem to be worth the extra complexity. Therefore, simplify the locking design by just using key->sem instead of mk_secret_sem. Link: https://lore.kernel.org/r/20201117032626.320275-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: remove unnecessary calls to fscrypt_require_key()Eric Biggers2020-11-241-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an encrypted directory, a regular dentry (one that doesn't have the no-key name flag) can only be created if the directory's encryption key is available. Therefore the calls to fscrypt_require_key() in __fscrypt_prepare_link() and __fscrypt_prepare_rename() are unnecessary, as these functions already check that the dentries they're given aren't no-key names. Remove these unnecessary calls to fscrypt_require_key(). Link: https://lore.kernel.org/r/20201118075609.120337-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: add fscrypt_is_nokey_name()Eric Biggers2020-11-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the encryption key. Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or sys_symlink()) can lookup the target filename while the directory's encryption key hasn't been added yet, resulting in a negative no-key dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or ->symlink()) because the dentry is negative. Normally, ->create() would return -ENOKEY due to the directory's key being unavailable. However, if the key was added between the dentry lookup and ->create(), then the filesystem will go ahead and try to create the file. If the target filename happens to already exist as a normal name (not a no-key name), a duplicate filename may be added to the directory. In order to fix this, we need to fix the filesystems to prevent ->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names. (->rename() and ->link() need it too, but those are already handled correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().) In preparation for this, add a helper function fscrypt_is_nokey_name() that filesystems can use to do this check. Use this helper function for the existing checks that fs/crypto/ does for rename and link. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: remove kernel-internal constants from UAPI headerEric Biggers2020-11-164-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There isn't really any valid reason to use __FSCRYPT_MODE_MAX or FSCRYPT_POLICY_FLAGS_VALID in a userspace program. These constants are only meant to be used by the kernel internally, and they are defined in the UAPI header next to the mode numbers and flags only so that kernel developers don't forget to update them when adding new modes or flags. In https://lkml.kernel.org/r/20201005074133.1958633-2-satyat@google.com there was an example of someone wanting to use __FSCRYPT_MODE_MAX in a user program, and it was wrong because the program would have broken if __FSCRYPT_MODE_MAX were ever increased. So having this definition available is harmful. FSCRYPT_POLICY_FLAGS_VALID has the same problem. So, remove these definitions from the UAPI header. Replace FSCRYPT_POLICY_FLAGS_VALID with just listing the valid flags explicitly in the one kernel function that needs it. Move __FSCRYPT_MODE_MAX to fscrypt_private.h, remove the double underscores (which were only present to discourage use by userspace), and add a BUILD_BUG_ON() and comments to (hopefully) ensure it is kept in sync. Keep the old name FS_POLICY_FLAGS_VALID, since it's been around for longer and there's a greater chance that removing it would break source compatibility with some program. Indeed, mtd-utils is using it in an #ifdef, and removing it would introduce compiler warnings (about FS_POLICY_FLAGS_PAD_* being redefined) into the mtd-utils build. However, reduce its value to 0x07 so that it only includes the flags with old names (the ones present before Linux 5.4), and try to make it clear that it's now "frozen" and no new flags should be added to it. Fixes: 2336d0deb2d4 ("fscrypt: use FSCRYPT_ prefix for uapi constants") Cc: <stable@vger.kernel.org> # v5.4+ Link: https://lore.kernel.org/r/20201024005132.495952-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: fix inline encryption not used on new filesEric Biggers2020-11-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new helper function fscrypt_prepare_new_inode() runs before S_ENCRYPTED has been set on the new inode. This accidentally made fscrypt_select_encryption_impl() never enable inline encryption on newly created files, due to its use of fscrypt_needs_contents_encryption() which only returns true when S_ENCRYPTED is set. Fix this by using S_ISREG() directly instead of fscrypt_needs_contents_encryption(), analogous to what select_encryption_mode() does. I didn't notice this earlier because by design, the user-visible behavior is the same (other than performance, potentially) regardless of whether inline encryption is used or not. Fixes: a992b20cd4ee ("fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()") Reviewed-by: Satya Tangirala <satyat@google.com> Link: https://lore.kernel.org/r/20201111015224.303073-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | fscrypt: remove reachable WARN in fscrypt_setup_iv_ino_lblk_32_key()Eric Biggers2020-11-061-3/+1
|/ | | | | | | | | | | | | | | | | I_CREATING isn't actually set until the inode has been assigned an inode number and inserted into the inode hash table. So the WARN_ON() in fscrypt_setup_iv_ino_lblk_32_key() is wrong, and it can trigger when creating an encrypted file on ext4. Remove it. This was sometimes causing xfstest generic/602 to fail on ext4. I didn't notice it before because due to a separate oversight, new inodes that haven't been assigned an inode number yet don't necessarily have i_ino == 0 as I had thought, so by chance I never saw the test fail. Fixes: a992b20cd4ee ("fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()") Reported-by: Theodore Y. Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20201031004556.87862-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: export fscrypt_d_revalidate()Eric Biggers2020-09-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Dentries that represent no-key names must have a dentry_operations that includes fscrypt_d_revalidate(). Currently, this is handled by fscrypt_prepare_lookup() installing fscrypt_d_ops. However, ceph support for encryption (https://lore.kernel.org/r/20200914191707.380444-1-jlayton@kernel.org) can't use fscrypt_d_ops, since ceph already has its own dentry_operations. Similarly, ext4 and f2fs support for directories that are both encrypted and casefolded (https://lore.kernel.org/r/20200923010151.69506-1-drosen@google.com) can't use fscrypt_d_ops either, since casefolding requires some dentry operations too. To satisfy both users, we need to move the responsibility of installing the dentry_operations to filesystems. In preparation for this, export fscrypt_d_revalidate() and give it a !CONFIG_FS_ENCRYPTION stub. Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200924054721.187797-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* fscrypt: rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAMEEric Biggers2020-09-232-5/+4
| | | | | | | | | | | | | | | | | Originally we used the term "encrypted name" or "ciphertext name" to mean the encoded filename that is shown when an encrypted directory is listed without its key. But these terms are ambiguous since they also mean the filename stored on-disk. "Encrypted name" is especially ambiguous since it could also be understood to mean "this filename is encrypted on-disk", similar to "encrypted file". So we've started calling these encoded names "no-key names" instead. Therefore, rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAME to avoid confusion about what this flag means. Link: https://lore.kernel.org/r/20200924042624.98439-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>