linux.git - Linux kernel mainline tree

diff options

author	Filipe Manana <fdmanana@suse.com>	2018-02-06 20:40:31 +0000
committer	David Sterba <dsterba@suse.com>	2018-03-26 15:09:40 +0200
commit	213e8c5520ed1ecc5401a3a0f716b51a7318bda9 (patch)
tree	b13cbe9e6ed1020ecdb40f531055dbb3c3a679be /fs/btrfs/send.c
parent	ed5d5f37e653b606c93b2d5f1cdd155be6fefce0 (diff)
download	linux-213e8c5520ed1ecc5401a3a0f716b51a7318bda9.tar.gz linux-213e8c5520ed1ecc5401a3a0f716b51a7318bda9.tar.bz2 linux-213e8c5520ed1ecc5401a3a0f716b51a7318bda9.zip

Btrfs: skip writeback of last page when truncating file to same size

When we truncate a file to the same size and that size is not aligned with the sector size, we end up triggering writeback (and wait for it to complete) of the last page. This is unncessary as we can not have delayed allocation beyond the inode's i_size and the goal of truncating a file to its own size is to discard prealloc extents (allocated via the fallocate(2) system call). Besides the unnecessary IO start and wait, it also breaks the oppurtunity for larger contiguous extents on disk, as before the last dirty page there might be other dirty pages. This scenario is probably not very common in general, however it is common for btrfs receive implementations because currently the send stream always issues a truncate operation for each processed inode as the last operation for that inode (this truncate operation is not always needed and the send implementation will be addressed to avoid them). So improve this by not starting and waiting for writeback of the inode's last page when we are truncating to exactly the same size. The following script was used to quickly measure the time a receive operation takes: $ cat test_send.sh #!/bin/bash SRC_DEV=/dev/sdc DST_DEV=/dev/sdd SRC_MNT=/mnt/sdc DST_MNT=/mnt/sdd mkfs.btrfs -f $SRC_DEV >/dev/null mkfs.btrfs -f $DST_DEV >/dev/null mount $SRC_DEV $SRC_MNT mount $DST_DEV $DST_MNT echo "Creating source filesystem" for ((t = 0; t < 10; t++)); do ( for ((i = 1; i <= 20000; i++)); do xfs_io -f -c "pwrite -S 0xab 0 5000" \ $SRC_MNT/file_$i > /dev/null done ) & worker_pids[$t]=$! done wait ${worker_pids[@]} echo "Creating and sending snapshot" btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null /usr/bin/time -f "send took %e seconds" \ btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1 /usr/bin/time -f "receive took %e seconds" \ btrfs receive -f $SRC_MNT/send_file $DST_MNT umount $SRC_MNT umount $DST_MNT The results for 5 runs were the following: * Without this change average receive time was 26.49 seconds standard deviation of 2.53 seconds * With this change average receive time was 12.51 seconds standard deviation of 0.32 seconds Reported-by: Robbie Ko <robbieko@synology.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

Diffstat (limited to 'fs/btrfs/send.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: