linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	pipe: add documentation and comments	Jens Axboe	2007-07-10	1	-0/+4
\| \| \| \| \| \| \| \|	As per Andrew Mortons request, here's a set of documentation for the generic pipe_buf_operations hooks, the pipe, and pipe_buffer structures. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	pipe: change the ->pin() operation to ->confirm()	Jens Axboe	2007-07-10	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \|	The name 'pin' was badly chosen, it doesn't pin a pipe buffer in the most commonly used sense in the kernel. So change the name to 'confirm', after debating this issue with Hugh Dickins a bit. A good return from ->confirm() means that the buffer is really there, and that the contents are good. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: completely document external interface with kerneldoc	Jens Axboe	2007-07-10	1	-24/+85
\| \| \| \| \| \| \|	Also add fs/splice.c as a kerneldoc target with a smaller blurb that should be expanded to better explain the overview of splice. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	pipe: allow passing around of ops private pointer	Jens Axboe	2007-07-10	1	-0/+1
\| \| \| \| \| \| \| \|	relay needs this for proper consumption handling, and the network receive support needs it as well to lookup the sk_buff on pipe release. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: divorce the splice structure/function definitions from the pipe header	Jens Axboe	2007-07-10	1	-21/+5
\| \| \| \| \| \| \| \| \|	We need to move even more stuff into the header so that folks can use the splice_to_pipe() implementation instead of open-coding a lot of pipe knowledge (see relay implementation), so move to our own header file finally. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	vmsplice: add vmsplice-to-user support	Jens Axboe	2007-07-10	1	-28/+150
\| \| \| \| \| \| \| \|	A bit of a cheat, it actually just copies the data to userspace. But this makes the interface nice and symmetric and enables people to build on splice, with room for future improvement in performance. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: abstract out actor data	Jens Axboe	2007-07-10	1	-29/+70
\| \| \| \| \| \| \| \| \| \| \| \|	For direct splicing (or private splicing), the output may not be a file. So abstract out the handling into a specified actor function and put the data in the splice_desc structure earlier, so we can build on top of that. This is the first step in better splice handling for drivers, and also for implementing vmsplice _to_ user memory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: only check do_wakeup in splice_to_pipe() for a real pipe	Jens Axboe	2007-06-15	1	-6/+7
\| \| \| \| \| \| \| \|	We only ever set do_wakeup to non-zero if the pipe has an inode backing, so it's pointless to check outside the pipe->inode check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: fix leak of pages on short splice to pipe	Jens Axboe	2007-06-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the destination pipe is full and we already transferred data, we break out instead of waiting for more pipe room. The exit logic looks at spd->nr_pages to see if we moved everything inside the spd container, but we decrement that variable in the loop to decide when spd has emptied. Instead we want to compare to the original page count in the spd, so cache that in a local variable. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: adjust balance_dirty_pages_ratelimited() call	Jens Axboe	2007-06-15	1	-2/+8
\| \| \| \| \| \| \| \| \|	As we have potentially dirtied more than 1 page, we should indicate as such to the dirty page balancing. So call balance_dirty_pages_ratelimited_nr() and pass in the approximate number of pages we dirtied. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: __generic_file_splice_read: fix read/truncate race	Jens Axboe	2007-06-08	1	-23/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original patch and description from Neil Brown <neilb@suse.de>, merged and adapted to splice branch by me. Neils text follows: __generic_file_splice_read() currently samples the i_size at the start and doesn't do so again unless it needs to call ->readpage to load a page. After ->readpage it has to re-sample i_size as a truncate may have caused that page to be filled with zeros, and the read() call should not see these. However there are other activities that might cause ->readpage to be called on a page between the time that __generic_file_splice_read() samples i_size and when it finds that it has an uptodate page. These include at least read-ahead and possibly another thread performing a read So we must sample i_size after it has an uptodate page. Thus the current sampling at the start and after a read can be replaced with a sampling before page addition into spd. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: __generic_file_splice_read: fix i_size_read() length checks	Hugh Dickins	2007-06-08	1	-8/+10
\| \| \| \| \| \| \| \| \| \|	__generic_file_splice_read's partial page check, at eof after readpage, not only got its calculations wrong, but also reused the loff variable: causing data corruption when splicing from a non-0 offset in the file's last page (revealed by ext2 -b 1024 testing on a loop of a tmpfs file). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: move balance_dirty_pages_ratelimited() outside of splice actor	Jens Axboe	2007-06-08	1	-1/+2
\| \| \| \| \| \| \|	I've seen inode related deadlocks, so move this call outside of the actor itself, which may hold the inode lock. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: remove do_splice_direct() symbol export	Jens Axboe	2007-06-08	1	-2/+0
\| \| \| \| \| \| \|	It's only supposed to be used by do_sendfile(), which is never modular. So kill the export. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	splice: move inode size check into generic_file_splice_read()	Jens Axboe	2007-06-08	1	-10/+9
\| \| \| \|	Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] splice: always call into page_cache_readahead()	Jens Axboe	2007-05-08	1	-5/+3
\| \| \| \| \| \| \|	Don't try to guess what the read-ahead logic will do, allow it to make its own decisions. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] splice(): fix interaction with readahead	Fengguang Wu	2007-05-08	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Eric Dumazet, thank you for disclosing this bug. Readahead logic somehow fails to populate the page range with data. It can be because 1) the readahead routine is not always called in the following lines of fs/splice.c: if (!loff \|\| nr_pages > 1) page_cache_readahead(mapping, &in->f_ra, in, index, nr_pages); 2) even called, page_cache_readahead() wont guarantee the pages are there. It wont submit readahead I/O for pages already in the radix tree, or when (ra_pages == 0), or after 256 cache hits. In your case, it should be because of the retried reads, which lead to excessive cache hits, and disables readahead at some time. And that _one_ failure of readahead blocks the whole read process. The application receives EAGAIN and retries the read, but __generic_file_splice_read() refuse to make progress: - in the previous invocation, it has allocated a blank page and inserted it into the radix tree, but never has the chance to start I/O for it: the test of SPLICE_F_NONBLOCK goes before that. - in the retried invocation, the readahead code will neither get out of the cache hit mode, nor will it submit I/O for an already existing page. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] splice: partial write fix	Dmitriy Monakhov	2007-03-29	1	-9/+16
\| \| \| \| \| \| \| \| \| \|	Currently if partial write has happened while ->commit_write() then page wasn't marked as accessed and rebalanced. Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Export __splice_from_pipe()	Mark Fasheh	2007-03-27	1	-3/+4
\| \| \| \| \| \| \| \| \|	Ocfs2 wants to implement it's own splice write actor so that it can better manage cluster / page locks. This lets us re-use the rest of splice write while only providing our own code where it's actually important. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	2/2 splice: dont readpage	Nick Piggin	2007-03-27	1	-30/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Splice does not need to readpage to bring the page uptodate before writing to it, because prepare_write will take care of that for us. Splice is also wrong to SetPageUptodate before the page is actually uptodate. This results in the old uninitialised memory leak. This gets fixed as a matter of course when removing the readpage logic. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	1/2 splice: dont steal	Nick Piggin	2007-03-27	1	-63/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stealing pages with splice is problematic because we cannot just insert an uptodate page into the pagecache and hope the filesystem can take care of it later. We also cannot just ClearPageUptodate, then hope prepare_write does not write anything into the page, because I don't think prepare_write gives that guarantee. Remove support for SPLICE_F_MOVE for now. If we really want to bring it back, we might be able to do so with a the new filesystem buffered write aops APIs I'm working on. If we really don't want to bring it back, then we should decide that sooner rather than later, and remove the flag and all the stealing infrastructure before anybody starts using it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] constify pipe_buf_operations	Eric Dumazet	2006-12-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	- pipe/splice should use const pipe_buf_operations and file_operations - struct pipe_inode_info has an unused field "start" : get rid of it. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] VFS: change struct file to use struct path	Josef "Jeff" Sipek	2006-12-08	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes struct file to use struct path instead of having independent pointers to struct dentry and struct vfsmount, and converts all users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}. Additionally, it adds two #define's to make the transition easier for users of the f_dentry and f_vfsmnt. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] splice: fix problem introduced with inode diet	Jens Axboe	2006-11-04	1	-6/+20
\| \| \| \| \| \| \| \| \| \| \| \|	After the inode slimming patch that unionised i_pipe/i_bdev/i_cdev, it's no longer enough to check for existance of ->i_pipe to verify that this is a pipe. Original patch from Eric Dumazet <dada1@cosmosbay.com> Final solution suggested by Linus. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] mm: clean up pagecache allocation	Nick Piggin	2006-10-28	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Consolidate page_cache_alloc - Fix splice: only the pagecache pages and filesystem data need to use mapping_gfp_mask. - Fix grab_cache_page_nowait: same as splice, also honour NUMA placement. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Remove SUID when splicing into an inode	Jens Axboe	2006-10-19	1	-4/+15
\| \| \| \| \| \| \| \| \| \|	Originally from Mark Fasheh <mark.fasheh@oracle.com> generic_file_splice_write() does not remove S_ISUID or S_ISGID. This is inconsistent with the way we generally write to files. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] Introduce generic_file_splice_write_nolock()	Mark Fasheh	2006-10-19	1	-14/+66
\| \| \| \| \| \| \| \| \| \| \|	This allows file systems to manage their own i_mutex locking while still re-using the generic_file_splice_write() logic. OCFS2 in particular wants this so that it can order cluster locks within i_mutex. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] Take i_mutex in splice_from_pipe()	Mark Fasheh	2006-10-19	1	-13/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The splice_actor may be calling ->prepare_write() and ->commit_write(). We want i_mutex on the inode being written to before calling those so that we don't race i_size changes. The double locking behavior is done elsewhere in splice.c, and if we eventually want _nolock variants of generic_file_splice_write(), fs modules might have to replicate the nasty locking code. We introduce inode_double_lock() and inode_double_unlock() to consolidate the locking rules into one set of functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] splice: fix pipe_to_file() ->prepare_write() error path	Jens Axboe	2006-10-12	1	-3/+3
\| \| \| \| \| \|	Don't jump to the unlock+release path, we already did that. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	[PATCH] Update axboe@suse.de email address	Jens Axboe	2006-09-30	1	-1/+1
\| \| \| \| \| \| \|	As people often look for the copyright in files to see who to mail, update the link to a neutral one. Signed-off-by: Jens Axboe <axboe@kernel.dk>
*	[PATCH] splice: fix problems with sys_tee()	Jens Axboe	2006-07-10	1	-105/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several issues noticed/fixed: - We cannot reliably block in link_pipe() while holding both input and output mutexes. So do preparatory checks before locking down both mutexes and doing the link. - The ipipe->nrbufs vs i check was bad, because we could have dropped the ipipe lock in-between. This causes us to potentially look at unknown buffers if we were racing with someone else reading this pipe. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: retrieve mapping after locking the page	Jens Axboe	2006-06-23	1	-17/+29
\| \| \| \| \| \| \| \| \|	Otherwise we could be racing with truncate/mapping removal. Problem found/fixed by Nick Piggin <npiggin@suse.de>, logic rewritten by me. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: redo page lookup if add_to_page_cache() returns -EEXIST	Jens Axboe	2006-05-04	1	-0/+2
\| \| \| \| \| \| \| \| \|	This can happen quite easily, if several processes are trying to splice the same file at the same time. It's not a failure, it just means someone raced with us in allocating this file page. So just dump the allocated page and relookup the original. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: rename remaining info variables to pipe	Jens Axboe	2006-05-04	1	-10/+10
\| \| \| \| \| \| \|	Same thing was done in fs/pipe.c and most of fs/splice.c, but we had a few missing still. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: LRU fixups	Jens Axboe	2006-05-04	1	-22/+11
\| \| \| \| \| \| \| \|	Nick says that the current construct isn't safe. This goes back to the original, but sets PIPE_BUF_FLAG_LRU on user pages as well as they all seem to be on the LRU in the first place. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: fix unlocking of page on error ->prepare_write()	Jens Axboe	2006-05-04	1	-3/+16
\| \| \| \| \| \| \| \| \|	Looking at generic_file_buffered_write(), we need to unlock_page() if prepare write fails and it isn't due to racing with truncate(). Also trim the size if ->prepare_write() fails, if we have to. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] vmsplice: restrict stealing a little more	Jens Axboe	2006-05-02	1	-1/+1
\| \| \| \| \| \| \|	Apply the same rules as the anon pipe pages, only allow stealing if no one else is using the page. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: fix page LRU accounting	Jens Axboe	2006-05-02	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Currently we rely on the PIPE_BUF_FLAG_LRU flag being set correctly to know whether we need to fiddle with page LRU state after stealing it, however for some origins we just don't know if the page is on the LRU list or not. So remove PIPE_BUF_FLAG_LRU and do this check/add manually in pipe_to_file() instead. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] vmsplice: fix badly placed end paranthesis	Jens Axboe	2006-05-02	1	-1/+1
\| \| \| \| \| \| \| \|	We need to use the minium of {len, PAGE_SIZE-off}, not {len, PAGE_SIZE}-off. The latter doesn't make any sense, and could cause us to attempt negative length transfers... Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] vmsplice: allow user to pass in gift pages	Jens Axboe	2006-05-01	1	-3/+25
\| \| \| \| \| \| \| \| \| \| \|	If SPLICE_F_GIFT is set, the user is basically giving this pages away to the kernel. That means we can steal them for eg page cache uses instead of copying it. The data must be properly page aligned and also a multiple of the page size in length. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] pipe: enable atomic copying of pipe data to/from user space	Jens Axboe	2006-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pipe ->map() method uses kmap() to virtually map the pages, which is both slow and has known scalability issues on SMP. This patch enables atomic copying of pipe pages, by pre-faulting data and using kmap_atomic() instead. lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here are results from that on a UP machine with highmem (1.5GiB of RAM), running first a UP kernel, SMP kernel, and SMP kernel patched. Vanilla-UP: Pipe bandwidth: 1622.28 MB/sec Pipe bandwidth: 1610.59 MB/sec Pipe bandwidth: 1608.30 MB/sec Pipe latency: 7.3275 microseconds Pipe latency: 7.2995 microseconds Pipe latency: 7.3097 microseconds Vanilla-SMP: Pipe bandwidth: 1382.19 MB/sec Pipe bandwidth: 1317.27 MB/sec Pipe bandwidth: 1355.61 MB/sec Pipe latency: 9.6402 microseconds Pipe latency: 9.6696 microseconds Pipe latency: 9.6153 microseconds Patched-SMP: Pipe bandwidth: 1578.70 MB/sec Pipe bandwidth: 1579.95 MB/sec Pipe bandwidth: 1578.63 MB/sec Pipe latency: 9.1654 microseconds Pipe latency: 9.2266 microseconds Pipe latency: 9.1527 microseconds Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: call handle_ra_miss() on failure to lookup page	Jens Axboe	2006-05-01	1	-0/+6
\| \| \| \| \| \| \|	Notify the readahead logic of the missing page. Suggested by Oleg Nesterov. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] pipe: introduce ->pin() buffer operation	Jens Axboe	2006-05-01	1	-61/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	The ->map() function is really expensive on highmem machines right now, since it has to use the slower kmap() instead of kmap_atomic(). Splice rarely needs to access the virtual address of a page, so it's a waste of time doing it. Introduce ->pin() to take over the responsibility of making sure the page data is valid. ->map() is then reduced to just kmap(). That way we can also share a most of the pipe buffer ops between pipe.c and splice.c Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: fix bugs in pipe_to_file()	Jens Axboe	2006-05-01	1	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found by Oleg Nesterov <oleg@tv-sign.ru>, fixed by me. - Only allow full pages to go to the page cache. - Check page != buf->page instead of using PIPE_BUF_FLAG_STOLEN. - Remember to clear 'stolen' if add_to_page_cache() fails. And as a cleanup on that: - Make the bottom fall-through logic a little less convoluted. Also make the steal path hold an extra reference to the page, so we don't have to differentiate between stolen and non-stolen at the end. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: fix bugs with stealing regular pipe pages	Jens Axboe	2006-04-30	1	-1/+3
\| \| \| \| \| \| \| \| \|	- Check that page has suitable count for stealing in the regular pipes. - pipe_to_file() assumes that the page is locked on succesful steal, so do that in the pipe steal hook - Missing unlock_page() in add_to_page_cache() failure. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: make the read-side do batched page lookups	Jens Axboe	2006-04-27	1	-30/+65
\| \| \| \| \| \| \| \|	Use the new find_get_pages_contig() to potentially look up the entire splice range in one single call. This speeds up generic_file_splice_read() quite a bit. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: switch to using page_cache_readahead()	Jens Axboe	2006-04-27	1	-2/+2
\| \| \| \| \| \|	Avoids doing useless work, when the file is fully cached. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: rearrange moving to/from pipe helpers	Jens Axboe	2006-04-26	1	-24/+11
\| \| \| \| \| \|	We need these for people writing their own ->splice_read/write hooks. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] Add support for the sys_vmsplice syscall	Jens Axboe	2006-04-26	1	-39/+253
\| \| \| \| \| \| \| \| \| \| \|	sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice() moves data to a pipe, with the input being a user address range instead. This uses an approach suggested by Linus, where we can hold partial ranges inside the pages[] map. Hopefully this will be useful for network receive support as well. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: fix offset problems	Jens Axboe	2006-04-26	1	-19/+27
\| \| \| \| \| \| \| \| \| \|	Make the move_from_pipe() actors return number of bytes processed, then move_from_pipe() can decide more cleverly when to move on to the next buffer. This fixes problems with pipe offset and differing file offset. Signed-off-by: Jens Axboe <axboe@suse.de>