summaryrefslogtreecommitdiffstats
path: root/block
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] Audit block layer inlinesJens Axboe2006-09-302-5/+5
| | | | | | | Kill a few inlines that bring in too much code to more than one location Shrinks kernel text by about 300 bytes on 32-bit x86. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: use new io context counting mechanismJens Axboe2006-09-301-5/+7
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: use new io context counting mechanismJens Axboe2006-09-301-4/+5
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: kill cfq_exit_lockJens Axboe2006-09-302-38/+18
| | | | | | | | | | | | | | | | | | | | | cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: cleanups, fixes, dead code removalJens Axboe2006-09-301-114/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | A collection of little fixes and cleanups: - We don't use the 'queued' sysfs exported attribute, since the may_queue() logic was rewritten. So kill it. - Remove dead defines. - cfq_set_active_queue() can be rewritten cleaner with else if conditions. - Several places had cfq_exit_cfqq() like logic, abstract that out and use that. - Annotate the cfqq kmem_cache_alloc() so the allocator knows that this is a repeat allocation if it fails with __GFP_WAIT set. Allows the allocator to start freeing some memory, if needed. CFQ already loops for this condition, so might as well pass the hint down. - Remove cfqd->rq_starved logic. It's not needed anymore after we dropped the crq allocation in cfq_set_request(). - Remove uneeded parameter passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ll_rw_blk: cleanup __make_request()Jens Axboe2006-09-301-15/+7
| | | | | | | | - Don't assign variables that are only used once. - Kill spin_lock() prefetching, it's opportunistic at best. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Drop useless bio passing in may_queue/set_request APIJens Axboe2006-09-304-14/+11
| | | | | | It's not needed for anything, so kill the bio passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->rq_status from struct requestJens Axboe2006-09-301-3/+0
| | | | | | | | | | | After Christophs SCSI change, the only usage left is RQ_ACTIVE and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing the request, so any check for RQ_INACTIVE in a driver is a bug and indicates use-after-free. So kill/clean the remaining users, straight forward. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove struct request_list from struct requestJens Axboe2006-09-301-8/+2
| | | | | | | | It is always identical to &q->rq, and we only use it for detecting whether this request came out of our mempool or not. So replace it with an additional ->flags bit flag. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->waiting member from struct requestJens Axboe2006-09-302-9/+7
| | | | | | | | As the comments indicates in blkdev.h, we can fold it into ->end_io_data usage as that is really what ->waiting is. Fixup the users of blk_end_sync_rq(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] as-iosched: kill arqJens Axboe2006-09-301-195/+118
| | | | | | | | | Get rid of the as_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any arq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: kill crqJens Axboe2006-09-301-144/+95
| | | | | | | | Get rid of the cfq_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any crq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: remove the crq flag functions/variableJens Axboe2006-09-301-42/+16
| | | | | | | There's just one flag currently (SYNC), and that one can be grabbed from the request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: remove elevator private drq request typeJens Axboe2006-09-301-142/+52
| | | | | | | | A big win, we now save an allocation/free on each request! With the previous rb/hash abstractions, we can just reuse queuelist/donelist for the FIFO data and be done with it. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: remove arq->is_sync memberJens Axboe2006-09-301-22/+14
| | | | | | | We can track this in struct request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] as-iosched: reuse rq for fifoJens Axboe2006-09-301-22/+12
| | | | | | | Saves some space in arq. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: convert to using the FIFO elevator definesJens Axboe2006-09-301-2/+1
| | | | Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-301-136/+34
| | | | | | This removes the rbtree handling from deadline. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-301-123/+41
| | | | | | This removes the rbtree handling from CFQ. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-301-151/+29
| | | | | | | This removes the rbtree handling from AS. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] elevator: abstract out the rbtree sort handlingJens Axboe2006-09-302-17/+113
| | | | | | | | | | | | The rbtree sort/lookup/reposition logic is mostly duplicated in cfq/deadline/as, so move it to the elevator core. The io schedulers still provide the actual rb root, as we don't want to impose any sort of specific handling on the schedulers. Introduce the helpers and rb_node in struct request to help migrate the IO schedulers. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prevJens Axboe2006-09-301-2/+2
| | | | | | | The conditions got reserved. Also make rb_next() and rb_prev() check for the empty condition. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] elevator: move the backmerging logic into the elevator coreJens Axboe2006-09-305-360/+142
| | | | | | | | | | | Right now, every IO scheduler implements its own backmerging (except for noop, which does no merging). That results in duplicated code for essentially the same operation, which is never a good thing. This patch moves the backmerging out of the io schedulers and into the elevator core. We save 1.6kb of text and as a bonus get backmerging for noop as well. Win-win! Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Split struct request ->flags into two partsJens Axboe2006-09-304-83/+52
| | | | | | | | | | Right now ->flags is a bit of a mess: some are request types, and others are just modifiers. Clean this up by splitting it into ->cmd_type and ->cmd_flags. This allows introduction of generic Linux block message types, useful for sending generic Linux commands to block devices. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ifdef blktrace debugging fieldsAlexey Dobriyan2006-09-292-4/+5
| | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] block: handle subsystem_register() init errorsRandy Dunlap2006-09-291-2/+7
| | | | | | | | | Check and handle init errors. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_privateTheodore Ts'o2006-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge mulgrave-w:git/linux-2.6James Bottomley2006-09-231-0/+12
|\ | | | | | | | | | | | | | | Conflicts: include/linux/blkdev.h Trivial merge to incorporate tag prototypes.
| * Add a real API for dealing with blk_congestion_wait()Trond Myklebust2006-09-221-0/+12
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | [SCSI] block: add support for shared tag mapsJames Bottomley2006-08-311-21/+88
|/ | | | | | | | | | The current block queue implementation already contains most of the machinery for shared tag maps. The only remaining pieces are a way to allocate and destroy a tag map independently of the queues (so that the maps can be managed on the life cycle of the overseeing entity) Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* elv_unregister: fix possible crash on module unloadOleg Nesterov2006-08-221-1/+2
| | | | | | | | | | An exiting task or process which didn't do I/O yet have no io context, elv_unregister() should check it is not NULL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* [PATCH] cfq_cic_link: fix usage of wrong cfq_io_contextOleg Nesterov2006-08-211-1/+1
| | | | | | | | Obviously, cfq_cic_link() shouldn't free a just allocated cfq_io_context? The dead key is from __cic, so drop that. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Fix current_io_context() vs set_task_ioprio() raceOleg Nesterov2006-08-211-0/+2
| | | | | | | | | | | I know nothing about io scheduler, but I suspect set_task_ioprio() is not safe. current_io_context() initializes "struct io_context", then sets ->io_context. set_task_ioprio() running on another cpu may see the changes out of order, so ->set_ioprio(ioc) may use io_context which was not initialized properly. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecsJens Axboe2006-07-251-1/+1
| | | | | | | | | | | | The CIC_SEEKY() test really wants to use the minimum of either: - 2 msecs (not jiffies) - or, the pending slice time So code it like that. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] blktrace: fix read-ahead bitMilton Miller2006-07-251-1/+1
| | | | | | It should be toggling the same bit on and off, fix it up. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] lockdep: annotate the BLKPG_DEL_PARTITION ioctlArjan van de Ven2006-07-141-2/+2
| | | | | | | | | | | The delete partition IOCTL takes the bd_mutex for both the disk and the partition; these have an obvious hierarchical relationship and this patch annotates this relationship for lockdep. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Only the first two bits in bio->bi_rw and rq->flags matchJens Axboe2006-07-061-2/+2
| | | | | | | Not three, as assumed. This causes the barrier bit to be needlessly set for some IO. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] blktrace: readahead supportNathan Scott2006-07-061-1/+4
| | | | | | | | Provide the needed kernel support for distinguishing readahead from regular read requests when tracing block devices. Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] lockdep: annotate on-stack completionsIngo Molnar2006-07-031-1/+1
| | | | | | | | | | | | lockdep needs to have the waitqueue lock initialized for on-stack waitqueues implicitly initialized by DECLARE_COMPLETION(). Annotate on-stack completions accordingly. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds2006-06-307-7/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
| * Remove obsolete #include <linux/config.h>Jörn Engel2006-06-307-7/+0
| | | | | | | | | | Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* | [PATCH] Light weight event countersChristoph Lameter2006-06-301-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] cpu hotplug: use hotplug version of cpu notifier in appropriate placesChandra Seetharaman2006-06-271-3/+1
| | | | | | | | | | Make use the of newly defined hotplug version of cpu_notifier functionality wherever appropriate. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17Chandra Seetharaman2006-06-271-1/+1
| | | | | | | | | This patch reverts notifier_block changes made in 2.6.17 Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* spelling fixesAndreas Mohr2006-06-262-3/+3
| | | | | | | | | | | | acquired (aquired) contiguous (contigious) successful (succesful, succesfull) surprise (suprise) whether (weather) some other misspellings Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* [BLOCK] Fix bounce limit address checkAndi Kleen2006-06-231-1/+1
| | | | | | | | Do a safer check for when to enable DMA. Currently we enable ISA DMA for cases that do not need it, resulting in OOM conditions when ZONE_DMA runs out of space. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] rbtree: support functions used by the io schedulersJens Axboe2006-06-233-32/+20
| | | | | | | They all duplicate macros to check for empty root and/or node, and clearing a node. So put those in rbtree.h. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: rq update fixesJens Axboe2006-06-231-5/+5
| | | | | | | | | - Remember to set ->last_sector so that the cfq_choose_req() logic works correctly. - Remove redundant call to cfq_choose_req() Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: many performance fixesJens Axboe2006-06-231-40/+76
| | | | | | | | | | | | | | | | | | | | This is a collection of patches that greatly improve CFQ performance in some circumstances. - Change the idling logic to only kick in after a request is done and we are deciding what to do. Before the idling included the request service time, so it was hard to adjust. Now it's true think/idle time. - Take advantage of TCQ/NCQ/queueing for seeky sync workloads, but keep it in control for sync and sequential (or close to) workloads. - Expire queues immediately and move on to other busy queues, if we are not going to idle after the current one finishes. - Don't rearm idle timer if there are no busy queues. Just leave the system idle. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: correctly set ioprio on both targetsJens Axboe2006-06-231-3/+2
| | | | | | | | | | | | | | | | | Patch originally from Vasily Tarasov <vtaras@sw.ru> If you set io-priority of process 1 using sys_ioprio_set system call by another process 2 (like ionice do), then cfq_init_prio_data() function sets priority of process 2 (current) on queue of process 1 and clears the flag, that designates change of ioprio. So the process 1 will work like with priority of process 2. I propose not to call cfq_init_prio_data() on io-priority change, but only mark queue as queue with changed prority. Every time when new request comes cfq-scheduler checks for this flag and atomaticaly changes priority of queue to new value. Signed-off-by: Jens Axboe <axboe@suse.de>