summaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAgeFilesLines
* bugfix for memory cgroup controller: migration under memory controller fixKAMEZAWA Hiroyuki2008-02-071-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While using memory control cgroup, page-migration under it works as following. == 1. uncharge all refs at try to unmap. 2. charge regs again remove_migration_ptes() == This is simple but has following problems. == The page is uncharged and charged back again if *mapped*. - This means that cgroup before migration can be different from one after migration - If page is not mapped but charged as page cache, charge is just ignored (because not mapped, it will not be uncharged before migration) This is memory leak. == This patch tries to keep memory cgroup at page migration by increasing one refcnt during it. 3 functions are added. mem_cgroup_prepare_migration() --- increase refcnt of page->page_cgroup mem_cgroup_end_migration() --- decrease refcnt of page->page_cgroup mem_cgroup_page_migration() --- copy page->page_cgroup from old page to new page. During migration - old page is under PG_locked. - new page is under PG_locked, too. - both old page and new page is not on LRU. These 3 facts guarantee that page_cgroup() migration has no race. Tested and worked well in x86_64/fake-NUMA box. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memcontrol: move oom task exclusion to tasklist scanDavid Rientjes2008-02-071-0/+7
| | | | | | | | | | | | | | | | | | | | Creates a helper function to return non-zero if a task is a member of a memory controller: int task_in_mem_cgroup(const struct task_struct *task, const struct mem_cgroup *mem); When the OOM killer is constrained by the memory controller, the exclusion of tasks that are not a member of that controller was previously misplaced and appeared in the badness scoring function. It should be excluded during the tasklist scan in select_bad_process() instead. [akpm@linux-foundation.org: build fix] Cc: Christoph Lameter <clameter@sgi.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memcontrol: move mm_cgroup to header fileDavid Rientjes2008-02-071-2/+9
| | | | | | | | | | | | | | | | | Inline functions must preceed their use, so mm_cgroup() should be defined in linux/memcontrol.h. include/linux/memcontrol.h:48: warning: 'mm_cgroup' declared inline after being called include/linux/memcontrol.h:48: warning: previous declaration of 'mm_cgroup' was here [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: nuther build fix] Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: make charging gfp mask awareBalbir Singh2008-02-072-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Nick Piggin pointed out that swap cache and page cache addition routines could be called from non GFP_KERNEL contexts. This patch makes the charging routine aware of the gfp context. Charging might fail if the cgroup is over it's limit, in which case a suitable error is returned. This patch was tested on a Powerpc box. I am still looking at being able to test the path, through which allocations happen in non GFP_KERNEL contexts. [kamezawa.hiroyu@jp.fujitsu.com: problem with ZONE_MOVABLE] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: make page_referenced() cgroup awareBalbir Singh2008-02-072-2/+9
| | | | | | | | | | | | | | | | | | | | | Make page_referenced() cgroup aware. Without this patch, page_referenced() can cause a page to be skipped while reclaiming pages. This patch ensures that other cgroups do not hold pages in a particular cgroup hostage. It is required to ensure that shared pages are freed from a cgroup when they are not actively referenced from the cgroup that brought them in Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: add switch to control what type of pages to limitBalbir Singh2008-02-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Choose if we want cached pages to be accounted or not. By default both are accounted for. A new set of tunables are added. echo -n 1 > mem_control_type switches the accounting to account for only mapped pages echo -n 3 > mem_control_type switches the behaviour back [bunk@kernel.org: mm/memcontrol.c: clenups] [akpm@linux-foundation.org: fix sparc32 build] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: OOM handlingPavel Emelianov2008-02-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Out of memory handling for cgroups over their limit. A task from the cgroup over limit is chosen using the existing OOM logic and killed. TODO: 1. As discussed in the OLS BOF session, consider implementing a user space policy for OOM handling. [akpm@linux-foundation.org: fix build due to oom-killer changes] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller improve user interfaceBalbir Singh2008-02-071-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | Change the interface to use bytes instead of pages. Page sizes can vary across platforms and configurations. A new strategy routine has been added to the resource counters infrastructure to format the data as desired. Suggested by David Rientjes, Andrew Morton and Herbert Poetzl Tested on a UML setup with the config for memory control enabled. [kamezawa.hiroyu@jp.fujitsu.com: possible race fix in res_counter] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: add per cgroup LRU and reclaimBalbir Singh2008-02-073-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the page_cgroup to the per cgroup LRU. The reclaim algorithm has been modified to make the isolate_lru_pages() as a pluggable component. The scan_control data structure now accepts the cgroup on behalf of which reclaims are carried out. try_to_free_pages() has been extended to become cgroup aware. [akpm@linux-foundation.org: fix warning] [Lee.Schermerhorn@hp.com: initialize all scan_control's isolate_pages member] [bunk@kernel.org: make do_try_to_free_pages() static] [hugh@veritas.com: memcgroup: fix try_to_free order] [kamezawa.hiroyu@jp.fujitsu.com: this unlock_page_cgroup() is unnecessary] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: memory accountingBalbir Singh2008-02-071-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the accounting hooks. The accounting is carried out for RSS and Page Cache (unmapped) pages. There is now a common limit and accounting for both. The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap() time. Page cache is accounted at add_to_page_cache(), __delete_from_page_cache(). Swap cache is also accounted for. Each page's page_cgroup is protected with the last bit of the page_cgroup pointer, this makes handling of race conditions involving simultaneous mappings of a page easier. A reference count is kept in the page_cgroup to deal with cases where a page might be unmapped from the RSS of all tasks, but still lives in the page cache. Credits go to Vaidyanathan Srinivasan for helping with reference counting work of the page cgroup. Almost all of the page cache accounting code has help from Vaidyanathan Srinivasan. [hugh@veritas.com: fix swapoff breakage] [akpm@linux-foundation.org: fix locking] Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: <Valdis.Kletnieks@vt.edu> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: accounting setupPavel Emelianov2008-02-073-0/+43
| | | | | | | | | | | | | | | | | Basic setup routines, the mm_struct has a pointer to the cgroup that it belongs to and the the page has a page_cgroup associated with it. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: cgroups setupBalbir Singh2008-02-072-0/+26
| | | | | | | | | | | | | | | | | | Setup the memory cgroup and add basic hooks and controls to integrate and work with the cgroup. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: David Rientjes <rientjes@google.com> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Memory controller: resource countersPavel Emelianov2008-02-071-0/+102
| | | | | | | | | | | | | | | | | | | | | | | With fixes from David Rientjes <rientjes@google.com> Introduce generic structures and routines for resource accounting. Each resource accounting cgroup is supposed to aggregate it, cgroup_subsystem_state and its resource-specific members within. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Kirill Korotaev <dev@sw.ru> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* tty: Kill TTY_FLIPBUF_SIZEAlan Cox2008-02-071-7/+0
| | | | | | | | | This legacy define from the old buffer code is now only used in a single power pc driver than doesn't compile anyway. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: swap do_ioctl and vfs_ioctl namesErez Zadok2008-02-071-1/+3
| | | | | | | | | | | | | | | | | | | Rename old vfs_ioctl to do_ioctl, because the comment above it clearly indicates that it is an internal function not to be exported to modules; therefore it should have a more traditional do_XXX name. The new do_ioctl is exported in fs.h but not to modules. Rename the old do_ioctl to vfs_ioctl because the names vfs_XXX should preferably be reserved to callable VFS functions which modules may call, as many other vfs_XXX functions already do. Export the new vfs_ioctl to GPL modules so others can use it (including Unionfs and eCryptfs). Add DocBook for new vfs_ioctl. [akpm@linux-foundation.org: fix build] Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* DS1WM: decouple host IRQ and INTR active state settingsPhilipp Zabel2008-02-071-0/+1
| | | | | | | | | | | | | The DS1WM driver incorrectly infers the IAS bit (1-wire interrupt active high) from IRQ settings. There are devices that have IAS=0 but still need the IRQ to trigger on a rising edge. With this patch, machines with DS1WM that need IAS=1 have to set .active_high=1 in the ds1wm_platform_data. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Acked-by: Matt Reimer <mreimer@vpop.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds2008-02-062-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix deadlock, make pgd_lock irq-safe virtio: fix trivial build bug x86: fix mttr trimming x86: delay CPA self-test and repeat it x86: fix 64-bit sections generic: add __FINITDATA x86: remove suprious ifdefs from pageattr.c x86: mark the .rodata section also NX x86: fix iret exception recovery on 64-bit cpuidle: dubious one-bit signed bitfield in cpuidle.h x86: fix sparse warnings in powernow-k8.c x86: fix sparse error in traps_32.c x86: trivial sparse/checkpatch in quirks.c x86 ptrace: disallow null cs/ss MAINTAINERS: RDC R-321x SoC maintainer brk randomization: introduce CONFIG_COMPAT_BRK brk: check the lower bound properly x86: remove X2 workaround x86: make spurious fault handler aware of large mappings x86: make traps on entry code be debuggable in user space, 64-bit
| * generic: add __FINITDATAIngo Molnar2008-02-061-0/+1
| | | | | | | | Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * cpuidle: dubious one-bit signed bitfield in cpuidle.hHarvey Harrison2008-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix these sparse warnings: CHECK arch/x86/kernel/acpi/cstate.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield CHECK arch/x86/kernel/acpi/processor.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield CHECK arch/x86/kernel/cpu/cpufreq/powernow-k7.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield CHECK arch/x86/kernel/cpu/cpufreq/powernow-k8.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield CHECK arch/x86/kernel/cpu/cpufreq/longhaul.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield CHECK arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'async-tx-for-linus' of ↵Linus Torvalds2008-02-062-16/+26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://lost.foo-projects.org/~dwillia2/git/iop into fix * 'async-tx-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: async_tx: allow architecture specific async_tx_find_channel implementations async_tx: replace 'int_en' with operation preparation flags async_tx: kill tx_set_src and tx_set_dest methods async_tx: kill ASYNC_TX_ASSUME_COHERENT iop-adma: use LIST_HEAD instead of LIST_HEAD_INIT async_tx: use LIST_HEAD instead of LIST_HEAD_INIT async_tx: fix compile breakage, mark do_async_xor __always_inline
| * | async_tx: allow architecture specific async_tx_find_channel implementationsDan Williams2008-02-061-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | The source and destination addresses are included to allow channel selection based on address alignment. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| * | async_tx: replace 'int_en' with operation preparation flagsDan Williams2008-02-061-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass a full set of flags to drivers' per-operation 'prep' routines. Currently the only flag passed is DMA_PREP_INTERRUPT. The expectation is that arch-specific async_tx_find_channel() implementations can exploit this capability to find the best channel for an operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| * | async_tx: kill tx_set_src and tx_set_dest methodsDan Williams2008-02-061-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tx_set_src and tx_set_dest methods were originally implemented to allow an array of addresses to be passed down from async_xor to the dmaengine driver while minimizing stack overhead. Removing these methods allows drivers to have all transaction parameters available at 'prep' time, saves two function pointers in struct dma_async_tx_descriptor, and reduces the number of indirect branches.. A consequence of moving this data to the 'prep' routine is that multi-source routines like async_xor need temporary storage to convert an array of linear addresses into an array of dma addresses. In order to keep the same stack footprint of the previous implementation the input array is reused as storage for the dma addresses. This requires that sizeof(dma_addr_t) be less than or equal to sizeof(void *). As a consequence CONFIG_DMADEVICES now depends on !CONFIG_HIGHMEM64G. It also requires that drivers be able to make descriptor resources available when the 'prep' routine is polled. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com>
| * | async_tx: kill ASYNC_TX_ASSUME_COHERENTDan Williams2008-02-061-2/+0
| |/ | | | | | | | | | | | | | | | | | | Remove the unused ASYNC_TX_ASSUME_COHERENT flag. Async_tx is meant to hide the difference between asynchronous hardware and synchronous software operations, this flag requires clients to understand cache coherency consequences of the async path. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
* | Merge branch 'upstream-linus' of ↵Linus Torvalds2008-02-062-5/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ata_piix.c:piix_init_one() must be __devinit sata_via.c: Remove missleading comment. libata-core: unblacklist HITACHI drives sata_nv: fix ATAPI issues with memory over 4GB (v7) ata: drivers/ata/sata_mv.c needs dmapool.h libata: kill now unused n_iter and fix sata_fsl ahci: fix CAP.NP and PI handling sata_mv: Support SoC controllers Rename: linux/pata_platform.h to linux/ata_platform.h
| * | libata: kill now unused n_iter and fix sata_fslJames Bottomley2008-02-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qc->n_iter was used for libata's own sg walking before sg chaining replaced it. During conversion, the field and its usage in sata_fsl were left behind. Kill the filed and update sata_fsl. tj: This was part of James's libata-use-block-layer-padding patch. Separated out by me. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Li Yang <leoli@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | sata_mv: Support SoC controllersSaeed Bishara2008-02-061-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marvell's Orion SoC includes SATA controllers based on Marvell's PCI-to-SATA 88SX controllers. This patch extends the libATA sata_mv driver to support those controllers. [edited to use linux/ata_platform.h -jg] Signed-off-by: Saeed Bishara <saeed@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * | Rename: linux/pata_platform.h to linux/ata_platform.hJeff Garzik2008-02-061-0/+0
| |/ | | | | | | Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-02-064-21/+21
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits) virtio net: fix oops on interface-up Fix PHY Lib support for gianfar and ucc_geth forcedeth: preserve registers forcedeth: phy status fix forcedeth: restart tx/rx ipvs: Make wrr "no available servers" error message rate-limited [PPPOL2TP]: Label unused warning when CONFIG_PROC_FS is not set. [NET_SCHED]: cls_flow: support classification based on VLAN tag [VLAN]: Constify skb argument to vlan_get_tag() [NET_SCHED]: cls_flow: fix key mask validity check [NET_SCHED]: em_meta: fix compile warning b43: Fix DMA for 30/32-bit DMA engines b43: fix build with CONFIG_SSB_PCIHOST=n mac80211: Is not EXPERIMENTAL anymore iwl3945-base.c: fix off-by-one errors b43legacy: fix DMA slot resource leakage b43legacy: drop packets we are not able to encrypt b43legacy: fix suspend/resume b43legacy: fix PIO crash Generic HDLC - use random_ether_addr() ...
| * \ Merge branch 'upstream-davem' of ↵David S. Miller2008-02-061-18/+7
| |\ \ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| | * | Generic HDLC - remove now unneeded hdlc_device_descKrzysztof Halasa2008-02-051-18/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Removes now unneeded struct hdlc_device_desc Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | | Merge branch 'fixes' of ↵David S. Miller2008-02-051-0/+9
| |\ \ \ | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * | | b43: fix build with CONFIG_SSB_PCIHOST=nAndrew Morton2008-02-051-0/+9
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | m68k allmodconfig gives drivers/net/wireless/b43/main.c:251: error: implicit declaration of function 'mmiowb' because CONFIG_B43=m, CONFIG_SSB_PCIHOST=n. Might be Kconfig bustage, but this works... Cc: Michael Buesch <mb@bu3sch.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | | [NET_SCHED]: cls_flow: support classification based on VLAN tagPatrick McHardy2008-02-051-0/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | [VLAN]: Constify skb argument to vlan_get_tag()Patrick McHardy2008-02-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Required by next patch to use it from the flow classifier. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | jbd.h: hide kernel only codeOlaf Hering2008-02-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move a few kernel-only things into __KERNEL__. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | make jbd/journal.c:__journal_abort_hard() staticAdrian Bunk2008-02-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __journal_abort_hard() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | isapnp driver semaphore to mutexDaniel Walker2008-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changed the isapnp semaphore to a mutex. [akpm@linux-foundation.org: no externs-in-c] [akpm@linux-foundation.org: build fix] Signed-off-by: Daniel Walker <dwalker@mvista.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: change ITERATE_RDEV_GENERIC to rdev_for_each_list, and remove ↵NeilBrown2008-02-061-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ITERATE_RDEV_PENDING. Finish ITERATE_ to for_each conversion. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: change ITERATE_RDEV to rdev_for_eachNeilBrown2008-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As this is more in line with common practice in the kernel. Also swap the args around to be more like list_for_each. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: allow devices to be shared between md arraysNeilBrown2008-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, a given device is "claimed" by a particular array so that it cannot be used by other arrays. This is not ideal for DDF and other metadata schemes which have their own partitioning concept. So for externally managed metadata, just claim the device for md in general, require that "offset" and "size" are set properly for each device, and make sure that if a device is included in different arrays then the active sections do not overlap. This involves adding another flag to the rdev which makes it awkward to set "->flags = 0" to clear certain flags. So now clear flags explicitly by name when we want to clear things. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: allow a maximum extent to be set for resyncingNeilBrown2008-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows userspace to control resync/reshape progress and synchronise it with other activities, such as shared access in a SAN, or backing up critical sections during a tricky reshape. Writing a number of sectors (which must be a multiple of the chunk size if such is meaningful) causes a resync to pause when it gets to that point. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: support 'external' metadata for md arraysNeilBrown2008-02-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add a state flag 'external' to indicate that the metadata is managed externally (by user-space) so important changes need to be left of user-space to handle. Alternates are non-persistant ('none') where there is no stable metadata - after the array is stopped there is no record of it's status - and internal which can be version 0.90 or version 1.x These are selected by writing to the 'metadata' attribute. - move the updating of superblocks (sync_sbs) to after we have checked if there are any superblocks or not. - New array state 'write_pending'. This means that the metadata records the array as 'clean', but a write has been requested, so the metadata has to be updated to record a 'dirty' array before the write can continue. This change is reported to md by writing 'active' to the array_state attribute. - tidy up marking of sb_dirty: - don't set sb_dirty when resync finishes as md_check_recovery calls md_update_sb when the sync thread finishes anyway. - Don't set sb_dirty in multipath_run as the array might not be dirty. - don't mark superblock dirty when switching to 'clean' if there is no internal superblock (if external, userspace can choose to update the superblock whenever it chooses to). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | md: Update md bitmap during resync.NeilBrown2008-02-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently an md array with a write-intent bitmap does not updated that bitmap to reflect successful partial resync. Rather the entire bitmap is updated when the resync completes. This is because there is no guarentee that resync requests will complete in order, and tracking each request individually is unnecessarily burdensome. However there is value in regularly updating the bitmap, so add code to periodically pause while all pending sync requests complete, then update the bitmap. Doing this only every few seconds (the same as the bitmap update time) does not notciably affect resync performance. [snitzer@gmail.com: export bitmap_cond_end_sync] Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | sm501fb: control panel pin usage with platform data flagsMagnus Damm2008-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes it possible to control panel pins usage with flags passed from the platform data. Without this patch the sm501fb driver always controls the VBIASEN and FPEN pins. The polarity and use of these pins are very platform specific, so this patch introduces the flags SM501FB_FLAG_PANEL_USE_VBIASEN and SM501FB_FLAG_PANEL_USE_FPEN which enable the use of these pins. This patch is needed to support the a Sharp LQ104V1DG21 lcd panel on SuperH platforms such as R2D-1 and R2D-PLUS boards. Letting the sm501fb driver control the FPEN and VBIASEN pins like today just results in lcd panel flicker. Signed-off-by: Magnus Damm <damm@igel.co.jp> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | gpio: rename pca953x symbolsGuennadi Liakhovetski2008-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This second part of an extension to support more pca953x chips renames the C and Kconfig symbols. All affected files were updated by sed, except for a couple of obvious exceptions. It also updates the Kconfig helptext. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | gpio: rename pca9539 driverGuennadi Liakhovetski2008-02-061-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First part of an extension to let the pca9539 driver support more chips, starting with pca9534, pca9535, pca9536, pca9537, and pca9538. This renames the files and modifies the Makefile. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | w1-gpio: add GPIO w1 bus master driverVille Syrjala2008-02-061-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a GPIO 1-wire bus master driver. The driver used the GPIO API to control the wire and the GPIO pin can be specified using platform data similar to i2c-gpio. The driver was tested with AT91SAM9260 + DS2401. Signed-off-by: Ville Syrjala <syrjala@sci.fi> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | kprobes: kretprobe user entry-handlerAbhishek Sagar2008-02-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide support to add an optional user defined callback to be run at function entry of a kretprobe'd function. Also modify the kprobe smoke tests to include an entry-handler during the kretprobe sanity test. Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Acked-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | system timer: fix crash in <100Hz system timerDavid Fries2008-02-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel has a divide by zero crash when trying to run the system timer less than 100Hz. The problem is x/(HZ/USER_HZ) and related. Now x*(USER_HZ/HZ) will be used if HZ<USER_HZ. I'm running the Linux kernel under qemu and went to run a slower system timer to take less CPU (and battery) on the host. I found that the kernel paniced under emulation because of a divide by zero in three places. Here is the patch. The base git was updated today 01-05-2008. I went for a 20Hz system time by adding config HZ_20 etc to kernel/Kconfig.hz. With this patch I verified the system timer by looking at /proc/interrupts. [akpm@linux-foundation.org: partially clean up the macro maze] Signed-off-by: David Fries <david@fries.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>