summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds2015-09-101-1/+2
|\ | | | | | | | | | | | | Pull microblaze update from Michal Simek. * 'next' of git://git.monstr.eu/linux-2.6-microblaze: elf-em.h: move EM_MICROBLAZE to the common header
| * elf-em.h: move EM_MICROBLAZE to the common headerMike Frysinger2015-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The linux/audit.h header uses EM_MICROBLAZE in order to define AUDIT_ARCH_MICROBLAZE, but it's only available in the microblaze asm headers. Move it to the common elf-em.h header so that the define can be used on non-microblaze systems. Otherwise we get build errors that EM_MICROBLAZE isn't defined when we try to use the AUDIT_ARCH_MICROBLAZE symbol. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2015-09-102-18/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel Pull hexagon updates from Richard Kuo: "Just two fixes -- one for a uapi header and one for a timer interface" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel: Revert "Hexagon: fix signal.c compile error" hexagon/time: Migrate to new 'set-state' interface
| * Revert "Hexagon: fix signal.c compile error"Mike Frysinger2015-09-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | This reverts commit f3f601c1d2728f02544cfd143eaa82e5398b3e9b. UAPI headers cannot use "uapi/" in their paths by design -- when they're installed, they do not have the uapi/ prefix. Otherwise doing so breaks userland badly. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
| * hexagon/time: Migrate to new 'set-state' interfaceViresh Kumar2015-09-081-16/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migrate hexagon driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. We weren't doing anything in the ->set_mode() callback. So, this patch doesn't provide any set-state callbacks. Cc: linux-hexagon@vger.kernel.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
* | Merge tag 'metag-for-v4.3' of ↵Linus Torvalds2015-09-091-4/+6
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull metag updates from James Hogan: "Metag architecture changes for v4.3. Just a couple of changes for v4.3-rc1. A preparatory IRQ patch to prepare for moving irq_data struct members, and a tweak to Documentation/features since Meta2 could support THP" * tag 'metag-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: Documentation/features/vm: Meta2 is capable of THP metag/irq: Use access helper irq_data_get_affinity_mask()
| * | metag/irq: Use access helper irq_data_get_affinity_mask()Jiang Liu2015-07-141-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | This is a preparatory patch for moving irq_data struct members. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: James Hogan <james.hogan@imgtec.com>
* | | Merge tag 'nios2-v4.3-rc1' of ↵Linus Torvalds2015-09-094-39/+359
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 Pull nios2 updates from Ley Foon Tan: - add defconfig and device tree for max 10 support - migrate to new 'set-state' interface for timer - fix unaligned handler - MAINTAINERS: update nios2 git repo * tag 'nios2-v4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: nios2: add Max10 defconfig nios2: Add Max10 device tree MAINTAINERS: update nios2 git repo nios2: remove unused statistic counters nios2: fixed variable imm16 to s16 nios2/time: Migrate to new 'set-state' interface
| * | | nios2: add Max10 defconfigChee Nouk Phoon2015-09-081-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Max10 is a FPGA device. This patch adds defconfig based on Max10 hardware reference design. Design is intended to run on Max10 development kit. Signed-off-by: Chee Nouk Phoon <cnphoon@altera.com> Signed-off-by: Ley Foon Tan <lftan@altera.com>
| * | | nios2: Add Max10 device treeChee Nouk Phoon2015-09-081-0/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Max10 is a FPGA device. This patch adds Nios2 support for Max10. This device tree is based on Max10 hardware reference design. Signed-off-by: Chee Nouk Phoon <cnphoon@altera.com> Signed-off-by: Ley Foon Tan <lftan@altera.com>
| * | | nios2: remove unused statistic countersBernd Weiberg2015-09-081-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed some statistic counters to improve the performance of the handler. Signed-off-by: Bernd Weiberg <bernd.weiberg@siemens.com> Signed-off-by: Ley Foon Tan <lftan@altera.com>
| * | | nios2: fixed variable imm16 to s16Bernd Weiberg2015-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fxid variable imm16 to s16 instead of u16, offset might be negative. Signed-off-by: Bernd Weiberg <bernd.weiberg@siemens.com> Signed-off-by: Ley Foon Tan <lftan@altera.com>
| * | | nios2/time: Migrate to new 'set-state' interfaceViresh Kumar2015-09-081-20/+29
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migrate nios2 driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Cc: Ley Foon Tan <lftan@altera.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Dmitry Torokhov <dtor@chromium.org> Cc: nios2-dev@lists.rocketboards.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ley Foon Tan <lftan@altera.com>
* | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-09-089-33/+73
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge second patch-bomb from Andrew Morton: "Almost all of the rest of MM. There was an unusually large amount of MM material this time" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits) zpool: remove no-op module init/exit mm: zbud: constify the zbud_ops mm: zpool: constify the zpool_ops mm: swap: zswap: maybe_preload & refactoring zram: unify error reporting zsmalloc: remove null check from destroy_handle_cache() zsmalloc: do not take class lock in zs_shrinker_count() zsmalloc: use class->pages_per_zspage zsmalloc: consider ZS_ALMOST_FULL as migrate source zsmalloc: partial page ordering within a fullness_list zsmalloc: use shrinker to trigger auto-compaction zsmalloc: account the number of compacted pages zsmalloc/zram: introduce zs_pool_stats api zsmalloc: cosmetic compaction code adjustments zsmalloc: introduce zs_can_compact() function zsmalloc: always keep per-class stats zsmalloc: drop unused variable `nr_to_migrate' mm/memblock.c: fix comment in __next_mem_range() mm/page_alloc.c: fix type information of memoryless node memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node() ...
| * | | mm: rename alloc_pages_exact_node() to __alloc_pages_node()Vlastimil Babka2015-09-085-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page allocator: do not check NUMA node ID when the caller knows the node is valid") as an optimized variant of alloc_pages_node(), that doesn't fallback to current node for nid == NUMA_NO_NODE. Unfortunately the name of the function can easily suggest that the allocation is restricted to the given node and fails otherwise. In truth, the node is only preferred, unless __GFP_THISNODE is passed among the gfp flags. The misleading name has lead to mistakes in the past, see for example commits 5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node") and b360edb43f8e ("mm, mempolicy: migrate_to_node should only migrate to node"). Another issue with the name is that there's a family of alloc_pages_exact*() functions where 'exact' means exact size (instead of page order), which leads to more confusion. To prevent further mistakes, this patch effectively renames alloc_pages_exact_node() to __alloc_pages_node() to better convey that it's an optimized variant of alloc_pages_node() not intended for general usage. Both functions get described in comments. It has been also considered to really provide a convenience function for allocations restricted to a node, but the major opinion seems to be that __GFP_THISNODE already provides that functionality and we shouldn't duplicate the API needlessly. The number of users would be small anyway. Existing callers of alloc_pages_exact_node() are simply converted to call __alloc_pages_node(), with the exception of sba_alloc_coherent() which open-codes the check for NUMA_NO_NODE, so it is converted to use alloc_pages_node() instead. This means it no longer performs some VM_BUG_ON checks, and since the current check for nid in alloc_pages_node() uses a 'nid < 0' comparison (which includes NUMA_NO_NODE), it may hide wrong values which would be previously exposed. Both differences will be rectified by the next patch. To sum up, this patch makes no functional changes, except temporarily hiding potentially buggy callers. Restricting the checks in alloc_pages_node() is left for the next patch which can in turn expose more existing buggy callers. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Robin Holt <robinmholt@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | x86: use generic early mem copyMark Salter2015-09-081-21/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The early_ioremap library now has a generic copy_from_early_mem() function. Use the generic copy function for x86 relocate_initrd(). [akpm@linux-foundation.org: remove MAX_MAP_CHUNK define, per Yinghai Lu] Signed-off-by: Mark Salter <msalter@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | arm64: support initrd outside kernel linear mapMark Salter2015-09-081-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of mem= could leave part or all of the initrd outside of the kernel linear map. This will lead to an error when unpacking the initrd and a probable failure to boot. This patch catches that situation and relocates the initrd to be fully within the linear map. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mem-hotplug: handle node hole when initializing numa_meminfo.Tang Chen2015-09-081-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parsing SRAT, all memory ranges are added into numa_meminfo. In numa_init(), before entering numa_cleanup_meminfo(), all possible memory ranges are in numa_meminfo. And numa_cleanup_meminfo() removes all ranges over max_pfn or empty. But, this only works if the nodes are continuous. Let's have a look at the following example: We have an SRAT like this: SRAT: Node 0 PXM 0 [mem 0x00000000-0x5fffffff] SRAT: Node 0 PXM 0 [mem 0x100000000-0x1ffffffffff] SRAT: Node 1 PXM 1 [mem 0x20000000000-0x3ffffffffff] SRAT: Node 4 PXM 2 [mem 0x40000000000-0x5ffffffffff] hotplug SRAT: Node 5 PXM 3 [mem 0x60000000000-0x7ffffffffff] hotplug SRAT: Node 2 PXM 4 [mem 0x80000000000-0x9ffffffffff] hotplug SRAT: Node 3 PXM 5 [mem 0xa0000000000-0xbffffffffff] hotplug SRAT: Node 6 PXM 6 [mem 0xc0000000000-0xdffffffffff] hotplug SRAT: Node 7 PXM 7 [mem 0xe0000000000-0xfffffffffff] hotplug On boot, only node 0,1,2,3 exist. And the numa_meminfo will look like this: numa_meminfo.nr_blks = 9 1. on node 0: [0, 60000000] 2. on node 0: [100000000, 20000000000] 3. on node 1: [20000000000, 40000000000] 4. on node 4: [40000000000, 60000000000] 5. on node 5: [60000000000, 80000000000] 6. on node 2: [80000000000, a0000000000] 7. on node 3: [a0000000000, a0800000000] 8. on node 6: [c0000000000, a0800000000] 9. on node 7: [e0000000000, a0800000000] And numa_cleanup_meminfo() will merge 1 and 2, and remove 8,9 because the end address is over max_pfn, which is a0800000000. But 4 and 5 are not removed because their end addresses are less then max_pfn. But in fact, node 4 and 5 don't exist. In a word, numa_cleanup_meminfo() is not able to handle holes between nodes. Since memory ranges in node 4 and 5 are in numa_meminfo, in numa_register_memblks(), node 4 and 5 will be mistakenly set to online. If you run lscpu, it will show: NUMA node0 CPU(s): 0-14,128-142 NUMA node1 CPU(s): 15-29,143-157 NUMA node2 CPU(s): NUMA node3 CPU(s): NUMA node4 CPU(s): 62-76,190-204 NUMA node5 CPU(s): 78-92,206-220 In this patch, we use memblock_overlaps_region() to check if ranges in numa_meminfo overlap with ranges in memory_block. Since memory_block contains all available memory at boot time, if they overlap, it means the ranges exist. If not, then remove them from numa_meminfo. After this patch, lscpu will show: NUMA node0 CPU(s): 0-14,128-142 NUMA node1 CPU(s): 15-29,143-157 NUMA node4 CPU(s): 62-76,190-204 NUMA node5 CPU(s): 78-92,206-220 Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | sparc32: do not include swap.h from pgtable_32.hMichal Hocko2015-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "memcg: export struct mem_cgroup" will add includes into linux/memcontrol.h which lead to further header dependency issues as reported by Guenter Roeck: In file included from include/linux/highmem.h:7:0, from include/linux/bio.h:23, from include/linux/writeback.h:192, from include/linux/memcontrol.h:30, from include/linux/swap.h:8, from ./arch/sparc/include/asm/pgtable_32.h:17, from ./arch/sparc/include/asm/pgtable.h:6, from arch/sparc/kernel/traps_32.c:23: include/linux/mm.h: In function 'is_vmalloc_addr': include/linux/mm.h:371:17: error: 'VMALLOC_START' undeclared (first use in this function) include/linux/mm.h:371:17: note: each undeclared identifier is reported only once for each function it appears in include/linux/mm.h:371:41: error: 'VMALLOC_END' undeclared (first use in this function) include/linux/mm.h: In function 'maybe_mkwrite': include/linux/mm.h:556:3: error: implicit declaration of function 'pte_mkwrite' The issue is that pgtable_32.h depends on swap.h to get swap_entry_t but that goes all the way down to linux/mm.h which wants to have VMALLOC_* which is defined later in pgtable_32.h, though. swap_entry_t is defined in include/mm_types.h so it should be sufficient to include this header without more dependencies. Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'parisc-4.3-1' of ↵Linus Torvalds2015-09-085-21/+15
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "The most important changes in this patchset are: - re-enable 64bit PCI bus addresses which were temporarily disabled for PA-RISC in kernel 4.2 - fix the 64bit CAS operation in the LWS path which now enables us to enable the 64bit gcc atomic builtins even on 32bit userspace with 64bit kernel - fix a long-standing bug which sometimes crashed kernel at bootup while serial interrupt wasn't registered yet" * 'parisc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Use platform_device_register_simple("rtc-generic") parisc: Drop CONFIG_SMP around update_cr16_clocksource() parisc: Use double word condition in 64bit CAS operation parisc: Filter out spurious interrupts in PA-RISC irq handler parisc: Additionally check for in_atomic() in page fault handler PCI,parisc: Enable 64-bit bus addresses on PA-RISC parisc: Define ioremap_uc and ioremap_wc
| * | | | parisc: Use platform_device_register_simple("rtc-generic")Helge Deller2015-09-081-10/+4
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | parisc: Drop CONFIG_SMP around update_cr16_clocksource()Helge Deller2015-09-081-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need to use CONFIG_SMP around update_cr16_clocksource(). It checks for num_online_cpus() beeing greater than 1, which is always 1 in UP builds. Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | parisc: Use double word condition in 64bit CAS operationJohn David Anglin2015-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attached change fixes the condition used in the "sub" instruction. A double word comparison is needed. This fixes the 64-bit LWS CAS operation on 64-bit kernels. I can now enable 64-bit atomic support in GCC. Cc: <stable@vger.kernel.org> Signed-off-by: John David Anglin <dave.anglin> Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | parisc: Filter out spurious interrupts in PA-RISC irq handlerHelge Deller2015-09-081-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When detecting a serial port on newer PA-RISC machines (with iosapic) we have a long way to go to find the right IRQ line, registering it, then registering the serial port and the irq handler for the serial port. During this phase spurious interrupts for the serial port may happen which then crashes the kernel because the action handler might not have been set up yet. So, basically it's a race condition between the serial port hardware and the CPU which sets up the necessary fields in the irq sructs. The main reason for this race is, that we unmask the serial port irqs too early without having set up everything properly before (which isn't easily possible because we need the IRQ number to register the serial ports). This patch is a work-around for this problem. It adds checks to the CPU irq handler to verify if the IRQ action field has been initialized already. If not, we just skip this interrupt (which isn't critical for a serial port at bootup). The real fix would probably involve rewriting all PA-RISC specific IRQ code (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of the irq chips and proper irq enabling along this line. This bug has been in the PA-RISC port since the beginning, but the crashes happened very rarely with currently used hardware. But on the latest machine which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900, 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel crashed at every boot because of this race. So, without this patch the machine would currently be unuseable. For the record, here is the flow logic: 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq(). 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq. 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!) 5. serial_init_chip() then registers the 8250 port. Problems: - In step 4 the CPU irq shouldn't have been registered yet, but after step 5 - If serial irq happens between 4 and 5 have finished, the kernel will crash Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | parisc: Additionally check for in_atomic() in page fault handlerHelge Deller2015-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Craig Estey noticed that we didn't checked for in_atomic() in our page fault handler like other architectures. This commit adds this check by using faulthandler_disabled() which includes a check for pagefault_disabled() and in_atomic(). Reported-by: Craig Estey <cae370@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | parisc: Define ioremap_uc and ioremap_wcGuenter Roeck2015-09-081-0/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC") introduces calls to ioremap_wc and ioremap_uc. This causes build failures with parisc:allmodconfig. Map the missing functions to ioremap_nocache. Fixes: 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC") Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de>
* | | | Merge tag 'rtc-v4.3' of ↵Linus Torvalds2015-09-0815-81/+41
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Core: - use is_visible() to control sysfs attributes - switch wakealarm attribute to DEVICE_ATTR_RW - make rtc_does_wakealarm() return boolean - properly manage lifetime of dev and cdev in rtc device - remove unnecessary device_get() in rtc_device_unregister - fix double free in rtc_register_device() error path New drivers: - NXP LPC24xx - Xilinx Zynq MP - Dialog DA9062 Subsystem wide cleanups: - fix drivers that consider 0 as a valid IRQ in client->irq - Drop (un)likely before IS_ERR(_OR_NULL) - drop the remaining owner assignment for i2c_driver and platform_driver - module autoload fixes Drivers: - 88pm80x: add device tree support - abx80x: fix RTC write bit - ab8500: Add a sentinel to ab85xx_rtc_ids[] - armada38x: Align RTC set time procedure with the official errata - as3722: correct month value - at91sam9: cleanups - at91rm9200: get and use slow clock and cleanups - bq32k: remove redundant check - cmos: century support, proper fix for the spurious wakeup - ds1307: cleanups and wakeup irq support - ds1374: Remove unused variable - ds1685: Use module_platform_driver - ds3232: fix WARNING trace in resume function - gemini: fix ptr_ret.cocci warnings - mt6397: implement suspend/resume - omap: support internal and external clock enabling - opal: Enable alarms only when opal supports tpo - pcf2127: use OFS flag to detect unreliable date and warn the user - pl031: fix typo for author email - rx8025: huge cleanup and fixes - sa1100/pxa: share common code - s5m: fix to update ctrl register - s3c: fix clocks and wakeup, cleanup - sirfsoc: use regmap - nvram_read()/nvram_write() functions for cmos, ds1305, ds1307, ds1343, ds1511, ds1553, ds1742, m48t59, rp5c01, stk17ta8, tx4939 - use rtc_valid_tm() error code when reading date/time instead of 0 for isl12022, pcf2123, pcf2127" * tag 'rtc-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (90 commits) rtc: abx80x: fix RTC write bit rtc: ab8500: Add a sentinel to ab85xx_rtc_ids[] rtc: ds1374: Remove unused variable rtc: Fix module autoload for OF platform drivers rtc: Fix module autoload for rtc-{ab8500,max8997,s5m} drivers rtc: omap: Add external clock enabling support rtc: omap: Add internal clock enabling support ARM: dts: AM437x: Add the internal and external clock nodes for rtc rtc: s5m: fix to update ctrl register rtc: add xilinx zynqmp rtc driver devicetree: bindings: rtc: add bindings for xilinx zynqmp rtc rtc: as3722: correct month value ARM: config: Switch PXA27x platforms to use PXA RTC driver ARM: mmp: remove unused RTC register definitions ARM: sa1100: remove unused RTC register definitions rtc: sa1100/pxa: convert to run-time register mapping ARM: pxa: add memory resource to SA1100 RTC device rtc: pxa: convert to use shared sa1100 functions rtc: sa1100: prepare to share sa1100_rtc_ops rtc: ds3232: fix WARNING trace in resume function ...
| * | | | ARM: dts: AM437x: Add the internal and external clock nodes for rtcKeerthy2015-09-054-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rtc can either be supplied from internal 32k clock or external crystal generated 32k clock. Internal clock is SOC specific and the external clock is board dependent. Adding the corresponding nodes. Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
| * | | | ARM: config: Switch PXA27x platforms to use PXA RTC driverRob Herring2015-09-056-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the SA1100 and PXA RTC drivers be mutually exclusive and no longer sharing hardware, PXA27x/PXA3xx platforms must use the PXA RTC driver as the SA1100 platform device is no longer registered. This change should be almost transparent to userspace. Former users of pxa-rtc should be aware that 2 RTCs will be available on their kernels, rtc0 being sa1100-rtc and rtc1 being pxa-rtc. Any userspace relying on the fact that rtc0 was pxa-rtc should be fixed. As a consequence: - the first reboot after the switch will have the wrong time, - on dual boot platform where the other OS programs some logic into the sa1100 rtc IP, a lack of fix in userspace, ie. a kernel changing sa1100-rtc thinking it is pxa-rtc could have dire consequence, such as wiping the other OS data partition. (Thanks to Robert Jarmik for help on the above commit text.) Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Daniel Mack <daniel@zonque.org> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: Sergey Lapin <slapin@ossfans.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mike Rapoport <mike@compulab.co.il> Cc: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
| * | | | ARM: mmp: remove unused RTC register definitionsRob Herring2015-09-051-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that register definitions have been moved to the driver, regs-rtc.h is no longer used and can be removed. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
| * | | | ARM: sa1100: remove unused RTC register definitionsRob Herring2015-09-051-34/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that register definitions have been moved to the driver, we can remove them from machine specific code. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
| * | | | ARM: pxa: add memory resource to SA1100 RTC deviceRob Herring2015-09-051-16/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The drivers for the SA1100 and PXA RTCs are now mutually exclusive, so add the memory resource for the sa1100-rtc device. Since the memory resource is already present in the pxa_rtc_resources, that makes sa1100_rtc_resources and pxa_rtc_resources equivalent, so use pxa_rtc_resources for both devices and remove the duplicate sa1100_rtc_resources. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Daniel Mack <daniel@zonque.org> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Russell King <linux@arm.linux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
| * | | | rtc: pxa: convert to use shared sa1100 functionsRob Herring2015-09-052-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the rtc-sa1100 and rtc-pxa drivers co-exist as rtc-pxa has a superset of functionality. Having 2 drivers sharing the same memory resource is not allowed by the driver model if resources are properly declared. This problem was avoided by not adding memory resources to the SA1100 RTC driver, but that prevents clean-up of the SA1100 driver. This commit converts the PXA RTC to use the exported SA1100 RTC functions. Now the sa1100-rtc and pxa-rtc devices are mutually exclusive, so we must remove the sa1100-rtc from pxa27x and pxa3xx. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Daniel Mack <daniel@zonque.org> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: linux-arm-kernel@lists.infradead.org Cc: rtc-linux@googlegroups.com Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
* | | | | Merge tag 'libnvdimm-for-4.3' of ↵Linus Torvalds2015-09-0826-191/+197
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "This update has successfully completed a 0day-kbuild run and has appeared in a linux-next release. The changes outside of the typical drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and the introduction of ZONE_DEVICE + devm_memremap_pages(). Summary: - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic mechanism for adding device-driver-discovered memory regions to the kernel's direct map. This facility is used by the pmem driver to enable pfn_to_page() operations on the page frames returned by DAX ('direct_access' in 'struct block_device_operations'). For now, the 'memmap' allocation for these "device" pages comes from "System RAM". Support for allocating the memmap from device memory will arrive in a later kernel. - Introduce memremap() to replace usages of ioremap_cache() and ioremap_wt(). memremap() drops the __iomem annotation for these mappings to memory that do not have i/o side effects. The replacement of ioremap_cache() with memremap() is limited to the pmem driver to ease merging the api change in v4.3. Completion of the conversion is targeted for v4.4. - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem driver, update the VFS DAX implementation and PMEM api to provide persistence guarantees for kernel operations on a DAX mapping. - Convert the ACPI NFIT 'BLK' driver to map the block apertures as cacheable to improve performance. - Miscellaneous updates and fixes to libnvdimm including support for issuing "address range scrub" commands, clarifying the optimal 'sector size' of pmem devices, a clarification of the usage of the ACPI '_STA' (status) property for DIMM devices, and other minor fixes" * tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits) libnvdimm, pmem: direct map legacy pmem by default libnvdimm, pmem: 'struct page' for pmem libnvdimm, pfn: 'struct page' provider infrastructure x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB add devm_memremap_pages mm: ZONE_DEVICE for "device memory" mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h dax: drop size parameter to ->direct_access() nd_blk: change aperture mapping from WC to WB nvdimm: change to use generic kvfree() pmem, dax: have direct_access use __pmem annotation dax: update I/O path to do proper PMEM flushing pmem: add copy_from_iter_pmem() and clear_pmem() pmem, x86: clean up conditional pmem includes pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: move x86 PMEM API to new pmem.h header libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option pmem: switch to devm_ allocations devres: add devm_memremap libnvdimm, btt: write and validate parent_uuid ...
| * | | | | x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WBDan Williams2015-08-272-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that a write-back (WB) mapping plus non-temporal stores is expected to be the most efficient way to access PMEM, update the definition of ARCH_HAS_PMEM_API to imply arch support for WB-mapped-PMEM. This is needed as a pre-requisite for adding PMEM to the direct map and mapping it with struct page. The above clarification for X86_64 means that memcpy_to_pmem() is permitted to use the non-temporal arch_memcpy_to_pmem() rather than needlessly fall back to default_memcpy_to_pmem() when the pcommit instruction is not available. When arch_memcpy_to_pmem() is not guaranteed to flush writes out of cache, i.e. on older X86_32 implementations where non-temporal stores may just dirty cache, ARCH_HAS_PMEM_API is simply disabled. The default fall back for persistent memory handling remains. Namely, map it with the WT (write-through) cache-type and hope for the best. arch_has_pmem_api() is updated to only indicate whether the arch provides the proper helpers to meet the minimum "writes are visible outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()". Code that cares whether wmb_pmem() actually flushes writes to pmem must now call arch_has_wmb_pmem() directly. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> [hch: set ARCH_HAS_PMEM_API=n on x86_32] Reviewed-by: Christoph Hellwig <hch@lst.de> [toshi: x86_32 compile fixes] Signed-off-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | mm: ZONE_DEVICE for "device memory"Dan Williams2015-08-277-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While pmem is usable as a block device or via DAX mappings to userspace there are several usage scenarios that can not target pmem due to its lack of struct page coverage. In preparation for "hot plugging" pmem into the vmemmap add ZONE_DEVICE as a new zone to tag these pages separately from the ones that are subject to standard page allocations. Importantly "device memory" can be removed at will by userspace unbinding the driver of the device. Having a separate zone prevents allocation and otherwise marks these pages that are distinct from typical uniform memory. Device memory has different lifetime and performance characteristics than RAM. However, since we have run out of ZONES_SHIFT bits this functionality currently depends on sacrificing ZONE_DMA. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Jerome Glisse <j.glisse@gmail.com> [hch: various simplifications in the arch interface] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.hChristoph Hellwig2015-08-273-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Three architectures already define these, and we'll need them genericly soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | dax: drop size parameter to ->direct_access()Dan Williams2015-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | None of the implementations currently use it. The common bdev_direct_access() entry point handles all the size checks before calling ->direct_access(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | Merge branch 'pmem-api' into libnvdimm-for-nextDan Williams2015-08-2713-86/+175
| |\ \ \ \ \
| | * | | | | nd_blk: change aperture mapping from WC to WBRoss Zwisler2015-08-274-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This should result in a pretty sizeable performance gain for reads. For rough comparison I did some simple read testing using PMEM to compare reads of write combining (WC) mappings vs write-back (WB). This was done on a random lab machine. PMEM reads from a write combining mapping: # dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000 100000+0 records in 100000+0 records out 409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s PMEM reads from a write-back mapping: # dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000 1000000+0 records in 1000000+0 records out 4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s To be able to safely support a write-back aperture I needed to add support for the "read flush" _DSM flag, as outlined in the DSM spec: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf This flag tells the ND BLK driver that it needs to flush the cache lines associated with the aperture after the aperture is moved but before any new data is read. This ensures that any stale cache lines from the previous contents of the aperture will be discarded from the processor cache, and the new data will be read properly from the DIMM. We know that the cache lines are clean and will be discarded without any writeback because either a) the previous aperture operation was a read, and we never modified the contents of the aperture, or b) the previous aperture operation was a write and we must have written back the dirtied contents of the aperture to the DIMM before the I/O was completed. In order to add support for the "read flush" flag I needed to add a generic routine to invalidate cache lines, mmio_flush_range(). This is protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently only supported on x86. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem, dax: have direct_access use __pmem annotationRoss Zwisler2015-08-201-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the annotation for the kaddr pointer returned by direct_access() so that it is a __pmem pointer. This is consistent with the PMEM driver and with how this direct_access() pointer is used in the DAX code. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem: add copy_from_iter_pmem() and clear_pmem()Ross Zwisler2015-08-201-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for two new PMEM APIs, copy_from_iter_pmem() and clear_pmem(). copy_from_iter_pmem() is used to copy data from an iterator into a PMEM buffer. clear_pmem() zeros a PMEM memory range. Both of these new APIs must be explicitly ordered using a wmb_pmem() function call and are implemented in such a way that the wmb_pmem() will make the stores to PMEM durable. Because both APIs are unordered they can be called as needed without introducing any unwanted memory barriers. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem, x86: clean up conditional pmem includesRoss Zwisler2015-08-201-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change x86_64 used the pmem defines in arch/x86/include/asm/pmem.h, and UM used the default ones at the top of include/linux/pmem.h. The inclusion or exclusion in linux/pmem.h was controlled by CONFIG_ARCH_HAS_PMEM_API, but the ones in asm/pmem.h were controlled by ARCH_HAS_NOCACHE_UACCESS. Instead, control them both with CONFIG_ARCH_HAS_PMEM_API so that it's clear that they are related and we don't run into the possibility where they are both included or excluded. Also remove a bunch of stale function prototypes meant for UM in asm/pmem.h - these just conflicted with the inline defaults in linux/pmem.h and gave compile errors. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem: remove layer when calling arch_has_wmb_pmem()Ross Zwisler2015-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change arch_has_wmb_pmem() was only called by arch_has_pmem_api(). Both arch_has_wmb_pmem() and arch_has_pmem_api() checked to make sure that CONFIG_ARCH_HAS_PMEM_API was enabled. Instead, remove the old arch_has_wmb_pmem() wrapper to be rid of one extra layer of indirection and the redundant CONFIG_ARCH_HAS_PMEM_API check. Rename __arch_has_wmb_pmem() to arch_has_wmb_pmem() since we no longer have a wrapper, and just have arch_has_pmem_api() call the architecture specific arch_has_wmb_pmem() directly. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem, x86: move x86 PMEM API to new pmem.h headerRoss Zwisler2015-08-202-71/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the x86 PMEM API implementation out of asm/cacheflush.h and into its own header asm/pmem.h. This will allow members of the PMEM API to be more easily identified on this and other architectures. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | pmem: convert to generic memremapDan Williams2015-08-141-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill arch_memremap_pmem() and just let the architecture specify the flags to be passed to memremap(). Default to writethrough by default. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | arch: introduce memremap()Dan Williams2015-08-143-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Existing users of ioremap_cache() are mapping memory that is known in advance to not have i/o side effects. These users are forced to cast away the __iomem annotation, or otherwise neglect to fix the sparse errors thrown when dereferencing pointers to this memory. Provide memremap() as a non __iomem annotated ioremap_*() in the case when ioremap is otherwise a pointer to cacheable memory. Empirically, ioremap_<cacheable-type>() call sites are seeking memory-like semantics (e.g. speculative reads, and prefetching permitted). memremap() is a break from the ioremap implementation pattern of adding a new memremap_<type>() for each mapping type and having silent compatibility fall backs. Instead, the implementation defines flags that are passed to the central memremap() and if a mapping type is not supported by an arch memremap returns NULL. We introduce a memremap prototype as a trivial wrapper of ioremap_cache() and ioremap_wt(). Later, once all ioremap_cache() and ioremap_wt() usage has been removed from drivers we teach archs to implement arch_memremap() with the ability to strictly enforce the mapping type. Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | cleanup IORESOURCE_CACHEABLE vs ioremap()Dan Williams2015-08-103-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting Arnd: I was thinking the opposite approach and basically removing all uses of IORESOURCE_CACHEABLE from the kernel. There are only a handful of them.and we can probably replace them all with hardcoded ioremap_cached() calls in the cases they are actually useful. All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of ioremap_nocache() if the resource is cacheable, however ioremap() is uncached by default. Clearly none of the existing usages care about the cacheability. Particularly devm_ioremap_resource() never worked as advertised since it always fell back to plain ioremap(). Clean this up as the new direction we want is to convert ioremap_<type>() usages to memremap(..., flags). Suggested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * | | | | arch, drivers: don't include <asm/io.h> directly, use <linux/io.h> insteadDan Williams2015-08-102-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preparation for uniform definition of ioremap, ioremap_wc, ioremap_wt, and ioremap_cache, tree-wide. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | | libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate optionDan Williams2015-08-194-74/+15
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently register a platform device for e820 type-12 memory and register a nvdimm bus beneath it. Registering the platform device triggers the device-core machinery to probe for a driver, but that search currently comes up empty. Building the nvdimm-bus registration into the e820_pmem platform device registration in this way forces libnvdimm to be built-in. Instead, convert the built-in portion of CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the rest of the logic to the driver for e820_pmem, for the following reasons: 1/ Letting e820_pmem support be a module allows building and testing libnvdimm.ko changes without rebooting 2/ All the normal policy around modules can be applied to e820_pmem (unbind to disable and/or blacklisting the module from loading by default) 3/ Moving the driver to a generic location and converting it to scan "iomem_resource" rather than "e820.map" means any other architecture can take advantage of this simple nvdimm resource discovery mechanism by registering a resource named "Persistent Memory (legacy)" Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>