summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2014-01-10The memblock current limit value is used to limit early bootSantosh Shilimkar2-3/+3
memory allocations below max low memory address by default, as the kernel can access only to the low memory. Hence, set memblock current limit value to the max mapped low memory address instead of max mapped memory address. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Tejun Heo <tj@kernel.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggableTang Chen1-0/+44
At very early time, the kernel have to use some memory such as loading the kernel image. We cannot prevent this anyway. So any node the kernel resides in should be un-hotpluggable. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10acpi-numa-mem_hotplug-mark-hotpluggable-memory-in-memblock-checkpatch-fixesAndrew Morton1-1/+1
WARNING: line over 80 characters #65: FILE: arch/x86/mm/srat.c:187: + (unsigned long long) start, (unsigned long long) end - 1); total: 0 errors, 1 warnings, 19 lines checked ./patches/acpi-numa-mem_hotplug-mark-hotpluggable-memory-in-memblock.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10acpi, numa, mem_hotplug: mark hotpluggable memory in memblockTang Chen2-0/+7
When parsing SRAT, we know that which memory area is hotpluggable. So we invoke function memblock_mark_hotplug() introduced by previous patch to mark hotpluggable memory in memblock. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10memblock-make-memblock_set_node-support-different-memblock_type-fixStephen Rothwell1-1/+1
After merging the final tree, today's linux-next build (powerpc allnoconfig) failed like this: arch/powerpc/mm/mem.c: In function 'do_init_bootmem': arch/powerpc/mm/mem.c:212:49: error: 'memblock_memory' undeclared (first us= e in this function) memblock_set_node(0, (phys_addr_t)ULLONG_MAX, &memblock_memory, 0); ^ Caused by commit 3a543893d46a ("memblock: make memblock_set_node() support different memblock_type") from the apm-current tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10memblock: make memblock_set_node() support different memblock_typeTang Chen10-15/+23
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10mm-show_mem-remove-show_mem_filter_page_count-fixAndrew Morton1-3/+0
fix parisc Cc: David Rientjes <rientjes@google.com> Cc: James Bottomley <jejb@parisc-linux.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10mm, show_mem: remove SHOW_MEM_FILTER_PAGE_COUNTMel Gorman6-179/+68
Commit 4b59e6c4 ("mm, show_mem: suppress page counts in non-blockable contexts") introduced SHOW_MEM_FILTER_PAGE_COUNT to suppress PFN walks on large memory machines. Commit c78e9363 ("mm: do not walk all of system memory during show_mem") avoided a PFN walk in the generic show_mem helper which removes the requirement for SHOW_MEM_FILTER_PAGE_COUNT in that case. This patch removes PFN walkers from the arch-specific implementations that report on a per-node or per-zone granularity. ARM and unicore32 still do a PFN walk as they report memory usage on each bank which is a much finer granularity where the debugging information may still be of use. As the remaining arches doing PFN walks have relatively small amounts of memory, this patch simply removes SHOW_MEM_FILTER_PAGE_COUNT. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David Rientjes <rientjes@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10microblaze: extable: sort the exception table at build timeMichal Simek1-0/+1
Sort the exception table at build-time rather than during boot. Microblaze is the same case as AARCH64 that's why EM_MICROBLAZE conditional check was added to allow cross-compilation on machines which are not running the latest libc-dev. Inspired by AARCH64 commit adace89562c7a9645b ("arm64: extable: sort the exception table at build time"). Signed-off-by: Michal Simek <michal.simek@xilinx.com> Acked-by: David Daney <david.daney@cavium.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10arm: move arm_dma_limit to setup_dma_zoneVladimir Murzin2-4/+4
Since 4dcfa600 ("ARM: DMA-API: better handing of DMA masks for coherent allocations") arm_dma_limit_pfn has almost substituted the arm_dma_limit. The remaining user is dma_contiguous_reserve(). It is also referenced in setup_dma_zone() to calculate arm_dma_limit_pfn. Kill the global arm_dma_limit and equip setup_zone_dma with the local one. Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> Reported-by: Vassili Karpov <av1474@comtv.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10mm: x86: revisit tlb_flushall_shift tuning for page flushes except on IvyBridgeMel Gorman2-11/+4
There was a large ebizzy performance regression that was bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range support for x86). The problem was related to the tlb_flushall_shift tuning for IvyBridge which was altered. The problem is that it is not clear if the tuning values for each CPU family is correct as the methodology used to tune the values is unclear. This patch uses a conservative tlb_flushall_shift value for all CPU families except IvyBridge so the decision can be revisited if any regression is found as a result of this change. IvyBridge is an exception as testing with one methodology determined that the value of 2 is acceptable. Details are in the changelog for the patch "x86: mm: Change tlb_flushall_shift for IvyBridge". One important aspect of this to watch out for is Xen. The original commit log mentioned large performance gains on Xen. It's possible Xen is more sensitive to this value if it flushes small ranges of pages more frequently than workloads on bare metal typically do. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Shi <alex.shi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H Peter Anvin <hpa@zytor.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10x86: mm: Change tlb_flushall_shift for IvyBridgeMel Gorman1-1/+1
There was a large performance regression that was bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range support for x86). This patch simply changes the default balance point between a local and global flush for IvyBridge. In the interest of allowing the tests to be reproduced, this patch was tested using mmtests 0.15 with the following configurations configs/config-global-dhp__tlbflush-performance configs/config-global-dhp__scheduler-performance configs/config-global-dhp__network-performance Results are from two machines Ivybridge 4 threads: Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz Ivybridge 8 threads: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Page fault microbenchmark showed nothing interesting. Ebizzy was configured to run multiple iterations and threads. Thread counts ranged from 1 to NR_CPUS*2. For each thread count, it ran 100 iterations and each iteration lasted 10 seconds. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 6395.44 ( 0.00%) 6789.09 ( 6.16%) Mean 2 7012.85 ( 0.00%) 8052.16 ( 14.82%) Mean 3 6403.04 ( 0.00%) 6973.74 ( 8.91%) Mean 4 6135.32 ( 0.00%) 6582.33 ( 7.29%) Mean 5 6095.69 ( 0.00%) 6526.68 ( 7.07%) Mean 6 6114.33 ( 0.00%) 6416.64 ( 4.94%) Mean 7 6085.10 ( 0.00%) 6448.51 ( 5.97%) Mean 8 6120.62 ( 0.00%) 6462.97 ( 5.59%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 7336.65 ( 0.00%) 7787.02 ( 6.14%) Mean 2 8218.41 ( 0.00%) 9484.13 ( 15.40%) Mean 3 7973.62 ( 0.00%) 8922.01 ( 11.89%) Mean 4 7798.33 ( 0.00%) 8567.03 ( 9.86%) Mean 5 7158.72 ( 0.00%) 8214.23 ( 14.74%) Mean 6 6852.27 ( 0.00%) 7952.45 ( 16.06%) Mean 7 6774.65 ( 0.00%) 7536.35 ( 11.24%) Mean 8 6510.50 ( 0.00%) 6894.05 ( 5.89%) Mean 12 6182.90 ( 0.00%) 6661.29 ( 7.74%) Mean 16 6100.09 ( 0.00%) 6608.69 ( 8.34%) Ebizzy hits the worst case scenario for TLB range flushing every time and it shows for these Ivybridge CPUs at least that the default choice is a poor on. The patch addresses the problem. Next was a tlbflush microbenchmark written by Alex Shi at http://marc.info/?l=linux-kernel&m=133727348217113 . It measures access costs while the TLB is being flushed. The expectation is that if there are always full TLB flushes that the benchmark would suffer and it benefits from range flushing There are 320 iterations of the test per thread count. The number of entries is randomly selected with a min of 1 and max of 512. To ensure a reasonably even spread of entries, the full range is broken up into 8 sections and a random number selected within that section. iteration 1, random number between 0-64 iteration 2, random number between 64-128 etc This is still a very weak methodology. When you do not know what are typical ranges, random is a reasonable choice but it can be easily argued that the opimisation was for smaller ranges and an even spread is not representative of any workload that matters. To improve this, we'd need to know the probability distribution of TLB flush range sizes for a set of workloads that are considered "common", build a synthetic trace and feed that into this benchmark. Even that is not perfect because it would not account for the time between flushes but there are limits of what can be reasonably done and still be doing something useful. If a representative synthetic trace is provided then this benchmark could be revisited and the shift values retuned. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 10.50 ( 0.00%) 10.50 ( 0.03%) Mean 2 17.59 ( 0.00%) 17.18 ( 2.34%) Mean 3 22.98 ( 0.00%) 21.74 ( 5.41%) Mean 5 47.13 ( 0.00%) 46.23 ( 1.92%) Mean 8 43.30 ( 0.00%) 42.56 ( 1.72%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 9.45 ( 0.00%) 9.36 ( 0.93%) Mean 2 9.37 ( 0.00%) 9.70 ( -3.54%) Mean 3 9.36 ( 0.00%) 9.29 ( 0.70%) Mean 5 14.49 ( 0.00%) 15.04 ( -3.75%) Mean 8 41.08 ( 0.00%) 38.73 ( 5.71%) Mean 13 32.04 ( 0.00%) 31.24 ( 2.49%) Mean 16 40.05 ( 0.00%) 39.04 ( 2.51%) For both CPUs, average access time is reduced which is good as this is the benchmark that was used to tune the shift values in the first place albeit it is now known *how* the benchmark was used. The scheduler benchmarks were somewhat inconclusive. They showed gains and losses and makes me reconsider how stable those benchmarks really are or if something else might be interfering with the test results recently. Network benchmarks were inconclusive. Almost all results were flat except for netperf-udp tests on the 4 thread machine. These results were unstable and showed large variations between reboots. It is unknown if this is a recent problems but I've noticed before that netperf-udp results tend to vary. Based on these results, changing the default for Ivybridge seems like a logical choice. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Alex Shi <alex.shi@linaro.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H Peter Anvin <hpa@zytor.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10x86: mm: eliminate redundant page table walk during TLB range flushingMel Gorman1-27/+1
When choosing between doing an address space or ranged flush, the x86 implementation of flush_tlb_mm_range takes into account whether there are any large pages in the range. A per-page flush typically requires fewer entries than would covered by a single large page and the check is redundant. There is one potential exception. THP migration flushes single THP entries and it conceivably would benefit from flushing a single entry instead of the mm. However, this flush is after a THP allocation, copy and page table update potentially with any other threads serialised behind it. In comparison to that, the flush is noise. It makes more sense to optimise balancing to require fewer flushes than to optimise the flush itself. This patch deletes the redundant huge page check. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Shi <alex.shi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H Peter Anvin <hpa@zytor.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10x86: mm: clean up inconsistencies when flushing TLB rangesMel Gorman1-6/+6
NR_TLB_LOCAL_FLUSH_ALL is not always accounted for correctly and the comparison with total_vm is done before taking tlb_flushall_shift into account. Clean it up. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Alex Shi <alex.shi@linaro.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H Peter Anvin <hpa@zytor.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10x86: mm: account for TLB flushes only when debuggingMel Gorman3-12/+12
Bisection between 3.11 and 3.12 fingered commit 9824cf97 (mm: vmstats: tlb flush counters). The counters are undeniably useful but how often do we really need to debug TLB flush related issues? It does not justify taking the penalty everywhere so make it a debugging option. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Shi <alex.shi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H Peter Anvin <hpa@zytor.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10arch-x86-mm-sratc-skip-numa_no_node-while-parsing-slit-fixToshi Kani1-1/+5
Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10arch/x86/mm/srat.c: skip NUMA_NO_NODE while parsing SLITToshi Kani1-2/+8
When ACPI SLIT table has an I/O locality (i.e. a locality unique to an I/O device), numa_set_distance() emits the warning message below. NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10 acpi_numa_slit_init() calls numa_set_distance() with pxm_to_node(), which assumes that all localities have been parsed with SRAT previously. SRAT does not list I/O localities, where as SLIT lists all localities including I/Os. Hence, pxm_to_node() returns NUMA_NO_NODE (-1) for an I/O locality. I/O localities are not supported and are ignored today, but emitting such warning message leads unnecessary confusion. Change acpi_numa_slit_init() to avoid calling numa_set_distance() with NUMA_NO_NODE. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10MIPS: fix blast_icache32 on loongson2Aaro Koskinen2-21/+29
Commit 14bd8c082016cd1f6 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over arch/mips") failed to add Loongson2 specific blast_icache32 functions. Fix that. The patch fixes the following crash seen with 3.13-rc1: [ 3.652000] Reserved instruction in kernel code[#1]: [...] [ 3.652000] Call Trace: [ 3.652000] [<ffffffff802223c8>] blast_icache32_page+0x8/0xb0 [ 3.652000] [<ffffffff80222c34>] r4k_flush_cache_page+0x19c/0x200 [ 3.652000] [<ffffffff802d17e4>] do_wp_page.isra.97+0x47c/0xe08 [ 3.652000] [<ffffffff802d51b0>] handle_mm_fault+0x938/0x1118 [ 3.652000] [<ffffffff8021bd40>] __do_page_fault+0x140/0x540 [ 3.652000] [<ffffffff80206be4>] resume_userspace_check+0x0/0x10 [ 3.652000] [ 3.652000] Code: 00200825 64834000 00200825 <bc900000> bc900020 bc900040 bc900060 bc900080 bc9000a0 [ 3.656000] ---[ end trace cd8a48f722f5c5f7 ]--- Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Acked-by: John Crispin <blogic@openwrt.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-10MIPS: Fix case mismatch in local_r4k_flush_icache_range()Huacai Chen3-7/+7
Currently, Loongson-2 call protected_blast_icache_range() and others call protected_loongson23_blast_icache_range(), but I think the correct behavior should be the opposite. BTW, Loongson-3's cache-ops is compatible with MIPS64, but not compatible with Loongson-2. So, rename xxx_loongson23_yyy things to xxx_loongson2_yyy. The patch fixes early boot hang with 3.13-rc1, introduced in the commit 14bd8c082016cd1f67 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over arch/mips"). Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Acked-by: John Crispin <blogic@openwrt.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2014-01-08parisc: Ensure full cache coherency for kmap/kunmapJohn David Anglin3-46/+6
Helge Deller noted a few weeks ago problems with the AIO support on parisc. This change is the result of numerous iterations on how best to deal with this problem. The solution adopted here is to provide full cache coherency in a uniform manner on all parisc systems. This involves calling flush_dcache_page() on kmap operations and flush_kernel_dcache_page() on kunmap operations. As a result, the copy_user_page() and clear_user_page() functions can be removed and the overall code is simpler. The change ensures that both userspace and kernel aliases to a mapped page are invalidated and flushed. This is necessary for the correct operation of PA8800 and PA8900 based systems which do not support inequivalent aliases. With this change, I have observed no cache related issues on c8000 and rp3440. It is now possible for example to do kernel builds with "-j64" on four way systems. On systems using XFS file systems, the patch recently posted by Mikulas Patocka to "fix crash using XFS on loopback" is needed to avoid a hang caused by an uninitialized lock passed to flush_dcache_page() in the page struct. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Helge Deller <deller@gmx.de>
2014-01-06Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds8-11/+19
Pull ARM fixes from Russell King: "Another set of small fixes for ARM, covering various areas. Laura fixed a long standing issue with virt_addr_valid() failing to handle holes in memory. Steve found a problem with dcache flushing for compound pages. I fixed another bug in footbridge stuff causing time to tick slowly, and also a problem with the AES code which can cause linker errors. A patch from Rob which fixes Xen problems induced by a lack of consistency in our naming of ioremap_cache() - which thankfully has very few users" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 7933/1: rename ioremap_cached to ioremap_cache ARM: fix "bad mode in ... handler" message for undefined instructions CRYPTO: Fix more AES build errors ARM: 7931/1: Correct virt_addr_valid ARM: 7923/1: mm: fix dcache flush logic for compound high pages ARM: fix footbridge clockevent device
2014-01-05ARM: 7933/1: rename ioremap_cached to ioremap_cacheRob Herring2-2/+2
ioremap_cache is more aligned with other architectures. There are only 2 users of this in the kernel: pxa2xx-flash and Xen. This fixes Xen build failures on arm64: drivers/tty/hvc/hvc_xen.c:233:2: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] drivers/xen/grant-table.c:1174:3: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] drivers/xen/xenbus/xenbus_probe.c:778:4: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-05ARM: fix "bad mode in ... handler" message for undefined instructionsRussell King1-1/+7
The array was missing the final entry for the undefined instruction exception handler; this commit adds it. Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-05CRYPTO: Fix more AES build errorsRussell King2-2/+2
Building a multi-arch kernel results in: arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt': sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt' arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt': sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt' arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt': sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks' arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt': sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt' This code is already runtime-conditional on NEON being supported, so there's no point compiling it out depending on the minimum build architecture. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds5-7/+8
Pull sparc bugfixes from David Miller: 1) Missing include can lead to build failure, from Kirill Tkhai. 2) Use dev_is_pci() where applicable, from Yijing Wang. 3) Enable irqs after we enable preemption in cpu startup path, from Kirill Tkhai. 4) Revert a __copy_{to,from}_user_inatomic change that broke iov_iter_copy_from_user_atomic() and thus several tests in xfstests and LTP. From Dave Kleikamp. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines." sparc64: smp_callin: Enable irqs after preemption is disabled sparc/PCI: Use dev_is_pci() to identify PCI devices sparc64: Fix build regression
2014-01-04Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines."Dave Kleikamp1-2/+2
This reverts commit 145e1c0023585e0e8f6df22316308ec61c5066b2. This commit broke the behavior of __copy_from_user_inatomic when it is only partially successful. Instead of returning the number of bytes not copied, it now returns 1. This translates to the wrong value being returned by iov_iter_copy_from_user_atomic. xfstests generic/246 and LTP writev01 both fail on btrfs and nfs because of this. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04sparc64: smp_callin: Enable irqs after preemption is disabledKirill Tkhai1-1/+2
Most of other architectures have below suggested order. So lets do the same to fit generic idle loop scheme better. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04sparc/PCI: Use dev_is_pci() to identify PCI devicesYijing Wang2-4/+3
Use dev_is_pci() instead of checking bus type directly. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-6/+5
Pull kvm bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: Unconditionally uninit the MMU on nested vmexit KVM: x86: Fix APIC map calculation after re-enabling
2014-01-02Merge branch 'akpm' (incoming from Andrew)Linus Torvalds1-0/+5
Merge patches from Andrew Morton: "Ten fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test() mm/memory-failure.c: transfer page count from head page to tail page after split thp MAINTAINERS: set up proper record for Xilinx Zynq mm: remove bogus warning in copy_huge_pmd() memcg: fix memcg_size() calculation mm: fix use-after-free in sys_remap_file_pages mm: munlock: fix deadlock in __munlock_pagevec() mm: munlock: fix a bug where THP tail page is encountered
2014-01-02sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to ↵Nobuhiro Iwamatsu1-0/+5
sh_ksyms_32.c Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined CONFIG_FLATMEM. When the functions that use the pfn_valid is used in driver module, max_low_pfn and min_low_pfn is to undefined, and fail to build. ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined! ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined! make[2]: *** [__modpost] Error 1 make[1]: *** [modules] Error 2 This patch fix this problem. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Cc: Kuninori Morimoto <kuninori.morimoto.gx@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02Merge branch 'for-linus' of ↵Linus Torvalds5-10/+21
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two small bug fixes and a follow-up to the CONFIG_NR_CPUS change. A kernel compiled with CONFIG_NR_CPUS=256 will waste quite a bit of memory for the per-cpu arrays. Under z/VM the maximum number of CPUs is 64, the code now limits the possible cpu mask to 64 if running under z/VM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: obtain function handle in hotplug notifier s390/3270: fix allocation of tty3270_screen structure s390/smp: improve setup of possible cpu mask
2014-01-02KVM: nVMX: Unconditionally uninit the MMU on nested vmexitJan Kiszka1-2/+1
Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if one guest VCPU manipulates the page of another VCPU in L2, we may be fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in nested state. That can crash the host later on if nested_ept_get_cr3 is invoked while L1 already left vmxon and nested.current_vmcs12 became NULL therefore. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-12-30KVM: x86: Fix APIC map calculation after re-enablingJan Kiszka1-4/+4
Update arch.apic_base before triggering recalculate_apic_map. Otherwise the recalculation will work against the previous state of the APIC and will fail to build the correct map when an APIC is hardware-enabled again. This fixes a regression of 1e08ec4a13. Cc: stable@vger.kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-12-30Merge branch 'merge' of ↵Linus Torvalds7-34/+60
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "A bit more endian problems found during testing of 3.13 and a few other simple fixes and regressions fixes" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix alignment of secondary cpu spin vars powerpc: Align p_end powernv/eeh: Add buffer for P7IOC hub error data powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag() powerpc: Make 64-bit non-VMX __copy_tofrom_user bi-endian powerpc: Make unaligned accesses endian-safe for powerpc powerpc: Fix bad stack check in exception entry powerpc/512x: dts: disable MPC5125 usb module powerpc/512x: dts: remove misplaced IRQ spec from 'soc' node (5125)
2013-12-30s390/pci: obtain function handle in hotplug notifierSebastian Ott1-0/+2
When using the CLP interface to enable or disable a pci device a valid function handle needs to be delivered. So far our assumption was that we always have an up-to-date version of the function handle (since it doesn't change when the device is in use). This assumption is incorrect if the pci device is enabled or disabled outside of our control. When we are notified about such a change we already receive the new function handle. Just use it. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-30Merge remote-tracking branch 'agust/merge' into mergeBenjamin Herrenschmidt1-1/+5
Anatolij writes: Please pull two DTS fixes for MPC5125 tower board. Without them the v3.13-rcX kernels do not boot.
2013-12-30powerpc: Fix alignment of secondary cpu spin varsOlof Johansson1-0/+1
Commit 5c0484e25ec0 ('powerpc: Endian safe trampoline') resulted in losing proper alignment of the spinlock variables used when booting secondary CPUs, causing some quite odd issues with failing to boot on PA Semi-based systems. This showed itself on ppc64_defconfig, but not on pasemi_defconfig, so it had gone unnoticed when I initially tested the LE patch set. Fix is to add explicit alignment instead of relying on good luck. :) [ It appears that there is a different issue with PA Semi systems however this fix is definitely correct so applying anyway -- BenH ] Fixes: 5c0484e25ec0 ('powerpc: Endian safe trampoline') Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=67811 Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc: Align p_endAnton Blanchard1-0/+1
p_end is an 8 byte value embedded in the text section. This means it is only 4 byte aligned when it should be 8 byte aligned. Fix this by adding an explicit alignment. This fixes an issue where POWER7 little endian builds with CONFIG_RELOCATABLE=y fail to boot. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powernv/eeh: Add buffer for P7IOC hub error dataBrian W Hart2-14/+5
Prevent ioda_eeh_hub_diag() from clobbering itself when called by supplying a per-PHB buffer for P7IOC hub diagnostic data. Take care to inform OPAL of the correct size for the buffer. [Small style change to the use of sizeof -- BenH] Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag()Brian W Hart1-2/+3
PHB diagnostic buffer may be smaller than PAGE_SIZE, especially when PAGE_SIZE > 4KB. Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc: Make 64-bit non-VMX __copy_tofrom_user bi-endianPaul E. McKenney1-15/+38
The powerpc 64-bit __copy_tofrom_user() function uses shifts to handle unaligned invocations. However, these shifts were designed for big-endian systems: On little-endian systems, they must shift in the opposite direction. This commit relies on the C preprocessor to insert the correct shifts into the assembly code. [ This is a rare but nasty LE issue. Most of the time we use the POWER7 optimised __copy_tofrom_user_power7 loop, but when it hits an exception we fall back to the base __copy_tofrom_user loop. - Anton ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc: Make unaligned accesses endian-safe for powerpcRajesh B Prathipati1-1/+6
The generic put_unaligned/get_unaligned macros were made endian-safe by calling the appropriate endian dependent macros based on the endian type of the powerpc processor. Signed-off-by: Rajesh B Prathipati <rprathip@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc: Fix bad stack check in exception entryMichael Neuling1-1/+1
In EXCEPTION_PROLOG_COMMON() we check to see if the stack pointer (r1) is valid when coming from the kernel. If it's not valid, we die but with a nice oops message. Currently we allocate a stack frame (subtract INT_FRAME_SIZE) before we check to see if the stack pointer is negative. Unfortunately, this won't detect a bad stack where r1 is less than INT_FRAME_SIZE. This patch fixes the check to compare the modified r1 with -INT_FRAME_SIZE. With this, bad kernel stack pointers (including NULL pointers) are correctly detected again. Kudos to Paulus for finding this. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-29Merge tag 'fixes-for-linus' of ↵Linus Torvalds8-10/+24
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Another smallish batch of fixes, it's been quiet due to the holidays. Nothing controversial here, a handful of things across the board" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: pxa: fix USB gadget driver compilation regression ARM: OMAP2+: Fix LCD panel backlight regression for LDP legacy booting ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL ARM: shmobile: r8a7790: fix shdi resource sizes ARM: shmobile: bockw: fixup DMA mask ARM: shmobile: armadillo: Add PWM backlight power supply
2013-12-29Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "There is a small EFI fix and a big power regression fix in this batch. My queue also had a fix for downing a CPU when there are insufficient number of IRQ vectors available, but I'm holding that one for now due to recent bug reports" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Don't select EFI from certain special ACPI drivers x86 idle: Repair large-server 50-watt idle-power regression
2013-12-29ARM: 7931/1: Correct virt_addr_validLaura Abbott1-1/+2
The definition of virt_addr_valid is that virt_addr_valid should return true if and only if virt_to_page returns a valid pointer. The current definition of virt_addr_valid only checks against the virtual address range. There's no guarantee that just because a virtual address falls bewteen PAGE_OFFSET and high_memory the associated physical memory has a valid backing struct page. Follow the example of other architectures and convert to pfn_valid to verify that the virtual address is actually valid. The check for an address between PAGE_OFFSET and high_memory is still necessary as vmalloc/highmem addresses are not valid with virt_to_page. Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29ARM: 7923/1: mm: fix dcache flush logic for compound high pagesSteven Capper1-3/+3
When given a compound high page, __flush_dcache_page will only flush the first page of the compound page repeatedly rather than the entire set of constituent pages. This error was introduced by: 0b19f93 ARM: mm: Add support for flushing HugeTLB pages. This patch corrects the logic such that all constituent pages are now flushed. Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29ARM: fix footbridge clockevent deviceRussell King1-2/+3
The clockevents code was being told that the footbridge clock event device ticks at 16x the rate which it actually does. This leads to timekeeping problems since it allows the clocksource to wrap before the kernel notices. Fix this by using the correct clock. Fixes: 4e8d76373c9fd ("ARM: footbridge: convert to clockevents/clocksource") Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: <stable@vger.kernel.org>
2013-12-28Merge tag 'omap-for-v3.13/intc-ldp-fix' of ↵Olof Johansson39-169/+269
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: Fix a regression for wrong interrupt numbers for some devices after the sparse IRQ conversion, fix DRA7 console output for earlyprintk, and fix the LDP LCD backlight when DSS is built into the kernel and not as a loadable module. * tag 'omap-for-v3.13/intc-ldp-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Fix LCD panel backlight regression for LDP legacy booting ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL + v3.13-rc5 Signed-off-by: Olof Johansson <olof@lixom.net>