Age | Commit message (Collapse) | Author | Files | Lines |
|
memory allocations below max low memory address by default, as
the kernel can access only to the low memory.
Hence, set memblock current limit value to the max mapped low
memory address instead of max mapped memory address.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
At very early time, the kernel have to use some memory such as loading the
kernel image. We cannot prevent this anyway. So any node the kernel
resides in should be un-hotpluggable.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
WARNING: line over 80 characters
#65: FILE: arch/x86/mm/srat.c:187:
+ (unsigned long long) start, (unsigned long long) end - 1);
total: 0 errors, 1 warnings, 19 lines checked
./patches/acpi-numa-mem_hotplug-mark-hotpluggable-memory-in-memblock.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When parsing SRAT, we know that which memory area is hotpluggable. So we
invoke function memblock_mark_hotplug() introduced by previous patch to
mark hotpluggable memory in memblock.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
After merging the final tree, today's linux-next build (powerpc
allnoconfig) failed like this:
arch/powerpc/mm/mem.c: In function 'do_init_bootmem':
arch/powerpc/mm/mem.c:212:49: error: 'memblock_memory' undeclared (first us=
e in this function)
memblock_set_node(0, (phys_addr_t)ULLONG_MAX, &memblock_memory, 0);
^
Caused by commit 3a543893d46a ("memblock: make memblock_set_node()
support different memblock_type") from the apm-current tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
fix parisc
Cc: David Rientjes <rientjes@google.com>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 4b59e6c4 ("mm, show_mem: suppress page counts in non-blockable
contexts") introduced SHOW_MEM_FILTER_PAGE_COUNT to suppress PFN walks on
large memory machines. Commit c78e9363 ("mm: do not walk all of system
memory during show_mem") avoided a PFN walk in the generic show_mem helper
which removes the requirement for SHOW_MEM_FILTER_PAGE_COUNT in that case.
This patch removes PFN walkers from the arch-specific implementations that
report on a per-node or per-zone granularity. ARM and unicore32 still do
a PFN walk as they report memory usage on each bank which is a much finer
granularity where the debugging information may still be of use. As the
remaining arches doing PFN walks have relatively small amounts of memory,
this patch simply removes SHOW_MEM_FILTER_PAGE_COUNT.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Sort the exception table at build-time rather than during boot.
Microblaze is the same case as AARCH64 that's why EM_MICROBLAZE
conditional check was added to allow cross-compilation on machines which
are not running the latest libc-dev.
Inspired by AARCH64 commit adace89562c7a9645b ("arm64: extable: sort the
exception table at build time").
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since 4dcfa600 ("ARM: DMA-API: better handing of DMA masks for coherent
allocations") arm_dma_limit_pfn has almost substituted the arm_dma_limit.
The remaining user is dma_contiguous_reserve(). It is also referenced in
setup_dma_zone() to calculate arm_dma_limit_pfn.
Kill the global arm_dma_limit and equip setup_zone_dma with the local one.
Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
Reported-by: Vassili Karpov <av1474@comtv.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There was a large ebizzy performance regression that was bisected to
commit 611ae8e3 (x86/tlb: enable tlb flush range support for x86). The
problem was related to the tlb_flushall_shift tuning for IvyBridge which
was altered. The problem is that it is not clear if the tuning values for
each CPU family is correct as the methodology used to tune the values is
unclear.
This patch uses a conservative tlb_flushall_shift value for all CPU
families except IvyBridge so the decision can be revisited if any
regression is found as a result of this change. IvyBridge is an exception
as testing with one methodology determined that the value of 2 is
acceptable. Details are in the changelog for the patch "x86: mm: Change
tlb_flushall_shift for IvyBridge".
One important aspect of this to watch out for is Xen. The original commit
log mentioned large performance gains on Xen. It's possible Xen is more
sensitive to this value if it flushes small ranges of pages more
frequently than workloads on bare metal typically do.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H Peter Anvin <hpa@zytor.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There was a large performance regression that was bisected to commit
611ae8e3 (x86/tlb: enable tlb flush range support for x86). This patch
simply changes the default balance point between a local and global flush
for IvyBridge.
In the interest of allowing the tests to be reproduced, this patch was
tested using mmtests 0.15 with the following configurations
configs/config-global-dhp__tlbflush-performance
configs/config-global-dhp__scheduler-performance
configs/config-global-dhp__network-performance
Results are from two machines
Ivybridge 4 threads: Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz
Ivybridge 8 threads: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Page fault microbenchmark showed nothing interesting.
Ebizzy was configured to run multiple iterations and threads. Thread counts
ranged from 1 to NR_CPUS*2. For each thread count, it ran 100 iterations and
each iteration lasted 10 seconds.
Ivybridge 4 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 6395.44 ( 0.00%) 6789.09 ( 6.16%)
Mean 2 7012.85 ( 0.00%) 8052.16 ( 14.82%)
Mean 3 6403.04 ( 0.00%) 6973.74 ( 8.91%)
Mean 4 6135.32 ( 0.00%) 6582.33 ( 7.29%)
Mean 5 6095.69 ( 0.00%) 6526.68 ( 7.07%)
Mean 6 6114.33 ( 0.00%) 6416.64 ( 4.94%)
Mean 7 6085.10 ( 0.00%) 6448.51 ( 5.97%)
Mean 8 6120.62 ( 0.00%) 6462.97 ( 5.59%)
Ivybridge 8 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 7336.65 ( 0.00%) 7787.02 ( 6.14%)
Mean 2 8218.41 ( 0.00%) 9484.13 ( 15.40%)
Mean 3 7973.62 ( 0.00%) 8922.01 ( 11.89%)
Mean 4 7798.33 ( 0.00%) 8567.03 ( 9.86%)
Mean 5 7158.72 ( 0.00%) 8214.23 ( 14.74%)
Mean 6 6852.27 ( 0.00%) 7952.45 ( 16.06%)
Mean 7 6774.65 ( 0.00%) 7536.35 ( 11.24%)
Mean 8 6510.50 ( 0.00%) 6894.05 ( 5.89%)
Mean 12 6182.90 ( 0.00%) 6661.29 ( 7.74%)
Mean 16 6100.09 ( 0.00%) 6608.69 ( 8.34%)
Ebizzy hits the worst case scenario for TLB range flushing every time and
it shows for these Ivybridge CPUs at least that the default choice is a
poor on. The patch addresses the problem.
Next was a tlbflush microbenchmark written by Alex Shi at
http://marc.info/?l=linux-kernel&m=133727348217113 . It measures access
costs while the TLB is being flushed. The expectation is that if there
are always full TLB flushes that the benchmark would suffer and it
benefits from range flushing
There are 320 iterations of the test per thread count. The number of
entries is randomly selected with a min of 1 and max of 512. To ensure a
reasonably even spread of entries, the full range is broken up into 8
sections and a random number selected within that section.
iteration 1, random number between 0-64
iteration 2, random number between 64-128 etc
This is still a very weak methodology. When you do not know what are
typical ranges, random is a reasonable choice but it can be easily argued
that the opimisation was for smaller ranges and an even spread is not
representative of any workload that matters. To improve this, we'd need
to know the probability distribution of TLB flush range sizes for a set of
workloads that are considered "common", build a synthetic trace and feed
that into this benchmark. Even that is not perfect because it would not
account for the time between flushes but there are limits of what can be
reasonably done and still be doing something useful. If a representative
synthetic trace is provided then this benchmark could be revisited and the
shift values retuned.
Ivybridge 4 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 10.50 ( 0.00%) 10.50 ( 0.03%)
Mean 2 17.59 ( 0.00%) 17.18 ( 2.34%)
Mean 3 22.98 ( 0.00%) 21.74 ( 5.41%)
Mean 5 47.13 ( 0.00%) 46.23 ( 1.92%)
Mean 8 43.30 ( 0.00%) 42.56 ( 1.72%)
Ivybridge 8 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 9.45 ( 0.00%) 9.36 ( 0.93%)
Mean 2 9.37 ( 0.00%) 9.70 ( -3.54%)
Mean 3 9.36 ( 0.00%) 9.29 ( 0.70%)
Mean 5 14.49 ( 0.00%) 15.04 ( -3.75%)
Mean 8 41.08 ( 0.00%) 38.73 ( 5.71%)
Mean 13 32.04 ( 0.00%) 31.24 ( 2.49%)
Mean 16 40.05 ( 0.00%) 39.04 ( 2.51%)
For both CPUs, average access time is reduced which is good as this is the
benchmark that was used to tune the shift values in the first place albeit
it is now known *how* the benchmark was used.
The scheduler benchmarks were somewhat inconclusive. They showed gains
and losses and makes me reconsider how stable those benchmarks really are
or if something else might be interfering with the test results recently.
Network benchmarks were inconclusive. Almost all results were flat except
for netperf-udp tests on the 4 thread machine. These results were
unstable and showed large variations between reboots. It is unknown if
this is a recent problems but I've noticed before that netperf-udp results
tend to vary.
Based on these results, changing the default for Ivybridge seems like a
logical choice.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H Peter Anvin <hpa@zytor.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When choosing between doing an address space or ranged flush, the x86
implementation of flush_tlb_mm_range takes into account whether there are
any large pages in the range. A per-page flush typically requires fewer
entries than would covered by a single large page and the check is
redundant.
There is one potential exception. THP migration flushes single THP
entries and it conceivably would benefit from flushing a single entry
instead of the mm. However, this flush is after a THP allocation, copy
and page table update potentially with any other threads serialised behind
it. In comparison to that, the flush is noise. It makes more sense to
optimise balancing to require fewer flushes than to optimise the flush
itself.
This patch deletes the redundant huge page check.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H Peter Anvin <hpa@zytor.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
NR_TLB_LOCAL_FLUSH_ALL is not always accounted for correctly and the
comparison with total_vm is done before taking tlb_flushall_shift into
account. Clean it up.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H Peter Anvin <hpa@zytor.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Bisection between 3.11 and 3.12 fingered commit 9824cf97 (mm: vmstats: tlb
flush counters). The counters are undeniably useful but how often do we
really need to debug TLB flush related issues? It does not justify taking
the penalty everywhere so make it a debugging option.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H Peter Anvin <hpa@zytor.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When ACPI SLIT table has an I/O locality (i.e. a locality unique to an
I/O device), numa_set_distance() emits the warning message below.
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
acpi_numa_slit_init() calls numa_set_distance() with pxm_to_node(), which
assumes that all localities have been parsed with SRAT previously. SRAT
does not list I/O localities, where as SLIT lists all localities including
I/Os. Hence, pxm_to_node() returns NUMA_NO_NODE (-1) for an I/O locality.
I/O localities are not supported and are ignored today, but emitting such
warning message leads unnecessary confusion.
Change acpi_numa_slit_init() to avoid calling numa_set_distance() with
NUMA_NO_NODE.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 14bd8c082016cd1f6 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery
all over arch/mips") failed to add Loongson2 specific blast_icache32
functions. Fix that.
The patch fixes the following crash seen with 3.13-rc1:
[ 3.652000] Reserved instruction in kernel code[#1]:
[...]
[ 3.652000] Call Trace:
[ 3.652000] [<ffffffff802223c8>] blast_icache32_page+0x8/0xb0
[ 3.652000] [<ffffffff80222c34>] r4k_flush_cache_page+0x19c/0x200
[ 3.652000] [<ffffffff802d17e4>] do_wp_page.isra.97+0x47c/0xe08
[ 3.652000] [<ffffffff802d51b0>] handle_mm_fault+0x938/0x1118
[ 3.652000] [<ffffffff8021bd40>] __do_page_fault+0x140/0x540
[ 3.652000] [<ffffffff80206be4>] resume_userspace_check+0x0/0x10
[ 3.652000]
[ 3.652000] Code: 00200825 64834000 00200825 <bc900000> bc900020 bc900040 bc900060 bc900080 bc9000a0
[ 3.656000] ---[ end trace cd8a48f722f5c5f7 ]---
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently, Loongson-2 call protected_blast_icache_range() and others call
protected_loongson23_blast_icache_range(), but I think the correct
behavior should be the opposite. BTW, Loongson-3's cache-ops is
compatible with MIPS64, but not compatible with Loongson-2. So, rename
xxx_loongson23_yyy things to xxx_loongson2_yyy.
The patch fixes early boot hang with 3.13-rc1, introduced in the commit
14bd8c082016cd1f67 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all
over arch/mips").
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Helge Deller noted a few weeks ago problems with the AIO support on
parisc. This change is the result of numerous iterations on how best to
deal with this problem.
The solution adopted here is to provide full cache coherency in a
uniform manner on all parisc systems. This involves calling
flush_dcache_page() on kmap operations and flush_kernel_dcache_page() on
kunmap operations. As a result, the copy_user_page() and
clear_user_page() functions can be removed and the overall code is
simpler.
The change ensures that both userspace and kernel aliases to a mapped
page are invalidated and flushed. This is necessary for the correct
operation of PA8800 and PA8900 based systems which do not support
inequivalent aliases.
With this change, I have observed no cache related issues on c8000 and
rp3440. It is now possible for example to do kernel builds with "-j64"
on four way systems.
On systems using XFS file systems, the patch recently posted by Mikulas
Patocka to "fix crash using XFS on loopback" is needed to avoid a hang
caused by an uninitialized lock passed to flush_dcache_page() in the
page struct.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Pull ARM fixes from Russell King:
"Another set of small fixes for ARM, covering various areas.
Laura fixed a long standing issue with virt_addr_valid() failing to
handle holes in memory. Steve found a problem with dcache flushing
for compound pages. I fixed another bug in footbridge stuff causing
time to tick slowly, and also a problem with the AES code which can
cause linker errors.
A patch from Rob which fixes Xen problems induced by a lack of
consistency in our naming of ioremap_cache() - which thankfully has
very few users"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 7933/1: rename ioremap_cached to ioremap_cache
ARM: fix "bad mode in ... handler" message for undefined instructions
CRYPTO: Fix more AES build errors
ARM: 7931/1: Correct virt_addr_valid
ARM: 7923/1: mm: fix dcache flush logic for compound high pages
ARM: fix footbridge clockevent device
|
|
ioremap_cache is more aligned with other architectures.
There are only 2 users of this in the kernel: pxa2xx-flash and Xen.
This fixes Xen build failures on arm64:
drivers/tty/hvc/hvc_xen.c:233:2: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration]
drivers/xen/grant-table.c:1174:3: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration]
drivers/xen/xenbus/xenbus_probe.c:778:4: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration]
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The array was missing the final entry for the undefined instruction
exception handler; this commit adds it.
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Building a multi-arch kernel results in:
arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt':
sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt'
arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt':
sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt'
arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt':
sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks'
arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt':
sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt'
This code is already runtime-conditional on NEON being supported, so
there's no point compiling it out depending on the minimum build
architecture.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Pull sparc bugfixes from David Miller:
1) Missing include can lead to build failure, from Kirill Tkhai.
2) Use dev_is_pci() where applicable, from Yijing Wang.
3) Enable irqs after we enable preemption in cpu startup path, from
Kirill Tkhai.
4) Revert a __copy_{to,from}_user_inatomic change that broke
iov_iter_copy_from_user_atomic() and thus several tests in xfstests
and LTP. From Dave Kleikamp.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines."
sparc64: smp_callin: Enable irqs after preemption is disabled
sparc/PCI: Use dev_is_pci() to identify PCI devices
sparc64: Fix build regression
|
|
This reverts commit 145e1c0023585e0e8f6df22316308ec61c5066b2.
This commit broke the behavior of __copy_from_user_inatomic when
it is only partially successful. Instead of returning the number
of bytes not copied, it now returns 1. This translates to the
wrong value being returned by iov_iter_copy_from_user_atomic.
xfstests generic/246 and LTP writev01 both fail on btrfs and nfs
because of this.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Most of other architectures have below suggested order.
So lets do the same to fit generic idle loop scheme better.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull kvm bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: nVMX: Unconditionally uninit the MMU on nested vmexit
KVM: x86: Fix APIC map calculation after re-enabling
|
|
Merge patches from Andrew Morton:
"Ten fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL
sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test()
mm/memory-failure.c: transfer page count from head page to tail page after split thp
MAINTAINERS: set up proper record for Xilinx Zynq
mm: remove bogus warning in copy_huge_pmd()
memcg: fix memcg_size() calculation
mm: fix use-after-free in sys_remap_file_pages
mm: munlock: fix deadlock in __munlock_pagevec()
mm: munlock: fix a bug where THP tail page is encountered
|
|
sh_ksyms_32.c
Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined
CONFIG_FLATMEM. When the functions that use the pfn_valid is used in
driver module, max_low_pfn and min_low_pfn is to undefined, and fail to
build.
ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
make[2]: *** [__modpost] Error 1
make[1]: *** [modules] Error 2
This patch fix this problem.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Two small bug fixes and a follow-up to the CONFIG_NR_CPUS change.
A kernel compiled with CONFIG_NR_CPUS=256 will waste quite a bit of
memory for the per-cpu arrays. Under z/VM the maximum number of CPUs
is 64, the code now limits the possible cpu mask to 64 if running
under z/VM"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: obtain function handle in hotplug notifier
s390/3270: fix allocation of tty3270_screen structure
s390/smp: improve setup of possible cpu mask
|
|
Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway
in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is
most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if
one guest VCPU manipulates the page of another VCPU in L2, we may be
fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in
nested state. That can crash the host later on if nested_ept_get_cr3 is
invoked while L1 already left vmxon and nested.current_vmcs12 became
NULL therefore.
Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Update arch.apic_base before triggering recalculate_apic_map. Otherwise
the recalculation will work against the previous state of the APIC and
will fail to build the correct map when an APIC is hardware-enabled
again.
This fixes a regression of 1e08ec4a13.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"A bit more endian problems found during testing of 3.13 and a few
other simple fixes and regressions fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix alignment of secondary cpu spin vars
powerpc: Align p_end
powernv/eeh: Add buffer for P7IOC hub error data
powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag()
powerpc: Make 64-bit non-VMX __copy_tofrom_user bi-endian
powerpc: Make unaligned accesses endian-safe for powerpc
powerpc: Fix bad stack check in exception entry
powerpc/512x: dts: disable MPC5125 usb module
powerpc/512x: dts: remove misplaced IRQ spec from 'soc' node (5125)
|
|
When using the CLP interface to enable or disable a pci device a
valid function handle needs to be delivered. So far our assumption
was that we always have an up-to-date version of the function handle
(since it doesn't change when the device is in use). This assumption
is incorrect if the pci device is enabled or disabled outside of our
control. When we are notified about such a change we already receive
the new function handle. Just use it.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Anatolij writes:
Please pull two DTS fixes for MPC5125 tower board. Without
them the v3.13-rcX kernels do not boot.
|
|
Commit 5c0484e25ec0 ('powerpc: Endian safe trampoline') resulted in
losing proper alignment of the spinlock variables used when booting
secondary CPUs, causing some quite odd issues with failing to boot on
PA Semi-based systems.
This showed itself on ppc64_defconfig, but not on pasemi_defconfig,
so it had gone unnoticed when I initially tested the LE patch set.
Fix is to add explicit alignment instead of relying on good luck. :)
[ It appears that there is a different issue with PA Semi systems
however this fix is definitely correct so applying anyway -- BenH
]
Fixes: 5c0484e25ec0 ('powerpc: Endian safe trampoline')
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=67811
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
p_end is an 8 byte value embedded in the text section. This means it
is only 4 byte aligned when it should be 8 byte aligned. Fix this
by adding an explicit alignment.
This fixes an issue where POWER7 little endian builds with
CONFIG_RELOCATABLE=y fail to boot.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Prevent ioda_eeh_hub_diag() from clobbering itself when called by supplying
a per-PHB buffer for P7IOC hub diagnostic data. Take care to inform OPAL of
the correct size for the buffer.
[Small style change to the use of sizeof -- BenH]
Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
PHB diagnostic buffer may be smaller than PAGE_SIZE, especially when
PAGE_SIZE > 4KB.
Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The powerpc 64-bit __copy_tofrom_user() function uses shifts to handle
unaligned invocations. However, these shifts were designed for
big-endian systems: On little-endian systems, they must shift in the
opposite direction.
This commit relies on the C preprocessor to insert the correct shifts
into the assembly code.
[ This is a rare but nasty LE issue. Most of the time we use the POWER7
optimised __copy_tofrom_user_power7 loop, but when it hits an exception
we fall back to the base __copy_tofrom_user loop. - Anton ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The generic put_unaligned/get_unaligned macros were made endian-safe by
calling the appropriate endian dependent macros based on the endian type
of the powerpc processor.
Signed-off-by: Rajesh B Prathipati <rprathip@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
In EXCEPTION_PROLOG_COMMON() we check to see if the stack pointer (r1)
is valid when coming from the kernel. If it's not valid, we die but
with a nice oops message.
Currently we allocate a stack frame (subtract INT_FRAME_SIZE) before we
check to see if the stack pointer is negative. Unfortunately, this
won't detect a bad stack where r1 is less than INT_FRAME_SIZE.
This patch fixes the check to compare the modified r1 with
-INT_FRAME_SIZE. With this, bad kernel stack pointers (including NULL
pointers) are correctly detected again.
Kudos to Paulus for finding this.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another smallish batch of fixes, it's been quiet due to the holidays.
Nothing controversial here, a handful of things across the board"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: pxa: fix USB gadget driver compilation regression
ARM: OMAP2+: Fix LCD panel backlight regression for LDP legacy booting
ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data
ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL
ARM: shmobile: r8a7790: fix shdi resource sizes
ARM: shmobile: bockw: fixup DMA mask
ARM: shmobile: armadillo: Add PWM backlight power supply
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"There is a small EFI fix and a big power regression fix in this batch.
My queue also had a fix for downing a CPU when there are insufficient
number of IRQ vectors available, but I'm holding that one for now due
to recent bug reports"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Don't select EFI from certain special ACPI drivers
x86 idle: Repair large-server 50-watt idle-power regression
|
|
The definition of virt_addr_valid is that virt_addr_valid should
return true if and only if virt_to_page returns a valid pointer.
The current definition of virt_addr_valid only checks against the
virtual address range. There's no guarantee that just because a
virtual address falls bewteen PAGE_OFFSET and high_memory the
associated physical memory has a valid backing struct page. Follow
the example of other architectures and convert to pfn_valid to
verify that the virtual address is actually valid. The check for
an address between PAGE_OFFSET and high_memory is still necessary
as vmalloc/highmem addresses are not valid with virt_to_page.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
When given a compound high page, __flush_dcache_page will only flush
the first page of the compound page repeatedly rather than the entire
set of constituent pages.
This error was introduced by:
0b19f93 ARM: mm: Add support for flushing HugeTLB pages.
This patch corrects the logic such that all constituent pages are now
flushed.
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The clockevents code was being told that the footbridge clock event
device ticks at 16x the rate which it actually does. This leads to
timekeeping problems since it allows the clocksource to wrap before
the kernel notices. Fix this by using the correct clock.
Fixes: 4e8d76373c9fd ("ARM: footbridge: convert to clockevents/clocksource")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
Fix a regression for wrong interrupt numbers for some devices after
the sparse IRQ conversion, fix DRA7 console output for earlyprintk,
and fix the LDP LCD backlight when DSS is built into the kernel and
not as a loadable module.
* tag 'omap-for-v3.13/intc-ldp-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Fix LCD panel backlight regression for LDP legacy booting
ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data
ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL
+ v3.13-rc5
Signed-off-by: Olof Johansson <olof@lixom.net>
|