summaryrefslogtreecommitdiff
path: root/arch/arm64
AgeCommit message (Collapse)AuthorFilesLines
2014-02-07arm64: defconfig: Expand default enabled featuresMark Rutland2-4/+15
FPGA implementations of the Cortex-A57 and Cortex-A53 are now available in the form of the SMM-A57 and SMM-A53 Soft Macrocell Models (SMMs) for Versatile Express. As these attach to a Motherboard Express V2M-P1 it would be useful to have support for some V2M-P1 peripherals enabled by default. Additionally a couple of of features have been introduced since the last defconfig update (CMA, jump labels) that would be good to have enabled by default to ensure they are build and boot tested. This patch updates the arm64 defconfig to enable support for these devices and features. The arm64 Kconfig is modified to select HAVE_PATA_PLATFORM, which is required to enable support for the CompactFlash controller on the V2M-P1. A few options which don't need to appear in defconfig are trimmed: * BLK_DEV - selected by default * EXPERIMENTAL - otherwise gone from the kernel * MII - selected by drivers which require it * USB_SUPPORT - selected by default Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-07arm64: asm: remove redundant "cc" clobbersWill Deacon4-25/+21
cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers from inline asm blocks that only use these instructions to implement conditional branches. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-07arm64: atomics: fix use of acquire + release for full barrier semanticsWill Deacon5-18/+35
Linux requires a number of atomic operations to provide full barrier semantics, that is no memory accesses after the operation can be observed before any accesses up to and including the operation in program order. On arm64, these operations have been incorrectly implemented as follows: // A, B, C are independent memory locations <Access [A]> // atomic_op (B) 1: ldaxr x0, [B] // Exclusive load with acquire <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b <Access [C]> The assumption here being that two half barriers are equivalent to a full barrier, so the only permitted ordering would be A -> B -> C (where B is the atomic operation involving both a load and a store). Unfortunately, this is not the case by the letter of the architecture and, in fact, the accesses to A and C are permitted to pass their nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the store-release on B). This is a clear violation of the full barrier requirement. The simple way to fix this is to implement the same algorithm as ARMv7 using explicit barriers: <Access [A]> // atomic_op (B) dmb ish // Full barrier 1: ldxr x0, [B] // Exclusive load <op(B)> stxr w1, x0, [B] // Exclusive store cbnz w1, 1b dmb ish // Full barrier <Access [C]> but this has the undesirable effect of introducing *two* full barrier instructions. A better approach is actually the following, non-intuitive sequence: <Access [A]> // atomic_op (B) 1: ldxr x0, [B] // Exclusive load <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b dmb ish // Full barrier <Access [C]> The simple observations here are: - The dmb ensures that no subsequent accesses (e.g. the access to C) can enter or pass the atomic sequence. - The dmb also ensures that no prior accesses (e.g. the access to A) can pass the atomic sequence. - Therefore, no prior access can pass a subsequent access, or vice-versa (i.e. A is strictly ordered before C). - The stlxr ensures that no prior access can pass the store component of the atomic operation. The only tricky part remaining is the ordering between the ldxr and the access to A, since the absence of the first dmb means that we're now permitting re-ordering between the ldxr and any prior accesses. From an (arbitrary) observer's point of view, there are two scenarios: 1. We have observed the ldxr. This means that if we perform a store to [B], the ldxr will still return older data. If we can observe the ldxr, then we can potentially observe the permitted re-ordering with the access to A, which is clearly an issue when compared to the dmb variant of the code. Thankfully, the exclusive monitor will save us here since it will be cleared as a result of the store and the ldxr will retry. Notice that any use of a later memory observation to imply observation of the ldxr will also imply observation of the access to A, since the stlxr/dmb ensure strict ordering. 2. We have not observed the ldxr. This means we can perform a store and influence the later ldxr. However, that doesn't actually tell us anything about the access to [A], so we've not lost anything here either when compared to the dmb variant. This patch implements this solution for our barriered atomic operations, ensuring that we satisfy the full barrier requirements where they are needed. Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-06arm64: barriers: allow dsb macro to take option parameterWill Deacon1-1/+1
The dsb instruction takes an option specifying both the target access types and shareability domain. This patch allows such an option to be passed to the dsb macro, resulting in potentially more efficient code. Currently the option is ignored until all callers are updated (unlike ARM, the option is mandated by the assembler). Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: compat: Wire up new AArch32 syscallsCatalin Marinas1-1/+4
This patch enables sys_compat, sys_finit_module, sys_sched_setattr and sys_sched_getattr for compat (AArch32) applications. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSENathan Lynch1-2/+2
Update wall-to-monotonic fields in the VDSO data page unconditionally. These are used to service CLOCK_MONOTONIC_COARSE, which is not guarded by use_syscall. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: vdso: fix coarse clock handlingNathan Lynch1-1/+6
When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the caller has placed in x2 ("ret x2" to return from the fast path). Fix this by saving x30/LR to x2 only in code that will call __do_get_tspec, restoring x30 afterward, and using a plain "ret" to return from the routine. Also: while the resulting tv_nsec value for CLOCK_REALTIME and CLOCK_MONOTONIC must be computed using intermediate values that are left-shifted by cs_shift (x12, set by __do_get_tspec), the results for coarse clocks should be calculated using unshifted values (xtime_coarse_nsec is in units of actual nanoseconds). The current code shifts intermediate values by x12 unconditionally, but x12 is uninitialized when servicing a coarse clock. Fix this by setting x12 to 0 once we know we are dealing with a coarse clock id. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: simplify pgd_allocMark Rutland1-9/+2
Currently pgd_alloc has a redundant NULL check in its return path that can be removed with no ill effects. With that removed it's also possible to return early and eliminate the new_pgd temporary variable. This patch applies said modifications, making the logic of pgd_alloc correspond 1-1 with that of pgd_free. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: fix typo: s/SERRROR/SERROR/Mark Rutland2-2/+2
Somehow SERROR has acquired an additional 'R' in a couple of headers. This patch removes them before they spread further. As neither instance is in use yet, no other sites need to be fixed up. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: Invalidate the TLB when replacing pmd entries during bootCatalin Marinas1-2/+10
With the 64K page size configuration, __create_page_tables in head.S maps enough memory to get started but using 64K pages rather than 512M sections with a single pgd/pud/pmd entry pointing to a pte table. create_mapping() may override the pgd/pud/pmd table entry with a block (section) one if the RAM size is more than 512MB and aligned correctly. For the end of this block to be accessible, the old TLB entry must be invalidated. Cc: <stable@vger.kernel.org> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: Align CMA sizes to PAGE_SIZELaura Abbott1-0/+1
dma_alloc_from_contiguous takes number of pages for a size. Align up the dma size passed in to page size to avoid truncation and allocation failures on sizes less than PAGE_SIZE. Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: add DSB after icache flush in __flush_icache_all()Vinayak Kale1-0/+1
Add DSB after icache flush to complete the cache maintenance operation. The function __flush_icache_all() is used only for user space mappings and an ISB is not required because of an exception return before executing user instructions. An exception return would behave like an ISB. Signed-off-by: Vinayak Kale <vkale@apm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-04arm64: vdso: prevent ld from aligning PT_LOAD segments to 64kWill Deacon1-1/+1
Whilst the text segment for our VDSO is marked as PT_LOAD in the ELF headers, it is mapped by the kernel and not actually subject to demand-paging. ld doesn't realise this, and emits a p_align field of 64k (the maximum supported page size), which conflicts with the load address picked by the kernel on 4k systems, which will be 4k aligned. This causes GDB to fail with "Failed to read a valid object file image from memory" when attempting to load the VDSO. This patch passes the -n option to ld, which prevents it from aligning PT_LOAD segments to the maximum page size. Cc: <stable@vger.kernel.org> Reported-by: Kyle McMartin <kyle@redhat.com> Acked-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-31Merge tag 'arm64-upstream' of ↵Linus Torvalds8-43/+74
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pyll ARM64 patches from Catalin Marinas: - Build fix with DMA_CMA enabled - Introduction of PTE_WRITE to distinguish between writable but clean and truly read-only pages - FIQs enabling/disabling clean-up (they aren't used on arm64) - CPU resume fix for the per-cpu offset restoring - Code comment typos * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Introduce PTE_WRITE arm64: mm: Remove PTE_BIT_FUNC macro arm64: FIQs are unused arm64: mm: fix the function name in comment of cpu_do_switch_mm arm64: fix build error if DMA_CMA is enabled arm64: kernel: fix per-cpu offset restore on resume arm64: mm: fix the function name in comment of __flush_dcache_area arm64: mm: use ubfm for dcache_line_size
2014-01-31arm64: mm: Introduce PTE_WRITESteve Capper1-23/+25
We have the following means for encoding writable or dirty ptes: PTE_DIRTY PTE_RDONLY !pte_dirty && !pte_write 0 1 !pte_dirty && pte_write 0 1 pte_dirty && !pte_write 1 1 pte_dirty && pte_write 1 0 So we can't distinguish between writable clean ptes and read only ptes. This can cause problems with ptes being incorrectly flagged as read only when they are writable but not dirty. This patch introduces a new software bit PTE_WRITE which allows us to correctly identify writable ptes. PTE_RDONLY is now only clear for valid ptes where a page is both writable and dirty. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-31arm64: mm: Remove PTE_BIT_FUNC macroSteve Capper1-10/+41
Expand out the pte manipulation functions. This makes our life easier when using things like tags and cscope. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-30arm64: FIQs are unusedNicolas Pitre2-8/+0
So any FIQ handling is superfluous at the moment. The functions to disable/enable FIQs is kept around if ever someone needs them in the future, but existing calling sites including arch_cpu_idle_prepare() may go for now. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-27arm64: mm: fix the function name in comment of cpu_do_switch_mmJingoo Han1-1/+1
Fix the function name of comment of cpu_do_switch_mm, because cpu_do_switch_mm is the correct name. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-27arm64: fix build error if DMA_CMA is enabledPankaj Dubey1-1/+0
arm64/include/asm/dma-contiguous.h is trying to include <asm-genric/dma-contiguous.h> which does not exist, and thus failing build for arm64 if we enable CONFIG_DMA_CMA. This patch fixes build error by removing unwanted header inclusion from arm64's dma-contiguous.h. Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com> Signed-off-by: Somraj Mani <somraj.mani@samsung.com> Acked-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+1
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-24arm64: kernel: fix per-cpu offset restore on resumeLorenzo Pieralisi1-0/+8
The introduction of percpu offset optimisation through tpidr_el1 in: Commit id :7158627686f02319c50c8d9d78f75d4c8 "arm64: percpu: implement optimised pcpu access using tpidr_el1" requires cpu_{suspend/resume} to restore the tpidr_el1 register upon resume so that percpu variables can be addressed correctly when a CPU comes out of reset from warm-boot. This patch fixes cpu_{suspend}/{resume} tpidr_el1 restoration on resume, by calling the set_my_cpu_offset C API, as it is done on primary and secondary CPUs on cold boot, so that, even if the register used to store the percpu offset is changed, the save and restore of general purpose registers does not have to be updated. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds7-18/+60
Pull KVM updates from Paolo Bonzini: "First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits) kvm: make KVM_MMU_AUDIT help text more readable KVM: s390: Fix memory access error detection KVM: nVMX: Update guest activity state field on L2 exits KVM: nVMX: Fix nested_run_pending on activity state HLT KVM: nVMX: Clean up handling of VMX-related MSRs KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit KVM: nVMX: Leave VMX mode on clearing of feature control MSR KVM: VMX: Fix DR6 update on #DB exception KVM: SVM: Fix reading of DR6 KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS add support for Hyper-V reference time counter KVM: remove useless write to vcpu->hv_clock.tsc_timestamp KVM: x86: fix tsc catchup issue with tsc scaling KVM: x86: limit PIT timer frequency KVM: x86: handle invalid root_hpa everywhere kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub kvm: vfio: silence GCC warning KVM: ARM: Remove duplicate include arm/arm64: KVM: relax the requirements of VMA alignment for THP ...
2014-01-22Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual rocket science stuff from trivial.git" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) neighbour.h: fix comment sched: Fix warning on make htmldocs caused by wait.h slab: struct kmem_cache is protected by slab_mutex doc: Fix typo in USB Gadget Documentation of/Kconfig: Spelling s/one/once/ mkregtable: Fix sscanf handling lp5523, lp8501: comment improvements thermal: rcar: comment spelling treewide: fix comments and printk msgs IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart() Documentation: update /proc/uptime field description Documentation: Fix size parameter for snprintf arm: fix comment header and macro name asm-generic: uaccess: Spelling s/a ny/any/ mtd: onenand: fix comment header doc: driver-model/platform.txt: fix a typo drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text doc: Fix typo (acces_process_vm -> access_process_vm) treewide: Fix typos in printk drivers/gpu/drm/qxl/Kconfig: reformat the help text ...
2014-01-22arm64: mm: fix the function name in comment of __flush_dcache_areaJingoo Han1-1/+1
Fix the function name of comment of __flush_dcache_area, because __flush_dcache_area is the correct name. Also, the missing variable 'size' is added to the comment. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-22arm64: mm: use ubfm for dcache_line_sizeJingoo Han1-2/+1
Use 'ubfm' for the bitfield move instruction; thus, single instruction can be used instead of two instructions, when getting the minimum D-cache line size from CTR_EL0 register. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-20Merge tag 'arm64-upstream' of ↵Linus Torvalds46-395/+1808
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull ARM64 updates from Catalin Marinas: - CPU suspend support on top of PSCI (firmware Power State Coordination Interface) - jump label support - CMA can now be enabled on arm64 - HWCAP bits for crypto and CRC32 extensions - optimised percpu using tpidr_el1 register - code cleanup * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits) arm64: fix typo in entry.S arm64: kernel: restore HW breakpoint registers in cpu_suspend jump_label: use defined macros instead of hard-coding for better readability arm64, jump label: optimize jump label implementation arm64, jump label: detect %c support for ARM64 arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions arm64: move encode_insn_immediate() from module.c to insn.c arm64: introduce interfaces to hotpatch kernel and module code arm64: introduce basic aarch64 instruction decoding helpers arm64: dts: Reduce size of virtio block device for foundation model arm64: Remove unused __data_loc variable arm64: Enable CMA arm64: Warn on NULL device structure for dma APIs arm64: Add hwcaps for crypto and CRC32 extensions. arm64: drop redundant macros from read_cpuid() arm64: Remove outdated comment arm64: cmpxchg: update macros to prevent warnings arm64: support single-step and breakpoint handler hooks ARM64: fix framepointer check in unwind_frame ARM64: check stack pointer in get_wchan ...
2014-01-20Merge branch 'core-locking-for-linus' of ↵Linus Torvalds1-0/+50
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: - futex performance increases: larger hashes, smarter wakeups - mutex debugging improvements - lots of SMP ordering documentation updates - introduce the smp_load_acquire(), smp_store_release() primitives. (There are WIP patches that make use of them - not yet merged) - lockdep micro-optimizations - lockdep improvement: better cover IRQ contexts - liblockdep at last. We'll continue to monitor how useful this is * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) futexes: Fix futex_hashsize initialization arch: Re-sort some Kbuild files to hopefully help avoid some conflicts futexes: Avoid taking the hb->lock if there's nothing to wake up futexes: Document multiprocessor ordering guarantees futexes: Increase hash table size for better performance futexes: Clean up various details arch: Introduce smp_load_acquire(), smp_store_release() arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE mutexes: Give more informative mutex warning in the !lock->owner case powerpc: Full barrier for smp_mb__after_unlock_lock() rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier Documentation/memory-barriers.txt: Document ACCESS_ONCE() Documentation/memory-barriers.txt: Prohibit speculative writes Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency" ...
2014-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c net/ipv4/tcp_metrics.c Overlapping changes between the "don't create two tcp metrics objects with the same key" race fix in net and the addition of the destination address in the lookup key in net-next. Minor overlapping changes in bnx2x driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-17Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache" We noticed that it breaks ioremap (and earlyprintk) with 64K page configuration" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"
2014-01-16Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"Catalin Marinas1-1/+1
This reverts commit 2f7dc6027522499582a520807cb9ffda589de47e. The above commit breaks the mapping type for Device memory because pgprot_default already contains a Normal memory type. pgprot_default is also not initialised early enough for earlyprintk resulting in an inconsistent memory mapping with 64K PAGE_SIZE configuration. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Acked-by: Will Deacon <will.deacon@arm.com>
2014-01-15Merge tag 'kvm-arm-for-3.14' of ↵Paolo Bonzini1-0/+1
git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-queue
2014-01-13arm64: fix typo in entry.SNeil Zhang1-1/+1
Commit 64681787 (arm64: let the core code deal with preempt_count) changed the code, but left the comments unchanged, fix it. Signed-off-by: Neil Zhang <zhangwm@marvell.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-13Merge tag 'v3.13-rc8' into core/lockingIngo Molnar2-24/+18
Refresh the tree with the latest fixes, before applying new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12arch: Introduce smp_load_acquire(), smp_store_release()Peter Zijlstra1-0/+50
A number of situations currently require the heavyweight smp_mb(), even though there is no need to order prior stores against later loads. Many architectures have much cheaper ways to handle these situations, but the Linux kernel currently has no portable way to make use of them. This commit therefore supplies smp_load_acquire() and smp_store_release() to remedy this situation. The new smp_load_acquire() primitive orders the specified load against any subsequent reads or writes, while the new smp_store_release() primitive orders the specifed store against any prior reads or writes. These primitives allow array-based circular FIFOs to be implemented without an smp_mb(), and also allow a theoretical hole in rcu_assign_pointer() to be closed at no additional expense on most architectures. In addition, the RCU experience transitioning from explicit smp_read_barrier_depends() and smp_wmb() to rcu_dereference() and rcu_assign_pointer(), respectively resulted in substantial improvements in readability. It therefore seems likely that replacing other explicit barriers with smp_load_acquire() and smp_store_release() will provide similar benefits. It appears that roughly half of the explicit barriers in core kernel code might be so replaced. [Changelog by PaulMck] Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-10arm64: kernel: restore HW breakpoint registers in cpu_suspendLorenzo Pieralisi2-23/+28
When a CPU resumes from low-power, it restores HW breakpoint and watchpoint slots through a CPU PM notifier. Since we want to enable debugging as early as possible in the resume path, the mdscr content is restored along the general purpose registers in the cpu_suspend API and debug exceptions are reenabled when cpu_suspend returns. Since the CPU PM notifier is run after a CPU has been resumed, we cannot expect HW breakpoint registers to contain sane values till the notifier is run, since the HW breakpoints registers content is unknown at reset; this means that the CPU might run with debug exceptions enabled, mdscr restored but HW breakpoint registers containing junk values that can trigger spurious debug exceptions. This patch fixes current HW breakpoints restore by moving the HW breakpoints registers restoration to the cpu_suspend API, before the debug exceptions are enabled. This way, as soon as the cpu_suspend function returns the kernel can resume debugging with sane values in HW breakpoint registers. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64, jump label: optimize jump label implementationJiang Liu4-0/+112
Optimize jump label implementation for ARM64 by dynamically patching kernel text. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functionsJiang Liu2-0/+50
Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which will be used to implement jump label on ARM64. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: move encode_insn_immediate() from module.c to insn.cJiang Liu3-110/+114
Function encode_insn_immediate() will be used by other instruction manipulate related functions, so move it into insn.c and rename it as aarch64_insn_encode_immediate(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce interfaces to hotpatch kernel and module codeJiang Liu2-1/+128
Introduce three interfaces to patch kernel and module code: aarch64_insn_patch_text_nosync(): patch code without synchronization, it's caller's responsibility to synchronize all CPUs if needed. aarch64_insn_patch_text_sync(): patch code and always synchronize with stop_machine() aarch64_insn_patch_text(): patch code and synchronize with stop_machine() if needed Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce basic aarch64 instruction decoding helpersJiang Liu3-1/+169
Introduce basic aarch64 instruction decoding helper aarch64_get_insn_class() and aarch64_insn_hotpatch_safe(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-06Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-24/+18
Conflicts: drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c net/ipv6/ip6_tunnel.c net/ipv6/ip6_vti.c ipv6 tunnel statistic bug fixes conflicting with consolidation into generic sw per-cpu net stats. qlogic conflict between queue counting bug fix and the addition of multiple MAC address support. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28Merge branch 'kvm-arm64/for-3.14' into kvm-arm64/nextMarc Zyngier6-18/+41
2013-12-28arm64: KVM: Force undefined exception for Guest SMC intructionsAnup Patel1-3/+0
The SMC-based PSCI emulation for Guest is going to be very different from the in-kernel HVC-based PSCI emulation hence for now just inject undefined exception when Guest executes SMC instruction. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: marc Zyngier <marc.zyngier@arm.com>
2013-12-28arm64: KVM: Support X-Gene guest VCPU on APM X-Gene hostAnup Patel3-14/+24
This patch allows us to have X-Gene guest VCPU when using KVM arm64 on APM X-Gene host. We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu() when running on X-Gene host with Potenza core. [maz: sanitized the commit log] Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-12-28arm64: KVM: Add Kconfig option for max VCPUs per-GuestAnup Patel2-1/+17
Current max VCPUs per-Guest is set to 4 which is preventing us from creating a Guest (or VM) with 8 VCPUs on Host (e.g. X-Gene Storm SOC) with 8 Host CPUs. The correct value of max VCPUs per-Guest should be same as the max CPUs supported by GICv2 which is 8 but, increasing value of max VCPUs per-Guest can make things slower hence we add Kconfig option to let KVM users select appropriate max VCPUs per-Guest. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-12-21ARM/KVM: save and restore generic timer registersAndre Przywara1-0/+18
For migration to work we need to save (and later restore) the state of each core's virtual generic timer. Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export the three needed registers (control, counter, compare value). Though they live in cp15 space, we don't use the existing list, since they need special accessor functions and the arch timer is optional. Acked-by: Marc Zynger <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-20Merge tag 'arm64-fixes' of ↵Linus Torvalds1-20/+18
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 ptrace fix from Catalin Marinas. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
2013-12-20Merge tag 'stable/for-linus-3.13-rc4-tag' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: - Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use scratch pages. - Fix block API header for ARM32 and ARM64 to have proper layout - On ARM when mapping guests, stick on PTE_SPECIAL - When using SWIOTLB under ARM, don't call swiotlb functions twice - When unmapping guests memory and if we fail, don't return pages which failed to be unmapped. - Grant driver was using the wrong address on ARM. * tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Seperate the auto-translate logic properly (v2) xen/block: Correctly define structures in public headers on ARM32 and ARM64 arm: xen: foreign mapping PTEs are special. xen/arm64: do not call the swiotlb functions twice xen: privcmd: do not return pages which we have failed to unmap XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
2013-12-20arm64: dts: Reduce size of virtio block device for foundation modelMark Brown1-1/+1
Will Deacon observed that kvmtool uses a size of 0x200 for virtio block memory region and that the virtio block spec only uses 31 bytes in the device specific region at 0x100 so reduce the region to a less wasteful 0x200. Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-20arm64: Remove unused __data_loc variableGeoff Levand2-12/+0
The __data_loc variable is an unused left over from the 32 bit arm implementation. Remove that variable and adjust the __mmap_switched startup routine accordingly. Signed-off-by: Geoff Levand <geoff@infradead.org> for Huawei, Linaro Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>