drm-tip - DRM current development and nightly trees

Age	Commit message (Collapse)	Author	Files	Lines
2019-09-24	KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit	Tao Xu	3	-17/+21
	As the latest Intel 64 and IA-32 Architectures Software Developer's Manual, UMWAIT and TPAUSE instructions cause a VM exit if the RDTSC exiting and enable user wait and pause VM-execution controls are both 1. Because KVM never enable RDTSC exiting, the vm-exit for UMWAIT and TPAUSE should never happen. Considering EXIT_REASON_XSAVES and EXIT_REASON_XRSTORS is also unexpected VM-exit for KVM. Introduce a common exit helper handle_unexpected_vmexit() to handle these unexpected VM-exit. Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL	Tao Xu	4	-0/+53
	UMWAIT and TPAUSE instructions use 32bit IA32_UMWAIT_CONTROL at MSR index E1H to determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. This patch emulates MSR IA32_UMWAIT_CONTROL in guest and differentiate IA32_UMWAIT_CONTROL between host and guest. The variable mwait_control_cached in arch/x86/kernel/cpu/umwait.c caches the MSR value, so this patch uses it to avoid frequently rdmsr of IA32_UMWAIT_CONTROL. Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Add support for user wait instructions	Tao Xu	5	-1/+27
	UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions. This patch adds support for user wait instructions in KVM. Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may be executed at any privilege level, and use 32bit IA32_UMWAIT_CONTROL MSR to set the maximum time. The behavior of user wait instructions in VMX non-root operation is determined first by the setting of the "enable user wait and pause" secondary processor-based VM-execution control bit 26. If the VM-execution control is 0, UMONITOR/UMWAIT/TPAUSE cause an invalid-opcode exception (#UD). If the VM-execution control is 1, treatment is based on the setting of the â€œRDTSC exitingâ€ VM-execution control. Because KVM never enables RDTSC exiting, if the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay. If IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in EDX:EAX minus the value that RDTSC would return; if IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum of that difference and AND(IA32_UMWAIT_CONTROL,FFFFFFFCH). Because umwait and tpause can put a (psysical) CPU into a power saving state, by default we dont't expose it to kvm and enable it only when guest CPUID has it. Detailed information about user wait instructions can be found in the latest Intel 64 and IA-32 Architectures Software Developer's Manual. Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Add comments to document various emulation types	Sean Christopherson	1	-0/+30
	Document the intended usage of each emulation type as each exists to handle an edge case of one kind or another and can be easily misinterpreted at first glance. Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig	Sean Christopherson	3	-39/+36
	VMX's EPT misconfig flow to handle fast-MMIO path falls back to decoding the instruction to determine the instruction length when running as a guest (Hyper-V doesn't fill VMCS.VM_EXIT_INSTRUCTION_LEN because it's technically not defined for EPT misconfigs). Rather than implement the slow skip in VMX's generic skip_emulated_instruction(), handle_ept_misconfig() directly calls kvm_emulate_instruction() with EMULTYPE_SKIP, which intentionally doesn't do single-step detection, and so handle_ept_misconfig() misses a single-step #DB. Rework the EPT misconfig fallback case to route it through kvm_skip_emulated_instruction() so that single-step #DBs and interrupt shadow updates are handled automatically. I.e. make VMX's slow skip logic match SVM's and have the SVM flow not intentionally avoid the shadow update. Alternatively, the handle_ept_misconfig() could manually handle single- step detection, but that results in EMULTYPE_SKIP having split logic for the interrupt shadow vs. single-step #DBs, and split emulator logic is largely what led to this mess in the first place. Modifying SVM to mirror VMX flow isn't really an option as SVM's case isn't limited to a specific exit reason, i.e. handling the slow skip in skip_emulated_instruction() is mandatory for all intents and purposes. Drop VMX's skip_emulated_instruction() wrapper since it can now fail, and instead WARN if it fails unexpectedly, e.g. if exit_reason somehow becomes corrupted. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: d391f12070672 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT}	Sean Christopherson	5	-78/+49
	Deferring emulation failure handling (in some cases) to the caller of x86_emulate_instruction() has proven fragile, e.g. multiple instances of KVM not setting run->exit_reason on EMULATE_FAIL, largely due to it being difficult to discern what emulation types can return what result, and which combination of types and results are handled where. Now that x86_emulate_instruction() always handles emulation failure, i.e. EMULATION_FAIL is only referenced in callers, remove the emulation_result enums entirely. Per KVM's existing exit handling conventions, return '0' and '1' for "exit to userspace" and "resume guest" respectively. Doing so cleans up many callers, e.g. they can return kvm_emulate_instruction() directly instead of having to interpret its result. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: VMX: Remove EMULATE_FAIL handling in handle_invalid_guest_state()	Sean Christopherson	1	-21/+20
	Now that EMULATE_FAIL is completely unused, remove the last remaning usage where KVM does something functional in response to EMULATE_FAIL. Leave the check in place as a WARN_ON_ONCE to provide a better paper trail when EMULATE_{DONE,FAIL,USER_EXIT} are completely removed. Opportunistically remove the gotos in handle_invalid_guest_state(). With the EMULATE_FAIL handling gone there is no need to have a common handler for emulation failure and the gotos only complicate things, e.g. the signal_pending() check always returns '1', but this is far from obvious when glancing through the code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Move triple fault request into RM int injection	Sean Christopherson	3	-16/+12
	Request triple fault in kvm_inject_realmode_interrupt() instead of returning EMULATE_FAIL and deferring to the caller. All existing callers request triple fault and it's highly unlikely Real Mode is going to acquire new features. While this consolidates a small amount of code, the real goal is to remove the last reference to EMULATE_FAIL. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Handle emulation failure directly in kvm_task_switch()	Sean Christopherson	3	-23/+11
	Consolidate the reporting of emulation failure into kvm_task_switch() so that it can return EMULATE_USER_EXIT. This helps pave the way for removing EMULATE_FAIL altogether. This also fixes a theoretical bug where task switch interception could suppress an EMULATE_USER_EXIT return. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Exit to userspace on emulation skip failure	Sean Christopherson	2	-4/+9
	Kill a few birds with one stone by forcing an exit to userspace on skip emulation failure. This removes a reference to EMULATE_FAIL, fixes a bug in handle_ept_misconfig() where it would exit to userspace without setting run->exit_reason, and fixes a theoretical bug in SVM's task_switch_interception() where it would overwrite run->exit_reason on a return of EMULATE_USER_EXIT. Note, this technically doesn't fully fix task_switch_interception() as it now incorrectly handles EMULATE_FAIL, but in practice there is no bug as EMULATE_FAIL will never be returned for EMULTYPE_SKIP. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Move #UD injection for failed emulation into emulation code	Sean Christopherson	1	-9/+5
	Immediately inject a #UD and return EMULATE done if emulation fails when handling an intercepted #UD. This helps pave the way for removing EMULATE_FAIL altogether. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Add explicit flag for forced emulation on #UD	Sean Christopherson	2	-2/+4
	Add an explicit emulation type for forced #UD emulation and use it to detect that KVM should unconditionally inject a #UD instead of falling into its standard emulation failure handling. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Move #GP injection for VMware into x86_emulate_instruction()	Sean Christopherson	4	-23/+14
	Immediately inject a #GP when VMware emulation fails and return EMULATE_DONE instead of propagating EMULATE_FAIL up the stack. This helps pave the way for removing EMULATE_FAIL altogether. Rename EMULTYPE_VMWARE to EMULTYPE_VMWARE_GP to document that the x86 emulator is called to handle VMware #GP interception, e.g. why a #GP is injected on emulation failure for EMULTYPE_VMWARE_GP. Drop EMULTYPE_NO_UD_ON_FAIL as a standalone type. The "no #UD on fail" is used only in the VMWare case and is obsoleted by having the emulator itself reinject #GP. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error code	Sean Christopherson	2	-2/+20
	The VMware backdoor hooks #GP faults on IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero error code for their #GP. Re-injecting #GP instead of attempting emulation on a non-zero error code will allow a future patch to move #GP injection (for emulation failure) into kvm_emulate_instruction() without having to plumb in the error code. Reviewed-and-tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Refactor kvm_vcpu_do_singlestep() to remove out param	Sean Christopherson	1	-6/+6
	Return the single-step emulation result directly instead of via an out param. Presumably at some point in the past kvm_vcpu_do_singlestep() could be called with *r==EMULATE_USER_EXIT, but that is no longer the case, i.e. all callers are happy to overwrite their own return variable. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Clean up handle_emulation_failure()	Sean Christopherson	1	-6/+4
	When handling emulation failure, return the emulation result directly instead of capturing it in a local variable. Future patches will move additional cases into handle_emulation_failure(), clean up the cruft before so there isn't an ugly mix of setting a local variable and returning directly. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Relocate MMIO exit stats counting	Sean Christopherson	3	-3/+2
	Move the stat.mmio_exits update into x86_emulate_instruction(). This is both a bug fix, e.g. the current update flows will incorrectly increment mmio_exits on emulation failure, and a preparatory change to set the stage for eliminating EMULATE_DONE and company. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: nVMX: Check Host Address Space Size on vmentry of nested guests	Krish Sadhukhan	1	-0/+28
	According to section "Checks Related to Address-Space Size" in Intel SDM vol 3C, the following checks are performed on vmentry of nested guests: If the logical processor is outside IA-32e mode (if IA32_EFER.LMA = 0) at the time of VM entry, the following must hold: - The "IA-32e mode guest" VM-entry control is 0. - The "host address-space size" VM-exit control is 0. If the logical processor is in IA-32e mode (if IA32_EFER.LMA = 1) at the time of VM entry, the "host address-space size" VM-exit control must be 1. If the "host address-space size" VM-exit control is 0, the following must hold: - The "IA-32e mode guest" VM-entry control is 0. - Bit 17 of the CR4 field (corresponding to CR4.PCIDE) is 0. - Bits 63:32 in the RIP field are 0. If the "host address-space size" VM-exit control is 1, the following must hold: - Bit 5 of the CR4 field (corresponding to CR4.PAE) is 1. - The RIP field contains a canonical address. On processors that do not support Intel 64 architecture, checks are performed to ensure that the "IA-32e mode guest" VM-entry control and the "host address-space size" VM-exit control are both 0. Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: selftests: hyperv_cpuid: add check for NoNonArchitecturalCoreSharing bit	Vitaly Kuznetsov	1	-0/+27
	The bit is supposed to be '1' when SMT is not supported or forcefully disabled and '0' otherwise. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: hyper-v: set NoNonArchitecturalCoreSharing CPUID bit when SMT is ↵	Vitaly Kuznetsov	2	-1/+10
	impossible Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot guarantee that two virtual processors won't end up running on sibling SMT threads without knowing about it. This is done as an optimization as in this case there is nothing the guest can do to protect itself against MDS and issuing additional flush requests is just pointless. On bare metal the topology is known, however, when Hyper-V is running nested (e.g. on top of KVM) it needs an additional piece of information: a confirmation that the exposed topology (wrt vCPU placement on different SMT threads) is trustworthy. NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in TLFS as follows: "Indicates that a virtual processor will never share a physical core with another virtual processor, except for virtual processors that are reported as sibling SMT threads." From KVM we can give such guarantee in two cases: - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work as it can become re-enabled during the lifetime of the guest). - vCPUs are properly pinned so the scheduler won't put them on sibling SMT threads (when they're not reported as such). This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the first case. The second case is outside of KVM's domain of responsibility (as vCPU pinning is actually done by someone who manages KVM's userspace - e.g. libvirt pinning QEMU threads). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	cpu/SMT: create and export cpu_smt_possible()	Vitaly Kuznetsov	2	-2/+11
	KVM needs to know if SMT is theoretically possible, this means it is supported and not forcefully disabled ('nosmt=force'). Create and export cpu_smt_possible() answering this question. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: hyperv: Fix Direct Synthetic timers assert an interrupt w/o lapic_in_kernel	Wanpeng Li	1	-2/+10
	Reported by syzkaller: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN RIP: 0010:__apic_accept_irq+0x46/0x740 arch/x86/kvm/lapic.c:1029 Call Trace: kvm_apic_set_irq+0xb4/0x140 arch/x86/kvm/lapic.c:558 stimer_notify_direct arch/x86/kvm/hyperv.c:648 [inline] stimer_expiration arch/x86/kvm/hyperv.c:659 [inline] kvm_hv_process_stimers+0x594/0x1650 arch/x86/kvm/hyperv.c:686 vcpu_enter_guest+0x2b2a/0x54b0 arch/x86/kvm/x86.c:7896 vcpu_run+0x393/0xd40 arch/x86/kvm/x86.c:8152 kvm_arch_vcpu_ioctl_run+0x636/0x900 arch/x86/kvm/x86.c:8360 kvm_vcpu_ioctl+0x6cf/0xaf0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2765 The testcase programs HV_X64_MSR_STIMERn_CONFIG/HV_X64_MSR_STIMERn_COUNT, in addition, there is no lapic in the kernel, the counters value are small enough in order that kvm_hv_process_stimers() inject this already-expired timer interrupt into the guest through lapic in the kernel which triggers the NULL deferencing. This patch fixes it by don't advertise direct mode synthetic timers and discarding the inject when lapic is not in kernel. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=1752fe0a600000 Reported-by: syzbot+dff25ee91f0c7d5c1695@syzkaller.appspotmail.com Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: Manually flush collapsible SPTEs only when toggling flags	Sean Christopherson	1	-1/+6
	Zapping collapsible sptes, a.k.a. 4k sptes that can be promoted into a large page, is only necessary when changing only the dirty logging flag of a memory region. If the memslot is also being moved, then all sptes for the memslot are zapped when it is invalidated. When a memslot is being created, it is impossible for there to be existing dirty mappings, e.g. KVM can have MMIO sptes, but not present, and thus dirty, sptes. Note, the comment and logic are shamelessly borrowed from MIPS's version of kvm_arch_commit_memory_region(). Fixes: 3ea3b7fa9af06 ("kvm: mmu: lazy collapse small sptes into large sptes") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: selftests: Remove duplicate guest mode handling	Peter Xu	3	-47/+26
	Remove the duplication code in run_test() of dirty_log_test because after some reordering of functions now we can directly use the outcome of vm_create(). Meanwhile, with the new VM_MODE_PXXV48_4K, we can safely revert b442324b58 too where we stick the x86_64 PA width to 39 bits for dirty_log_test. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: selftests: Introduce VM_MODE_PXXV48_4K	Peter Xu	6	-14/+67
	The naming VM_MODE_P52V48_4K is explicit but unclear when used on x86_64 machines, because x86_64 machines are having various physical address width rather than some static values. Here's some examples: - Intel Xeon E3-1220: 36 bits - Intel Core i7-8650: 39 bits - AMD EPYC 7251: 48 bits All of them are using 48 bits linear address width but with totally different physical address width (and most of the old machines should be less than 52 bits). Let's create a new guest mode called VM_MODE_PXXV48_4K for current x86_64 tests and make it as the default to replace the old naming of VM_MODE_P52V48_4K because it shows more clearly that the PA width is not really a constant. Meanwhile we also stop assuming all the x86 machines are having 52 bits PA width but instead we fetch the real vm->pa_bits from CPUID 0x80000008 during runtime. We currently make this exclusively used by x86_64 but no other arch. As a slight touch up, moving DEBUG macro from dirty_log_test.c to kvm_util.h so lib can use it too. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: selftests: Create VM earlier for dirty log test	Peter Xu	1	-3/+16
	Since we've just removed the dependency of vm type in previous patch, now we can create the vm much earlier. Note that to move it earlier we used an approximation of number of extra pages but it should be fine. This prepares for the follow up patches to finally remove the duplication of guest mode parsings. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: selftests: Move vm type into _vm_create() internally	Peter Xu	3	-20/+17
	Rather than passing the vm type from the top level to the end of vm creation, let's simply keep that as an internal of kvm_vm struct and decide the type in _vm_create(). Several reasons for doing this: - The vm type is only decided by physical address width and currently only used in aarch64, so we've got enough information as long as we're passing vm_guest_mode into _vm_create(), - This removes a loop dependency between the vm->type and creation of vms. That's why now we need to parse vm_guest_mode twice sometimes, once in run_test() and then again in _vm_create(). The follow up patches will move on to clean up that as well so we can have a single place to decide guest machine types and so. Note that this patch will slightly change the behavior of aarch64 tests in that previously most vm_create() callers will directly pass in type==0 into _vm_create() but now the type will depend on vm_guest_mode, however it shouldn't affect any user because all vm_create() users of aarch64 will be using VM_MODE_DEFAULT guest mode (which is VM_MODE_P40V48_4K) so at last type will still be zero. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: announce KVM_CAP_HYPERV_ENLIGHTENED_VMCS support only when it is ↵	Vitaly Kuznetsov	1	-2/+4
	available It was discovered that after commit 65efa61dc0d5 ("selftests: kvm: provide common function to enable eVMCS") hyperv_cpuid selftest is failing on AMD. The reason is that the commit changed _vcpu_ioctl() to vcpu_ioctl() in the test and this one can't fail. Instead of fixing the test is seems to make more sense to not announce KVM_CAP_HYPERV_ENLIGHTENED_VMCS support if it is definitely missing (on svm and in case kvm_intel.nested=0). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM: x86: svm: remove unneeded nested_enable_evmcs() hook	Vitaly Kuznetsov	1	-8/+1
	Since commit 5158917c7b019 ("KVM: x86: nVMX: Allow nested_enable_evmcs to be NULL") the code in x86.c is prepared to see nested_enable_evmcs being NULL and in VMX case it actually is when nesting is disabled. Remove the unneeded stub from SVM code. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM/Hyper-V/VMX: Add direct tlb flush support	Vitaly Kuznetsov	4	-0/+47
	Hyper-V provides direct tlb flush function which helps L1 Hypervisor to handle Hyper-V tlb flush request from L2 guest. Add the function support for VMX. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH	Tianyu Lan	4	-0/+23
	Hyper-V direct tlb flush function should be enabled for guest that only uses Hyper-V hypercall. User space hypervisor(e.g, Qemu) can disable KVM identification in CPUID and just exposes Hyper-V identification to make sure the precondition. Add new KVM capability KVM_CAP_ HYPERV_DIRECT_TLBFLUSH for user space to enable Hyper-V direct tlb function and this function is default to be disabled in KVM. Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	x86/Hyper-V: Fix definition of struct hv_vp_assist_page	Tianyu Lan	1	-5/+15
	The struct hv_vp_assist_page was defined incorrectly. The "vtl_control" should be u64[3], "nested_enlightenments _control" should be a u64 and there are 7 reserved bytes following "enlighten_vmentry". Fix the definition. Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24	kvm: x86: Add Intel PMU MSRs to msrs_to_save[]	Jim Mattson	1	-0/+41
	These MSRs should be enumerated by KVM_GET_MSR_INDEX_LIST, so that userspace knows that these MSRs may be part of the vCPU state. Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Eric Hankland <ehankland@google.com> Reviewed-by: Peter Shier <pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-23	Merge tag 'mfd-next-5.4' of ↵	Linus Torvalds	45	-727/+742
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "New Drivers: - Add support for Merrifield Basin Cove PMIC New Device Support: - Add support for Intel Tiger Lake to Intel LPSS PCI - Add support for Intel Sky Lake to Intel LPSS PCI - Add support for ST-Ericsson DB8520 to DB8500 PRCMU New Functionality: - Add RTC and PWRC support to MT6323 Fix-ups: - Clean-up include files; davinci_voicecodec, asic3, sm501, mt6397 - Ignore return values from debugfs_create(); ab3100-, ab8500-debugfs, aat2870-core - Device Tree changes; rn5t618, mt6397 - Use new I2C API; tps80031, 88pm860x-core, ab3100-core, bcm590xx, da9150-core, max14577, max77693, max77843, max8907, max8925-i2c, max8997, max8998, palmas, twl-core, - Remove obsolete code; da9063, jz4740-adc - Simplify semantics; timberdale, htc-i2cpld - Add 'fall-through' tags; omap-usb-host, db8500-prcmu - Remove superfluous prints; ab8500-debugfs, db8500-prcmu, fsl-imx25-tsadc, intel_soc_pmic_bxtwc, qcom_rpm, sm501 - Trivial rename/whitespace/typo fixes; mt6397-core, MAINTAINERS - Reorganise code structure; mt6397-* - Improve code consistency; intel-lpss - Use MODULE_SOFTDEP() helper; intel-lpss - Use DEFINE_RES_() helpers; mt6397-core Bug Fixes: - Clean-up resources; max77620 - Prevent input events being dropped on resume; intel-lpss-pci - Prevent sleeping in IRQ context; ezx-pcap" tag 'mfd-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (48 commits) mfd: mt6323: Add MT6323 RTC and PWRC mfd: mt6323: Replace boilerplate resource code with DEFINE_RES_* macros mfd: mt6397: Add mutex include dt-bindings: mfd: mediatek: Add MT6323 Power Controller dt-bindings: mfd: mediatek: Update RTC to include MT6323 dt-bindings: mfd: mediatek: mt6397: Change to relative paths mfd: db8500-prcmu: Support the higher DB8520 ARMSS mfd: intel-lpss: Use MODULE_SOFTDEP() instead of implicit request mfd: htc-i2cpld: Drop check because i2c_unregister_device() is NULL safe mfd: sm501: Include the GPIO driver header mfd: intel-lpss: Add Intel Skylake ACPI IDs mfd: intel-lpss: Consistently use GENMASK() mfd: Add support for Merrifield Basin Cove PMIC mfd: ezx-pcap: Replace mutex_lock with spin_lock mfd: asic3: Include the right header MAINTAINERS: altera-sysmgr: Fix typo in a filepath mfd: mt6397: Extract IRQ related code from core driver mfd: mt6397: Rename macros to something more readable mfd: Remove dev_err() usage after platform_get_irq() mfd: db8500-prcmu: Mark expected switch fall-throughs ...
2019-09-23	Merge tag 'backlight-next-5.4' of ↵	Linus Torvalds	11	-14/+120
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: "Core Frameworks - Obtain scale type through sysfs New Functionality: - Provide Device Tree functionality in rave-sp-backlight - Calculate if scale type is (non-)linear in pwm_bl Fix-ups: - Simplify code in lm3630a_bl - Trivial rename/whitespace/typo fixes in lms283gf05 - Remove superfluous NULL check in tosa_lcd - Fix power state initialisation in gpio_backlight - List supported file in MAINTAINERS Bug Fixes: - Kconfig - default to not building unless requested in {LED,BACKLIGHT}_CLASS_DEVICE" * tag 'backlight-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: backlight: pwm_bl: Set scale type for brightness curves specified in the DT backlight: pwm_bl: Set scale type for CIE 1931 curves backlight: Expose brightness curve type through sysfs MAINTAINERS: Add entry for stable backlight sysfs ABI documentation backlight: gpio-backlight: Correct initial power state handling video: backlight: tosa_lcd: drop check because i2c_unregister_device() is NULL safe video: backlight: Drop default m for {LCD,BACKLIGHT_CLASS_DEVICE} backlight: lms283gf05: Fix a typo in the description passed to 'devm_gpio_request_one()' backlight: lm3630a: Switch to use fwnode_property_count_uXX() backlight: rave-sp: Leave initial state and register with correct device
2019-09-23	Merge tag 'pci-v5.4-changes' of ↵	Linus Torvalds	99	-1186/+4201
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Consolidate _HPP/_HPX stuff in pci-acpi.c and simplify it (Krzysztof Wilczynski) - Fix incorrect PCIe device types and remove dev->has_secondary_link to simplify code that deals with upstream/downstream ports (Mika Westerberg) - After suspend, restore Resizable BAR size bits correctly for 1MB BARs (Sumit Saxena) - Enable PCI_MSI_IRQ_DOMAIN support for RISC-V (Wesley Terpstra) Virtualization: - Add ACS quirks for iProc PAXB (Abhinav Ratna), Amazon Annapurna Labs (Ali Saidi) - Move sysfs SR-IOV functions to iov.c (Kelsey Skunberg) - Remove group write permissions from sysfs sriov_numvfs, sriov_drivers_autoprobe (Kelsey Skunberg) Hotplug: - Simplify pciehp indicator control (Denis Efremov) Peer-to-peer DMA: - Allow P2P DMA between root ports for whitelisted bridges (Logan Gunthorpe) - Whitelist some Intel host bridges for P2P DMA (Logan Gunthorpe) - DMA map P2P DMA requests that traverse host bridge (Logan Gunthorpe) Amazon Annapurna Labs host bridge driver: - Add DT binding and controller driver (Jonathan Chocron) Hyper-V host bridge driver: - Fix hv_pci_dev->pci_slot use-after-free (Dexuan Cui) - Fix PCI domain number collisions (Haiyang Zhang) - Use instance ID bytes 4 & 5 as PCI domain numbers (Haiyang Zhang) - Fix build errors on non-SYSFS config (Randy Dunlap) i.MX6 host bridge driver: - Limit DBI register length (Stefan Agner) Intel VMD host bridge driver: - Fix config addressing issues (Jon Derrick) Layerscape host bridge driver: - Add bar_fixed_64bit property to endpoint driver (Xiaowei Bao) - Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC drivers separately (Xiaowei Bao) Mediatek host bridge driver: - Add MT7629 controller support (Jianjun Wang) Mobiveil host bridge driver: - Fix CPU base address setup (Hou Zhiqiang) - Make "num-lanes" property optional (Hou Zhiqiang) Tegra host bridge driver: - Fix OF node reference leak (Nishka Dasgupta) - Disable MSI for root ports to work around design problem (Vidya Sagar) - Add Tegra194 DT binding and controller support (Vidya Sagar) - Add support for sideband pins and slot regulators (Vidya Sagar) - Add PIPE2UPHY support (Vidya Sagar) Misc: - Remove unused pci_block_cfg_access() et al (Kelsey Skunberg) - Unexport pci_bus_get(), etc (Kelsey Skunberg) - Hide PM, VC, link speed, ATS, ECRC, PTM constants and interfaces in the PCI core (Kelsey Skunberg) - Clean up sysfs DEVICE_ATTR() usage (Kelsey Skunberg) - Mark expected switch fall-through (Gustavo A. R. Silva) - Propagate errors for optional regulators and PHYs (Thierry Reding) - Fix kernel command line resource_alignment parameter issues (Logan Gunthorpe)" * tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (112 commits) PCI: Add pci_irq_vector() and other stubs when !CONFIG_PCI arm64: tegra: Add PCIe slot supply information in p2972-0000 platform arm64: tegra: Add configuration for PCIe C5 sideband signals PCI: tegra: Add support to enable slot regulators PCI: tegra: Add support to configure sideband pins PCI: vmd: Fix shadow offsets to reflect spec changes PCI: vmd: Fix config addressing when using bus offsets PCI: dwc: Add validation that PCIe core is set to correct mode PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port PCI: Add ACS quirk for Amazon Annapurna Labs root ports PCI: Add Amazon's Annapurna Labs vendor ID MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries dt-bindings: PCI: tegra: Add sideband pins configuration entries PCI: tegra: Add Tegra194 PCIe support PCI: Get rid of dev->has_secondary_link flag ...
2019-09-23	Merge tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next	Linus Torvalds	2	-23/+23
	Pull smack updates from Casey Schaufler: "Four patches for v5.4. Nothing is major. All but one are in response to mechanically detected potential issues. The remaining patch cleans up kernel-doc notations" * tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next: smack: use GFP_NOFS while holding inode_smack::smk_lock security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() smack: fix some kernel-doc notations Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
2019-09-23	Merge branch 'pci/trivial'	Bjorn Helgaas	13	-21/+7
	- Fix typos and whitespace errors (Bjorn Helgaas, Krzysztof Wilczynski) - Remove unnecessary "return" statements (Krzysztof Wilczynski) - Correct of_irq_parse_pci() function documentation (Lubomir Rintel) * pci/trivial: PCI: Remove unnecessary returns PCI: OF: Correct of_irq_parse_pci() documentation PCI: Fix typos and whitespace errors
2019-09-23	Merge branch 'remotes/lorenzo/pci/vmd'	Bjorn Helgaas	1	-10/+15
	- Fix VMD config addressing to ignore starting bus offset (Jon Derrick) - Fix VMD shadow offset scratchpad address (Jon Derrick) * remotes/lorenzo/pci/vmd: PCI: vmd: Fix shadow offsets to reflect spec changes PCI: vmd: Fix config addressing when using bus offsets
2019-09-23	Merge branch 'lorenzo/pci/tegra'	Bjorn Helgaas	20	-52/+2336
	- Fix Tegra OF node reference leak (Nishka Dasgupta) - Add #defines for PCIe Data Link Feature and Physical Layer 16.0 GT/s features (Vidya Sagar) - Disable MSI for Tegra Root Ports since they don't support using MSI for all Root Port events (Vidya Sagar) - Group DesignWare write-protected register writes together (Vidya Sagar) - Move DesignWare capability search interfaces so they can be used by both host and endpoint drivers (Vidya Sagar) - Add DesignWare extended capability search interfaces (Vidya Sagar) - Export dw_pcie_wait_for_link() so drivers can be modules (Vidya Sagar) - Add "snps,enable-cdm-check" DT binding for Configuration Dependent Module (CDM) register checking (Vidya Sagar) - Add DesignWare support for "snps,enable-cdm-check" CDM checking (Vidya Sagar) - Add "supports-clkreq" DT binding for host drivers to decide whether to advertise low power features (Vidya Sagar) - Add DT binding for Tegra194 (Vidya Sagar) - Add DT binding for Tegra194 P2U (PIPE to UPHY) block (Vidya Sagar) - Add support for Tegra194 P2U (PIPE to UPHY) (Vidya Sagar) - Add support for Tegra194 host controller (Vidya Sagar) - Add Tegra support for sideband PERST# and CLKREQ# for C5 (Vidya Sagar) - Add Tegra support for slot regulators for p2972-0000 platform (Vidya Sagar) * lorenzo/pci/tegra: arm64: tegra: Add PCIe slot supply information in p2972-0000 platform arm64: tegra: Add configuration for PCIe C5 sideband signals PCI: tegra: Add support to enable slot regulators PCI: tegra: Add support to configure sideband pins dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries dt-bindings: PCI: tegra: Add sideband pins configuration entries PCI: tegra: Add Tegra194 PCIe support phy: tegra: Add PCIe PIPE2UPHY support dt-bindings: PHY: P2U: Add Tegra194 P2U block dt-bindings: PCI: tegra: Add device tree support for Tegra194 dt-bindings: Add PCIe supports-clkreq property PCI: dwc: Add support to enable CDM register check dt-bindings: PCI: designware: Add binding for CDM register check PCI: dwc: Export dw_pcie_wait_for_link() API PCI: dwc: Add extended configuration space capability search API PCI: dwc: Move config space capability search API PCI: dwc: Group DBI registers writes requiring unlocking PCI: Disable MSI for Tegra root ports PCI: Add #defines for some of PCIe spec r4.0 features PCI: tegra: Fix OF node reference leak
2019-09-23	Merge branch 'remotes/lorenzo/pci/mobiveil'	Bjorn Helgaas	1	-3/+7
	- Fix mobiveil inbound window CPU base address setup (Hou Zhiqiang) * remotes/lorenzo/pci/mobiveil: PCI: mobiveil: Fix the CPU base address setup in inbound window
2019-09-23	Merge branch 'remotes/lorenzo/pci/misc'	Bjorn Helgaas	7	-23/+20
	- Propagate regulator_get_optional() errors so callers can distinguish real errors from optional regulators that are absent (Thierry Reding) - Propagate devm_of_phy_get() errors so callers can distinguish real errors from optional PHYs that are absent (Thierry Reding) - Add Andrew Murray as PCI native driver reviewer (Lorenzo Pieralisi) * remotes/lorenzo/pci/misc: MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer PCI: iproc: Propagate errors for optional PHYs PCI: histb: Propagate errors for optional regulators PCI: armada8x: Propagate errors for optional PHYs PCI: imx6: Propagate errors for optional regulators PCI: exynos: Propagate errors for optional PHYs PCI: rockchip: Propagate errors for optional regulators
2019-09-23	Merge branch 'remotes/lorenzo/pci/mediatek'	Bjorn Helgaas	3	-0/+20
	- Add mediatek support for MT7629 (Jianjun Wang) * remotes/lorenzo/pci/mediatek: PCI: mediatek: Add controller support for MT7629 dt-bindings: PCI: Add support for MT7629
2019-09-23	Merge branch 'remotes/lorenzo/pci/layerscape'	Bjorn Helgaas	3	-3/+21
	- Mark Layerscape endpoint BARs 2 and 4 as 64-bit (Xiaowei Bao) - Add CONFIG_PCI_LAYERSCAPE_EP so EP/RC can be built separately (Xiaowei Bao) * remotes/lorenzo/pci/layerscape: PCI: layerscape: Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC separately PCI: layerscape: Add the bar_fixed_64bit property to the endpoint driver
2019-09-23	Merge branch 'remotes/lorenzo/pci/imx'	Bjorn Helgaas	1	-0/+33
	- Reduce i.MX 6Quad DBI register length to avoid aborts from accessing invalid registers (Stefan Agner) * remotes/lorenzo/pci/imx: PCI: imx6: Limit DBI register length
2019-09-23	Merge branch 'remotes/lorenzo/pci/hv'	Bjorn Helgaas	2	-15/+81
	- Fix Hyper-V use-after-free in pci_dev removal (Dexuan Cui) - Fix Hyper-V build error in non-sysfs config (Randy Dunlap) - Reallocate to avoid Hyper-V domain number collisions (Haiyang Zhang) - Use Hyper-V instance ID bytes 4-5 to reduce domain collisions (Haiyang Zhang) * remotes/lorenzo/pci/hv: PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers PCI: hv: Detect and fix Hyper-V PCI domain number collision PCI: pci-hyperv: Fix build errors on non-SYSFS config PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it
2019-09-23	Merge branch 'remotes/lorenzo/pci/dwc'	Bjorn Helgaas	9	-23/+5
	- Make kirin_dw_pcie_ops constant (Nishka Dasgupta) - Make DesignWare "num-lanes" property optional and remove from relevant DTs (Hou Zhiqiang) * remotes/lorenzo/pci/dwc: arm64: dts: fsl: Remove num-lanes property from PCIe nodes ARM: dts: ls1021a: Remove num-lanes property from PCIe nodes PCI: dwc: Return directly when num-lanes is not found dt-bindings: PCI: designware: Remove the num-lanes from Required properties PCI: kirin: Make structure kirin_dw_pcie_ops constant
2019-09-23	Merge branch 'remotes/lorenzo/pci/al'	Bjorn Helgaas	9	-1/+495
	- Add driver for Amazon Annapurna Labs PCIe controller (Jonathan Chocron) - Disable MSI-X since Annapurna Labs advertises it, but it's broken (Jonathan Chocron) - Disable VPD since Annapurna Labs advertises it, but it's broken (Jonathan Chocron) - Add ACS quirk since Annapurna Labs doesn't support ACS but does provide some equivalent protections (Ali Saidi) * remotes/lorenzo/pci/al: PCI: dwc: Add validation that PCIe core is set to correct mode PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port PCI: Add ACS quirk for Amazon Annapurna Labs root ports PCI: Add Amazon's Annapurna Labs vendor ID # Conflicts: # drivers/pci/quirks.c
2019-09-23	Merge branch 'pci/resource'	Bjorn Helgaas	9	-30/+20
	- Convert pci_resource_to_user() to a weak function to remove HAVE_ARCH_PCI_RESOURCE_TO_USER #defines (Denis Efremov) - Use PCI_SRIOV_NUM_BARS for idiomatic loop structure (Denis Efremov) - Fix Resizable BAR size suspend/restore for 1MB BARs (Sumit Saxena) - Correct "pci=resource_alignment" example in documentation (Alexey Kardashevskiy) * pci/resource: PCI: Correct pci=resource_alignment parameter example PCI: Restore Resizable BAR size bits correctly for 1MB BARs PCI: Use PCI_SRIOV_NUM_BARS in loops instead of PCI_IOV_RESOURCE_END PCI: Convert pci_resource_to_user() to a weak function # Conflicts: # drivers/pci/pci.c
2019-09-23	Merge branch 'pci/pciehp'	Bjorn Helgaas	5	-80/+67
	- Cleanup pciehp LED/indicator control with a new consolidated pciehp_set_indicators() interface that controls both Attention and Power Indicators (Denis Efremov) * pci/pciehp: PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments PCI: pciehp: Remove pciehp_green_led_{on,off,blink}() PCI: pciehp: Remove pciehp_set_attention_status() PCI: pciehp: Combine adjacent indicator updates PCI: pciehp: Add pciehp_set_indicators() to set both indicators