Age | Commit message (Collapse) | Author | Files | Lines |
|
TLS 1.3 has to strip padding, and it starts out 16 bytes
from the end of the record. Make it clear this is because
of the auth tag.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar justification to previous change, the information
about decryption status belongs in the skb.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Original TLS implementation was handling one record at a time.
It stashed the type of the record inside tls context (per socket
structure) for convenience. When async crypto support was added
[1] the author had to use skb->cb to store the type per-message.
The use of skb->cb overlaps with strparser, however, so a hybrid
approach was taken where type is stored in context while parsing
(since we parse a message at a time) but once parsed its copied
to skb->cb.
Recently a workaround for sockmaps [2] exposed the previously
private struct _strp_msg and started a trend of adding user
fields directly in strparser's header. This is cleaner than
storing information about an skb in the context.
This change is not strictly necessary, but IMHO the ownership
of the context field is confusing. Information naturally
belongs to the skb.
[1] commit 94524d8fc965 ("net/tls: Add support for async decryption of tls records")
[2] commit b2c4618162ec ("bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - new code bugs:
- mctp: correct mctp_i2c_header_create result
- eth: fungible: fix reference to __udivdi3 on 32b builds
- eth: micrel: remove latencies support lan8814
Previous releases - regressions:
- bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
- vrf: fix packet sniffing for traffic originating from ip tunnels
- rxrpc: fix a race in rxrpc_exit_net()
- dsa: revert "net: dsa: stop updating master MTU from master.c"
- eth: ice: fix MAC address setting
Previous releases - always broken:
- tls: fix slab-out-of-bounds bug in decrypt_internal
- bpf: support dual-stack sockets in bpf_tcp_check_syncookie
- xdp: fix coalescing for page_pool fragment recycling
- ovs: fix leak of nested actions
- eth: sfc:
- add missing xdp queue reinitialization
- fix using uninitialized xdp tx_queue
- eth: ice:
- clear default forwarding VSI during VSI release
- fix broken IFF_ALLMULTI handling
- synchronize_rcu() when terminating rings
- eth: qede: confirm skb is allocated before using
- eth: aqc111: fix out-of-bounds accesses in RX fixup
- eth: slip: fix NPD bug in sl_tx_timeout()"
* tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
drivers: net: slip: fix NPD bug in sl_tx_timeout()
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
qede: confirm skb is allocated before using
net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n
net: phy: mscc-miim: reject clause 45 register accesses
net: axiemac: use a phandle to reference pcs_phy
dt-bindings: net: add pcs-handle attribute
net: axienet: factor out phy_node in struct axienet_local
net: axienet: setup mdio unconditionally
net: sfc: fix using uninitialized xdp tx_queue
rxrpc: fix a race in rxrpc_exit_net()
net: openvswitch: fix leak of nested actions
net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
net: openvswitch: don't send internal clone attribute to the userspace.
net: micrel: Fix KS8851 Kconfig
ice: clear cmd_type_offset_bsz for TX rings
ice: xsk: fix VSI state check in ice_xsk_wakeup()
...
|
|
The congestion status of a tcp flow may be updated since there
is congestion between tcp sender and receiver. It makes sense to
add tracepoint for congestion status set function to summate cc
status duration and evaluate the performance of network
and congestion algorithm. the backgound of this patch is below.
Link: https://github.com/iovisor/bcc/pull/3899
Signed-off-by: Ping Gan <jacky_gam_2001@163.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220406010956.19656-1-jacky_gam_2001@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Increment rx_otherhost_dropped counter when packet dropped due to
mismatched dest MAC addr.
An example when this drop can occur is when manually crafting raw
packets that will be consumed by a user space application via a tap
device. For testing purposes local traffic was generated using trafgen
for the client and netcat to start a server
Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
with "{eth(daddr=$INCORRECT_MAC...}", verified that iproute2 showed the
counter was incremented. (Also had to modify iproute2 to show the stat,
additional patch for that coming next.)
Signed-off-by: Jeffrey Ji <jeffreyji@google.com>
Reviewed-by: Brian Vazquez <brianvv@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220406172600.1141083-1-jeffreyjilinux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There's a number of functions and static variables used
under net/core/ but not from the outside. We currently
dump most of them into netdevice.h. That bad for many
reasons:
- netdevice.h is very cluttered, hard to figure out
what the APIs are;
- netdevice.h is very long;
- we have to touch netdevice.h more which causes expensive
incremental builds.
Create a header under net/core/ and move some declarations.
The new header is also a bit of a catch-all but that's
fine, if we create more specific headers people will
likely over-think where their declaration fit best.
And end up putting them in netdevice.h, again.
More work should be done on splitting netdevice.h into more
targeted headers, but that'd be more time consuming so small
steps.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Correctly propagate coherence information for VMbus devices (Michael
Kelley)
- Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng)
- Use barrier to prevent reording when reading ring buffer (Michael
Kelley)
- Use virt_store_mb in favour of smp_store_mb (Andrea Parri)
- Fix VMbus device object initialization (Andrea Parri)
- Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri)
- Fix a crash when unloading VMbus module (Guilherme G. Piccoli)
* tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
Drivers: hv: balloon: Disable balloon and hot-add accordingly
Drivers: hv: balloon: Support status report for larger page sizes
Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
PCI: hv: Propagate coherence from VMbus device to PCI device
Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
Drivers: hv: vmbus: Fix potential crash on module unload
Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
|
|
idev->addr_list needs to be protected by idev->lock. However, it is not
always possible to do so while iterating and performing actions on
inet6_ifaddr instances. For example, multiple functions (like
addrconf_{join,leave}_anycast) eventually call down to other functions
that acquire the idev->lock. The current code temporarily unlocked the
idev->lock during the loops, which can cause race conditions. Moving the
locks up is also not an appropriate solution as the ordering of lock
acquisition will be inconsistent with for example mc_lock.
This solution adds an additional field to inet6_ifaddr that is used
to temporarily add the instances to a temporary list while holding
idev->lock. The temporary list can then be traversed without holding
idev->lock. This change was done in two places. In addrconf_ifdown, the
list_for_each_entry_safe variant of the list loop is also no longer
necessary as there is no deletion within that specific loop.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20220403231523.45843-1-dossche.niels@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2022-04-06
We've added 8 non-merge commits during the last 8 day(s) which contain
a total of 9 files changed, 139 insertions(+), 36 deletions(-).
The main changes are:
1) rethook related fixes, from Jiri and Masami.
2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin.
3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf: selftests: Test fentry tracing a struct_ops program
bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
rethook: Fix to use WRITE_ONCE() for rethook:: Handler
selftests/bpf: Fix warning comparing pointer to 0
bpf: Fix sparse warnings in kprobe_multi_resolve_syms
bpftool: Explicit errno handling in skeletons
====================
Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We had various bugs over the years with code
breaking the assumption that tp->snd_cwnd is greater
than zero.
Lately, syzbot reported the WARN_ON_ONCE(!tp->prior_cwnd) added
in commit 8b8a321ff72c ("tcp: fix zero cwnd in tcp_cwnd_reduction")
can trigger, and without a repro we would have to spend
considerable time finding the bug.
Instead of complaining too late, we want to catch where
and when tp->snd_cwnd is set to an illegal value.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Link: https://lore.kernel.org/r/20220405233538.947344-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This allows hardware flow offloading from Ethernet to WLAN on MT7622 SoC
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Wireless Ethernet Dispatch subsystem on the MT7622 SoC can be
configured to intercept and handle access to the DMA queues and
PCIe interrupts for a MT7615/MT7915 wireless card.
It can manage the internal WDMA (Wireless DMA) controller, which allows
ethernet packets to be passed from the packet switch engine (PSE) to the
wireless card, bypassing the CPU entirely.
This can be used to implement hardware flow offloading from ethernet to
WLAN.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In include/uapi/linux/tipc_config.h, there's a comment that it includes
arpa/inet.h for ntohs; but ntohs is not defined in any UAPI header. For
now, reuse the definitions from include/linux/byteorder/generic.h, since
the various conversion functions do exist in UAPI headers:
include/uapi/linux/byteorder/big_endian.h
include/uapi/linux/byteorder/little_endian.h
We would like to get to the point where we can build UAPI header tests
with -nostdinc, meaning that kernel UAPI headers should not have a
circular dependency on libc headers.
Link: https://android-review.googlesource.com/c/platform/bionic/+/2048127
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are
merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)'
As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags'
into 'flags' and 'noblock' with finally obsolete bit operations like this:
skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc);
And this is not even done consistently with the 'flags' parameter.
This patch removes the obsolete and costly splitting into two parameters
and only performs bit operations when really needed on the caller side.
One missing conversion thankfully reported by kernel test robot. I missed
to enable kunit tests to build the mctp code.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In [1], Will raised a potential issue that the cfg80211 code,
which does (from a locking perspective)
rtnl_lock()
wiphy_lock()
rtnl_unlock()
might be suspectible to ABBA deadlocks, because rtnl_unlock()
calls netdev_run_todo(), which might end up calling rtnl_lock()
again, which could then deadlock (see the comment in the code
added here for the scenario).
Some back and forth and thinking ensued, but clearly this can't
happen if the net_todo_list is empty at the rtnl_unlock() here.
Clearly, the code here cannot actually put an entry on it, and
all other users of rtnl_unlock() will empty it since that will
always go through netdev_run_todo(), emptying the list.
So the only other way to get there would be to add to the list
and then unlock the RTNL without going through rtnl_unlock(),
which is only possible through __rtnl_unlock(). However, this
isn't exported and not used in many places, and none of them
seem to be able to unregister before using it.
Therefore, add a WARN_ON() in the code to ensure this invariant
won't be broken, so that the cfg80211 (or any similar) code
stays safe.
[1] https://lore.kernel.org/r/Yjzpo3TfZxtKPMAG@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220404113847.0ee02e4a70da.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull virtio fixes from Michael Tsirkin:
"Fixes and cleanups:
- A couple of mlx5 fixes related to cvq
- A couple of reverts dropping useless code (code that used it got
reverted earlier)"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa: mlx5: synchronize driver status with CVQ
vdpa: mlx5: prevent cvq work from hogging CPU
Revert "virtio_config: introduce a new .enable_cbs method"
Revert "virtio: use virtio_device_ready() in virtio_device_restore()"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull more tracing updates from Steven Rostedt:
- Rename the staging files to give them some meaning. Just
stage1,stag2,etc, does not show what they are for
- Check for NULL from allocation in bootconfig
- Hold event mutex for dyn_event call in user events
- Mark user events to broken (to work on the API)
- Remove eBPF updates from user events
- Remove user events from uapi header to keep it from being installed.
- Move ftrace_graph_is_dead() into inline as it is called from hot
paths and also convert it into a static branch.
* tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Move user_events.h temporarily out of include/uapi
ftrace: Make ftrace_graph_is_dead() a static branch
tracing: Set user_events to BROKEN
tracing/user_events: Remove eBPF interfaces
tracing/user_events: Hold event_mutex during dyn_event_add
proc: bootconfig: Add null pointer check
tracing: Rename the staging files for trace_events
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RT signal fix from Thomas Gleixner:
"Revert the RT related signal changes. They need to be reworked and
generalized"
* tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
|
|
Pull more dma-mapping updates from Christoph Hellwig:
- fix a regression in dma remap handling vs AMD memory encryption (me)
- finally kill off the legacy PCI DMA API (Christophe JAILLET)
* tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move pgprot_decrypted out of dma_pgprot
PCI/doc: cleanup references to the legacy PCI DMA API
PCI: Remove the deprecated "pci-dma-compat.h" API
|
|
Pull kvm fixes from Paolo Bonzini:
- Only do MSR filtering for MSRs accessed by rdmsr/wrmsr
- Documentation improvements
- Prevent module exit until all VMs are freed
- PMU Virtualization fixes
- Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences
- Other miscellaneous bugfixes
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
KVM: x86: fix sending PV IPI
KVM: x86/mmu: do compare-and-exchange of gPTE via the user address
KVM: x86: Remove redundant vm_entry_controls_clearbit() call
KVM: x86: cleanup enter_rmode()
KVM: x86: SVM: fix tsc scaling when the host doesn't support it
kvm: x86: SVM: remove unused defines
KVM: x86: SVM: move tsc ratio definitions to svm.h
KVM: x86: SVM: fix avic spec based definitions again
KVM: MIPS: remove reference to trap&emulate virtualization
KVM: x86: document limitations of MSR filtering
KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr
KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
KVM: x86/pmu: Fix and isolate TSX-specific performance event logic
KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set
KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
KVM: x86: Trace all APICv inhibit changes and capture overall status
KVM: x86: Add wrappers for setting/clearing APICv inhibits
KVM: x86: Make APICv inhibit reasons an enum and cleanup naming
KVM: X86: Handle implicit supervisor access with SMAP
KVM: X86: Rename variable smap to not_smap in permission_fault()
...
|
|
After being merged, user_events become more visible to a wider audience
that have concerns with the current API.
It is too late to fix this for this release, but instead of a full
revert, just mark it as BROKEN (which prevents it from being selected in
make config). Then we can work finding a better API. If that fails,
then it will need to be completely reverted.
To not have the code silently bitrot, still allow building it with
COMPILE_TEST.
And to prevent the uapi header from being installed, then later changed,
and then have an old distro user space see the old version, move the
header file out of the uapi directory.
Surround the include with CONFIG_COMPILE_TEST to the current location,
but when the BROKEN tag is taken off, it will use the uapi directory,
and fail to compile. This is a good way to remind us to move the header
back.
Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While user_events API is under development and has been marked for broken
to not let the API become fixed, move the header file out of the uapi
directory. This is to prevent it from being installed, then later changed,
and then have an old distro user space update with a new kernel, where
applications see the user_events being available, but the old header is in
place, and then they get compiled incorrectly.
Also, surround the include with CONFIG_COMPILE_TEST to the current
location, but when the BROKEN tag is taken off, it will use the uapi
directory, and fail to compile. This is a good way to remind us to move
the header back.
Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.home
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
ftrace_graph_is_dead() is used on hot paths, it just reads a variable
in memory and is not worth suffering function call constraints.
For instance, at entry of prepare_ftrace_return(), inlining it avoids
saving prepare_ftrace_return() parameters to stack and restoring them
after calling ftrace_graph_is_dead().
While at it using a static branch is even more performant and is
rather well adapted considering that the returned value will almost
never change.
Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool
by a static branch.
The performance improvement is noticeable.
Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Remove eBPF interfaces within user_events to ensure they are fully
reviewed.
Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/
Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.com
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
When looking for implementation of different phases of the creation of the
TRACE_EVENT() macro, it is pretty useless when all helper macro
redefinitions are in files labeled "stageX_defines.h". Rename them to
state which phase the files are for. For instance, when looking for the
defines that are used to create the event fields, seeing
"stage4_event_fields.h" gives the developer a good idea that the defines
are in that file.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
It isn't OK to cache the dirty status of a page in internal structures
for an indefinite period of time.
Any time a vCPU exits the run loop to userspace might be its last; the
VMM might do its final check of the dirty log, flush the last remaining
dirty pages to the destination and complete a live migration. If we
have internal 'dirty' state which doesn't get flushed until the vCPU
is finally destroyed on the source after migration is complete, then
we have lost data because that will escape the final copy.
This problem already exists with the use of kvm_vcpu_unmap() to mark
pages dirty in e.g. VMX nesting.
Note that the actual Linux MM already considers the page to be dirty
since we have a writeable mapping of it. This is just about the KVM
dirty logging.
For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to
track which gfn_to_pfn_caches have been used and explicitly mark the
corresponding pages dirty before returning to userspace. But we would
have needed external tracking of that anyway, rather than walking the
full list of GPCs to find those belonging to this vCPU which are dirty.
So let's rely *solely* on that external tracking, and keep it simple
rather than laying a tempting trap for callers to fall into.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220303154127.202856-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Replace the guest_uses_pa and kernel_map booleans in the PFN cache code
with a unified enum/bitmask. Using explicit names makes it easier to
review and audit call sites.
Opportunistically add a WARN to prevent passing garbage; instantating a
cache without declaring its usage is either buggy or pointless.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220303154127.202856-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Don't actually set a request bit in vcpu->requests when making a request
purely to force a vCPU to exit the guest. Logging a request but not
actually consuming it would cause the vCPU to get stuck in an infinite
loop during KVM_RUN because KVM would see the pending request and bail
from VM-Enter to service the request.
Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as
nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all
too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init()
without implementing handling of the request, especially since getting
test coverage of MMU notifier interaction with specific KVM features
usually requires a directed test.
Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus
to evict_vcpus. The purpose of the request is to get vCPUs out of guest
mode, it's supposed to _avoid_ waking vCPUs that are blocking.
Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to
what it wants to accomplish, and to genericize the name so that it can
used for similar but unrelated scenarios, should they arise in the future.
Add a comment and documentation to explain why the "no action" request
exists.
Add compile-time assertions to help detect improper usage. Use the inner
assertless helper in the one s390 path that makes requests without a
hardcoded request.
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220223165302.3205276-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
"Assorted bits and pieces"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: drop needless assignment in aio_read()
clean overflow checks in count_mounts() a bit
seq_file: fix NULL pointer arithmetic warning
uml/x86: use x86 load_unaligned_zeropad()
asm/user.h: killed unused macros
constify struct path argument of finish_automount()/do_add_mount()
fs: Remove FIXME comment in generic_write_checks()
|
|
Pull block driver fixes from Jens Axboe:
"Followup block driver updates and fixes for the 5.18-rc1 merge window.
In detail:
- NVMe pull request
- Fix multipath hang when disk goes live over reconnect (Anton
Eidelman)
- fix RCU hole that allowed for endless looping in multipath
round robin (Chris Leech)
- remove redundant assignment after left shift (Colin Ian King)
- add quirks for Samsung X5 SSDs (Monish Kumar R)
- fix the read-only state for zoned namespaces with unsupposed
features (Pankaj Raghav)
- use a private workqueue instead of the system workqueue in
nvmet (Sagi Grimberg)
- allow duplicate NSIDs for private namespaces (Sungup Moon)
- expose use_threaded_interrupts read-only in sysfs (Xin Hao)"
- nbd minor allocation fix (Zhang)
- drbd fixes and maintainer addition (Lars, Jakob, Christoph)
- n64cart build fix (Jackie)
- loop compat ioctl fix (Carlos)
- misc fixes (Colin, Dongli)"
* tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block:
drbd: remove check of list iterator against head past the loop body
drbd: remove usage of list iterator variable after loop
nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
MAINTAINERS: add drbd co-maintainer
drbd: fix potential silent data corruption
loop: fix ioctl calls using compat_loop_info
nvme-multipath: fix hang when disk goes live over reconnect
nvme: fix RCU hole that allowed for endless looping in multipath round robin
nvme: allow duplicate NSIDs for private namespaces
nvmet: remove redundant assignment after left shift
nvmet: use a private workqueue instead of the system workqueue
nvme-pci: add quirks for Samsung X5 SSDs
nvme-pci: expose use_threaded_interrupts read-only in sysfs
nvme: fix the read-only state for zoned namespaces with unsupposed features
n64cart: convert bi_disk to bi_bdev->bd_disk fix build
xen/blkfront: fix comment for need_copy
xen-blkback: remove redundant assignment to variable i
|
|
Pull block fixes from Jens Axboe:
"Either fixes or a few additions that got missed in the initial merge
window pull. In detail:
- List iterator fix to avoid leaking value post loop (Jakob)
- One-off fix in minor count (Christophe)
- Fix for a regression in how io priority setting works for an
exiting task (Jiri)
- Fix a regression in this merge window with blkg_free() being called
in an inappropriate context (Ming)
- Misc fixes (Ming, Tom)"
* tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block:
blk-wbt: remove wbt_track stub
block: use dedicated list iterator variable
block: Fix the maximum minor value is blk_alloc_ext_minor()
block: restore the old set_task_ioprio() behaviour wrt PF_EXITING
block: avoid calling blkg_free() in atomic context
lib/sbitmap: allocate sb->map via kvzalloc_node
|
|
Pull io_uring fixes from Jens Axboe:
"A little bit all over the map, some regression fixes for this merge
window, and some general fixes that are stable bound. In detail:
- Fix an SQPOLL memory ordering issue (Almog)
- Accept fixes (Dylan)
- Poll fixes (me)
- Fixes for provided buffers and recycling (me)
- Tweak to IORING_OP_MSG_RING command added in this merge window (me)
- Memory leak fix (Pavel)
- Misc fixes and tweaks (Pavel, me)"
* tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block:
io_uring: defer msg-ring file validity check until command issue
io_uring: fail links if msg-ring doesn't succeeed
io_uring: fix memory leak of uid in files registration
io_uring: fix put_kbuf without proper locking
io_uring: fix invalid flags for io_put_kbuf()
io_uring: improve req fields comments
io_uring: enable EPOLLEXCLUSIVE for accept poll
io_uring: improve task work cache utilization
io_uring: fix async accept on O_NONBLOCK sockets
io_uring: remove IORING_CQE_F_MSG
io_uring: add flag for disabling provided buffer recycling
io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
io_uring: don't recycle provided buffer if punted to async worker
io_uring: fix assuming triggered poll waitqueue is the single poll
io_uring: bump poll refs to full 31-bits
io_uring: remove poll entry from list when canceling all
io_uring: fix memory ordering when SQPOLL thread goes to sleep
io_uring: ensure that fsnotify is always called
io_uring: recycle provided before arming poll
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM integrity shrink crash due to journal entry not being marked
unused.
- Fix DM bio polling to handle possibility that underlying device(s)
return BLK_STS_AGAIN during submission.
- Fix dm_io and dm_target_io flags race condition on Alpha.
- Add some pr_err debugging to help debug cases when DM ioctl structure
is corrupted.
* tag 'for-5.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix bio polling to handle possibile BLK_STS_AGAIN
dm: fix dm_io and dm_target_io flags race condition on Alpha
dm integrity: set journal entry unused when shrinking device
dm ioctl: log an error if the ioctl structure is corrupted
|
|
Pull more filesystem folio updates from Matthew Wilcox:
"A mixture of odd changes that didn't quite make it into the original
pull and fixes for things that did. Also the readpages changes had to
wait for the NFS tree to be pulled first.
- Remove ->readpages infrastructure
- Remove AOP_FLAG_CONT_EXPAND
- Move read_descriptor_t to networking code
- Pass the iocb to generic_perform_write
- Minor updates to iomap, btrfs, ext4, f2fs, ntfs"
* tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache:
btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio()
ntfs: Correct mark_ntfs_record_dirty() folio conversion
f2fs: Get the superblock from the mapping instead of the page
f2fs: Correct f2fs_dirty_data_folio() conversion
ext4: Correct ext4_journalled_dirty_folio() conversion
filemap: Remove AOP_FLAG_CONT_EXPAND
fs: Pass an iocb to generic_perform_write()
fs, net: Move read_descriptor_t to net.h
fs: Remove read_actor_t
iomap: Simplify is_partially_uptodate a little
readahead: Update comments
mm: remove the skip_page argument to read_pages
mm: remove the pages argument to read_pages
fs: Remove ->readpages address space operation
readahead: Remove read_cache_pages()
|
|
Pull XArray updates from Matthew Wilcox:
- Documentation update
- Fix test-suite build after move of bitmap.h
- Fix xas_create_range() when a large entry is already present
- Fix xas_split() of a shadow entry
* tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray:
XArray: Update the LRU list in xas_split()
XArray: Fix xas_create_range() when multi-order entry present
XArray: Include bitmap.h from xarray.h
XArray: Document the locking requirement for the xa_state
|
|
Merge still more updates from Andrew Morton:
"16 patches.
Subsystems affected by this patch series: ofs2, nilfs2, mailmap, and
mm (madvise, mlock, mfence, memory-failure, kasan, debug, kmemleak,
and damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/damon: prevent activated scheme from sleeping by deactivated schemes
mm/kmemleak: reset tag when compare object pointer
doc/vm/page_owner.rst: remove content related to -c option
tools/vm/page_owner_sort.c: remove -c option
mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP
mm,hwpoison: unmap poisoned page before invalidation
mailmap: update Kirill's email
mm: kfence: fix objcgs vector allocation
mm/munlock: protect the per-CPU pagevec by a local_lock_t
mm/munlock: update Documentation/vm/unevictable-lru.rst
mm/munlock: add lru_add_drain() to fix memcg_stat_test
nilfs2: get rid of nilfs_mapping_init()
nilfs2: fix lockdep warnings during disk space reclamation
nilfs2: fix lockdep warnings in page operations for btree nodes
ocfs2: fix crash when mount with quota enabled
Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"
|
|
KASAN changes that added new GFP flags mistakenly updated
__GFP_BITS_SHIFT as the total number of GFP bits instead of as a shift
used to define __GFP_BITS_MASK.
This broke LOCKDEP, as __GFP_BITS_MASK now gets the 25th bit enabled
instead of the 28th for __GFP_NOLOCKDEP.
Update __GFP_BITS_SHIFT to always count KASAN GFP bits.
In the future, we could handle all combinations of KASAN and LOCKDEP to
occupy as few bits as possible. For now, we have enough GFP bits to be
inefficient in this quick fix.
Link: https://lkml.kernel.org/r/462ff52742a1fcc95a69778685737f723ee4dfb3.1648400273.git.andreyknvl@google.com
Fixes: 9353ffa6e9e9 ("kasan, page_alloc: allow skipping memory init for HW_TAGS")
Fixes: 53ae233c30a6 ("kasan, page_alloc: allow skipping unpoisoning for HW_TAGS")
Fixes: f49d9c5bb15c ("kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This flag is no longer used, so remove it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We can extract both the file pointer and the pos from the iocb.
This simplifies each caller as well as allowing generic_perform_write()
to see more of the iocb contents in the future.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
fs.h has no more need for this typedef; networking is now the sole user
of the read_descriptor_t.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This typedef is not used any more.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
All filesystems have now been converted to use ->readahead, so
remove the ->readpages operation and fix all the comments that
used to refer to it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
With no remaining users, remove this function and the related
infrastructure.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few fixes that have been gathered since the previous pull:
- An additional fix for potential PCM deadlocks
- A series of HD-audio CS8409 codec patches for new models
- Other device specific fixes for HD-audio, ASoC mediatek, Intel,
fsl, rockchip"
* tag 'sound-fix-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock
ALSA: hda: Avoid unsol event during RPM suspending
ALSA: hda/realtek: Fix audio regression on Mi Notebook Pro 2020
ALSA: hda/cs8409: Add new Dolphin HW variants
ALSA: hda/cs8409: Disable HSBIAS_SENSE_EN for Cyborg
ALSA: hda/cs8409: Support new Warlock MLK Variants
ALSA: hda/cs8409: Fix Full Scale Volume setting for all variants
ALSA: hda/cs8409: Re-order quirk table into ascending order
ALSA: hda/cs8409: Fix Warlock to use mono mic configuration
ALSA: cs4236: fix an incorrect NULL check on list iterator
ALSA: hda/realtek: Enable headset mic on Lenovo P360
ASoC: SOF: Intel: Fix build error without SND_SOC_SOF_PCI_DEV
ALSA: hda/realtek: Add mute and micmut LED support for Zbook Fury 17 G9
ASoC: rockchip: i2s_tdm: Fixup config for SND_SOC_DAIFMT_DSP_A/B
ASoC: fsl-asoc-card: Fix jack_event() always return 0
ASoC: mediatek: mt6358: add missing EXPORT_SYMBOLs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- grammar and formatting fixes in comments for gpio-ts4900
- correct links in gpio-ts5500
- fix a warning in doc generation for the core GPIO documentation
* tag 'gpio-fixes-for-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: ts5500: Fix Links to Technologic Systems web resources
gpio: Properly document parent data union
gpio: ts4900: Fix comment formatting and grammar
|
|
Early alpha processors cannot write a single byte or short; they read 8
bytes, modify the value in registers and write back 8 bytes.
This could cause race condition in the structure dm_io - if the fields
flags and io_count are modified simultaneously.
Fix this bug by using 32-bit flags if we are on Alpha and if we are
compiling for a processor that doesn't have the byte-word-extension.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: bd4a6dd241ae ("dm: reduce size of dm_io and dm_target_io structs")
[snitzer: Jens allowed this change since Mikulas owns a relevant Alpha!]
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a revert of a patch resetting extra buttons on touchpads claiming to
be buttonpads as this caused regression on certain Dell devices
- a new driver for Mediatek MT6779 keypad
- a new driver for Imagis touchscreen
- rework of Google/Chrome OS "Vivaldi" keyboard handling
- assorted driver fixes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (31 commits)
Revert "Input: clear BTN_RIGHT/MIDDLE on buttonpads"
Input: adi - remove redundant variable z
Input: add Imagis touchscreen driver
dt-bindings: input/touchscreen: bindings for Imagis
Input: synaptics - enable InterTouch on ThinkPad T14/P14s Gen 1 AMD
Input: stmfts - fix reference leak in stmfts_input_open
Input: add bounds checking to input_set_capability()
Input: iqs5xx - use local input_dev pointer
HID: google: modify HID device groups of eel
HID: google: Add support for vivaldi to hid-hammer
HID: google: extract Vivaldi hid feature mapping for use in hid-hammer
Input: extract ChromeOS vivaldi physmap show function
HID: google: switch to devm when registering keyboard backlight LED
Input: mt6779-keypad - fix signedness bug
Input: mt6779-keypad - add MediaTek keypad driver
dt-bindings: input: Add bindings for Mediatek matrix keypad
Input: da9063 - use devm_delayed_work_autocancel()
Input: goodix - fix race on driver unbind
Input: goodix - use input_copy_abs() helper
Input: add input_copy_abs() function
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"The bulk of the patches are about replacing the uie_unsupported struct
rtc_device member by a feature bit.
Subsystem:
- remove uie_unsupported, all users have been converted to clear
RTC_FEATURE_UPDATE_INTERRUPT and provide a reason
- RTCs with an alarm with a resolution of a minute are now letting
the core handle rounding down the alarm time
- fix use-after-free on device removal
New driver:
- OP-TEE RTC PTA
Drivers:
- sun6i: Add H616 support
- cmos: Fix the AltCentury for AMD platforms
- spear: set range"
* tag 'rtc-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (56 commits)
rtc: check if __rtc_read_time was successful
rtc: gamecube: Fix refcount leak in gamecube_rtc_read_offset_from_sram
rtc: mc146818-lib: Fix the AltCentury for AMD platforms
rtc: optee: add RTC driver for OP-TEE RTC PTA
rtc: pm8xxx: Return -ENODEV if set_time disallowed
rtc: pm8xxx: Attach wake irq to device
clk: sunxi-ng: sun6i-rtc: include clk/sunxi-ng.h
rtc: remove uie_unsupported
rtc: xgene: stop using uie_unsupported
rtc: hym8563: switch to RTC_FEATURE_UPDATE_INTERRUPT
rtc: hym8563: let the core handle the alarm resolution
rtc: hym8563: switch to devm_rtc_allocate_device
rtc: efi: switch to RTC_FEATURE_UPDATE_INTERRUPT
rtc: efi: switch to devm_rtc_allocate_device
rtc: add new RTC_FEATURE_ALARM_WAKEUP_ONLY feature
rtc: spear: fix spear_rtc_read_time
rtc: spear: drop uie_unsupported
rtc: spear: set range
rtc: spear: switch to devm_rtc_allocate_device
rtc: pcf8563: switch to RTC_FEATURE_UPDATE_INTERRUPT
...
|