Pull spi updates from Mark Brown:
"The main focus of this release from a framework point of view has been
spi-mem where we've acquired support for a few new hardware features
which enable better performance on suitable hardware.
Otherwise mostly thanks to Arnd's cleanup efforts on old platforms
we've removed several obsolete drivers which just about balance out
the newer drivers we've added this cycle.
Summary:
- Allow drivers to flag if they are unidirectional.
- Support for DTR mode and hardware acceleration of dummy cycles in
spi-mem.
- Support for Allwinder H616, Intel Lightning Mountain, nVidia Tegra
QuadSPI, Realtek RTL838x and RTL839x.
- Removal of obsolete EFM32, Txx9 and SIRF Prima and Atlas drivers"
* tag 'spi-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (76 commits)
spi: Skip zero-length transfers in spi_transfer_one_message()
spi: dw: Avoid stack content exposure
spi: cadence-quadspi: Use spi_mem_dtr_supports_op()
spi: spi-mem: add spi_mem_dtr_supports_op()
spi: atmel-quadspi: Disable the QSPI IP at suspend()
spi: pxa2xx: Add IDs for the controllers found on Intel Lynxpoint
spi: pxa2xx: Fix the controller numbering for Wildcat Point
spi: Change provied to provided in the file spi.h
spi: mediatek: add set_cs_timing support
spi: support CS timing for HW & SW mode
spi: add power control when set_cs_timing
spi: stm32: make spurious and overrun interrupts visible
spi: stm32h7: replace private SPI_1HZ_NS with NSEC_PER_SEC
spi: stm32: defer probe for reset
spi: stm32: driver uses reset controller only at init
spi: stm32h7: ensure message are smaller than max size
spi: stm32: use bitfield macros
spi: stm32: do not mandate cs_gpio
spi: stm32: properly handle 0 byte transfer
spi: clps711xx: remove redundant white-space
...
Pull i2c updates from Wolfram Sang:
- mostly driver updates. Bigger ones for mlxcpld and iproc. But most of
them are all over the place.
- removal of the efm32, sirf, u300, and zte zx bus drivers because of
platform removal. So, we have a pleasant diffstat this time.
- first set of cleanups in the I2C core as preparation to increase
maximum length of SMBus transfers to 255 (as specified in the new
standard). Better documentation of struct i2c_msg and its flags stand
out here.
- the testunit can now respond to SMBus block process calls which is
the testcase when implementing the above new maximum length.
* 'i2c/for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (62 commits)
i2c: remove redundant error print in stm32f7_i2c_probe
i2c: testunit: add support for block process calls
i2c: busses: Replace spin_lock_irqsave with spin_lock in hard IRQ
dt-bindings: eeprom: at24: Document ROHM BR24G01
i2c: i801: Add support for Intel Alder Lake PCH-P
i2c: mv64xxx: Fix check for missing clock after adding RPM
i2c: mux: mlxcpld: Add callback to notify mux creation completion
i2c: mux: mlxcpld: Extend supported mux number
i2c: mux: mlxcpld: Extend driver to support word address space devices
i2c: mux: mlxcpld: Get rid of adapter numbers enforcement
i2c: mux: mlxcpld: Prepare mux selection infrastructure for two-byte support
i2c: mux: mlxcpld: Convert driver to platform driver
i2c: imx: Synthesize end of transaction events without idle interrupts
i2c: i2c-qcom-geni: Add shutdown callback for i2c
i2c: mv64xxx: Add runtime PM support
i2c: amd-mp2: Remove unused macro
i2c: amd-mp2: convert to PCI logging functions
i2c: mux: mlxcpld: Move header file out of x86 realm
platform/x86: mlxcpld: Update module license
i2c: mux: mlxcpld: Update module license
...
Pull x86 platform driver updates from Hans de Goede:
"Highlights:
- Microsoft Surface devices System Aggregator Module support
- SW_TABLET_MODE reporting improvements
- thinkpad_acpi keyboard language setting support
- platform / DPTF profile settings support:
- Base / userspace API parts merged from Rafael's acpi-platform
branch
- thinkpad_acpi and ideapad-laptop support through pdx86
- Remove support for some obsolete Intel MID platforms through
merging of the shared intel-mid-removal branch
- Big cleanup of the ideapad-laptop driver
- Misc other fixes / new hw support / quirks"
* tag 'platform-drivers-x86-v5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (99 commits)
platform/x86: intel_scu_ipc: Increase virtual timeout from 3 to 5 seconds
platform/surface: aggregator: Fix access of unaligned value
tools/power/x86/intel-speed-select: Update version to 1.8
tools/power/x86/intel-speed-select: Add new command to get/set TRL
tools/power/x86/intel-speed-select: Add new command turbo-mode
Platform: OLPC: Constify static struct regulator_ops
platform/surface: Add Surface Hot-Plug driver
platform/x86: intel_scu_wdt: Drop mistakenly added const
platform/x86: Kconfig: add missing selects for ideapad-laptop
platform/x86: acer-wmi: Don't use ACPI_EXCEPTION()
platform/x86: thinkpad_acpi: Replace ifdef CONFIG_ACPI_PLATFORM_PROFILE with depends on
platform/x86: thinkpad_acpi: Fix 'warning: no previous prototype for' warnings
platform/x86: msi-wmi: Fix variable 'status' set but not used compiler warning
platform/surface: surface3-wmi: Fix variable 'status' set but not used compiler warning
platform/x86: Move all dell drivers to their own subdirectory
Documentation/ABI: sysfs-platform-ideapad-laptop: conservation_mode attribute
Documentation/ABI: sysfs-platform-ideapad-laptop: update device attribute paths
platform/x86: ideapad-laptop: add "always on USB charging" control support
platform/x86: ideapad-laptop: add keyboard backlight control support
platform/x86: ideapad-laptop: send notification about touchpad state change to sysfs
...
Pull media updates from Mauro Carvalho Chehab:
- some core fixes in VB2 mem2mem support
- some improvements and cleanups in V4L2 async kAPI
- newer controls in V4L2 API for H-264 and HEVC codecs
- allegro-dvt driver was promoted from staging
- new i2c sendor drivers: imx334, ov5648, ov8865
- new automobile camera module: rdacm21
- ipu3 cio2 driver started gained support for some ACPI BIOSes
- new ATSC frontend: MaxLinear mxl692 VSB tuner/demod
- the SMIA/CCS driver gained more support for CSS standard
- several driver fixes, updates and improvements
* tag 'media/v5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (362 commits)
media: v4l: async: Fix kerneldoc documentation for async functions
media: i2c: max9271: Add MODULE_* macros
media: i2c: Kconfig: Make MAX9271 a module
media: imx334: 'ret' is uninitialized, should have been PTR_ERR()
media: i2c: Add imx334 camera sensor driver
media: dt-bindings: media: Add bindings for imx334
media: ov8856: Configure sensor for GRBG Bayer for all modes
media: i2c: imx219: Implement V4L2_CID_LINK_FREQ control
media: ov5675: fix vflip/hflip control
media: ipu3-cio2: Build bridge only if ACPI is enabled
media: Remove the legacy v4l2-clk API
media: ov6650: Use the generic clock framework
media: mt9m111: Use the generic clock framework
media: ov9640: Use the generic clock framework
media: pxa_camera: Drop the v4l2-clk clock register
media: mach-pxa: Register the camera sensor fixed-rate clock
media: i2c: imx258: get clock from device properties and enable it via runtime PM
media: i2c: imx258: simplify getting state container
media: i2c: imx258: add support for binding via device tree
media: dt-bindings: media: imx258: add bindings for IMX258 sensor
...
Pull KVM updates from Paolo Bonzini:
"x86:
- Support for userspace to emulate Xen hypercalls
- Raise the maximum number of user memslots
- Scalability improvements for the new MMU.
Instead of the complex "fast page fault" logic that is used in
mmu.c, tdp_mmu.c uses an rwlock so that page faults are concurrent,
but the code that can run against page faults is limited. Right now
only page faults take the lock for reading; in the future this will
be extended to some cases of page table destruction. I hope to
switch the default MMU around 5.12-rc3 (some testing was delayed
due to Chinese New Year).
- Cleanups for MAXPHYADDR checks
- Use static calls for vendor-specific callbacks
- On AMD, use VMLOAD/VMSAVE to save and restore host state
- Stop using deprecated jump label APIs
- Workaround for AMD erratum that made nested virtualization
unreliable
- Support for LBR emulation in the guest
- Support for communicating bus lock vmexits to userspace
- Add support for SEV attestation command
- Miscellaneous cleanups
PPC:
- Support for second data watchpoint on POWER10
- Remove some complex workarounds for buggy early versions of POWER9
- Guest entry/exit fixes
ARM64:
- Make the nVHE EL2 object relocatable
- Cleanups for concurrent translation faults hitting the same page
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Simplification of the early init hypercall handling
Non-KVM changes (with acks):
- Detection of contended rwlocks (implemented only for qrwlocks,
because KVM only needs it for x86)
- Allow __DISABLE_EXPORTS from assembly code
- Provide a saner follow_pfn replacements for modules"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (192 commits)
KVM: x86/xen: Explicitly pad struct compat_vcpu_info to 64 bytes
KVM: selftests: Don't bother mapping GVA for Xen shinfo test
KVM: selftests: Fix hex vs. decimal snafu in Xen test
KVM: selftests: Fix size of memslots created by Xen tests
KVM: selftests: Ignore recently added Xen tests' build output
KVM: selftests: Add missing header file needed by xAPIC IPI tests
KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c
KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static
locking/arch: Move qrwlock.h include after qspinlock.h
KVM: PPC: Book3S HV: Fix host radix SLB optimisation with hash guests
KVM: PPC: Book3S HV: Ensure radix guest has no SLB entries
KVM: PPC: Don't always report hash MMU capability for P9 < DD2.2
KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path
KVM: PPC: remove unneeded semicolon
KVM: PPC: Book3S HV: Use POWER9 SLBIA IH=6 variant to clear SLB
KVM: PPC: Book3S HV: No need to clear radix host SLB before loading HPT guest
KVM: PPC: Book3S HV: Fix radix guest SLB side channel
KVM: PPC: Book3S HV: Remove support for running HPT guest on RPT host without mixed mode support
KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR
KVM: PPC: Book3S HV: Add infrastructure to support 2nd DAWR
...
Pull parisc updates from Helge Deller:
- Optimize parisc page table locks by using the existing
page_table_lock
- Export argv0-preserve flag in binfmt_misc for usage in qemu-user
- Fix interrupt table (IVT) checksum so firmware will call crash
handler (HPMC)
- Increase IRQ stack to 64kb on 64-bit kernel
- Switch to common devmem_is_allowed() implementation
- Minor fix to get_whan()
* 'parisc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
binfmt_misc: pass binfmt_misc flags to the interpreter
parisc: Optimize per-pagetable spinlocks
parisc: Replace test_ti_thread_flag() with test_tsk_thread_flag()
parisc: Bump 64-bit IRQ stack size to 64 KB
parisc: Fix IVT checksum calculation wrt HPMC
parisc: Use the generic devmem_is_allowed()
parisc: Drop out of get_whan() if task is running again
Pull performance event updates from Ingo Molnar:
- Add CPU-PMU support for Intel Sapphire Rapids CPUs
- Extend the perf ABI with PERF_SAMPLE_WEIGHT_STRUCT, to offer
two-parameter sampling event feedback. Not used yet, but is intended
for Golden Cove CPU-PMU, which can provide both the instruction
latency and the cache latency information for memory profiling
events.
- Remove experimental, default-disabled perfmon-v4 counter_freezing
support that could only be enabled via a boot option. The hardware is
hopelessly broken, we'd like to make sure nobody starts relying on
this, as it would only end in tears.
- Fix energy/power events on Intel SPR platforms
- Simplify the uprobes resume_execution() logic
- Misc smaller fixes.
* tag 'perf-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/rapl: Fix psys-energy event on Intel SPR platform
perf/x86/rapl: Only check lower 32bits for RAPL energy counters
perf/x86/rapl: Add msr mask support
perf/x86/kvm: Add Cascade Lake Xeon steppings to isolation_ucodes[]
perf/x86/intel: Support CPUID 10.ECX to disable fixed counters
perf/x86/intel: Add perf core PMU support for Sapphire Rapids
perf/x86/intel: Filter unsupported Topdown metrics event
perf/x86/intel: Factor out intel_update_topdown_event()
perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT
perf/intel: Remove Perfmon-v4 counter_freezing support
x86/perf: Use static_call for x86_pmu.guest_get_msrs
perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info
perf/x86/intel/uncore: Store the logical die id instead of the physical die id.
x86/kprobes: Do not decode opcode in resume_execution()
Pull io_uring updates from Jens Axboe:
"Highlights from this cycles are things like request recycling and
task_work optimizations, which net us anywhere from 10-20% of speedups
on workloads that mostly are inline.
This work was originally done to put io_uring under memcg, which adds
considerable overhead. But it's a really nice win as well. Also worth
highlighting is the LOOKUP_CACHED work in the VFS, and using it in
io_uring. Greatly speeds up the fast path for file opens.
Summary:
- Put io_uring under memcg protection. We accounted just the rings
themselves under rlimit memlock before, now we account everything.
- Request cache recycling, persistent across invocations (Pavel, me)
- First part of a cleanup/improvement to buffer registration (Bijan)
- SQPOLL fixes (Hao)
- File registration NULL pointer fixup (Dan)
- LOOKUP_CACHED support for io_uring
- Disable /proc/thread-self/ for io_uring, like we do for /proc/self
- Add Pavel to the io_uring MAINTAINERS entry
- Tons of code cleanups and optimizations (Pavel)
- Support for skip entries in file registration (Noah)"
* tag 'for-5.12/io_uring-2021-02-17' of git://git.kernel.dk/linux-block: (103 commits)
io_uring: tctx->task_lock should be IRQ safe
proc: don't allow async path resolution of /proc/thread-self components
io_uring: kill cached requests from exiting task closing the ring
io_uring: add helper to free all request caches
io_uring: allow task match to be passed to io_req_cache_free()
io-wq: clear out worker ->fs and ->files
io_uring: optimise io_init_req() flags setting
io_uring: clean io_req_find_next() fast check
io_uring: don't check PF_EXITING from syscall
io_uring: don't split out consume out of SQE get
io_uring: save ctx put/get for task_work submit
io_uring: don't duplicate io_req_task_queue()
io_uring: optimise SQPOLL mm/files grabbing
io_uring: optimise out unlikely link queue
io_uring: take compl state from submit state
io_uring: inline io_complete_rw_common()
io_uring: move res check out of io_rw_reissue()
io_uring: simplify iopoll reissuing
io_uring: clean up io_req_free_batch_finish()
io_uring: move submit side state closer in the ring
...
Pull fsverity updates from Eric Biggers:
"Add an ioctl which allows reading fs-verity metadata from a file.
This is useful when a file with fs-verity enabled needs to be served
somewhere, and the other end wants to do its own fs-verity compatible
verification of the file. See the commit messages for details.
This new ioctl has been tested using new xfstests I've written for it"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: support reading signature with ioctl
fs-verity: support reading descriptor with ioctl
fs-verity: support reading Merkle tree with ioctl
fs-verity: add FS_IOC_READ_VERITY_METADATA ioctl
fs-verity: don't pass whole descriptor to fsverity_verify_signature()
fs-verity: factor out fsverity_get_descriptor()
Pull nfsd updates from Chuck Lever:
- Update NFSv2 and NFSv3 XDR decoding functions
- Further improve support for re-exporting NFS mounts
- Convert NFSD stats to per-CPU counters
- Add batch Receive posting to the server's RPC/RDMA transport
* tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (65 commits)
nfsd: skip some unnecessary stats in the v4 case
nfs: use change attribute for NFS re-exports
NFSv4_2: SSC helper should use its own config.
nfsd: cstate->session->se_client -> cstate->clp
nfsd: simplify nfsd4_check_open_reclaim
nfsd: remove unused set_client argument
nfsd: find_cpntf_state cleanup
nfsd: refactor set_client
nfsd: rename lookup_clientid->set_client
nfsd: simplify nfsd_renew
nfsd: simplify process_lock
nfsd4: simplify process_lookup1
SUNRPC: Correct a comment
svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom()
svcrdma: Reduce Receive doorbell rate
svcrdma: Deprecate stat variables that are no longer used
svcrdma: Restore read and write stats
svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter
svcrdma: Convert rdma_stat_recv to a per-CPU counter
svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up()
...
Pull namei updates from Al Viro:
"Most of that pile is LOOKUP_CACHED series; the rest is a couple of
misc cleanups in the general area...
There's a minor bisect hazard in the end of series, and normally I
would've just folded the fix into the previous commit, but this branch
is shared with Jens' tree, with stuff on top of it in there, so that
would've required rebases outside of vfs.git"
* 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*
fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED
fs: add support for LOOKUP_CACHED
saner calling conventions for unlazy_child()
fs: make unlazy_walk() error handling consistent
fs/namei.c: Remove unlikely of status being -ECHILD in lookup_fast()
do_tmpfile(): don't mess with finish_open()
Pull USB and Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt driver changes for
5.12-rc1.
It's been an active set of development in these subsystems for the
past few months:
- loads of typec features added for new hardware
- xhci features and bugfixes
- dwc3 features added for more hardware support
- dwc2 fixes and new hardware support
- cdns3 driver updates for more hardware support
- gadget driver cleanups and minor fixes
- usb-serial fixes, new driver, and more devices supported
- thunderbolt feature additions for new hardware
- lots of other tiny fixups and additions
The chrome driver changes are in here as well, as they depended on
some of the typec changes, and the maintainer acked them.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (300 commits)
dt-bindings: usb: mediatek: musb: add mt8516 compatbile
dt-bindings: usb: mtk-xhci: add compatible for mt2701 and mt7623
dt-bindings: usb: mtk-xhci: add optional assigned clock properties
Documentation: connector: Update the description of sink-vdos
usb: misc: usb3503: Fix logic in usb3503_init()
dt-bindings: usb: usb-device: fix typo in required properties
usb: Replace lkml.org links with lore
dt-bindings: usb: dwc3: add description for rk3328
dt-bindings: usb: convert rockchip,dwc3.txt to yaml
usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable
USB: quirks: sort quirk entries
USB: serial: drop bogus to_usb_serial_port() checks
USB: serial: make remove callback return void
USB: serial: drop if with an always false condition
usb: gadget: Assign boolean values to a bool variable
usb: typec: tcpm: Get Sink VDO from fwnode
dt-bindings: connector: Add SVDM VDO properties
usb: typec: displayport: Fill the negotiated SVDM Version in the header
usb: typec: ucsi: Determine common SVDM Version
usb: typec: tcpm: Determine common SVDM Version
...
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty/serial driver changes for 5.12-rc1.
Nothing huge, just lots of good cleanups and additions:
- n_tty line discipline cleanups
- vt core cleanups and reworks to make the code more "modern"
- stm32 driver additions
- tty led support added to the tty core and led layer
- minor serial driver fixups and additions
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (54 commits)
serial: core: Remove BUG_ON(in_interrupt()) check
vt_ioctl: Remove in_interrupt() check
dt-bindings: serial: imx: Switch to my personal address
vt: keyboard, use new API for keyboard_tasklet
serial: stm32: improve platform_get_irq condition handling in init_port
serial: ifx6x60: Remove driver for deprecated platform
tty: fix up iterate_tty_read() EOVERFLOW handling
tty: fix up hung_up_tty_read() conversion
tty: fix up hung_up_tty_write() conversion
tty: teach the n_tty ICANON case about the new "cookie continuations" too
tty: teach n_tty line discipline about the new "cookie continuations"
tty: clean up legacy leftovers from n_tty line discipline
tty: implement read_iter
tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer
serial: remove sirf prima/atlas driver
serial: mxs-auart: Remove <asm/cacheflush.h>
serial: mxs-auart: Remove serial_mxs_probe_dt()
serial: fsl_lpuart: Use of_device_get_match_data()
dt-bindings: serial: renesas,hscif: Add r8a779a0 support
tty: serial: Drop unused efm32 serial driver
...
Pull ARM SoC driver updates from Arnd Bergmann:
"Updates for SoC specific drivers include a few subsystems that have
their own maintainers but send them through the soc tree:
SCMI firmware:
- add support for a completion interrupt
Reset controllers:
- new driver for BCM4908
- new devm_reset_control_get_optional_exclusive_released() function
Memory controllers:
- Renesas RZ/G2 support
- Tegra124 interconnect support
- Allow more drivers to be loadable modules
TEE/optee firmware:
- minor code cleanup
The other half of this is SoC specific drivers that do not belong into
any other subsystem, most of them living in drivers/soc:
- Allwinner/sunxi power management work
- Allwinner H616 support
- ASpeed AST2600 system identification support
- AT91 SAMA7G5 SoC ID driver
- AT91 SoC driver cleanups
- Broadcom BCM4908 power management bus support
- Marvell mbus cleanups
- Mediatek MT8167 power domain support
- Qualcomm socinfo driver support for PMIC
- Qualcomm SoC identification for many more products
- TI Keystone driver cleanups for PRUSS and elsewhere"
* tag 'arm-drivers-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (89 commits)
soc: aspeed: socinfo: Add new systems
soc: aspeed: snoop: Add clock control logic
memory: tegra186-emc: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
memory: samsung: exynos5422-dmc: Correct function names in kerneldoc
memory: ti-emif-pm: Drop of_match_ptr from of_device_id table
optee: simplify i2c access
drivers: soc: atmel: fix type for same7
tee: optee: remove need_resched() before cond_resched()
soc: qcom: ocmem: don't return NULL in of_get_ocmem
optee: sync OP-TEE headers
tee: optee: fix 'physical' typos
drivers: optee: use flexible-array member instead of zero-length array
tee: fix some comment typos in header files
soc: ti: k3-ringacc: Use of_device_get_match_data()
soc: ti: pruss: Refactor the CFG sub-module init
soc: mediatek: pm-domains: Don't print an error if child domain is deferred
soc: mediatek: pm-domains: Add domain regulator supply
dt-bindings: power: Add domain regulator supply
soc: mediatek: cmdq: Remove cmdq_pkt_flush()
soc: mediatek: pm-domains: Add support for mt8167
...
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add two helper functions to release one table and hooks from
the netns and netlink event path.
2) Add table ownership infrastructure, this new infrastructure allows
users to bind a table (and its content) to a process through the
netlink socket.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-02-16
The following pull-request contains BPF updates for your *net-next* tree.
There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:
[...]
lock_sock(sk);
err = tcp_zerocopy_receive(sk, &zc, &tss);
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
&zc, &len, err);
release_sock(sk);
[...]
We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).
The main changes are:
1) Adds support of pointers to types with known size among global function
args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.
2) Add bpf_iter for task_vma which can be used to generate information similar
to /proc/pid/maps, from Song Liu.
3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
rewriting bind user ports from BPF side below the ip_unprivileged_port_start
range, both from Stanislav Fomichev.
4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
as well as per-cpu maps for the latter, from Alexei Starovoitov.
5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
for sleepable programs, both from KP Singh.
6) Extend verifier to enable variable offset read/write access to the BPF
program stack, from Andrei Matei.
7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
query device MTU from programs, from Jesper Dangaard Brouer.
8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
tracing programs, from Florent Revest.
9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
otherwise too many passes are required, from Gary Lin.
10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
verification both related to zero-extension handling, from Ilya Leoshkevich.
11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.
12) Batch of AF_XDP selftest cleanups and small performance improvement
for libbpf's xsk map redirect for newer kernels, from Björn Töpel.
13) Follow-up BPF doc and verifier improvements around atomics with
BPF_FETCH, from Brendan Jackman.
14) Permit zero-sized data sections e.g. if ELF .rodata section contains
read-only data from local variables, from Yonghong Song.
15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It can be useful to the interpreter to know which flags are in use.
For instance, knowing if the preserve-argv[0] is in use would
allow to skip the pathname argument.
This patch uses an unused auxiliary vector, AT_FLAGS, to add a
flag to inform interpreter if the preserve-argv[0] is enabled.
Note by Helge Deller:
The real-world user of this patch is qemu-user, which needs to know
if it has to preserve the argv[0]. See Debian bug #970460.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: YunQiang Su <ysu@wavecomp.com>
URL: http://bugs.debian.org/970460
Signed-off-by: Helge Deller <deller@gmx.de>
A userspace daemon like firewalld might need to monitor for netlink
updates to detect its ruleset removal by the (global) flush ruleset
command to ensure ruleset persistency. This adds extra complexity from
userspace and, for some little time, the firewall policy is not in
place.
This patch adds the NFT_TABLE_F_OWNER flag which allows a userspace
program to own the table that creates in exclusivity.
Tables that are owned...
- can only be updated and removed by the owner, non-owners hit EPERM if
they try to update it or remove it.
- are destroyed when the owner closes the netlink socket or the process
is gone (implicit netlink socket closure).
- are skipped by the global flush ruleset command.
- are listed in the global ruleset.
The userspace process that sets on the NFT_TABLE_F_OWNER flag need to
leave open the netlink socket.
A new NFTA_TABLE_OWNER netlink attribute specifies the netlink port ID
to identify the owner from userspace.
This patch also updates error reporting when an unknown table flag is
specified to change it from EINVAL to EOPNOTSUPP given that EINVAL is
usually reserved to report for malformed netlink messages to userspace.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Johannes Berg says:
====================
Last set of updates:
* more minstrel work from Felix to reduce the
probing overhead
* QoS for nl80211 control port frames
* STBC injection support
* and a couple of small fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow userspace (mptcpd) to subscribe to mptcp genl multicast events.
This implementation reuses the same event API as the mptcp kernel fork
to ease integration of existing tools, e.g. mptcpd.
Supported events include:
1. start and close of an mptcp connection
2. start and close of subflows (joins)
3. announce and withdrawals of addresses
4. subflow priority (backup/non-backup) change.
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.
The SKB object is complex and the skb->len value (accessible from
BPF-prog) also include the length of any extra GRO/GSO segments, but
without taking into account that these GRO/GSO segments get added
transport (L4) and network (L3) headers before being transmitted. Thus,
this BPF-helper is created such that the BPF-programmer don't need to
handle these details in the BPF-prog.
The API is designed to help the BPF-programmer, that want to do packet
context size changes, which involves other helpers. These other helpers
usually does a delta size adjustment. This helper also support a delta
size (len_diff), which allow BPF-programmer to reuse arguments needed by
these other helpers, and perform the MTU check prior to doing any actual
size adjustment of the packet context.
It is on purpose, that we allow the len adjustment to become a negative
result, that will pass the MTU check. This might seem weird, but it's not
this helpers responsibility to "catch" wrong len_diff adjustments. Other
helpers will take care of these checks, if BPF-programmer chooses to do
actual size adjustment.
V14:
- Improve man-page desc of len_diff.
V13:
- Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.
V12:
- Simplify segment check that calls skb_gso_validate_network_len.
- Helpers should return long
V9:
- Use dev->hard_header_len (instead of ETH_HLEN)
- Annotate with unlikely req from Daniel
- Fix logic error using skb_gso_validate_network_len from Daniel
V6:
- Took John's advice and dropped BPF_MTU_CHK_RELAX
- Returned MTU is kept at L3-level (like fib_lookup)
V4: Lot of changes
- ifindex 0 now use current netdev for MTU lookup
- rename helper from bpf_mtu_check to bpf_check_mtu
- fix bug for GSO pkt length (as skb->len is total len)
- remove __bpf_len_adj_positive, simply allow negative len adj
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul
The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup)
can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog
don't know the MTU value that caused this rejection.
If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it
need to know this MTU value for the ICMP packet.
Patch change lookup and result struct bpf_fib_lookup, to contain this MTU
value as output via a union with 'tot_len' as this is the value used for
the MTU lookup.
V5:
- Fixed uninit value spotted by Dan Carpenter.
- Name struct output member mtu_result
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul
Explicitly define reserved field and require it and any subsequent
fields to be zero-valued for now. Additionally, limit the valid CMSG
flags that tcp_zerocopy_receive accepts.
Fixes: 7eeba1706e ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Suggested-by: Leon Romanovsky <leon@kernel.org>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reject the unsupported and invalid ct_state flags of cls flower rules.
Fixes: e0ace68af2 ("net/sched: cls_flower: Add matching on conntrack info")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce KVM_CAP_PPC_DAWR1 which can be used by QEMU to query whether
KVM supports 2nd DAWR or not. The capability is by default disabled
even when the underlying CPU supports 2nd DAWR. QEMU needs to check
and enable it manually to use the feature.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The flag indicates to user space that route offload failed.
Previous patch set added the ability to emit RTM_NEWROUTE notifications
whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, but if the offload
fails there is no indication to user-space.
The flag will be used in subsequent patches by netdevsim and mlxsw to
indicate to user space that route offload failed, so that users will
have better visibility into the offload process.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
This feature/cleanup patchset is an updated version of the pull request
of Feb 2nd (batadv-next-pullrequest-20210202) and includes the
following patches:
- Bump version strings, by Simon Wunderlich (added commit log)
- Drop publication years from copyright info, by Sven Eckelmann
(replaced the previous patch which updated copyright years, as per
our discussion)
- Avoid sizeof on flexible structure, by Sven Eckelmann (unchanged)
- Fix names for kernel-doc blocks, by Sven Eckelmann (unchanged)
* tag 'batadv-next-pullrequest-20210208' of git://git.open-mesh.org/linux-merge:
batman-adv: Fix names for kernel-doc blocks
batman-adv: Avoid sizeof on flexible structure
batman-adv: Drop publication years from copyright info
batman-adv: Start new development cycle
====================
Link: https://lore.kernel.org/r/20210208165938.13262-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for FS_VERITY_METADATA_TYPE_SIGNATURE to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the built-in signature (if present) of a verity file for
serving to a client which implements fs-verity compatible verification.
See the patch which introduced FS_IOC_READ_VERITY_METADATA for more
details.
The ability for userspace to read the built-in signatures is also useful
because it allows a system that is using the in-kernel signature
verification to migrate to userspace signature verification.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-7-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add support for FS_VERITY_METADATA_TYPE_DESCRIPTOR to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the fs-verity descriptor of a file for serving to a client
which implements fs-verity compatible verification. See the patch which
introduced FS_IOC_READ_VERITY_METADATA for more details.
"fs-verity descriptor" here means only the part that userspace cares
about because it is hashed to produce the file digest. It doesn't
include the signature which ext4 and f2fs append to the
fsverity_descriptor struct when storing it on-disk, since that way of
storing the signature is an implementation detail. The next patch adds
a separate metadata_type value for retrieving the signature separately.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-6-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add support for FS_VERITY_METADATA_TYPE_MERKLE_TREE to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the Merkle tree of a verity file for serving to a client which
implements fs-verity compatible verification. See the patch which
introduced FS_IOC_READ_VERITY_METADATA for more details.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-5-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add an ioctl FS_IOC_READ_VERITY_METADATA which will allow reading verity
metadata from a file that has fs-verity enabled, including:
- The Merkle tree
- The fsverity_descriptor (not including the signature if present)
- The built-in signature, if present
This ioctl has similar semantics to pread(). It is passed the type of
metadata to read (one of the above three), and a buffer, offset, and
size. It returns the number of bytes read or an error.
Separate patches will add support for each of the above metadata types.
This patch just adds the ioctl itself.
This ioctl doesn't make any assumption about where the metadata is
stored on-disk. It does assume the metadata is in a stable format, but
that's basically already the case:
- The Merkle tree and fsverity_descriptor are defined by how fs-verity
file digests are computed; see the "File digest computation" section
of Documentation/filesystems/fsverity.rst. Technically, the way in
which the levels of the tree are ordered relative to each other wasn't
previously specified, but it's logical to put the root level first.
- The built-in signature is the value passed to FS_IOC_ENABLE_VERITY.
This ioctl is useful because it allows writing a server program that
takes a verity file and serves it to a client program, such that the
client can do its own fs-verity compatible verification of the file.
This only makes sense if the client doesn't trust the server and if the
server needs to provide the storage for the client.
More concretely, there is interest in using this ability in Android to
export APK files (which are protected by fs-verity) to "protected VMs".
This would use Protected KVM (https://lwn.net/Articles/836693), which
provides an isolated execution environment without having to trust the
traditional "host". A "guest" VM can boot from a signed image and
perform specific tasks in a minimum trusted environment using files that
have fs-verity enabled on the host, without trusting the host or
requiring that the guest has its own trusted storage.
Technically, it would be possible to duplicate the metadata and store it
in separate files for serving. However, that would be less efficient
and would require extra care in userspace to maintain file consistency.
In addition to the above, the ability to read the built-in signatures is
useful because it allows a system that is using the in-kernel signature
verification to migrate to userspace signature verification.
Link: https://lore.kernel.org/r/20210115181819.34732-4-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Pull syscall entry fixes from Borislav Petkov:
- For syscall user dispatch, separate prctl operation from syscall
redirection range specification before the API has been made official
in 5.11.
- Ensure tasks using the generic syscall code do trap after returning
from a syscall when single-stepping is requested.
* tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Use different define for selector variable in SUD
entry: Ensure trap after single-step on system call return
The batman-adv source code was using the year of publication (to net-next)
as "last" year for the copyright statement. The whole source code mentioned
in the MAINTAINERS "BATMAN ADVANCED" section was handled as a single entity
regarding the publishing year.
This avoided having outdated (in sense of year information - not copyright
holder) publishing information inside several files. But since the simple
"update copyright year" commit (without other changes) in the file was not
well received in the upstream kernel, the option to not have a copyright
year (for initial and last publication) in the files are chosen instead.
More detailed information about the years can still be retrieved from the
SCM system.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Michael Kerrisk suggested that, from an API perspective, it is a bad
idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
and the selector variable.
Therefore, define two new constants to be used by SUD's selector variable
and update the corresponding documentation and test cases.
While this changes the API syscall user dispatch has never been part of a
Linux release, it will show up for the first time in 5.11.
Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
This reverts commit 9ab7e76aef.
This patch was committed without maintainer approval and despite a number
of unaddressed concerns from review. There are several issues that
impede the acceptance of this patch and that make a reversion of this
particular instance of these changes the best way forward:
i) the patch contains several logically separate changes that would be
better served as smaller patches (for review purposes)
ii) functionality like the handling of end markers has been introduced
without further explanation
iii) symmetry between the handling of GTPv0 and GTPv1 has been
unnecessarily broken
iv) the patchset produces 'broken' packets when extension headers are
included
v) there are no available userspace tools to allow for testing this
functionality
vi) there is an unaddressed Coverity report against the patch concering
memory leakage
vii) most importantly, the patch contains a large amount of superfluous
churn that impedes other ongoing work with this driver
This patch will be reworked into a series that aligns with other
ongoing work and facilitates review.
Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Instead of adding a plethora of new KVM_CAP_XEN_FOO capabilities, just
add bits to the return value of KVM_CAP_XEN_HVM.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
It turns out that we can't handle event channels *entirely* in userspace
by delivering them as ExtINT, because KVM is a bit picky about when it
accepts ExtINT interrupts from a legacy PIC. The in-kernel local APIC
has to have LVT0 configured in APIC_MODE_EXTINT and unmasked, which
isn't necessarily the case for Xen guests especially on secondary CPUs.
To cope with this, add kvm_xen_get_interrupt() which checks the
evtchn_pending_upcall field in the Xen vcpu_info, and delivers the Xen
upcall vector (configured by KVM_XEN_ATTR_TYPE_UPCALL_VECTOR) if it's
set regardless of LAPIC LVT0 configuration. This gives us the minimum
support we need for completely userspace-based implementation of event
channels.
This does mean that vcpu_enter_guest() needs to check for the
evtchn_pending_upcall flag being set, because it can't rely on someone
having set KVM_REQ_EVENT unless we were to add some way for userspace to
do so manually.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Allow the Xen emulated guest the ability to register secondary
vcpu time information. On Xen guests this is used in order to be
mapped to userspace and hence allow vdso gettimeofday to work.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
The vcpu info supersedes the per vcpu area of the shared info page and
the guest vcpus will use this instead.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>