blatt A1
Go to file
Cosmin Ratiu e7d05cf159 xfrm_output: Force software GSO only in tunnel mode
[ Upstream commit 0aae2867aa6067f73d066bc98385e23c8454a1d7 ]

The cited commit fixed a software GSO bug with VXLAN + IPSec in tunnel
mode. Unfortunately, it is slightly broader than necessary, as it also
severely affects performance for Geneve + IPSec transport mode over a
device capable of both HW GSO and IPSec crypto offload. In this case,
xfrm_output unnecessarily triggers software GSO instead of letting the
HW do it. In simple iperf3 tests over Geneve + IPSec transport mode over
a back-2-back pair of NICs with MTU 1500, the performance was observed
to be up to 6x worse when doing software GSO compared to leaving it to
the hardware.

This commit makes xfrm_output only trigger software GSO in crypto
offload cases for already encapsulated packets in tunnel mode, as not
doing so would then cause the inner tunnel skb->inner_networking_header
to be overwritten and break software GSO for that packet later if the
device turns out to not be capable of HW GSO.

Taking a closer look at the conditions for the original bug, to better
understand the reasons for this change:
- vxlan_build_skb -> iptunnel_handle_offloads sets inner_protocol and
  inner network header.
- then, udp_tunnel_xmit_skb -> ip_tunnel_xmit adds outer transport and
  network headers.
- later in the xmit path, xfrm_output -> xfrm_outer_mode_output ->
  xfrm4_prepare_output -> xfrm4_tunnel_encap_add overwrites the inner
  network header with the one set in ip_tunnel_xmit before adding the
  second outer header.
- __dev_queue_xmit -> validate_xmit_skb checks whether GSO segmentation
  needs to happen based on dev features. In the original bug, the hw
  couldn't segment the packets, so skb_gso_segment was invoked.
- deep in the .gso_segment callback machinery, __skb_udp_tunnel_segment
  tries to use the wrong inner network header, expecting the one set in
  iptunnel_handle_offloads but getting the one set by xfrm instead.
- a bit later, ipv6_gso_segment accesses the wrong memory based on that
  wrong inner network header.

With the new change, the original bug (or similar ones) cannot happen
again, as xfrm will now trigger software GSO before applying a tunnel.
This concern doesn't exist in packet offload mode, when the HW adds
encapsulation headers. For the non-offloaded packets (crypto in SW),
software GSO is still done unconditionally in the else branch.

Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Fixes: a204aef9fd ("xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-28 21:58:59 +01:00
arch arm64: dts: freescale: tqma8mpql: Fix vqmmc-supply 2025-03-28 21:58:59 +01:00
block block: fix 'kmem_cache of name 'bio-108' already exists' 2025-03-28 21:58:53 +01:00
certs
crypto crypto: api - Add crypto_clone_tfm 2024-12-14 19:53:51 +01:00
Documentation sched/isolation: Prevent boot crash when the boot CPU is nohz_full 2025-03-28 21:58:48 +01:00
drivers firmware: imx-scu: fix OF node leak in .probe() 2025-03-28 21:58:59 +01:00
fs smb: client: fix potential UAF in cifs_dump_full_key() 2025-03-28 21:58:58 +01:00
include ASoC: ops: Consistently treat platform_max as control value 2025-03-28 21:58:57 +01:00
init rust: Disallow BTF generation with Rust + LTO 2025-03-28 21:58:57 +01:00
io_uring io_uring: fix corner case forgetting to vunmap 2025-03-28 21:58:53 +01:00
ipc ipc: fix memleak if msg_init_ns failed in create_ipc_ns 2024-12-14 19:54:06 +01:00
kernel sched: Clarify wake_up_q()'s write to task->wake_q.next 2025-03-28 21:58:51 +01:00
lib lib/buildid: Handle memfd_secret() files in build_id_parse() 2025-03-28 21:58:57 +01:00
LICENSES
mm mm: add nommu variant of vm_insert_pages() 2025-03-28 21:58:53 +01:00
net xfrm_output: Force software GSO only in tunnel mode 2025-03-28 21:58:59 +01:00
rust scripts: generate_rust_analyzer: provide cfgs for core and alloc 2025-03-28 21:58:57 +01:00
samples samples/landlock: Fix possible NULL dereference in parse_path() 2025-02-21 13:49:03 +01:00
scripts scripts: generate_rust_analyzer: add missing macros deps 2025-03-28 21:58:58 +01:00
security tomoyo: don't emit warning in tomoyo_write_control() 2025-02-21 13:49:31 +01:00
sound ASoC: codecs: wm0010: Fix error handling path in wm0010_spi_probe() 2025-03-28 21:58:57 +01:00
tools selftests: rtnetlink: update netdevsim ipsec output format 2025-02-21 13:50:11 +01:00
usr
virt KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() 2024-06-27 13:46:21 +02:00
.clang-format
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore Remove *.orig pattern from .gitignore 2024-10-17 15:21:15 +02:00
.mailmap
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Remove Michal Marek from Kbuild maintainers 2022-11-16 14:53:00 +09:00
Kbuild
Kconfig
MAINTAINERS MAINTAINERS: add leah to 6.1 MAINTAINERS file 2024-05-17 11:56:16 +02:00
Makefile scripts: make rust-analyzer for out-of-tree modules 2025-03-28 21:58:57 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.