0x17: Talk, Zero Copy Receive using io_uring - people

6 Oct 2023


      The war on memory bottlenecks[1][2] continues with this talk from
David Wei and Pavel Begunkov.
Socket recv() data is first copied/DMAed into the kernel memory and
then again into user space memory - adding pressure to overall memory
bandwidth and of course comes with a CPU cost.
While there are other approaches to avoid the second copy on recv()
such as DPDK, RDMA etc, David and Pavel argue that all of them have
downsides - ranging from being proprietary, custom patched drivers,
and difficult to debug and worse requiring a rewrite of the
applications...
Our esteemed speakers instead opt for continuing to use the network
stack as is and io_uring as the user facing API. So what do we need to
get this working?
As in [2] that the NIC supports header splitting and RSS flow
steering. On incoming data the headers traverse the TCP stack while
data is DMA'ed not to the kernel memory but into user space...
David and Pavel will discuss the overall approach they took and
describe in some detail the kernel infra as well as what uapi would
look like. They will further review kernel-existing zero copy
approaches and how they plan to coexist with them. Last but not least
they will dig into limitations and challenges of zero copy receive and
how they overcome them to facilitate a real deployment.
[1] https://netdevconf.info/0x17/sessions/talk/congestion-control-architecture-f...
[2] https://netdevconf.info/0x17/sessions/talk/device-memory-tcp.html
cheers,
jamal
Reminder: 2 more days to go for early bird registration