[Netdev 0x17 Net-Power] Re: Welcome, let's discuss!

11 Dec 2023


      "Brandeburg, Jesse" jesse.brandeburg@intel.com writes:
...
Toke and I were chatting offline about this problem of power management in networking.
We thought it might be a useful start to figure out a good set of
benchmarks to demonstrate "power vs networking" problems. I have a
couple in mind right away. One is "system is sleeping but I'm trying
to run a latency sensitive workload and the latency sucks" Two is
"system is sleeping and my single-threaded bulk throughput benchmark
(netperf/iperf2/neper/etc) shows a lot of retransmits and / or
receiver drops"
Another thought is how do I count these events and / or notice I have
a problem?
More thoughts on this from anyone?
Thank you for starting the on-list discussion. I'll add some high-level
thoughts here and also reply to a couple of messages down-thread with
some more specific comments.
When talking about benchmarking, the reason I mentioned that as a good
starting point is that I believe having visibility into power usage is
the only way we can make people actually use any tweaks we can come up
with. Especially since there's a lot of cargo-culting involved in tuning
(of the "use these settings for the best latency/throughput/whatever"
variety), and having more precise measurements of the impact of settings
is a way of combating that (and empowering people to make better
assessments of the tradeoffs involved).
And secondly, of course, if we are actually trying to improve something,
we need some baseline metrics to improve against. I'm thinking this can
be approached from both "ends", i.e., "here is the cost tradeoff of
various tuning parameters" that you mention, but also "here is the power
consumption of workload X", which can then be a target for improvement.
Turning to areas for improvement, I can think of a couple of broad
categories that seem promising to explore (some of which have already
been mentioned down-thread):
- Smart task placement when scaling up/down (consolidating work on fewer
  cores to leave others idle enough that they can go to sleep).
- Forecasting the next packet arrival; and using this both so we can
  make smarter sleep state decisions, but also so we can do smarter
  batching (maybe we can defer waking up the userspace process if we
  expect another packet to arrive shortly, that sort of thing).
- General performance improvements in targeted areas (better performance
  should translate to less work done per packet, which means less power
  used, all other things being equal.
Sorry it the above is a bit vague, but I'm hoping the brain dump can
help spur some (more) discussion :)
-Toke

2025

2024

2023

[Netdev 0x17 Net-Power] Re: Welcome, let's discuss!