Hi all,
We’re releasing SPD v1.0.5 — a DPDK-based, software-only Ethernet packet distributor that adds a
Greedy Reshaper to reduce worker imbalance under elephant-flow skew.
What it is
• Software-only & portable: no NIC-specific features; all reshaping is done in user space. Suit for SDN.
• Bounded, in-place edits: each interval flips a small number of RETA entries to move hot buckets
from overloaded to cold workers, keeping overhead predictable.
Repo & docs
• Repository: GitHub - mikechang-engr/software-packet-distributor: A software-only, flow-aware Ethernet packet distribution framework for DPDK that dynamically reshapes traffic and workload using a greedy algorithm to improve fairness and load balance across multiple cores in an embedded networking system. Validated on the NXP LX2160A platform.
• README: overview, quick start, start script knobs (TARGET_MPPS/GBPS, ELEPHANTS, GREEDY),
metrics path (/var/log/software-packet-distributor/worker_stats_v105.csv), and core layout.
Overview
The Software Packet Distributor (SPD) is a DPDK-based packet distribution framework for embedded multicore networking systems. It addresses the limitations of static RSS by introducing a Greedy Reshaper that adaptively reassigns flow buckets to worker cores based on runtime telemetry—improving fairness, utilization, and stability without relying on NIC-specific features.
Motivation
Traditional RSS-based packet steering is fast but fundamentally static. While RSS classifies traffic by 5-tuple flows, it ignores the weight imbalance between flows. In real network conditions, a small number of high-volume (elephant) flows can overwhelm specific CPU cores, leaving others underutilized—hurting throughput, fairness, and stability.
Hardware solutions offer limited help: NIC indirection tables are vendor-specific and coarse-grained. Kernel-level steering mechanisms are not suitable for high-speed, user-space DPDK pipelines, where latency and overhead must remain tightly bounded.
The Software Packet Distributor (SPD) is motivated by the need for a fully software-defined, adaptable distribution path that remains portable and predictable across platforms. Its design is guided by three principles:
• Software-only — No dependence on NIC-specific capabilities or offloads; deterministic behavior across DPDK environments.
• Flow-aware — Continuously observes per-flow and per-bucket characteristics to identify hotspots and imbalances.
• Workload-aware — Dynamically reshapes bucket-to-core assignments based on real-time worker utilization, stabilizing system behavior under skewed traffic.
Validated setup (Arm)
• Board: NXP LX2160A-RDB (16×A72 @2.2GHz); LSDK 21.08; Linux 5.10.35; DPDK 19.11.7 (PCAP/NULL vdev).
• Hugepages: 1GiB (preferred) with 2MiB fallback.
What’s new in v1.0.5
• Build portability: nested func removed (lcg32_local at file scope); runtime unchanged.
• Perf CSV: new filename/location; start script: improved signal escalation & newline-safe logging.
Call for feedback
• Looking for testers on additional SoCs/NICs and discussion about congestion-aware bucket ranking.
• If you can share worker_stats_v105.csv from a 60–120s run, we’ll generate comparison plots.
Thanks!
—Mike (author), repo link above