v0 — Rust 1.70+ · Linux x86_64 / aarch64

NUMA-first runtime for latency-critical Rust.

numaperf gives you explicit control over memory placement, thread pinning, and work scheduling on NUMA systems. Stop guessing where your data lives and start guaranteeing it.

Memory ratio (NUMA)
2–3×
Placement policies
4
Workspace crates
8
MSRV
1.70

Cross-node memory access latency on multi-socket servers is typically 2–3× local access. Source: project README.

What you get

A focused set of NUMA primitives — not a kitchen-sink runtime. Pick the facade or wire up only the crates you need.

topo

Topology discovery

Query NUMA nodes, CPUs, and inter-node distances at runtime. No more assuming a fixed layout across hardware.

affinity

Thread pinning

RAII-based CPU affinity via ScopedPin. Affinity is restored on drop — no leaks across scopes or worker pools.

mem

Memory placement

Explicit MemPolicy: Bind, Preferred, Interleave, Local. Pair with Prefault::Touch to make placement decisions stick at allocation time.

sched

Work scheduling

NumaExecutor with per-node worker pools and configurable work stealing. Schedule queries on data-local workers, not whatever core was free.

sharded

Sharded data

NumaSharded<T> for per-node data structures. ShardedCounter for lock-free counting that does not bounce a cache line across the box.

io

Device locality

Map NICs and NVMe devices to their NUMA nodes. Allocate packet buffers on the NIC-local node and skip the cross-node copy.

perf

Observability

Track locality ratios, generate health reports, identify cross-node traffic. Find the leak before it shows up in p99.

mode

Hard mode

Strict enforcement when you need guarantees, graceful degradation when you do not. The default behaviour is your choice, not the OS’s.

The 30-second mental model

Discover the topology. Pin your thread. Allocate locally with a policy you actually chose. Touch pages to force placement. Everything else is observability.

main.rs
use numaperf::{Topology, ScopedPin, NumaRegion, MemPolicy, NodeMask, Prefault};

fn main() -> Result<(), numaperf::NumaError> {
    // Discover NUMA topology
    let topo = Topology::discover()?;
    let node0 = topo.numa_nodes()[0].id();

    // Pin this thread to node 0's CPUs (restored on drop)
    let _pin = ScopedPin::to_node(&topo, node0)?;

    // Allocate 1 GiB bound to node 0
    let region = NumaRegion::anon(
        1024 * 1024 * 1024,
        MemPolicy::Bind(NodeMask::single(node0)),
        Default::default(),
        Prefault::Touch,
    )?;

    // region.as_mut_slice() is now guaranteed local to node 0
    println!("Allocated {} bytes on node {}", region.len(), node0);
    Ok(())
}

Example adapted directly from the project README.

Where it earns its keep

Database engines

Pin buffer pools to specific nodes, schedule queries on data-local workers.

Network processing

Allocate packet buffers on the NIC-local node, process without cross-node copies.

Scientific computing

Partition large arrays across nodes, compute with guaranteed locality.

Trading systems

Eliminate latency variance from NUMA effects with strict pinning and placement.

Crate layout

numaperf ships as a workspace. Use the facade for everything, or wire up only the crates you actually need.

CratePurpose
numaperf Facade — re-exports all public APIs
numaperf-topo Topology discovery
numaperf-affinity Thread pinning
numaperf-mem Memory placement
numaperf-sched Work scheduling
numaperf-sharded Per-node data structures
numaperf-io Device locality
numaperf-perf Observability

Platform support

PlatformSupport
Linux x86_64Full
Linux aarch64Full
macOSGraceful degradation (no NUMA hardware)

Windows is not currently supported. If you need it, open an issue describing the target.