Timber compiles tree-based ML models to native C99 code. No Python runtime overhead. No interpreter. No garbage collection pauses. Just raw CPU speed.
~50 µs
Median latency
Single-row inference
1.38M
Rows/sec
Batch throughput
<2 ms
P99 latency
Under sustained load
0.72 µs
Per-row cost
10K batch inference
Measured on Apple M-series (single core, Docker Desktop). Production on bare-metal Linux will be faster.
XGBoost — Breast Cancer
50 trees · 30 features
1.38M
rows/sec
~50 µs
Single-row inference
0.05 ms median
<2 ms
P99 latency
Under load
0.6 ms
1K batch
0.6 µs per row
7.2 ms
10K batch
0.72 µs per row
XGBoost — Fraud Detection
200 trees · 50 features
625K
rows/sec
~120 µs
Single-row inference
0.12 ms median
<4 ms
P99 latency
Under load
1.8 ms
1K batch
1.8 µs per row
16 ms
10K batch
1.6 µs per row
LightGBM — Click Prediction
150 trees · 80 features
830K
rows/sec
~90 µs
Single-row inference
0.09 ms median
<3 ms
P99 latency
Under load
1.4 ms
1K batch
1.4 µs per row
12 ms
10K batch
1.2 µs per row
Single-row inference latency for a 50-tree XGBoost model.
Ahead-of-Time C99 Compilation
Your model is compiled to native C code at deploy time — not interpreted at inference time. The resulting binary runs directly on the CPU with zero runtime overhead.
No Python in the Hot Path
Traditional serving loads your model in a Python process. Every prediction pays for the GIL, garbage collector, and interpreter overhead. Timber eliminates all of that.
Containerized Isolation
Each deployment runs in its own container with dedicated CPU and memory. No noisy neighbors. Predictable, consistent latency under any traffic pattern.
Deterministic & Reproducible
Compiled binaries produce bit-identical outputs. Every compilation is SHA-256 hashed. Auditable, versioned, and guaranteed consistent across environments.
Upload your model and see the difference in seconds.