VRP optimization kernel that runs anywhere
Surge is a rich Vehicle Routing Problem solver written in C11 with zero external dependencies. 25+ constraint types, BKS-competitive solution quality across standard academic benchmarks (88% vehicle match on GH-200, 62% on GH-400), compiles to a ~200KB static library or a 361KB WebAssembly module. Same solver runs in a browser, on the edge, or in a datacenter.
Every VRP solver makes you choose
The VRP solver landscape forces a trade-off. You get solution quality (HGS, LKH-3) but can't deploy it. You get deployment flexibility (VROOM) but sacrifice constraint depth. You get constraint richness (PTV, Ortec) but pay six figures and get locked into a JRE monolith. You get a familiar API (OR-Tools) but the solver chokes beyond 500 requests and ships a 50MB binary.
Surge refuses the trade-off.
The six-box test. Surge is the only VRP solver that is simultaneously: (1) competitive with academic BKS, (2) richer in constraint modeling than most commercial offerings, (3) embeddable anywhere, (4) zero external dependencies, (5) open source, and (6) written in C with universal ABI. No other solver checks all six.
An optimization kernel for vehicle routing
Surge solves the family of Vehicle Routing Problems: VRPTW, PDPTW, DARP, and rich VRP with arbitrary combinations of constraints. It uses an Adaptive Large Neighborhood Search (ALNS) metaheuristic built on the Arbor framework, with simulated annealing acceptance, population-based parallel search, and HGS-style infeasible-space exploration.
Problem JSON Solution JSON
│ ▲
▼ │
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐
│ sg_api │───▶│ sg_solve │───▶│ Arbor │───▶│ sg_solution │
│ (parse) │ │ (phases) │ │ (ALNS) │ │ (extract) │
└─────────┘ └──────────┘ └──────────┘ └──────────────┘
│ │
▼ ▼
┌──────────┐ ┌───────────┐
│ Construct│ │ Destroy │ random, worst,
│ (greedy │ │ + Repair │ Shaw, route,
│ insert) │ │ operators │ regret-k, greedy
└──────────┘ └───────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│Feasibility│ │ Cost │ │ Local │
│ (timing, │ │ (lexicog │ │ Search │
│ capacity,│ │ multi- │ │ (2-opt*, │
│ pairing) │ │ obj) │ │ or-opt, │
└──────────┘ └──────────┘ │ cross) │
└──────────┘
Problem classes
- VRPTW — Capacitated VRP with time windows (Solomon benchmarks)
- PDPTW — Pickup and delivery with time windows (Li & Lim benchmarks)
- DARP — Dial-a-ride with max ride time constraints
- Rich VRP — Arbitrary constraint combinations: compartments, breaks, setup times, multi-trip, stacking, precedence, locking
Solve phases
- Construction — Greedy insertion builds initial feasible solution
- Phase 1 (Vehicle minimization) — ALNS with infeasible-space exploration, targets unassigned count then vehicle count
- Phase 1.5 (Crunch) — Aggressive vehicle removal with extended repair attempts
- Phase 2 (Distance polish) — ALNS focusing on total distance with fixed vehicle count
- Postprocess — Intra-route local search (2-opt*, or-opt, cross-exchange)
Competitive with best known solutions
All results: population mode (3 generations, auto threads), scale-aware iteration caps, deterministic seed (42). Scale-tuned ALNS (S22), instance-adaptive construction (S24), generation SA reheat (S25). Compared against published Best Known Solutions.
Solomon VRPTW (56 instances, 100 customers each)
| Category | Instances | Vehicles Match BKS | Avg Veh Gap | Avg Dist Gap |
|---|---|---|---|---|
| C1xx (clustered, tight TW) | 9 | 9/9 | 0.00 | -0.0% |
| C2xx (clustered, wide TW) | 8 | 8/8 | 0.00 | +0.0% |
| R1xx (random, tight TW) | 12 | 7/12 | +0.75 | +5.5% |
| R2xx (random, wide TW) | 11 | 6/11 | +0.45 | +1.4% |
| RC1xx (mixed, tight TW) | 8 | 4/8 | +0.63 | +3.2% |
| RC2xx (mixed, wide TW) | 8 | 2/8 | +0.50 | +3.5% |
| Overall | 56 | 36/56 (64%) | +0.48 | +3.7% |
Li & Lim PDPTW (56 instances, 100 requests each)
| Category | Instances | Vehicles Match BKS | Avg Veh Gap | Avg Dist Gap |
|---|---|---|---|---|
| LC1xx (clustered, tight TW) | 9 | 5/9 | +1.22 | +15.6% |
| LC2xx (clustered, wide TW) | 8 | 8/8 | 0.00 | exact BKS |
| LR1xx (random, tight TW) | 12 | 9/12 | +0.83 | +3.0% |
| LR2xx (random, wide TW) | 11 | 11/11 | 0.00 | exact BKS |
| LRC1xx (mixed, tight TW) | 8 | 6/8 | +0.25 | +0.6% |
| LRC2xx (mixed, wide TW) | 8 | 8/8 | 0.00 | +0.4% |
| Overall | 56 | 47/56 | +0.41 | +3.3% |
Honest assessment. Solomon 100 uses a short 5s budget — longer runs would close the gap. The remaining vehicle gap is on tight-TW R1/RC1/RC2 instances. LC1xx PDPTW remains hardest (+15.6% avg dist gap). On wide-TW and clustered instances, Surge matches BKS consistently. At scale (200-400 customers), Surge matches 88-62% of BKS vehicle counts. Distance gaps widen with scale (6.6% at 200, 12.1% at 400) — this is typical for ALNS vs specialized solvers with months of tuning.
Gehring-Homberger VRPTW (60 instances, 200 customers each)
Population mode, 60s time limit, MEDIUM_TUNE (S22), deterministic seed 42.
| Category | Instances | Vehicles Match BKS | Avg Veh Gap | Avg Dist Gap |
|---|---|---|---|---|
| C1_2 (clustered, tight TW) | 10 | 8/10 | +0.20 | +5.6% |
| C2_2 (clustered, wide TW) | 10 | 10/10 | 0.00 | +0.7% |
| R1_2 (random, tight TW) | 10 | 10/10 | 0.00 | +12.0% |
| R2_2 (random, wide TW) | 10 | 9/10 | +0.10 | +1.7% |
| RC1_2 (mixed, tight TW) | 10 | 8/10 | +0.20 | +18.2% |
| RC2_2 (mixed, wide TW) | 10 | 8/10 | +0.20 | +1.2% |
| Overall | 60 | 53/60 (88%) | +0.12 | +6.6% |
Gehring-Homberger VRPTW (60 instances, 400 customers each)
Population mode, 120s time limit, LARGE_TUNE (S22) + SA reheat 2.0× (S25), deterministic seed 42.
| Category | Instances | Vehicles Match BKS | Avg Veh Gap | Avg Dist Gap |
|---|---|---|---|---|
| C1_4 (clustered, tight TW) | 10 | 5/10 | +1.10 | +9.6% |
| C2_4 (clustered, wide TW) | 10 | 4/10 | +0.80 | +8.9% |
| R1_4 (random, tight TW) | 10 | 10/10 | 0.00 | +27.4% |
| R2_4 (random, wide TW) | 10 | 10/10 | 0.00 | +6.4% |
| RC1_4 (mixed, tight TW) | 10 | 4/10 | +0.70 | +18.7% |
| RC2_4 (mixed, wide TW) | 10 | 4/10 | +0.70 | +4.3% |
| Overall | 60 | 37/60 (62%) | +0.50 | +12.1% |
Scaling Summary
| Scale | Time Budget | Veh Match | Avg Dist Gap | Tune Profile |
|---|---|---|---|---|
| Solomon 100 | 5s | 36/56 (64%) | +3.7% | FAST |
| Li & Lim 100 | 5s | 47/56 (84%) | +3.3% | FAST |
| GH-200 | 60s | 53/60 (88%) | +6.6% | MEDIUM_TUNE |
| GH-400 | 120s | 37/60 (62%) | +12.1% | LARGE_TUNE + S25 reheat |
| GH-800 | 120s | 25/60 (42%) | +17.0% | LARGE_TUNE |
Performance
| Metric | Value | Notes |
|---|---|---|
| Solomon 100-customer | ~5s avg (population) | 5s time limit, 3 generations, auto threads |
| GH 200-customer | ~66s avg (population) | 60s time limit, MEDIUM_TUNE, auto threads |
| GH 400-customer | ~136s avg (population) | 120s time limit, LARGE_TUNE + SA reheat, auto threads |
| GH 800-customer | ~215s avg (population) | 120s time limit, LARGE_TUNE, auto threads |
| Malloc in hot loop | Zero | Arena allocators, pre-allocated scratch |
| Arena speedup | -21% (Solomon) | 6.06s → 4.79s from arena alone |
| Startup time | Microseconds | No JIT warmup, no module loading |
| GC pauses | None | C, not JVM |
25+ constraint types, all composable
Every constraint is independently testable and freely composable. Adding a new constraint to ALNS requires one feasibility check and one cost component — not rewriting the solver. Compare with HGS, where each new constraint touches 5+ subsystems (Split, crossover, move evaluation, penalties, route updates).
Re-optimization & Warm Start
- Warm start — provide initial solution, solver improves from there
- Request locking (FROZEN) — request stays on designated vehicle, position preserved
- Request locking (COMMITTED) — request must be served, but vehicle/position flexible
- Plan/ETA validation — validate fixed routes, compute ETAs, report constraint violations
Kernel-level comparison
Surge is a VRP optimization kernel, not a logistics platform. Comparing against platforms (PTV, HERE) on platform features (real-time traffic, driver apps) is a category error — those are application-layer concerns handled by other OTTO modules. This compares solvers as optimization kernels.
| Surge | OR-Tools | PTV | HERE | VROOM | Ortec | cuOpt | |
|---|---|---|---|---|---|---|---|
| 100-req quality | BKS-level | Good | Good | Decent | Decent | Good | Good |
| Constraint richness | 25+ types | Good | Good | Good | Limited | Excellent | Limited |
| Large-scale (200+) | 88% veh match (GH-200) | Degrades | Decent | Opaque | Good | Good | Fast but rigid |
| Embeddability | C/WASM, 0 deps | C++, 50MB+ | JRE monolith | API only | C++, OSRM | JRE + license | GPU + CUDA |
| Edge / offline | Native | Heavy | No | No | Needs server | No | Needs GPU |
| Tunability | Full source | Good | Black box | Black box | Limited | Black box | Limited |
| Cost | Open source | Open source | $$$$$ | $$$ API | Open source | $$$$$ | $$$ GPU |
Grade summary
| Dimension | Surge | VROOM | OR-Tools | OptaPlanner |
|---|---|---|---|---|
| API design | A | B+ | C+ | B |
| Solution quality | B+ | B | B+ | B |
| Speed | A- | A | B+ | C |
| Constraint richness | A | C+ | A- | B+ |
| Deployability | A+ | B+ | C | C- |
| Binary size | A+ | B | D | D |
| Auditability | A | B | D | C |
| Community | D | B | A | B+ |
Feature-by-feature constraint comparison
Y = Production quality P = Partial / limited N = Not supported ? = Opaque / undocumented
Time & Scheduling
| Constraint | Surge | OR-Tools | PTV | HERE | VROOM | Ortec | cuOpt |
|---|---|---|---|---|---|---|---|
| Hard time windows | Y | Y | Y | Y | Y | Y | Y |
| Soft time windows (penalty) | Y | Y | P | ? | N | Y | N |
| Disjunct time windows | Y | P | P | N | N | P | N |
| Max route duration | Y | Y | Y | Y | N | Y | P |
| Max ride time (DARP) | Y | N | N | N | N | P | N |
| Waiting cost | Y | Y | P | ? | N | Y | N |
| Overtime cost | Y | Y | P | ? | N | Y | N |
| Time-dependent travel | Y | Y | Y | Y | N | Y | N |
| Time-indexed brackets | Y | P | P | ? | N | P | N |
| Per-vehicle travel profiles | Y | Y | P | N | P | P | N |
Capacity & Loading
| Constraint | Surge | OR-Tools | PTV | HERE | VROOM | Ortec | cuOpt |
|---|---|---|---|---|---|---|---|
| Multi-dimensional capacity | Y | Y | Y | P | Y | Y | P |
| Vehicle compartments | Y | N | P | N | N | Y | N |
| LIFO PD stacking | Y | P | N | N | N | Y | N |
| FIFO PD stacking | Y | P | N | N | N | P | N |
| Backhaul | Y | P | P | N | N | Y | N |
| Initial vehicle loads | Y | P | P | N | N | P | N |
| Commodity conflicts | Y | N | P | N | N | Y | N |
Vehicle, PD, Breaks, Sequencing
| Constraint | Surge | OR-Tools | PTV | HERE | VROOM | Ortec | cuOpt |
|---|---|---|---|---|---|---|---|
| Vehicle qualifications | Y | Y | Y | P | Y | Y | P |
| Multi-trip (depot reload) | Y | N | P | N | N | Y | N |
| PD pairing (same vehicle) | Y | Y | P | P | P | Y | P |
| Max ride time (DARP) | Y | N | N | N | N | P | N |
| Driver breaks | Y | P | Y | P | P | Y | N |
| Setup times | Y | P | P | N | N | Y | N |
| Inter-request precedence | Y | P | N | N | N | P | N |
| Exclusion groups | Y | N | N | N | N | P | N |
| Request locking (FROZEN) | Y | N | ? | N | N | P | N |
| Warm start | Y | Y | ? | N | N | Y | N |
| Plan validation mode | Y | N | P | Y | Y | Y | N |
Deployment & Integration
| Surge | OR-Tools | PTV | HERE | VROOM | Ortec | cuOpt | |
|---|---|---|---|---|---|---|---|
| Language | C11 | C++ | Java | REST | C++ | Java | CUDA |
| Binary size | ~200KB | 50MB+ | 100MB+ | N/A | ~5MB | 100MB+ | ~500MB |
| WASM target | Y | N | N | N | N | N | N |
| Zero dependencies | Y | N | N | N | N | N | N |
| Startup time | μs | ms | seconds | N/A | ms | seconds | seconds |
| GC pauses | None | None | Unpredictable | N/A | None | Unpredictable | None |
| Deterministic | Y | Y | ? | N | P | ? | P |
| Edge deployment | Y | Difficult | N | N | N | N | N |
| Source available | Y | Y | N | N | Y | N | N |
| JSON API | Y | P | Y | Y | Y | Y | Y |
| Python bindings | Y | Y | Y | Y | Y | Y | Y |
Domain-native, not framework ceremony
~80 C functions, all following the same pattern: sg_<noun>_set_<property>(ctx, id, value).
No inheritance hierarchies, no builder patterns, no callback dimensions. You think in vehicles, requests,
tasks, and depots — not solver internals.
Surge: 3 lines
// Add a constrained vehicle uint32_t v = sg_add_vehicle(ctx); sg_vehicle_set_depots(ctx, v, depot, depot); sg_vehicle_set_max_duration(ctx, v, 28800);
OR-Tools: dimensions + callbacks
# Dimension + callback + slack ceremony routing.AddDimension( transit_cb, 30, 28800, True, 'Time') time_dim = routing.GetDimensionOrDie('Time') time_dim.CumulVar( manager.NodeToIndex(loc) ).SetRange(tw_start, tw_end)
JSON API
The same solver is accessible via a JSON API — POST a problem, get a solution. The transport-agnostic
sg_api_handle() function works identically over HTTP, WASM, Unix sockets, or direct C calls.
$ curl -X POST http://localhost:8085/api/v1/solve \ -H "Content-Type: application/json" \ -d '{ "vehicles": [{"id": 0, "depot_start": 0, "depot_end": 0, "capacity": [20], "shift": [0, 1000]}], "depots": [{"id": 0, "x": 40.0, "y": 50.0, "tw": [0, 1000]}], "tasks": [ {"id": 0, "x": 45.0, "y": 55.0, "tw": [0, 500], "service": 10, "demand": [5]}, {"id": 1, "x": 42.0, "y": 58.0, "tw": [0, 500], "service": 10, "demand": [3]} ], "requests": [{"id": 0, "delivery_task": 0}, {"id": 1, "delivery_task": 1}], "config": {"max_iterations": 1000} }'
Integration paths
| Method | Use Case | Example |
|---|---|---|
| C API | Maximum performance, embedded systems | #include "surge.h" |
| JSON API (REST) | Microservice architecture | POST /api/v1/solve |
| Python (ctypes) | Data science, scripting | from surge import Surge |
| Node.js (WASM) | Serverless, edge functions | import { loadSurge } |
| Browser (WASM) | Client-side optimization | <script src="surge.js"> |
| Any FFI | Go (cgo), Rust (bindgen), Java (Panama) | C ABI is universal |
Why ALNS, not HGS
HGS (Hybrid Genetic Search by Vidal) is state-of-the-art on clean CVRP benchmarks. But its architecture makes three commitments that conflict with rich VRP:
- Giant tour + Split decoder — With PD precedence, Split becomes NP-hard. PyVRP explicitly does not support PDPTW for this reason.
- O(1) concatenation scheme — Breaks for setup times (depends on adjacent node), max ride time (non-decomposable), commodity conflicts (set membership), and breaks (state machine). Falls back to O(n) per move, eliminating HGS's speed advantage.
- Crossover destroys structure — OX/SREX permute individual nodes without awareness of PD pairs, commodity conflicts, or exclusion groups. Repair required after nearly every crossover.
Adding a new constraint to ALNS touches 1–2 files (feasibility check + cost component). Adding a new constraint to HGS touches 5+ subsystems (Split, crossover, move evaluation, penalties, route updates). For 25+ simultaneous constraint types, ALNS is the right architecture.
Memory management
Zero malloc/free in the hot loop. Arena allocators for per-solution memory,
optimized single-memcpy solution copy, pre-allocated scratch buffers for feasibility
and local search. 21% speedup on Solomon from arenas alone.
| Solver | Memory Strategy | Hot-loop Allocations |
|---|---|---|
| Surge | Arena + pre-allocated scratch | Zero |
| VROOM | STL containers, general allocator | Some (wins on less work, not better memory) |
| OR-Tools | Smart pointers, STL, framework overhead | Significant |
| OptaPlanner/jsprit | JVM with GC | Massive (12–16 byte object headers) |
| HGS | Lean C++ vectors, general allocator | Moderate (closest competitor) |
Codebase
27K lines of solver library code (Surge + Arbor). A competent C developer can read the entire solver in a day. No metaprogramming, no templates, no macros beyond basics. Deterministic: same seed = same output. ASan/UBSan clean. Per-operator telemetry built in.
Three ways to try Surge
1. Browser (zero install)
The API documentation page includes a live WASM demo. Click "Solve Problem" to run the actual solver in your browser.
2. Build from source
# Clone and build git clone https://github.com/ottofleet/otto.git cd otto && make surge # Run tests (454 tests) make test-surge # Run Solomon benchmarks make -C surge bench
3. REST API server
# Build and start the API server make -C surge/api ./surge/api/surge-solver # Solve a problem curl -X POST http://localhost:8085/api/v1/solve \ -H "Content-Type: application/json" \ -d @problem.json
4. Python bindings
# Build shared library make -C surge shared-lib # Use from Python from surge import Surge solver = Surge() solver.add_vehicle(depot=0, capacity=[20], shift=(0, 1000)) solver.add_task(x=45.0, y=55.0, tw=(0, 500), demand=[5]) result = solver.solve(max_iterations=1000)
5. Node.js (WASM)
import { loadSurge } from '@artalis/surge'; const surge = await loadSurge(); const result = surge.solve({ vehicles: [{ id: 0, depot_start: 0, depot_end: 0, capacity: [20], shift: [0, 1000] }], depots: [{ id: 0, x: 40.0, y: 50.0, tw: [0, 1000] }], tasks: [{ id: 0, x: 45.0, y: 55.0, tw: [0, 500], service: 10, demand: [5] }], requests: [{ id: 0, delivery_task: 0 }], config: { max_iterations: 1000 } });
Common questions
How does Surge compare to OR-Tools for large instances?
OR-Tools uses CP-SAT, which degrades badly beyond ~500 requests — memory footprint explodes with distance matrix size. Surge's ALNS architecture scales by construction: destroy/repair operates on a fixed-size solution representation regardless of instance size. On Gehring-Homberger 200-customer instances, Surge matches BKS vehicle count 90% of the time with +7.9% avg distance gap. 400+ instances need per-iteration speedups (distance matrix caching, global time limits) to be practical.
Why C and not Rust?
Universal ABI. C is the only language callable from literally everything: Python (ctypes), Node (ffi-napi), Go (cgo), Rust (bindgen), Java (Panama), Swift, WebAssembly. Rust would provide memory safety at compile time, but the C ABI means zero FFI friction. Arena allocation mitigates the most dangerous class of memory bugs while delivering best-in-class performance.
Is WASM performance competitive with native?
WASM runs at 50–80% of native speed depending on the workload. For a 100-customer problem, that's the difference between 9 seconds and 14 seconds — both acceptable for batch optimization. The deployment advantage (run in any browser, no install) outweighs the speed penalty for most use cases.
Can I use this for real-time dispatch?
For re-optimization of an existing plan (insert 1–5 new requests), yes. Use request locking (FROZEN) to pin committed deliveries, then solve with a low iteration budget (100–500 iterations, sub-second). For full fleet re-planning from scratch with 1000+ requests, budget 30–120 seconds.
What's missing vs. commercial solvers?
At the solver kernel level: energy/EV cost model (experimental in OR-Tools, niche for trucking). Everything else — compartments, precedence, locking, breaks, multi-trip, setup times — is implemented. At the platform level: real-time traffic, driver apps, dashboards, reporting. Those are application-layer concerns handled by other OTTO modules or your own infrastructure.
What license? Can I embed this in a commercial product?
AGPLv3 with a Trucking Exception. If you're a trucking fleet using Surge for your own operations, the exception applies and you don't need to open-source your application code. If you're embedding Surge in a commercial SaaS product for resale, AGPL requires source disclosure — or contact us for a commercial license.