NEROX/Hybrid Solver
Solver

Hybrid Solver

Automatic domain decomposition for large-scale problems. Breaks problems that exceed single-GPU memory into subproblems, solves each on GPU in parallel, and stitches a global solution.

Problem decomposition approach

The Hybrid Solver uses a variable interaction graph to detect clusters of strongly-coupled variables. Each cluster is extracted as a subproblem, boundary conditions are fixed from the previous pass, and subproblems are dispatched to GPU workers in parallel. After each GPU pass, the global solution is reconstructed by stitching subproblem results, and the process repeats for n_passes iterations.

This is a form of Large Neighborhood Search — fixing part of the solution while optimizing another part — implemented at scale with GPU workers.

Usage

python
import nerox

client = nerox.Client()

# 1000-city TSP — too large for a single GPU pass
job = client.optimize.tsp(
    distance_matrix=large_matrix,
    solver="hybrid",
    n_passes=5,             # decomposition rounds (default 3)
    subproblem_size=2000,   # variables per subproblem (default auto)
)

result = job.wait(timeout=600)
print(f"Tour: {result.objective:.1f}")

# Raw QUBO with 20,000 variables
job2 = client.optimize.qubo(
    Q=large_Q,
    solver="hybrid",
)
result2 = job2.wait()

When to use Hybrid

Problem has > 10,000 binary variables
QUBO matrix exceeds single-GPU VRAM (80 GB on A100)
TSP with more than ~2,000 cities
VRP with more than ~500 customers
Supply chain network design with thousands of nodes

Quality vs. single-GPU

The Hybrid Solver typically achieves slightly higher optimality gap than direct GPU Annealing on problems that fit in a single GPU, because decomposition introduces boundary errors. For problems that genuinely require decomposition, it is the only option and consistently outperforms CPU-only decomposition approaches by 5–30×.