Prerequisites
You need at least one NVIDIA GPU with CUDA 12.1+ support. The NEROX solver image is tested on A100, H100, RTX 4090, and A10G. A minimum of 24 GB VRAM is recommended for production workloads over 10,000 variables.
Docker 24.0+
NVIDIA Container Toolkit (nvidia-docker2)
CUDA driver ≥ 525.60 (CUDA 12.1 compatible)
Linux x86_64 (Ubuntu 22.04 LTS recommended)
A NEROX Business or Enterprise license key
Install NVIDIA Container Toolkit
bash
# Ubuntu / Debian curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker # Verify docker run --rm --gpus all nvidia/cuda:12.1-base-ubuntu22.04 nvidia-smi
Pull and run the solver container
bash
# Authenticate with the NEROX registry echo $NEROX_LICENSE_KEY | docker login registry.driftrail.com -u license --password-stdin # Pull the solver image docker pull registry.driftrail.com/nerox-solver:latest # Run with all GPUs docker run -d \ --name nerox-solver \ --gpus all \ --restart unless-stopped \ -p 8080:8080 \ -e NEROX_LICENSE_KEY=$NEROX_LICENSE_KEY \ -e NEROX_LOG_LEVEL=info \ registry.driftrail.com/nerox-solver:latest
Configure the Python client to use your instance
python
import nerox
client = nerox.Client(
base_url="http://your-server:8080",
api_key="nrx_sk_..." # your API key, validated locally by the container
)
job = client.optimize.tsp(distance_matrix=matrix, solver="gpu")
result = job.wait()
print(result.objective)Docker Compose (recommended for single-node)
yaml
# docker-compose.yml
version: "3.9"
services:
nerox-solver:
image: registry.driftrail.com/nerox-solver:latest
restart: unless-stopped
ports:
- "8080:8080"
environment:
NEROX_LICENSE_KEY: ${NEROX_LICENSE_KEY}
NEROX_LOG_LEVEL: info
NEROX_MAX_CONCURRENT_JOBS: 4
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
- nerox-data:/var/nerox
nerox-gateway:
image: registry.driftrail.com/nerox-gateway:latest
restart: unless-stopped
ports:
- "443:443"
environment:
UPSTREAM: http://nerox-solver:8080
TLS_CERT: /certs/tls.crt
TLS_KEY: /certs/tls.key
volumes:
- ./certs:/certs:ro
volumes:
nerox-data:Kubernetes deployment
GPU node pool
Label your GPU nodes and configure the NVIDIA device plugin before deploying. The NEROX solver is a StatefulSet — each replica owns a dedicated GPU.
yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nerox-solver
namespace: nerox
spec:
replicas: 2
selector:
matchLabels:
app: nerox-solver
template:
metadata:
labels:
app: nerox-solver
spec:
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-a100
containers:
- name: solver
image: registry.driftrail.com/nerox-solver:latest
ports:
- containerPort: 8080
env:
- name: NEROX_LICENSE_KEY
valueFrom:
secretKeyRef:
name: nerox-license
key: key
resources:
limits:
nvidia.com/gpu: "1"
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10Environment variables reference
NEROX_LICENSE_KEYRequired. Your Business/Enterprise license key.NEROX_LOG_LEVELdebug | info | warn | error. Default: info.NEROX_MAX_CONCURRENT_JOBSMax jobs running in parallel per instance. Default: 2.NEROX_DATA_DIRPath for job result storage. Default: /var/nerox.NEROX_TLS_CERTPath to TLS certificate (optional, use gateway instead).NEROX_METRICS_PORTPrometheus metrics endpoint port. Default: 9090.Health check and monitoring
bash
# Health endpoint
curl http://localhost:8080/health
# { "status": "ok", "gpus": 2, "jobs_running": 1 }
# Prometheus metrics (scraped by default at :9090/metrics)
# nerox_jobs_total, nerox_job_duration_seconds, nerox_gpu_utilization_percent
# View live GPU utilization
watch -n1 nvidia-smi