Documentation

DriftRail SDK Documentation

Learn how to integrate DriftRail into your application to monitor, classify, and audit every LLM interaction in real-time.

3 Lines of Code

Drop-in integration with minimal configuration.

Fail-Open

Never breaks your production app. Async by default.

Multi-Language

Python, Node.js, and browser SDKs available.

Quickstart

Get up and running in under 5 minutes. Choose your preferred language:

# Install the SDK
pip install driftrail
from driftrail import DriftRail

# Initialize the client
client = DriftRail(
    api_key="dr_prod_...",
    app_id="my-app"
)

# Log an LLM interaction
response = client.ingest(
    model="gpt-5",
    provider="openai",
    input={"prompt": "What is the capital of France?"},
    output={"text": "The capital of France is Paris."}
)

print(f"Event ID: {response.event_id}")

API Keys

API keys are scoped by environment. We recommend using separate keys for development, staging, and production.

Key Formats

Environment Prefix Example
Production dr_prod_ dr_prod_a1b2c3d4...
Staging dr_staging_ dr_staging_e5f6g7h8...
Development dr_dev_ dr_dev_i9j0k1l2...

Security Note: Never expose your API keys in client-side code or public repositories. Use environment variables.

Python SDK

Installation

pip install driftrail

# For async support
pip install driftrail[async]

Async Usage

import asyncio
from driftrail import DriftRailAsync

async def main():
    async with DriftRailAsync(api_key="...", app_id="my-app") as client:
        response = await client.ingest(
            model="claude-3",
            provider="anthropic",
            input={"prompt": "Hello"},
            output={"text": "Hi there!"}
        )

asyncio.run(main())

Fire-and-Forget (Non-blocking)

# Won't block your main thread
client.ingest_async(
    model="gpt-5",
    provider="openai",
    input={"prompt": "..."},
    output={"text": "..."}
)

With Metadata

import time

start = time.time()
# ... your LLM call ...
latency = int((time.time() - start) * 1000)

client.ingest(
    model="gpt-5",
    provider="openai",
    input={"prompt": "..."},
    output={"text": "..."},
    metadata={
        "latency_ms": latency,
        "tokens_in": 50,
        "tokens_out": 150,
        "temperature": 0.7
    }
)

Node.js SDK

Installation

npm install @driftrail/sdk
# or
yarn add @driftrail/sdk
# or
pnpm add @driftrail/sdk

Fire-and-Forget (Non-blocking)

// Won't block your code
client.ingestAsync({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: '...' },
  output: { text: '...' }
});

OpenAI Chat Completions Helper

// Convenience method for chat completions
await client.logChatCompletion({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  response: 'Hi there! How can I help you today?',
  latencyMs: 420,
  tokensIn: 25,
  tokensOut: 12
});

TypeScript Support

import type { 
  IngestParams, 
  IngestResponse, 
  Provider,
  InputPayload,
  OutputPayload 
} from '@driftrail/sdk';

Browser SDK

The browser SDK is designed for client-side applications. It uses batching and async delivery to minimize performance impact.

Initialization

import { initAIObservability, logInference } from '@driftrail/sdk';

initAIObservability({
  apiKey: 'dr_prod_...',
  appId: 'my-app',
  environment: 'prod',
  batchSize: 10,          // Send events in batches
  flushIntervalMs: 5000,  // Or every 5 seconds
  debug: false
});

Logging Events

logInference({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: 'User query here' },
  output: { text: 'AI response here' },
  metadata: {
    latencyMs: 350,
    temperature: 0.7
  }
});

Inference Events

An inference event captures a single LLM interaction, including the prompt, response, and metadata.

Event Schema

{
  "event_id": "evt_a1b2c3d4...",
  "timestamp": "2024-01-15T10:30:00Z",
  "model": "gpt-5",
  "provider": "openai",
  "input": {
    "prompt": "What is the capital of France?",
    "messages": [...],
    "retrieved_sources": [...]
  },
  "output": {
    "text": "The capital of France is Paris.",
    "tool_calls": [...]
  },
  "metadata": {
    "latency_ms": 420,
    "tokens_in": 25,
    "tokens_out": 12,
    "temperature": 0.7
  },
  "classification": {
    "risk_score": 15,
    "risk_level": "low",
    "detected_issues": []
  }
}

Risk Classification

Every event is automatically classified for risk using our AI-powered analysis engine.

Low (0-30)

Safe, no issues detected

Medium (31-60)

Minor concerns, review suggested

High (61-85)

Significant issues detected

Critical (86-100)

Immediate attention required

Detection Types

  • Hallucination detection
  • PII exposure (names, emails, SSNs)
  • Policy violations
  • Prompt injection attempts
  • Jailbreak attempts
  • Toxic/harmful content
  • Off-topic responses
  • RAG groundedness issues

RAG Sources

Track the sources your RAG system retrieves and verify that responses are grounded in those sources.

client.ingest(
    model="gpt-5",
    provider="openai",
    input={
        "prompt": "What does our refund policy say?",
        "retrieved_sources": [
            {"id": "doc-123", "content": "Refunds available within 30 days..."},
            {"id": "doc-456", "content": "Contact support for refund requests..."}
        ]
    },
    output={"text": "According to our policy, refunds are available within 30 days..."}
)
Inline Protection

Inline Guardrails

Block dangerous LLM outputs before they reach your users. The guard() method runs real-time AI classification and rule-based guardrails in under 50ms.

What it detects

  • • PII exposure (emails, phones, SSNs)
  • • Toxic or harmful content
  • • Prompt injection attempts
  • • Custom keyword/regex rules

Actions available

  • Block - Prevent output entirely
  • Redact - Mask sensitive data
  • Warn - Flag but allow
  • Allow - Safe content

Fail-Open by Default: If DriftRail is unavailable, content is allowed through. Your app never breaks. Configure guard_mode="fail_closed" for strict environments.

Python Guardrails

Basic Usage

from driftrail import DriftRail

client = DriftRail(api_key="dr_prod_...", app_id="my-app")

# Get response from your LLM
llm_response = your_llm_call(user_prompt)

# Guard it BEFORE returning to user
result = client.guard(
    output=llm_response,
    input=user_prompt,
    mode="strict"  # or "permissive"
)

if result.allowed:
    # Safe to return (may be redacted if PII was found)
    return result.output
else:
    # Content was blocked
    print(f"Blocked: {[t.reason for t in result.triggered]}")
    return "Sorry, I can't help with that."

Guard Modes

Mode Blocks On Use Case
strict Medium+ risk (PII, moderate toxicity, prompt injection) Healthcare, finance, compliance-heavy apps
permissive High risk only (severe toxicity, high-risk injection) General apps, creative tools

Fail-Closed Mode

from driftrail import DriftRail, GuardBlockedError

# Fail-closed: raises exception if content blocked or service unavailable
client = DriftRail(
    api_key="dr_prod_...",
    app_id="my-app",
    guard_mode="fail_closed"
)

try:
    result = client.guard(output=llm_response)
    return result.output
except GuardBlockedError as e:
    # Content was blocked
    log_security_event(e.result.triggered)
    return "Content blocked for safety reasons."

Node.js Guardrails

Basic Usage

import { DriftRail } from '@driftrail/sdk';

const client = new DriftRail({ apiKey: 'dr_prod_...', appId: 'my-app' });

// Get response from your LLM
const llmResponse = await yourLLMCall(userPrompt);

// Guard it BEFORE returning to user
const result = await client.guard({
  output: llmResponse,
  input: userPrompt,
  mode: 'strict'
});

if (result.allowed) {
  return result.output; // May be redacted
} else {
  console.log('Blocked:', result.triggered.map(t => t.reason));
  return "Sorry, I can't help with that.";
}

Express Middleware Pattern

import express from 'express';
import { DriftRail } from '@driftrail/sdk';

const app = express();
const driftrail = new DriftRail({ apiKey: '...', appId: 'my-app' });

app.post('/api/chat', async (req, res) => {
  const { prompt } = req.body;
  
  // Get LLM response
  const llmResponse = await callLLM(prompt);
  
  // Guard before returning
  const guard = await driftrail.guard({
    output: llmResponse,
    input: prompt
  });
  
  if (!guard.allowed) {
    return res.status(400).json({ 
      error: 'Content blocked',
      reasons: guard.triggered.map(t => t.reason)
    });
  }
  
  res.json({ response: guard.output });
});

POST /api/guard

Real-time content safety check. Returns allow/block/redact decision in <50ms.

Request

curl -X POST https://api.driftrail.com/api/guard \
  -H "Authorization: Bearer dr_prod_..." \
  -H "X-App-Id: my-app" \
  -H "Content-Type: application/json" \
  -d '{
    "output": "The LLM response to check",
    "input": "Optional user prompt for context",
    "mode": "strict",
    "timeout_ms": 100
  }'

Request Body

Field Type Required Description
output string Yes The LLM output to check
input string No User prompt (helps detect prompt injection)
mode string No "strict" (default) or "permissive"
timeout_ms number No Classification timeout (default: 100, max: 500)

Response

{
  "allowed": true,
  "action": "redact",
  "output": "Contact me at j***@***.com for details",
  "triggered": [
    {
      "type": "classification",
      "name": "PII Redaction",
      "reason": "Redacted email"
    }
  ],
  "classification": {
    "risk_score": 25,
    "pii": { "detected": true, "types": ["email"] },
    "toxicity": { "detected": false, "severity": "none" },
    "prompt_injection": { "detected": false, "risk": "none" }
  },
  "latency_ms": 42,
  "fallback": false
}

POST /api/ingest

Ingest a new inference event for monitoring and classification. Events are processed asynchronously.

Authentication

Include your API key in one of these headers:

  • Authorization: Bearer dr_prod_...
  • X-API-Key: dr_prod_...

Required Headers

POST /api/ingest HTTP/1.1
Content-Type: application/json
Authorization: Bearer dr_prod_...
X-App-Id: my-application

Request Body

{
  "model": "gpt-5",
  "provider": "openai",
  "input": {
    "prompt": "What is the capital of France?",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "What is the capital of France?" }
    ],
    "retrievedSources": [
      { "id": "doc-123", "content": "France is a country..." }
    ]
  },
  "output": {
    "text": "The capital of France is Paris.",
    "toolCalls": []
  },
  "metadata": {
    "latencyMs": 420,
    "tokensIn": 25,
    "tokensOut": 12,
    "temperature": 0.7
  }
}

Response (202 Accepted)

{
  "success": true,
  "event_id": "550e8400-e29b-41d4-a716-446655440000",
  "job_id": "job_a1b2c3d4..."
}

Error Responses

Status Description
400 Invalid JSON or schema validation failed
401 Missing or invalid API key
429 Rate limit or usage limit exceeded
500 Internal server error

GET /api/events

Retrieve and filter inference events. Supports pagination and filtering by app, environment, model, and time range.

Query Parameters

Parameter Type Description
event_id string Get a specific event by ID
app_id string Filter by application
environment string prod, staging, or dev
model string Filter by model name
start_time ISO 8601 Start of time range
end_time ISO 8601 End of time range
limit integer Max results (default: 100)
offset integer Pagination offset

Example Request

curl -X GET "https://api.driftrail.com/api/events?app_id=my-app&limit=50" \
  -H "Authorization: Bearer dr_prod_..."

Response

{
  "events": [
    {
      "event_id": "550e8400-e29b-41d4-a716-446655440000",
      "timestamp": "2024-01-15T10:30:00Z",
      "model": "gpt-5",
      "provider": "openai",
      "app_id": "my-app",
      "environment": "prod",
      "latency_ms": 420,
      "tokens_in": 25,
      "tokens_out": 12
    }
  ],
  "total": 1250,
  "limit": 50,
  "offset": 0
}

GET /api/classifications

Retrieve risk classification results. Includes endpoints for risk distribution and high-risk event alerts.

Endpoints

GET /api/classifications

List all classifications with optional filters

GET /api/classifications/distribution

Get risk level distribution (low, medium, high, critical)

GET /api/classifications/high-risk

Get events above risk threshold (default: 70)

Risk Distribution Example

curl -X GET "https://api.driftrail.com/api/classifications/distribution?app_id=my-app" \
  -H "Authorization: Bearer dr_prod_..."

Response

{
  "low": 850,
  "medium": 280,
  "high": 95,
  "critical": 15,
  "total": 1240
}

High-Risk Events Query

curl -X GET "https://api.driftrail.com/api/classifications/high-risk?threshold=80&limit=20" \
  -H "Authorization: Bearer dr_prod_..."

Response

{
  "classifications": [
    {
      "event_id": "550e8400-e29b-41d4-a716-446655440000",
      "risk_score": 92,
      "risk_level": "critical",
      "detected_issues": ["pii_exposure", "hallucination"],
      "classified_at": "2024-01-15T10:30:05Z"
    }
  ]
}

AI Playground

Dashboard Feature

Test AI models with real-time DriftRail safety monitoring. Every interaction runs through our full detection pipeline, giving you instant visibility into potential risks.

Features

  • Interactive chat with multiple AI models (Gemini 2.5 Flash Lite, Gemini 3 Flash, GPT-5 Nano, Claude 4.5 Haiku)
  • Real-time detection pipeline visualization
  • Guardrail testing with automatic blocking
  • Risk analysis for hallucination, PII, toxicity, prompt injection
  • Toggle detections and streaming on/off

Supported Models

Google Gemini 2.5 Flash Lite

Ultra fast, lowest cost

Google Gemini 3 Flash

Latest Gemini model

GPT-5 Nano

OpenAI's fastest model

Anthropic Claude 4.5 Haiku

Anthropic's efficient model

Usage Limits by Plan

Plan Monthly Messages Cost
Starter 25 Free
Growth 500 $99/mo
Pro 2,500 $499/mo
Enterprise 10,000+ Custom

API Access

The playground is also available via API for programmatic testing:

curl -X POST "https://api.driftrail.com/api-playground" \
  -H "Authorization: Bearer dr_prod_..." \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "model": "gemini-flash-lite-latest",
    "runDetections": true
  }'

Response

{
  "content": "The capital of France is Paris.",
  "model": "gemini-2.5-flash-lite",
  "provider": "google",
  "detections": [
    {
      "type": "hallucination",
      "risk": "low",
      "confidence": 0.95,
      "details": "Response is factually accurate"
    }
  ],
  "latencyMs": 450,
  "tokensUsed": 25,
  "usage": {
    "current": 16,
    "limit": 500,
    "remaining": 484
  }
}

Try it now: Access the AI Playground from your dashboard to test models with real-time safety monitoring.

Ready to get started?

Create your free account and start monitoring your AI in under 5 minutes.

Get Started Free →