Documentation

DriftRail SDK Documentation

Learn how to integrate DriftRail into your application to monitor, classify, and audit every LLM interaction in real-time.

3 Lines of Code

Drop-in integration with minimal configuration.

Fail-Open

Never breaks your production app. Async by default.

Multi-Language

Python, Node.js, and browser SDKs available.

Quickstart

Get up and running in under 5 minutes. Choose your preferred language:

# Install the SDK
pip install driftrail

from driftrail import DriftRail

# Initialize the client
client = DriftRail(
    api_key="dr_live_...",
    app_id="my-app"
)

# Log an LLM interaction
response = client.ingest(
    model="gpt-4o",
    provider="openai",
    input={"prompt": "What is the capital of France?"},
    output={"text": "The capital of France is Paris."}
)

print(f"Event ID: {response.event_id}")

# Install the SDK
npm install @drift_rail/sdk

import { DriftRail } from '@drift_rail/sdk';

// Initialize the client
const client = new DriftRail({
  apiKey: 'dr_live_...',
  appId: 'my-app'
});

// Log an LLM interaction
const response = await client.ingest({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: 'What is the capital of France?' },
  output: { text: 'The capital of France is Paris.' }
});

console.log(`Event ID: ${response.event_id}`);

<!-- Include via CDN -->
<script src="https://unpkg.com/@drift_rail/browser@latest/dist/driftrail.min.js"></script>

// Initialize
DriftRail.init({
  apiKey: 'dr_live_...',
  appId: 'my-app',
  environment: 'prod'
});

// Log an inference event
DriftRail.logInference({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: 'What is the capital of France?' },
  output: { text: 'The capital of France is Paris.' }
});

API Keys

API keys are scoped by environment. We recommend using separate keys for development, staging, and production.

Getting Your API Key

When you sign up for DriftRail, an API key is automatically generated for your account.

1. Go to Dashboard → API Keys

2. Click the eye icon to reveal your key

3. Click copy to copy it to your clipboard

Key Formats

Environment	Prefix	Example
Production	dr_live_	dr_live_a1b2c3d4...
Staging	dr_test_	dr_test_e5f6g7h8...
Development	dr_test_	dr_test_i9j0k1l2...

Security Note: Never expose your API keys in client-side code or public repositories. Use environment variables.

Python SDK

The official Python SDK for DriftRail. Supports both sync and async usage patterns, with built-in inline guardrails and enterprise features.

Now Available: pip install driftrail — Version 2.1.0 on PyPI with 120+ enterprise methods

Installation

pip install driftrail

# For async support
pip install driftrail[async]

Quick Start

from driftrail import DriftRail

client = DriftRail(
    api_key="dr_live_...",
    app_id="my-app"
)

# Log an LLM interaction
response = client.ingest(
    model="gpt-4o",
    provider="openai",
    input={"prompt": "What is the capital of France?"},
    output={"text": "The capital of France is Paris."},
    metadata={
        "latency_ms": 420,
        "tokens_in": 25,
        "tokens_out": 12,
        "temperature": 0.7
    }
)

print(f"Event ID: {response.event_id}")

Inline Guardrails

Block dangerous outputs before they reach users:

from driftrail import DriftRail

client = DriftRail(api_key="dr_live_...", app_id="my-app")

# Get response from your LLM
llm_response = your_llm_call(user_prompt)

# Guard it before returning to user
result = client.guard(
    output=llm_response,
    input=user_prompt,
    mode="strict"  # or "permissive"
)

if result.allowed:
    return result.output  # May be redacted if PII was found
else:
    print(f"Blocked: {[t.reason for t in result.triggered]}")
    return "Sorry, I can't help with that."

Async Usage

For async applications using aiohttp:

import asyncio
from driftrail import DriftRailAsync

async def main():
    async with DriftRailAsync(api_key="dr_live_...", app_id="my-app") as client:
        response = await client.ingest(
            model="claude-sonnet-4",
            provider="anthropic",
            input={"prompt": "Hello!"},
            output={"text": "Hi there!"}
        )
        print(f"Event ID: {response.event_id}")

asyncio.run(main())

Fire-and-Forget (Non-blocking)

For long-running server processes only:

# Won't block your main thread
client.ingest_async(
    model="gpt-4o",
    provider="openai",
    input={"prompt": "..."},
    output={"text": "..."}
)

⚠️ Serverless Warning: Do not use ingest_async() in AWS Lambda, Google Cloud Functions, or other serverless environments. Use the synchronous ingest() method instead.

RAG Source Tracking

Track which documents were used in RAG responses:

client.ingest(
    model="gpt-4o",
    provider="openai",
    input={
        "prompt": "What's our refund policy?",
        "retrieved_sources": [
            {"id": "doc-123", "content": "Refunds available within 30 days..."},
            {"id": "doc-456", "content": "Contact support for refund requests..."}
        ]
    },
    output={"text": "According to our policy, refunds are available within 30 days..."}
)

Fail-Open Architecture

By default, the SDK fails open—errors are captured but won't crash your app:

client = DriftRail(
    api_key="dr_live_...",
    app_id="my-app",
    fail_open=True,         # Default: errors logged but don't raise
    guard_mode="fail_open"  # Default: if guard API unavailable, allow content
)

# Even if DriftRail is down, this won't raise
response = client.ingest(...)
if not response.success:
    print(f"Warning: {response.error}")

Enterprise Features

The DriftRailEnterprise client provides 120+ methods for full platform access:

from driftrail import DriftRailEnterprise

client = DriftRailEnterprise(api_key="dr_live_...", app_id="my-app")

# === Incident Management ===
stats = client.get_incident_stats()
incidents = client.list_incidents(status=["open"], severity=["critical"])
client.create_incident(title="High risk spike", severity="high", incident_type="risk_spike")
client.update_incident_status(incident_id, status="investigating")

# === Compliance & Reporting ===
compliance = client.get_compliance_status()
score = client.get_compliance_score()
reports = client.get_compliance_reports()
client.generate_compliance_report(framework="hipaa", format="pdf", include_ai_analysis=True)
client.create_custom_framework(name="Internal Policy", controls=[...])

# === Executive Dashboard ===
metrics = client.get_executive_metrics(period="7d")
targets = client.get_kpi_targets()
client.update_kpi_targets({"max_high_risk_percent": 5.0})
client.export_executive_metrics(period="30d", format="xlsx")

# === Model Analytics ===
summary = client.get_model_analytics_summary()
logs = client.get_historical_logs(model="gpt-4o", limit=100)
switches = client.get_model_switches()
client.record_model_switch(app_id="my-app", new_model="claude-4", new_provider="anthropic")
benchmarks = client.get_model_benchmarks()
client.calculate_model_benchmark(model="gpt-4o", days=7)

# === Drift Detection V3 ===
drift_score = client.get_drift_score()
heatmap = client.get_drift_heatmap(days=30)
thresholds = client.get_drift_thresholds()
client.update_drift_thresholds({"risk_score": {"warning": 15, "critical": 25}})
predictions = client.get_drift_predictions()
correlations = client.get_drift_deployment_correlations()
client.record_deployment(app_id="my-app", version="v2.1.0", deployment_type="release")
seasonality = client.get_seasonality_patterns()
distribution = client.get_distribution_analysis()
statistics = client.get_baseline_statistics()

# === Notification Channels ===
channels = client.get_notification_channels()
client.create_notification_channel(
    channel_type="slack",
    name="Alerts Channel",
    config={"webhook_url": "https://hooks.slack.com/..."},
    severity_filter=["critical", "warning"]
)

# === Drift Segments ===
segments = client.get_drift_segments()
client.create_drift_segment(name="Production GPT-4", filter_criteria={"model": "gpt-4o"})

# === Distributed Tracing ===
trace = client.start_trace(app_id="my-app", name="chat-completion", user_id="user-123")
span = client.start_span(trace_id=trace["trace_id"], name="llm-call", span_type="llm", model="gpt-4o")
client.end_span(span["span_id"], status="completed", tokens_in=100, tokens_out=50)
client.end_trace(trace["trace_id"])
traces = client.list_traces(status="completed", limit=50)

# === Prompt Management ===
prompt = client.create_prompt(name="Customer Support", content="You are a helpful assistant...")
version = client.create_prompt_version(prompt["prompt_id"], content="Updated prompt...", commit_message="v2")
client.deploy_prompt_version(version["version_id"], environment="production")
deployed = client.get_deployed_prompt(prompt["prompt_id"], environment="production")

# === Evaluation Framework ===
dataset = client.create_dataset(name="QA Test Set", schema_type="qa")
client.add_dataset_items(dataset["dataset_id"], items=[{"input": {...}, "expected_output": {...}}])
run = client.create_eval_run(dataset["dataset_id"], evaluators=[{"type": "exact_match"}])
results = client.get_eval_run(run["run_id"])

# === Semantic Caching ===
settings = client.get_cache_settings()
client.update_cache_settings({"is_enabled": True, "similarity_threshold": 0.95})
stats = client.get_cache_stats()
lookup = client.cache_lookup(input="What is the capital of France?")
client.cache_store(input="...", output="...", model="gpt-4o")

# === Agent Simulations ===
sim = client.create_simulation(name="Support Bot Test", scenario="User asks for refund")
run = client.run_simulation(sim["simulation_id"])
client.add_simulation_turn(run["run_id"], turn_number=1, role="user", content="I want a refund")
stats = client.get_simulation_stats()

# === Integrations ===
integrations = client.get_integrations()
client.create_integration(type="slack", webhook_url="https://...", events=["high_risk", "incident"])
client.test_integration(webhook_url="https://...", type="slack")

# === Benchmarks ===
industries = client.get_industries()
report = client.get_benchmark_report(industry="healthcare")
client.set_tenant_industry(industry="fintech")

# === Guardrails ===
guardrails = client.get_guardrails()
client.create_guardrail(name="PII Blocker", rule_type="pii", action="block")
stats = client.get_guardrail_stats()

# === Retention Policies ===
policies = client.get_retention_policies()
client.create_retention_policy(name="90 Day Retention", data_type="events", retention_days=90)

# === Audit Logs ===
logs = client.get_audit_logs(action="event.ingested", limit=100)

# === Events & Classifications ===
events = client.get_events(min_risk_score=70, limit=50)
event = client.get_event(event_id)
live = client.get_live_events()
classifications = client.get_classifications(min_score=0.8)

# === Custom Detections ===
detections = client.get_custom_detections()
client.create_custom_detection(name="Competitor Mention", detection_type="keyword", config={"keywords": [...]})

# === Webhooks ===
webhooks = client.get_webhooks()
client.create_webhook(url="https://...", events=["high_risk", "drift_alert"])

# === Stats ===
stats = client.get_stats(period="7d")

Available Enterprise Methods

Incidents & Compliance

• list_incidents, create_incident
• update_incident_status
• get_incident_stats
• get_compliance_status/score
• generate_compliance_report
• create_custom_framework

Drift Detection V3

• get_drift_score/heatmap
• get/update_drift_thresholds
• get_drift_predictions
• get_seasonality_patterns
• get_baseline_statistics
• record_deployment

Tracing & Prompts

• start/end_trace, start/end_span
• list_traces, get_trace
• create_prompt, create_prompt_version
• deploy_prompt_version
• get_deployed_prompt
• rollback_prompt

Evaluations & Cache

• create_dataset, add_dataset_items
• create_eval_run, get_eval_run
• submit_eval_result
• get/update_cache_settings
• cache_lookup, cache_store
• invalidate_cache, clear_cache

Simulations & Analytics

• create/run_simulation
• add_simulation_turn
• get_simulation_stats
• get_executive_metrics
• get_model_analytics_summary
• get_model_leaderboard

Configuration

• get/create_guardrail
• get/create_webhook
• get/create_integration
• get/create_retention_policy
• get/create_notification_channel
• get_audit_logs

Node.js SDK

Installation

npm install @drift_rail/sdk
# or
yarn add @drift_rail/sdk
# or
pnpm add @drift_rail/sdk

Fire-and-Forget (Non-blocking)

// Won't block your code
client.ingestAsync({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: '...' },
  output: { text: '...' }
});

OpenAI Chat Completions Helper

// Convenience method for chat completions
await client.logChatCompletion({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  response: 'Hi there! How can I help you today?',
  latencyMs: 420,
  tokensIn: 25,
  tokensOut: 12
});

TypeScript Support

import type { 
  IngestParams, 
  IngestResponse, 
  Provider,
  InputPayload,
  OutputPayload 
} from '@drift_rail/sdk';

Browser SDK

The browser SDK is designed for client-side applications. It uses batching and async delivery to minimize performance impact.

Initialization

import { initAIObservability, logInference } from '@drift_rail/sdk';

initAIObservability({
  apiKey: 'dr_live_...',
  appId: 'my-app',
  environment: 'prod',
  batchSize: 10,          // Send events in batches
  flushIntervalMs: 5000,  // Or every 5 seconds
  debug: false
});

Logging Events

logInference({
  model: 'gpt-4',
  provider: 'openai',
  input: { prompt: 'User query here' },
  output: { text: 'AI response here' },
  metadata: {
    latencyMs: 350,
    temperature: 0.7
  }
});

Inference Events

An inference event captures a single LLM interaction, including the prompt, response, and metadata.

Event Schema

{
  "event_id": "evt_a1b2c3d4...",
  "timestamp": "2026-01-15T10:30:00Z",
  "model": "gpt-5",
  "provider": "openai",
  "input": {
    "prompt": "What is the capital of France?",
    "messages": [...],
    "retrieved_sources": [...]
  },
  "output": {
    "text": "The capital of France is Paris.",
    "tool_calls": [...]
  },
  "metadata": {
    "latency_ms": 420,
    "tokens_in": 25,
    "tokens_out": 12,
    "temperature": 0.7
  },
  "classification": {
    "risk_score": 15,
    "risk_level": "low",
    "detected_issues": []
  }
}

Risk Classification

Every event is automatically classified for risk using our AI-powered analysis engine. Risk scores are calculated using a weighted, confidence-adjusted algorithm that you can customize.

Low (0-39)

Safe, no significant issues detected

Medium (40-69)

Concerns detected, review suggested

High (70-100)

Significant issues, attention required

Detection Types

Detection	Default Weight	Description
hallucination	0.20	Detects fabricated or unsupported claims
policy_violation	0.25	Content that violates usage policies
toxicity	0.20	Harmful, offensive, or inappropriate content
prompt_injection	0.15	Attempts to manipulate the model
pii_detection	0.15	Personal data exposure (names, emails, SSNs)
factual_accuracy	0.10	Verifiable factual errors
confidence_degradation	0.05	Low model confidence in response
brand_safety	0.10	Content that may harm brand reputation

Risk Scoring Algorithm

The final risk score is calculated using a production-ready algorithm:

1.
Weighted Scoring - Each detection type contributes based on its configured weight (0-1). Weights can be customized per tenant.
2.
Confidence Adjustment - Scores are scaled by √(confidence) to give diminishing returns for uncertain detections.
3.
Compounding - Multiple significant detections (score ≥50) compound the final score by a configurable factor (default: 1.1x per detection, max 1.5x).
4.
Custom Thresholds - Define your own boundaries for low/medium/high risk levels.

Pro Feature: Configure custom weights, thresholds, and compounding settings via the Risk Config API or in the Dashboard under Settings → Detections.

RAG Sources

Track the sources your RAG system retrieves and verify that responses are grounded in those sources. DriftRail uses these sources to detect hallucinations and verify factual accuracy.

Source Schema

Field	Type	Required	Description
id	string	Yes	Unique identifier for the source document
content	string	No	The retrieved text content (recommended for groundedness checks)
type	string	No	Source type (e.g., "document", "webpage", "database")
url	string	No	URL of the source document
metadata	object	No	Custom metadata (e.g., score, chunk_index)

Python Example

client.ingest(
    model="gpt-5",
    provider="openai",
    input={
        "prompt": "What does our refund policy say?",
        "retrievedSources": [  # camelCase in API payload
            {
                "id": "doc-123",
                "content": "Refunds available within 30 days...",
                "type": "document",
                "url": "https://docs.example.com/refunds",
                "metadata": {"score": 0.95, "chunk_index": 2}
            },
            {
                "id": "doc-456", 
                "content": "Contact support for refund requests..."
            }
        ]
    },
    output={"text": "According to our policy, refunds are available within 30 days..."}
)

# Python SDK also accepts snake_case for convenience:
input={"retrieved_sources": [...]}

Node.js Example

await client.ingest({
  model: 'gpt-5',
  provider: 'openai',
  input: {
    prompt: 'What does our refund policy say?',
    retrievedSources: [
      { id: 'doc-123', content: 'Refunds available within 30 days...' },
      { id: 'doc-456', content: 'Contact support for refund requests...' }
    ]
  },
  output: { text: 'According to our policy, refunds are available within 30 days...' }
});

How RAG Groundedness Works

When you provide retrievedSources, DriftRail's classifier compares the LLM output against the source content to detect:

• Hallucinations - Claims not supported by sources
• Contradictions - Output that conflicts with source material
• False attributions - Incorrect citations or references
• Invented details - Specifics not present in sources

Note: For best groundedness detection, include the actual content of retrieved sources, not just IDs. Sources are automatically summarized if total context exceeds ~7,500 tokens.

Inline Protection

Inline Guardrails

Block dangerous LLM outputs before they reach your users. The guard() method runs real-time AI classification and rule-based guardrails in under 50ms.

What it detects

• PII exposure (emails, phones, SSNs)
• Toxic or harmful content
• Prompt injection attempts
• Custom keyword/regex rules

Actions available

• Block - Prevent output entirely
• Redact - Mask sensitive data
• Warn - Flag but allow
• Allow - Safe content

Fail-Open by Default: If DriftRail is unavailable, content is allowed through. Your app never breaks. Configure guard_mode="fail_closed" for strict environments.

Python Guardrails

Basic Usage

from driftrail import DriftRail

client = DriftRail(api_key="dr_live_...", app_id="my-app")

# Get response from your LLM
llm_response = your_llm_call(user_prompt)

# Guard it BEFORE returning to user
result = client.guard(
    output=llm_response,
    input=user_prompt,
    mode="strict"
)

if result.allowed:
    return result.output  # May be redacted
else:
    print(f"Blocked: {[t.reason for t in result.triggered]}")
    return "Sorry, I can't help with that."

Guard Modes

Mode	Behavior
strict	Blocks on medium+ risk (PII, moderate toxicity, prompt injection)
permissive	Only blocks on high risk (severe toxicity, high-risk injection)

Fail-Open vs Fail-Closed

from driftrail import DriftRail, GuardBlockedError

# Fail-open (default): If DriftRail is unavailable, content is allowed
client = DriftRail(api_key="...", app_id="...", guard_mode="fail_open")

# Fail-closed: If DriftRail is unavailable, raises exception
client = DriftRail(api_key="...", app_id="...", guard_mode="fail_closed")

try:
    result = client.guard(output=llm_response)
except GuardBlockedError as e:
    # Content was blocked
    print(f"Blocked: {e.result.triggered}")

Guard Response

result = client.guard(output="...")

result.allowed      # bool - True if content can be shown to user
result.action       # "allow" | "block" | "redact" | "warn"
result.output       # Original or redacted content
result.triggered    # List of triggered guardrails/classifications
result.classification  # AI classification details (risk_score, pii, toxicity, etc.)
result.latency_ms   # Processing time
result.fallback     # True if classification failed (fail-open)

FastAPI Example

from fastapi import FastAPI, HTTPException
from driftrail import DriftRail

app = FastAPI()
driftrail = DriftRail(api_key="...", app_id="my-api")

@app.post("/api/chat")
async def chat(prompt: str):
    # Get LLM response
    llm_response = await call_llm(prompt)
    
    # Guard before returning
    guard = driftrail.guard(output=llm_response, input=prompt)
    
    if not guard.allowed:
        raise HTTPException(
            status_code=400,
            detail={"error": "Content blocked", "reasons": [t.reason for t in guard.triggered]}
        )
    
    return {"response": guard.output}

Node.js Guardrails

Basic Usage

import { DriftRail } from '@drift_rail/sdk';

const client = new DriftRail({ apiKey: 'dr_live_...', appId: 'my-app' });

// Get response from your LLM
const llmResponse = await yourLLMCall(userPrompt);

// Guard it BEFORE returning to user
const result = await client.guard({
  output: llmResponse,
  input: userPrompt,
  mode: 'strict'
});

if (result.allowed) {
  return result.output; // May be redacted
} else {
  console.log('Blocked:', result.triggered.map(t => t.reason));
  return "Sorry, I can't help with that.";
}

Express Middleware Pattern

import express from 'express';
import { DriftRail } from '@drift_rail/sdk';

const app = express();
const driftrail = new DriftRail({ apiKey: '...', appId: 'my-app' });

app.post('/api/chat', async (req, res) => {
  const { prompt } = req.body;
  
  // Get LLM response
  const llmResponse = await callLLM(prompt);
  
  // Guard before returning
  const guard = await driftrail.guard({
    output: llmResponse,
    input: prompt
  });
  
  if (!guard.allowed) {
    return res.status(400).json({ 
      error: 'Content blocked',
      reasons: guard.triggered.map(t => t.reason)
    });
  }
  
  res.json({ response: guard.output });
});

POST /api/guard

Real-time content safety check. Returns allow/block/redact decision in <50ms.

Request

curl -X POST https://api.driftrail.com/api/guard \
  -H "Authorization: Bearer dr_live_..." \
  -H "X-App-Id: my-app" \
  -H "Content-Type: application/json" \
  -d '{
    "output": "The LLM response to check",
    "input": "Optional user prompt for context",
    "mode": "strict",
    "timeout_ms": 100
  }'

Request Body

Field	Type	Required	Description
output	string	Yes	The LLM output to check
input	string	No	User prompt (helps detect prompt injection)
mode	string	No	"strict" (default) or "permissive"
timeout_ms	number	No	Classification timeout (default: 100, max: 500)

Response

{
  "allowed": true,
  "action": "redact",
  "output": "Contact me at j***@***.com for details",
  "triggered": [
    {
      "type": "classification",
      "name": "PII Redaction",
      "reason": "Redacted email"
    }
  ],
  "classification": {
    "risk_score": 25,
    "pii": { "detected": true, "types": ["email"] },
    "toxicity": { "detected": false, "severity": "none" },
    "prompt_injection": { "detected": false, "risk": "none" }
  },
  "latency_ms": 42,
  "fallback": false
}

POST /api/ingest

Ingest a new inference event for monitoring and classification. Events are processed asynchronously.

Authentication

Include your API key in one of these headers:

Authorization: Bearer dr_live_...
X-API-Key: dr_live_...

Required Headers

POST /api/ingest HTTP/1.1
Content-Type: application/json
Authorization: Bearer dr_live_...
X-App-Id: my-application

Request Body

{
  "model": "gpt-5",
  "provider": "openai",
  "input": {
    "prompt": "What is the capital of France?",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "What is the capital of France?" }
    ],
    "retrievedSources": [
      { "id": "doc-123", "content": "France is a country..." }
    ]
  },
  "output": {
    "text": "The capital of France is Paris.",
    "toolCalls": []
  },
  "metadata": {
    "latencyMs": 420,
    "tokensIn": 25,
    "tokensOut": 12,
    "temperature": 0.7
  }
}

Response (202 Accepted)

{
  "success": true,
  "event_id": "550e8400-e29b-41d4-a716-446655440000",
  "job_id": "job_a1b2c3d4..."
}

Error Responses

Status	Description
400	Invalid JSON or schema validation failed
401	Missing or invalid API key
429	Rate limit or usage limit exceeded
500	Internal server error

GET /api/events

Retrieve and filter inference events. Supports pagination and filtering by app, environment, model, and time range.

Query Parameters

Parameter	Type	Description
event_id	string	Get a specific event by ID
app_id	string	Filter by application
environment	string	prod, staging, or dev
model	string	Filter by model name
start_time	ISO 8601	Start of time range
end_time	ISO 8601	End of time range
limit	integer	Max results (default: 100)
offset	integer	Pagination offset

Example Request

curl -X GET "https://api.driftrail.com/api/events?app_id=my-app&limit=50" \
  -H "Authorization: Bearer dr_live_..."

Response

{
  "events": [
    {
      "event_id": "550e8400-e29b-41d4-a716-446655440000",
      "timestamp": "2026-01-15T10:30:00Z",
      "model": "gpt-5",
      "provider": "openai",
      "app_id": "my-app",
      "environment": "prod",
      "latency_ms": 420,
      "tokens_in": 25,
      "tokens_out": 12
    }
  ],
  "total": 1250,
  "limit": 50,
  "offset": 0
}

GET /api/classifications

Retrieve risk classification results. Includes endpoints for risk distribution and high-risk event alerts.

Endpoints

GET /api/classifications

List all classifications with optional filters

GET /api/classifications/distribution

Get risk level distribution (low, medium, high, critical)

GET /api/classifications/high-risk

Get events above risk threshold (default: 70)

Risk Distribution Example

curl -X GET "https://api.driftrail.com/api/classifications/distribution?app_id=my-app" \
  -H "Authorization: Bearer dr_live_..."

Response

{
  "low": 850,
  "medium": 280,
  "high": 95,
  "critical": 15,
  "total": 1240
}

High-Risk Events Query

curl -X GET "https://api.driftrail.com/api/classifications/high-risk?threshold=80&limit=20" \
  -H "Authorization: Bearer dr_live_..."

Response

{
  "classifications": [
    {
      "event_id": "550e8400-e29b-41d4-a716-446655440000",
      "risk_score": 92,
      "risk_level": "critical",
      "detected_issues": ["pii_exposure", "hallucination"],
      "classified_at": "2026-01-15T10:30:05Z"
    }
  ]
}

AI Playground

Dashboard Feature

Test AI models with real-time DriftRail safety monitoring. Every interaction runs through our full detection pipeline, giving you instant visibility into potential risks.

Features

→ Interactive chat with multiple AI models (Gemini Flash Lite, Gemini Flash, GPT-5 Nano, Claude 4.5 Haiku)
→ Real-time detection pipeline visualization
→ Guardrail testing with automatic blocking
→ Risk analysis for hallucination, PII, toxicity, prompt injection
→ Toggle detections and streaming on/off

Supported Models

Gemini Flash Lite

Ultra fast, lowest cost

Gemini Flash

Latest Gemini model

GPT-5 Nano

OpenAI's fastest model

Claude 4.5 Haiku

Anthropic's efficient model

Usage Limits by Plan

Plan	Monthly Messages	Cost
Starter	25	Free
Growth	500	$99/mo
Pro	2,500	$499/mo
Enterprise	10,000+	Custom

API Access

The playground is also available via API for programmatic testing:

curl -X POST "https://api.driftrail.com/api-playground" \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "model": "gemini-flash-lite-latest",
    "runDetections": true
  }'

Response

{
  "content": "The capital of France is Paris.",
  "model": "gemini-2.5-flash-lite",
  "provider": "google",
  "detections": [
    {
      "type": "hallucination",
      "risk": "low",
      "confidence": 0.95,
      "details": "Response is factually accurate"
    }
  ],
  "latencyMs": 450,
  "tokensUsed": 25,
  "usage": {
    "current": 16,
    "limit": 500,
    "remaining": 484
  }
}

Try it now: Access the AI Playground from your dashboard to test models with real-time safety monitoring.

Serverless Environments

Critical

When deploying to serverless platforms (Vercel, Netlify, AWS Lambda, Cloudflare Workers), you must await the ingest call. Fire-and-forget methods cause race conditions where events are silently lost.

⚠️ #1 Cause of Missing Events: Using ingestAsync() in serverless environments. The function terminates before the HTTP request completes, and events are silently dropped.

📖 Deep Dive: Read our blog post Why Your LLM Observability Breaks on Vercel for a detailed explanation of the race condition and solutions.

The Race Condition

Serverless functions terminate immediately after returning a response. Any "fire and forget" HTTP requests get killed mid-flight—no error, no warning, just missing data.

Environment	`ingest()` (awaited)	`ingestAsync()`
Vercel Serverless	✅ Works	❌ Loses events
Vercel Edge	✅ Works	❌ Loses events
Netlify Functions	✅ Works	❌ Loses events
AWS Lambda	✅ Works	❌ Loses events
Cloudflare Workers	✅ Works	❌ Loses events
Express / Fastify	✅ Works	✅ Safe

✅ Correct: Await in Serverless

// Vercel, Netlify, Lambda, Cloudflare Workers
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const response = await callLLM(prompt);
  
  // MUST await - ensures request completes before function terminates
  await client.ingest({
    model: 'gpt-4o',
    provider: 'openai',
    input: { prompt },
    output: { text: response }
  });
  
  return Response.json({ response });
}

❌ Wrong: Fire-and-Forget in Serverless

// DON'T do this in serverless!
export async function POST(req: Request) {
  const response = await callLLM(prompt);
  
  client.ingestAsync({...}); // Race condition! Function terminates first
  
  return Response.json({ response });
  // HTTP request to DriftRail is killed here
}

Platform Examples

▲ Vercel (Next.js App Router)

// app/api/chat/route.ts
import { DriftRail } from '@drift_rail/sdk';

const client = new DriftRail({
  apiKey: process.env.DRIFTRAIL_API_KEY!,
  appId: 'my-nextjs-app'
});

export async function POST(req: Request) {
  const { messages } = await req.json();
  const startTime = Date.now();
  
  const completion = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages
  });
  
  const response = completion.choices[0].message.content;
  
  await client.ingest({
    model: 'gpt-4o',
    provider: 'openai',
    input: { prompt: messages.at(-1).content, messages },
    output: { text: response },
    metadata: { latencyMs: Date.now() - startTime }
  });
  
  return Response.json({ response });
}

N Netlify Functions

// netlify/functions/chat.ts
import { Handler } from '@netlify/functions';
import { DriftRail } from '@drift_rail/sdk';

const client = new DriftRail({
  apiKey: process.env.DRIFTRAIL_API_KEY!,
  appId: 'my-netlify-app'
});

export const handler: Handler = async (event) => {
  const { prompt } = JSON.parse(event.body || '{}');
  const response = await callYourLLM(prompt);
  
  await client.ingest({
    model: 'claude-3.5-sonnet',
    provider: 'anthropic',
    input: { prompt },
    output: { text: response }
  });
  
  return { statusCode: 200, body: JSON.stringify({ response }) };
};

When Fire-and-Forget is Safe

ingestAsync() is only safe in long-running server processes where the process stays alive:

// Express, Fastify, Koa, etc. - process stays alive
app.post('/chat', async (req, res) => {
  const response = await getLLMResponse(req.body);
  
  // Safe: Express process continues running after response
  client.ingestAsync({
    model: 'gpt-4o',
    provider: 'openai',
    input: { prompt: req.body.message },
    output: { text: response }
  });
  
  res.json({ response });
});

Recommended Configuration

For serverless, configure timeouts and fail-open behavior to prevent observability from blocking your app:

const client = new DriftRail({
  apiKey: process.env.DRIFTRAIL_API_KEY!,
  appId: 'my-app',
  timeout: 5000,    // 5s max - don't let logging block too long
  failOpen: true    // Default: errors logged but don't crash your app
});

Streaming / SSE Integration

Best Practice

When logging streaming LLM responses (Server-Sent Events), timing matters. Always log before closing the stream.

The Problem

If you call DriftRail logging after closing the stream controller, the logging may never execute because the response context is already gone.

✅ Correct: Log Before Close

async function handleStream(controller: ReadableStreamDefaultController) {
  let fullResponse = '';
  const startTime = Date.now();
  
  for await (const chunk of llmStream) {
    controller.enqueue(chunk);
    fullResponse += chunk;
  }
  
  // Log BEFORE closing the stream
  await client.ingest({
    model: 'gpt-4o',
    provider: 'openai',
    input: { prompt: userMessage },
    output: { text: fullResponse },
    metadata: { latencyMs: Date.now() - startTime }
  });
  
  controller.close(); // Close AFTER logging
}

❌ Wrong: Log After Close

// DON'T do this!
controller.close(); // Stream ends here
await client.ingest({...}); // Too late - may not execute

Complete Next.js Example

// app/api/chat/route.ts
import { DriftRail } from '@drift_rail/sdk';

const client = new DriftRail({ 
  apiKey: process.env.DRIFTRAIL_KEY!, 
  appId: 'my-app' 
});

export async function POST(req: Request) {
  const { messages } = await req.json();
  const startTime = Date.now();
  
  const stream = new ReadableStream({
    async start(controller) {
      let fullResponse = '';
      
      const llmStream = await openai.chat.completions.create({
        model: 'gpt-4o',
        messages,
        stream: true
      });
      
      for await (const chunk of llmStream) {
        const text = chunk.choices[0]?.delta?.content || '';
        fullResponse += text;
        controller.enqueue(new TextEncoder().encode(text));
      }
      
      // IMPORTANT: await in serverless + log before close
      await client.ingest({
        model: 'gpt-4o',
        provider: 'openai',
        input: { 
          prompt: messages[messages.length - 1].content,
          messages 
        },
        output: { text: fullResponse },
        metadata: { latencyMs: Date.now() - startTime }
      });
      
      controller.close();
    }
  });
  
  return new Response(stream, {
    headers: { 'Content-Type': 'text/event-stream' }
  });
}

Guardrails with Streaming

For streaming with guardrails, you have two options:

Option 1: Post-Stream Check (Flag Only)

Check after streaming completes. Content already sent, but violations are logged for review.

// After stream completes
const guardResult = await client.guard({ 
  output: fullResponse,
  input: userMessage 
});

if (!guardResult.allowed) {
  console.warn('Guardrail triggered:', guardResult.triggered);
}
controller.close();

Option 2: Pre-Stream Guard (Block)

Get full response first, guard it, then stream if safe. Adds latency but enables blocking.

// Get full response first
const fullResponse = await getLLMResponse(messages);
const guardResult = await client.guard({ output: fullResponse });

if (!guardResult.allowed) {
  return new Response(JSON.stringify({ 
    error: 'Content blocked' 
  }), { status: 403 });
}

// Safe to stream
const stream = new ReadableStream({...});

Alerts & Notifications

Real-time Monitoring

DriftRail automatically monitors your AI system and generates alerts when anomalies are detected. Alerts help you catch issues before they impact users.

How Alerts Work

1

Continuous Monitoring

Every inference event is analyzed against your baseline metrics and configured thresholds.

2

Anomaly Detection

When metrics deviate significantly from baseline, an alert is created with severity based on deviation.

3

Instant Notification

Alerts trigger webhooks and integrations (Slack, Teams, Discord) for immediate visibility.

Critical

Immediate attention required. High-severity anomalies that may impact users.

Warning

Notable deviation from baseline. Should be investigated soon.

Info

Informational alerts for tracking trends and minor changes.

Alert Types

risk_score_drift Risk Score Drift

Triggered when the average risk score for your application exceeds the baseline threshold.

Baseline Value:25.0 (avg risk score) Current Value:85.5 Deviation:+242%

latency_drift Response Latency Drift

Triggered when inference latency significantly exceeds historical averages, indicating potential model or infrastructure issues.

Baseline Value:450ms (avg) Current Value:2,100ms Deviation:+367%

error_rate_drift Error Rate Drift

Triggered when the error rate exceeds baseline thresholds, indicating potential model or API issues.

Baseline Value:0.5% error rate Current Value:5.2% Deviation:+940%

hallucination_drift Hallucination Rate Drift

Triggered when the hallucination detection rate increases significantly from baseline.

Baseline Value:2.1% hallucination rate Current Value:8.5% Deviation:+305%

volume_drift Traffic Volume Drift

Triggered when request volume deviates significantly from expected patterns. Can indicate attacks, outages, or viral usage.

Baseline Value:1,200 req/hour Current Value:15,000 req/hour Deviation:+1,150%

token_drift Token Usage Drift

Triggered when average token usage per request deviates from baseline, indicating prompt or response changes.

Baseline Value:1,180 tokens/request Current Value:2,450 tokens/request Deviation:+108%

Webhook Events

Configure webhooks to receive real-time notifications when alerts are created. Webhooks are signed with HMAC-SHA256 for security.

Alert Webhook Events

alert.created Fired when any new alert is created

alert.critical Fired only for critical severity alerts

usage.threshold Fired when usage approaches plan limits

classification.high_risk Fired for high-risk classification results

Webhook Payload

{
  "event_type": "alert.created",
  "timestamp": "2025-01-03T10:30:00.000Z",
  "tenant_id": "tenant_abc123",
  "data": {
    "alert_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "alert_type": "risk_increase",
    "severity": "critical",
    "app_id": "my-chatbot",
    "model": "gpt-4",
    "current_value": 85.5,
    "baseline_value": 25.0,
    "deviation_percent": 242.0,
    "details": {
      "affected_events": 150,
      "time_window": "1h"
    }
  }
}

Signature Verification

Verify webhook authenticity using the X-Webhook-Signature header:

import crypto from 'crypto';

function verifyWebhook(signature: string, body: string, secret: string): boolean {
  const [timestampPart, signaturePart] = signature.split(',');
  const timestamp = timestampPart.split('=')[1];
  const receivedSig = signaturePart.split('=')[1];
  
  const expectedSig = crypto
    .createHmac('sha256', secret)
    .update(`${timestamp}.${body}`)
    .digest('hex');
  
  return crypto.timingSafeEqual(
    Buffer.from(receivedSig, 'hex'),
    Buffer.from(expectedSig, 'hex')
  );
}

Slack & Teams Integration

Send rich alert notifications directly to your team's communication channels. Supports Slack, Microsoft Teams, and Discord.

Slack

Rich Block Kit messages

Microsoft Teams

Adaptive Cards

Discord

Webhook embeds

Alert Notification Types

🚨 High Risk

📊 Drift Detected

📈 Volume Anomaly

🔥 Critical Incident

Getting Your Webhook URL

Slack Setup

1. Go to api.slack.com/apps and click "Create New App"
2. Choose "From scratch", name it "DriftRail Alerts", select your workspace
3. In the sidebar, click "Incoming Webhooks" → Toggle "Activate Incoming Webhooks" ON
4. Click "Add New Webhook to Workspace" → Select your alerts channel
5. Copy the webhook URL (starts with https://hooks.slack.com/services/...)

Microsoft Teams Setup

1. Open Teams and go to the channel where you want alerts
2. Click the ••• menu next to the channel name → "Connectors"
3. Search for "Incoming Webhook" and click "Configure"
4. Name it "DriftRail", optionally upload an icon, click "Create"
5. Copy the webhook URL (starts with https://outlook.office.com/webhook/...)

Discord Setup

1. Open Discord and go to your server's channel settings (gear icon)
2. Click "Integrations" → "Webhooks" → "New Webhook"
3. Name it "DriftRail Alerts", select the channel
4. Click "Copy Webhook URL" (starts with https://discord.com/api/webhooks/...)

Setup via API

# Register a Slack integration
curl -X POST "https://api.driftrail.com/api/integrations" \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "type": "slack",
    "webhook_url": "https://hooks.slack.com/services/T.../B.../xxx",
    "channel_name": "#ai-alerts",
    "events": ["alert.created", "alert.critical"]
  }'

Test Your Integration

# Send a test notification
curl -X POST "https://api.driftrail.com/api/integrations/test" \
  -H "Authorization: Bearer dr_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "webhook_url": "https://hooks.slack.com/services/T.../B.../xxx",
    "type": "slack"
  }'

Dashboard Setup: You can also configure integrations from Dashboard → Integrations with a visual interface.