DriftRail SDK Documentation
Learn how to integrate DriftRail into your application to monitor, classify, and audit every LLM interaction in real-time.
3 Lines of Code
Drop-in integration with minimal configuration.
Fail-Open
Never breaks your production app. Async by default.
Multi-Language
Python, Node.js, and browser SDKs available.
Quickstart
Get up and running in under 5 minutes. Choose your preferred language:
# Install the SDK
pip install driftrail
from driftrail import DriftRail
# Initialize the client
client = DriftRail(
api_key="dr_live_...",
app_id="my-app"
)
# Log an LLM interaction
response = client.ingest(
model="gpt-4o",
provider="openai",
input={"prompt": "What is the capital of France?"},
output={"text": "The capital of France is Paris."}
)
print(f"Event ID: {response.event_id}")
API Keys
API keys are scoped by environment. We recommend using separate keys for development, staging, and production.
Getting Your API Key
When you sign up for DriftRail, an API key is automatically generated for your account.
Key Formats
| Environment | Prefix | Example |
|---|---|---|
| Production | dr_live_ | dr_live_a1b2c3d4... |
| Staging | dr_test_ | dr_test_e5f6g7h8... |
| Development | dr_test_ | dr_test_i9j0k1l2... |
Security Note: Never expose your API keys in client-side code or public repositories. Use environment variables.
Python SDK
The official Python SDK for DriftRail. Supports both sync and async usage patterns, with built-in inline guardrails and enterprise features.
Now Available: pip install driftrail — Version 2.1.0 on PyPI with 120+ enterprise methods
Installation
pip install driftrail
# For async support
pip install driftrail[async]
Quick Start
from driftrail import DriftRail
client = DriftRail(
api_key="dr_live_...",
app_id="my-app"
)
# Log an LLM interaction
response = client.ingest(
model="gpt-4o",
provider="openai",
input={"prompt": "What is the capital of France?"},
output={"text": "The capital of France is Paris."},
metadata={
"latency_ms": 420,
"tokens_in": 25,
"tokens_out": 12,
"temperature": 0.7
}
)
print(f"Event ID: {response.event_id}")
Inline Guardrails
Block dangerous outputs before they reach users:
from driftrail import DriftRail
client = DriftRail(api_key="dr_live_...", app_id="my-app")
# Get response from your LLM
llm_response = your_llm_call(user_prompt)
# Guard it before returning to user
result = client.guard(
output=llm_response,
input=user_prompt,
mode="strict" # or "permissive"
)
if result.allowed:
return result.output # May be redacted if PII was found
else:
print(f"Blocked: {[t.reason for t in result.triggered]}")
return "Sorry, I can't help with that."
Async Usage
For async applications using aiohttp:
import asyncio
from driftrail import DriftRailAsync
async def main():
async with DriftRailAsync(api_key="dr_live_...", app_id="my-app") as client:
response = await client.ingest(
model="claude-sonnet-4",
provider="anthropic",
input={"prompt": "Hello!"},
output={"text": "Hi there!"}
)
print(f"Event ID: {response.event_id}")
asyncio.run(main())
Fire-and-Forget (Non-blocking)
For long-running server processes only:
# Won't block your main thread
client.ingest_async(
model="gpt-4o",
provider="openai",
input={"prompt": "..."},
output={"text": "..."}
)
⚠️ Serverless Warning: Do not use ingest_async() in AWS Lambda, Google Cloud Functions, or other serverless environments. Use the synchronous ingest() method instead.
RAG Source Tracking
Track which documents were used in RAG responses:
client.ingest(
model="gpt-4o",
provider="openai",
input={
"prompt": "What's our refund policy?",
"retrieved_sources": [
{"id": "doc-123", "content": "Refunds available within 30 days..."},
{"id": "doc-456", "content": "Contact support for refund requests..."}
]
},
output={"text": "According to our policy, refunds are available within 30 days..."}
)
Fail-Open Architecture
By default, the SDK fails open—errors are captured but won't crash your app:
client = DriftRail(
api_key="dr_live_...",
app_id="my-app",
fail_open=True, # Default: errors logged but don't raise
guard_mode="fail_open" # Default: if guard API unavailable, allow content
)
# Even if DriftRail is down, this won't raise
response = client.ingest(...)
if not response.success:
print(f"Warning: {response.error}")
Enterprise Features
The DriftRailEnterprise client provides 120+ methods for full platform access:
from driftrail import DriftRailEnterprise
client = DriftRailEnterprise(api_key="dr_live_...", app_id="my-app")
# === Incident Management ===
stats = client.get_incident_stats()
incidents = client.list_incidents(status=["open"], severity=["critical"])
client.create_incident(title="High risk spike", severity="high", incident_type="risk_spike")
client.update_incident_status(incident_id, status="investigating")
# === Compliance & Reporting ===
compliance = client.get_compliance_status()
score = client.get_compliance_score()
reports = client.get_compliance_reports()
client.generate_compliance_report(framework="hipaa", format="pdf", include_ai_analysis=True)
client.create_custom_framework(name="Internal Policy", controls=[...])
# === Executive Dashboard ===
metrics = client.get_executive_metrics(period="7d")
targets = client.get_kpi_targets()
client.update_kpi_targets({"max_high_risk_percent": 5.0})
client.export_executive_metrics(period="30d", format="xlsx")
# === Model Analytics ===
summary = client.get_model_analytics_summary()
logs = client.get_historical_logs(model="gpt-4o", limit=100)
switches = client.get_model_switches()
client.record_model_switch(app_id="my-app", new_model="claude-4", new_provider="anthropic")
benchmarks = client.get_model_benchmarks()
client.calculate_model_benchmark(model="gpt-4o", days=7)
# === Drift Detection V3 ===
drift_score = client.get_drift_score()
heatmap = client.get_drift_heatmap(days=30)
thresholds = client.get_drift_thresholds()
client.update_drift_thresholds({"risk_score": {"warning": 15, "critical": 25}})
predictions = client.get_drift_predictions()
correlations = client.get_drift_deployment_correlations()
client.record_deployment(app_id="my-app", version="v2.1.0", deployment_type="release")
seasonality = client.get_seasonality_patterns()
distribution = client.get_distribution_analysis()
statistics = client.get_baseline_statistics()
# === Notification Channels ===
channels = client.get_notification_channels()
client.create_notification_channel(
channel_type="slack",
name="Alerts Channel",
config={"webhook_url": "https://hooks.slack.com/..."},
severity_filter=["critical", "warning"]
)
# === Drift Segments ===
segments = client.get_drift_segments()
client.create_drift_segment(name="Production GPT-4", filter_criteria={"model": "gpt-4o"})
# === Distributed Tracing ===
trace = client.start_trace(app_id="my-app", name="chat-completion", user_id="user-123")
span = client.start_span(trace_id=trace["trace_id"], name="llm-call", span_type="llm", model="gpt-4o")
client.end_span(span["span_id"], status="completed", tokens_in=100, tokens_out=50)
client.end_trace(trace["trace_id"])
traces = client.list_traces(status="completed", limit=50)
# === Prompt Management ===
prompt = client.create_prompt(name="Customer Support", content="You are a helpful assistant...")
version = client.create_prompt_version(prompt["prompt_id"], content="Updated prompt...", commit_message="v2")
client.deploy_prompt_version(version["version_id"], environment="production")
deployed = client.get_deployed_prompt(prompt["prompt_id"], environment="production")
# === Evaluation Framework ===
dataset = client.create_dataset(name="QA Test Set", schema_type="qa")
client.add_dataset_items(dataset["dataset_id"], items=[{"input": {...}, "expected_output": {...}}])
run = client.create_eval_run(dataset["dataset_id"], evaluators=[{"type": "exact_match"}])
results = client.get_eval_run(run["run_id"])
# === Semantic Caching ===
settings = client.get_cache_settings()
client.update_cache_settings({"is_enabled": True, "similarity_threshold": 0.95})
stats = client.get_cache_stats()
lookup = client.cache_lookup(input="What is the capital of France?")
client.cache_store(input="...", output="...", model="gpt-4o")
# === Agent Simulations ===
sim = client.create_simulation(name="Support Bot Test", scenario="User asks for refund")
run = client.run_simulation(sim["simulation_id"])
client.add_simulation_turn(run["run_id"], turn_number=1, role="user", content="I want a refund")
stats = client.get_simulation_stats()
# === Integrations ===
integrations = client.get_integrations()
client.create_integration(type="slack", webhook_url="https://...", events=["high_risk", "incident"])
client.test_integration(webhook_url="https://...", type="slack")
# === Benchmarks ===
industries = client.get_industries()
report = client.get_benchmark_report(industry="healthcare")
client.set_tenant_industry(industry="fintech")
# === Guardrails ===
guardrails = client.get_guardrails()
client.create_guardrail(name="PII Blocker", rule_type="pii", action="block")
stats = client.get_guardrail_stats()
# === Retention Policies ===
policies = client.get_retention_policies()
client.create_retention_policy(name="90 Day Retention", data_type="events", retention_days=90)
# === Audit Logs ===
logs = client.get_audit_logs(action="event.ingested", limit=100)
# === Events & Classifications ===
events = client.get_events(min_risk_score=70, limit=50)
event = client.get_event(event_id)
live = client.get_live_events()
classifications = client.get_classifications(min_score=0.8)
# === Custom Detections ===
detections = client.get_custom_detections()
client.create_custom_detection(name="Competitor Mention", detection_type="keyword", config={"keywords": [...]})
# === Webhooks ===
webhooks = client.get_webhooks()
client.create_webhook(url="https://...", events=["high_risk", "drift_alert"])
# === Stats ===
stats = client.get_stats(period="7d")
Available Enterprise Methods
Incidents & Compliance
- • list_incidents, create_incident
- • update_incident_status
- • get_incident_stats
- • get_compliance_status/score
- • generate_compliance_report
- • create_custom_framework
Drift Detection V3
- • get_drift_score/heatmap
- • get/update_drift_thresholds
- • get_drift_predictions
- • get_seasonality_patterns
- • get_baseline_statistics
- • record_deployment
Tracing & Prompts
- • start/end_trace, start/end_span
- • list_traces, get_trace
- • create_prompt, create_prompt_version
- • deploy_prompt_version
- • get_deployed_prompt
- • rollback_prompt
Evaluations & Cache
- • create_dataset, add_dataset_items
- • create_eval_run, get_eval_run
- • submit_eval_result
- • get/update_cache_settings
- • cache_lookup, cache_store
- • invalidate_cache, clear_cache
Simulations & Analytics
- • create/run_simulation
- • add_simulation_turn
- • get_simulation_stats
- • get_executive_metrics
- • get_model_analytics_summary
- • get_model_leaderboard
Configuration
- • get/create_guardrail
- • get/create_webhook
- • get/create_integration
- • get/create_retention_policy
- • get/create_notification_channel
- • get_audit_logs
Node.js SDK
Installation
npm install @drift_rail/sdk
# or
yarn add @drift_rail/sdk
# or
pnpm add @drift_rail/sdk
Fire-and-Forget (Non-blocking)
// Won't block your code
client.ingestAsync({
model: 'gpt-4',
provider: 'openai',
input: { prompt: '...' },
output: { text: '...' }
});
OpenAI Chat Completions Helper
// Convenience method for chat completions
await client.logChatCompletion({
model: 'gpt-4',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' }
],
response: 'Hi there! How can I help you today?',
latencyMs: 420,
tokensIn: 25,
tokensOut: 12
});
TypeScript Support
import type {
IngestParams,
IngestResponse,
Provider,
InputPayload,
OutputPayload
} from '@drift_rail/sdk';
Browser SDK
The browser SDK is designed for client-side applications. It uses batching and async delivery to minimize performance impact.
Initialization
import { initAIObservability, logInference } from '@drift_rail/sdk';
initAIObservability({
apiKey: 'dr_live_...',
appId: 'my-app',
environment: 'prod',
batchSize: 10, // Send events in batches
flushIntervalMs: 5000, // Or every 5 seconds
debug: false
});
Logging Events
logInference({
model: 'gpt-4',
provider: 'openai',
input: { prompt: 'User query here' },
output: { text: 'AI response here' },
metadata: {
latencyMs: 350,
temperature: 0.7
}
});
Inference Events
An inference event captures a single LLM interaction, including the prompt, response, and metadata.
Event Schema
{
"event_id": "evt_a1b2c3d4...",
"timestamp": "2026-01-15T10:30:00Z",
"model": "gpt-5",
"provider": "openai",
"input": {
"prompt": "What is the capital of France?",
"messages": [...],
"retrieved_sources": [...]
},
"output": {
"text": "The capital of France is Paris.",
"tool_calls": [...]
},
"metadata": {
"latency_ms": 420,
"tokens_in": 25,
"tokens_out": 12,
"temperature": 0.7
},
"classification": {
"risk_score": 15,
"risk_level": "low",
"detected_issues": []
}
}
Risk Classification
Every event is automatically classified for risk using our AI-powered analysis engine. Risk scores are calculated using a weighted, confidence-adjusted algorithm that you can customize.
Low (0-39)
Safe, no significant issues detected
Medium (40-69)
Concerns detected, review suggested
High (70-100)
Significant issues, attention required
Detection Types
| Detection | Default Weight | Description |
|---|---|---|
| hallucination | 0.20 | Detects fabricated or unsupported claims |
| policy_violation | 0.25 | Content that violates usage policies |
| toxicity | 0.20 | Harmful, offensive, or inappropriate content |
| prompt_injection | 0.15 | Attempts to manipulate the model |
| pii_detection | 0.15 | Personal data exposure (names, emails, SSNs) |
| factual_accuracy | 0.10 | Verifiable factual errors |
| confidence_degradation | 0.05 | Low model confidence in response |
| brand_safety | 0.10 | Content that may harm brand reputation |
Risk Scoring Algorithm
The final risk score is calculated using a production-ready algorithm:
-
1.
Weighted Scoring - Each detection type contributes based on its configured weight (0-1). Weights can be customized per tenant.
-
2.
Confidence Adjustment - Scores are scaled by √(confidence) to give diminishing returns for uncertain detections.
-
3.
Compounding - Multiple significant detections (score ≥50) compound the final score by a configurable factor (default: 1.1x per detection, max 1.5x).
-
4.
Custom Thresholds - Define your own boundaries for low/medium/high risk levels.
Pro Feature: Configure custom weights, thresholds, and compounding settings via the Risk Config API or in the Dashboard under Settings → Detections.
RAG Sources
Track the sources your RAG system retrieves and verify that responses are grounded in those sources. DriftRail uses these sources to detect hallucinations and verify factual accuracy.
Source Schema
| Field | Type | Required | Description |
|---|---|---|---|
| id | string | Yes | Unique identifier for the source document |
| content | string | No | The retrieved text content (recommended for groundedness checks) |
| type | string | No | Source type (e.g., "document", "webpage", "database") |
| url | string | No | URL of the source document |
| metadata | object | No | Custom metadata (e.g., score, chunk_index) |
Python Example
client.ingest(
model="gpt-5",
provider="openai",
input={
"prompt": "What does our refund policy say?",
"retrievedSources": [ # camelCase in API payload
{
"id": "doc-123",
"content": "Refunds available within 30 days...",
"type": "document",
"url": "https://docs.example.com/refunds",
"metadata": {"score": 0.95, "chunk_index": 2}
},
{
"id": "doc-456",
"content": "Contact support for refund requests..."
}
]
},
output={"text": "According to our policy, refunds are available within 30 days..."}
)
# Python SDK also accepts snake_case for convenience:
input={"retrieved_sources": [...]}
Node.js Example
await client.ingest({
model: 'gpt-5',
provider: 'openai',
input: {
prompt: 'What does our refund policy say?',
retrievedSources: [
{ id: 'doc-123', content: 'Refunds available within 30 days...' },
{ id: 'doc-456', content: 'Contact support for refund requests...' }
]
},
output: { text: 'According to our policy, refunds are available within 30 days...' }
});
How RAG Groundedness Works
When you provide retrievedSources, DriftRail's
classifier
compares the LLM output against the source content to detect:
- • Hallucinations - Claims not supported by sources
- • Contradictions - Output that conflicts with source material
- • False attributions - Incorrect citations or references
- • Invented details - Specifics not present in sources
Note: For best groundedness detection, include the
actual content
of retrieved sources, not just IDs. Sources are automatically summarized if total context
exceeds ~7,500 tokens.
Inline Guardrails
Block dangerous LLM outputs before they reach your users.
The guard() method runs real-time AI classification and
rule-based guardrails in under 50ms.
What it detects
- • PII exposure (emails, phones, SSNs)
- • Toxic or harmful content
- • Prompt injection attempts
- • Custom keyword/regex rules
Actions available
- • Block - Prevent output entirely
- • Redact - Mask sensitive data
- • Warn - Flag but allow
- • Allow - Safe content
Fail-Open by Default: If DriftRail is unavailable, content is allowed
through.
Your app never breaks. Configure guard_mode="fail_closed" for strict
environments.
Python Guardrails
Basic Usage
from driftrail import DriftRail
client = DriftRail(api_key="dr_live_...", app_id="my-app")
# Get response from your LLM
llm_response = your_llm_call(user_prompt)
# Guard it BEFORE returning to user
result = client.guard(
output=llm_response,
input=user_prompt,
mode="strict"
)
if result.allowed:
return result.output # May be redacted
else:
print(f"Blocked: {[t.reason for t in result.triggered]}")
return "Sorry, I can't help with that."
Guard Modes
| Mode | Behavior |
|---|---|
| strict | Blocks on medium+ risk (PII, moderate toxicity, prompt injection) |
| permissive | Only blocks on high risk (severe toxicity, high-risk injection) |
Fail-Open vs Fail-Closed
from driftrail import DriftRail, GuardBlockedError
# Fail-open (default): If DriftRail is unavailable, content is allowed
client = DriftRail(api_key="...", app_id="...", guard_mode="fail_open")
# Fail-closed: If DriftRail is unavailable, raises exception
client = DriftRail(api_key="...", app_id="...", guard_mode="fail_closed")
try:
result = client.guard(output=llm_response)
except GuardBlockedError as e:
# Content was blocked
print(f"Blocked: {e.result.triggered}")
Guard Response
result = client.guard(output="...")
result.allowed # bool - True if content can be shown to user
result.action # "allow" | "block" | "redact" | "warn"
result.output # Original or redacted content
result.triggered # List of triggered guardrails/classifications
result.classification # AI classification details (risk_score, pii, toxicity, etc.)
result.latency_ms # Processing time
result.fallback # True if classification failed (fail-open)
FastAPI Example
from fastapi import FastAPI, HTTPException
from driftrail import DriftRail
app = FastAPI()
driftrail = DriftRail(api_key="...", app_id="my-api")
@app.post("/api/chat")
async def chat(prompt: str):
# Get LLM response
llm_response = await call_llm(prompt)
# Guard before returning
guard = driftrail.guard(output=llm_response, input=prompt)
if not guard.allowed:
raise HTTPException(
status_code=400,
detail={"error": "Content blocked", "reasons": [t.reason for t in guard.triggered]}
)
return {"response": guard.output}
Node.js Guardrails
Basic Usage
import { DriftRail } from '@drift_rail/sdk';
const client = new DriftRail({ apiKey: 'dr_live_...', appId: 'my-app' });
// Get response from your LLM
const llmResponse = await yourLLMCall(userPrompt);
// Guard it BEFORE returning to user
const result = await client.guard({
output: llmResponse,
input: userPrompt,
mode: 'strict'
});
if (result.allowed) {
return result.output; // May be redacted
} else {
console.log('Blocked:', result.triggered.map(t => t.reason));
return "Sorry, I can't help with that.";
}
Express Middleware Pattern
import express from 'express';
import { DriftRail } from '@drift_rail/sdk';
const app = express();
const driftrail = new DriftRail({ apiKey: '...', appId: 'my-app' });
app.post('/api/chat', async (req, res) => {
const { prompt } = req.body;
// Get LLM response
const llmResponse = await callLLM(prompt);
// Guard before returning
const guard = await driftrail.guard({
output: llmResponse,
input: prompt
});
if (!guard.allowed) {
return res.status(400).json({
error: 'Content blocked',
reasons: guard.triggered.map(t => t.reason)
});
}
res.json({ response: guard.output });
});
POST /api/guard
Real-time content safety check. Returns allow/block/redact decision in <50ms.
Request
curl -X POST https://api.driftrail.com/api/guard \
-H "Authorization: Bearer dr_live_..." \
-H "X-App-Id: my-app" \
-H "Content-Type: application/json" \
-d '{
"output": "The LLM response to check",
"input": "Optional user prompt for context",
"mode": "strict",
"timeout_ms": 100
}'
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| output | string | Yes | The LLM output to check |
| input | string | No | User prompt (helps detect prompt injection) |
| mode | string | No | "strict" (default) or "permissive" |
| timeout_ms | number | No | Classification timeout (default: 100, max: 500) |
Response
{
"allowed": true,
"action": "redact",
"output": "Contact me at j***@***.com for details",
"triggered": [
{
"type": "classification",
"name": "PII Redaction",
"reason": "Redacted email"
}
],
"classification": {
"risk_score": 25,
"pii": { "detected": true, "types": ["email"] },
"toxicity": { "detected": false, "severity": "none" },
"prompt_injection": { "detected": false, "risk": "none" }
},
"latency_ms": 42,
"fallback": false
}
POST /api/ingest
Ingest a new inference event for monitoring and classification. Events are processed asynchronously.
Authentication
Include your API key in one of these headers:
- Authorization: Bearer dr_live_...
- X-API-Key: dr_live_...
Required Headers
POST /api/ingest HTTP/1.1
Content-Type: application/json
Authorization: Bearer dr_live_...
X-App-Id: my-application
Request Body
{
"model": "gpt-5",
"provider": "openai",
"input": {
"prompt": "What is the capital of France?",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
],
"retrievedSources": [
{ "id": "doc-123", "content": "France is a country..." }
]
},
"output": {
"text": "The capital of France is Paris.",
"toolCalls": []
},
"metadata": {
"latencyMs": 420,
"tokensIn": 25,
"tokensOut": 12,
"temperature": 0.7
}
}
Response (202 Accepted)
{
"success": true,
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"job_id": "job_a1b2c3d4..."
}
Error Responses
| Status | Description |
|---|---|
| 400 | Invalid JSON or schema validation failed |
| 401 | Missing or invalid API key |
| 429 | Rate limit or usage limit exceeded |
| 500 | Internal server error |
GET /api/events
Retrieve and filter inference events. Supports pagination and filtering by app, environment, model, and time range.
Query Parameters
| Parameter | Type | Description |
|---|---|---|
| event_id | string | Get a specific event by ID |
| app_id | string | Filter by application |
| environment | string | prod, staging, or dev |
| model | string | Filter by model name |
| start_time | ISO 8601 | Start of time range |
| end_time | ISO 8601 | End of time range |
| limit | integer | Max results (default: 100) |
| offset | integer | Pagination offset |
Example Request
curl -X GET "https://api.driftrail.com/api/events?app_id=my-app&limit=50" \
-H "Authorization: Bearer dr_live_..."
Response
{
"events": [
{
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2026-01-15T10:30:00Z",
"model": "gpt-5",
"provider": "openai",
"app_id": "my-app",
"environment": "prod",
"latency_ms": 420,
"tokens_in": 25,
"tokens_out": 12
}
],
"total": 1250,
"limit": 50,
"offset": 0
}
GET /api/classifications
Retrieve risk classification results. Includes endpoints for risk distribution and high-risk event alerts.
Endpoints
List all classifications with optional filters
Get risk level distribution (low, medium, high, critical)
Get events above risk threshold (default: 70)
Risk Distribution Example
curl -X GET "https://api.driftrail.com/api/classifications/distribution?app_id=my-app" \
-H "Authorization: Bearer dr_live_..."
Response
{
"low": 850,
"medium": 280,
"high": 95,
"critical": 15,
"total": 1240
}
High-Risk Events Query
curl -X GET "https://api.driftrail.com/api/classifications/high-risk?threshold=80&limit=20" \
-H "Authorization: Bearer dr_live_..."
Response
{
"classifications": [
{
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"risk_score": 92,
"risk_level": "critical",
"detected_issues": ["pii_exposure", "hallucination"],
"classified_at": "2026-01-15T10:30:05Z"
}
]
}
AI Playground
Dashboard FeatureTest AI models with real-time DriftRail safety monitoring. Every interaction runs through our full detection pipeline, giving you instant visibility into potential risks.
Features
- → Interactive chat with multiple AI models (Gemini Flash Lite, Gemini Flash, GPT-5 Nano, Claude 4.5 Haiku)
- → Real-time detection pipeline visualization
- → Guardrail testing with automatic blocking
- → Risk analysis for hallucination, PII, toxicity, prompt injection
- → Toggle detections and streaming on/off
Supported Models
Ultra fast, lowest cost
Latest Gemini model
OpenAI's fastest model
Anthropic's efficient model
Usage Limits by Plan
| Plan | Monthly Messages | Cost |
|---|---|---|
| Starter | 25 | Free |
| Growth | 500 | $99/mo |
| Pro | 2,500 | $499/mo |
| Enterprise | 10,000+ | Custom |
API Access
The playground is also available via API for programmatic testing:
curl -X POST "https://api.driftrail.com/api-playground" \
-H "Authorization: Bearer dr_live_..." \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"model": "gemini-flash-lite-latest",
"runDetections": true
}'
Response
{
"content": "The capital of France is Paris.",
"model": "gemini-2.5-flash-lite",
"provider": "google",
"detections": [
{
"type": "hallucination",
"risk": "low",
"confidence": 0.95,
"details": "Response is factually accurate"
}
],
"latencyMs": 450,
"tokensUsed": 25,
"usage": {
"current": 16,
"limit": 500,
"remaining": 484
}
}
Try it now: Access the AI Playground from your dashboard to test models with real-time safety monitoring.
Serverless Environments
CriticalWhen deploying to serverless platforms (Vercel, Netlify, AWS Lambda, Cloudflare Workers), you must await the ingest call. Fire-and-forget methods cause race conditions where events are silently lost.
⚠️ #1 Cause of Missing Events: Using ingestAsync() in serverless environments. The function terminates before the HTTP request completes, and events are silently dropped.
📖 Deep Dive: Read our blog post Why Your LLM Observability Breaks on Vercel for a detailed explanation of the race condition and solutions.
The Race Condition
Serverless functions terminate immediately after returning a response. Any "fire and forget" HTTP requests get killed mid-flight—no error, no warning, just missing data.
| Environment | ingest() (awaited) |
ingestAsync() |
|---|---|---|
| Vercel Serverless | ✅ Works | ❌ Loses events |
| Vercel Edge | ✅ Works | ❌ Loses events |
| Netlify Functions | ✅ Works | ❌ Loses events |
| AWS Lambda | ✅ Works | ❌ Loses events |
| Cloudflare Workers | ✅ Works | ❌ Loses events |
| Express / Fastify | ✅ Works | ✅ Safe |
✅ Correct: Await in Serverless
// Vercel, Netlify, Lambda, Cloudflare Workers
export async function POST(req: Request) {
const { prompt } = await req.json();
const response = await callLLM(prompt);
// MUST await - ensures request completes before function terminates
await client.ingest({
model: 'gpt-4o',
provider: 'openai',
input: { prompt },
output: { text: response }
});
return Response.json({ response });
}
❌ Wrong: Fire-and-Forget in Serverless
// DON'T do this in serverless!
export async function POST(req: Request) {
const response = await callLLM(prompt);
client.ingestAsync({...}); // Race condition! Function terminates first
return Response.json({ response });
// HTTP request to DriftRail is killed here
}
Platform Examples
▲ Vercel (Next.js App Router)
// app/api/chat/route.ts
import { DriftRail } from '@drift_rail/sdk';
const client = new DriftRail({
apiKey: process.env.DRIFTRAIL_API_KEY!,
appId: 'my-nextjs-app'
});
export async function POST(req: Request) {
const { messages } = await req.json();
const startTime = Date.now();
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages
});
const response = completion.choices[0].message.content;
await client.ingest({
model: 'gpt-4o',
provider: 'openai',
input: { prompt: messages.at(-1).content, messages },
output: { text: response },
metadata: { latencyMs: Date.now() - startTime }
});
return Response.json({ response });
}
N Netlify Functions
// netlify/functions/chat.ts
import { Handler } from '@netlify/functions';
import { DriftRail } from '@drift_rail/sdk';
const client = new DriftRail({
apiKey: process.env.DRIFTRAIL_API_KEY!,
appId: 'my-netlify-app'
});
export const handler: Handler = async (event) => {
const { prompt } = JSON.parse(event.body || '{}');
const response = await callYourLLM(prompt);
await client.ingest({
model: 'claude-3.5-sonnet',
provider: 'anthropic',
input: { prompt },
output: { text: response }
});
return { statusCode: 200, body: JSON.stringify({ response }) };
};
When Fire-and-Forget is Safe
ingestAsync() is only safe in long-running server processes where the process stays alive:
// Express, Fastify, Koa, etc. - process stays alive
app.post('/chat', async (req, res) => {
const response = await getLLMResponse(req.body);
// Safe: Express process continues running after response
client.ingestAsync({
model: 'gpt-4o',
provider: 'openai',
input: { prompt: req.body.message },
output: { text: response }
});
res.json({ response });
});
Recommended Configuration
For serverless, configure timeouts and fail-open behavior to prevent observability from blocking your app:
const client = new DriftRail({
apiKey: process.env.DRIFTRAIL_API_KEY!,
appId: 'my-app',
timeout: 5000, // 5s max - don't let logging block too long
failOpen: true // Default: errors logged but don't crash your app
});
Streaming / SSE Integration
Best PracticeWhen logging streaming LLM responses (Server-Sent Events), timing matters. Always log before closing the stream.
The Problem
If you call DriftRail logging after closing the stream controller, the logging may never execute because the response context is already gone.
✅ Correct: Log Before Close
async function handleStream(controller: ReadableStreamDefaultController) {
let fullResponse = '';
const startTime = Date.now();
for await (const chunk of llmStream) {
controller.enqueue(chunk);
fullResponse += chunk;
}
// Log BEFORE closing the stream
await client.ingest({
model: 'gpt-4o',
provider: 'openai',
input: { prompt: userMessage },
output: { text: fullResponse },
metadata: { latencyMs: Date.now() - startTime }
});
controller.close(); // Close AFTER logging
}
❌ Wrong: Log After Close
// DON'T do this!
controller.close(); // Stream ends here
await client.ingest({...}); // Too late - may not execute
Complete Next.js Example
// app/api/chat/route.ts
import { DriftRail } from '@drift_rail/sdk';
const client = new DriftRail({
apiKey: process.env.DRIFTRAIL_KEY!,
appId: 'my-app'
});
export async function POST(req: Request) {
const { messages } = await req.json();
const startTime = Date.now();
const stream = new ReadableStream({
async start(controller) {
let fullResponse = '';
const llmStream = await openai.chat.completions.create({
model: 'gpt-4o',
messages,
stream: true
});
for await (const chunk of llmStream) {
const text = chunk.choices[0]?.delta?.content || '';
fullResponse += text;
controller.enqueue(new TextEncoder().encode(text));
}
// IMPORTANT: await in serverless + log before close
await client.ingest({
model: 'gpt-4o',
provider: 'openai',
input: {
prompt: messages[messages.length - 1].content,
messages
},
output: { text: fullResponse },
metadata: { latencyMs: Date.now() - startTime }
});
controller.close();
}
});
return new Response(stream, {
headers: { 'Content-Type': 'text/event-stream' }
});
}
Guardrails with Streaming
For streaming with guardrails, you have two options:
Option 1: Post-Stream Check (Flag Only)
Check after streaming completes. Content already sent, but violations are logged for review.
// After stream completes
const guardResult = await client.guard({
output: fullResponse,
input: userMessage
});
if (!guardResult.allowed) {
console.warn('Guardrail triggered:', guardResult.triggered);
}
controller.close();
Option 2: Pre-Stream Guard (Block)
Get full response first, guard it, then stream if safe. Adds latency but enables blocking.
// Get full response first
const fullResponse = await getLLMResponse(messages);
const guardResult = await client.guard({ output: fullResponse });
if (!guardResult.allowed) {
return new Response(JSON.stringify({
error: 'Content blocked'
}), { status: 403 });
}
// Safe to stream
const stream = new ReadableStream({...});
Alerts & Notifications
Real-time MonitoringDriftRail automatically monitors your AI system and generates alerts when anomalies are detected. Alerts help you catch issues before they impact users.
How Alerts Work
Continuous Monitoring
Every inference event is analyzed against your baseline metrics and configured thresholds.
Anomaly Detection
When metrics deviate significantly from baseline, an alert is created with severity based on deviation.
Instant Notification
Alerts trigger webhooks and integrations (Slack, Teams, Discord) for immediate visibility.
Critical
Immediate attention required. High-severity anomalies that may impact users.
Warning
Notable deviation from baseline. Should be investigated soon.
Info
Informational alerts for tracking trends and minor changes.
Alert Types
Triggered when the average risk score for your application exceeds the baseline threshold.
Triggered when inference latency significantly exceeds historical averages, indicating potential model or infrastructure issues.
Triggered when the error rate exceeds baseline thresholds, indicating potential model or API issues.
Triggered when the hallucination detection rate increases significantly from baseline.
Triggered when request volume deviates significantly from expected patterns. Can indicate attacks, outages, or viral usage.
Triggered when average token usage per request deviates from baseline, indicating prompt or response changes.
Webhook Events
Configure webhooks to receive real-time notifications when alerts are created. Webhooks are signed with HMAC-SHA256 for security.
Alert Webhook Events
alert.created
Fired when any new alert is created
alert.critical
Fired only for critical severity alerts
usage.threshold
Fired when usage approaches plan limits
classification.high_risk
Fired for high-risk classification results
Webhook Payload
{
"event_type": "alert.created",
"timestamp": "2025-01-03T10:30:00.000Z",
"tenant_id": "tenant_abc123",
"data": {
"alert_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"alert_type": "risk_increase",
"severity": "critical",
"app_id": "my-chatbot",
"model": "gpt-4",
"current_value": 85.5,
"baseline_value": 25.0,
"deviation_percent": 242.0,
"details": {
"affected_events": 150,
"time_window": "1h"
}
}
}
Signature Verification
Verify webhook authenticity using the X-Webhook-Signature
header:
import crypto from 'crypto';
function verifyWebhook(signature: string, body: string, secret: string): boolean {
const [timestampPart, signaturePart] = signature.split(',');
const timestamp = timestampPart.split('=')[1];
const receivedSig = signaturePart.split('=')[1];
const expectedSig = crypto
.createHmac('sha256', secret)
.update(`${timestamp}.${body}`)
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(receivedSig, 'hex'),
Buffer.from(expectedSig, 'hex')
);
}
Slack & Teams Integration
Send rich alert notifications directly to your team's communication channels. Supports Slack, Microsoft Teams, and Discord.
Slack
Rich Block Kit messages
Microsoft Teams
Adaptive Cards
Discord
Webhook embeds
Alert Notification Types
Getting Your Webhook URL
Slack Setup
- 1. Go to api.slack.com/apps and click "Create New App"
- 2. Choose "From scratch", name it "DriftRail Alerts", select your workspace
- 3. In the sidebar, click "Incoming Webhooks" → Toggle "Activate Incoming Webhooks" ON
- 4. Click "Add New Webhook to Workspace" → Select your alerts channel
- 5. Copy the webhook
URL (starts with
https://hooks.slack.com/services/...)
Microsoft Teams Setup
- 1. Open Teams and go to the channel where you want alerts
- 2. Click the ••• menu next to the channel name → "Connectors"
- 3. Search for "Incoming Webhook" and click "Configure"
- 4. Name it "DriftRail", optionally upload an icon, click "Create"
- 5. Copy the webhook
URL (starts with
https://outlook.office.com/webhook/...)
Discord Setup
- 1. Open Discord and go to your server's channel settings (gear icon)
- 2. Click "Integrations" → "Webhooks" → "New Webhook"
- 3. Name it "DriftRail Alerts", select the channel
- 4. Click "Copy
Webhook URL" (starts with
https://discord.com/api/webhooks/...)
Setup via API
# Register a Slack integration
curl -X POST "https://api.driftrail.com/api/integrations" \
-H "Authorization: Bearer dr_live_..." \
-H "Content-Type: application/json" \
-d '{
"type": "slack",
"webhook_url": "https://hooks.slack.com/services/T.../B.../xxx",
"channel_name": "#ai-alerts",
"events": ["alert.created", "alert.critical"]
}'
Test Your Integration
# Send a test notification
curl -X POST "https://api.driftrail.com/api/integrations/test" \
-H "Authorization: Bearer dr_live_..." \
-H "Content-Type: application/json" \
-d '{
"webhook_url": "https://hooks.slack.com/services/T.../B.../xxx",
"type": "slack"
}'
Dashboard Setup: You can also configure integrations from Dashboard → Integrations with a visual interface.
Ready to get started?
Create your free account and start monitoring your AI in under 5 minutes.
Get Started Free →