DriftRail SDK Documentation
Learn how to integrate DriftRail into your application to monitor, classify, and audit every LLM interaction in real-time.
3 Lines of Code
Drop-in integration with minimal configuration.
Fail-Open
Never breaks your production app. Async by default.
Multi-Language
Python, Node.js, and browser SDKs available.
Quickstart
Get up and running in under 5 minutes. Choose your preferred language:
# Install the SDK
pip install driftrail
from driftrail import DriftRail
# Initialize the client
client = DriftRail(
api_key="dr_prod_...",
app_id="my-app"
)
# Log an LLM interaction
response = client.ingest(
model="gpt-5",
provider="openai",
input={"prompt": "What is the capital of France?"},
output={"text": "The capital of France is Paris."}
)
print(f"Event ID: {response.event_id}")
API Keys
API keys are scoped by environment. We recommend using separate keys for development, staging, and production.
Key Formats
| Environment | Prefix | Example |
|---|---|---|
| Production | dr_prod_ | dr_prod_a1b2c3d4... |
| Staging | dr_staging_ | dr_staging_e5f6g7h8... |
| Development | dr_dev_ | dr_dev_i9j0k1l2... |
Security Note: Never expose your API keys in client-side code or public repositories. Use environment variables.
Python SDK
Installation
pip install driftrail
# For async support
pip install driftrail[async]
Async Usage
import asyncio
from driftrail import DriftRailAsync
async def main():
async with DriftRailAsync(api_key="...", app_id="my-app") as client:
response = await client.ingest(
model="claude-3",
provider="anthropic",
input={"prompt": "Hello"},
output={"text": "Hi there!"}
)
asyncio.run(main())
Fire-and-Forget (Non-blocking)
# Won't block your main thread
client.ingest_async(
model="gpt-5",
provider="openai",
input={"prompt": "..."},
output={"text": "..."}
)
With Metadata
import time
start = time.time()
# ... your LLM call ...
latency = int((time.time() - start) * 1000)
client.ingest(
model="gpt-5",
provider="openai",
input={"prompt": "..."},
output={"text": "..."},
metadata={
"latency_ms": latency,
"tokens_in": 50,
"tokens_out": 150,
"temperature": 0.7
}
)
Node.js SDK
Installation
npm install @driftrail/sdk
# or
yarn add @driftrail/sdk
# or
pnpm add @driftrail/sdk
Fire-and-Forget (Non-blocking)
// Won't block your code
client.ingestAsync({
model: 'gpt-4',
provider: 'openai',
input: { prompt: '...' },
output: { text: '...' }
});
OpenAI Chat Completions Helper
// Convenience method for chat completions
await client.logChatCompletion({
model: 'gpt-4',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' }
],
response: 'Hi there! How can I help you today?',
latencyMs: 420,
tokensIn: 25,
tokensOut: 12
});
TypeScript Support
import type {
IngestParams,
IngestResponse,
Provider,
InputPayload,
OutputPayload
} from '@driftrail/sdk';
Browser SDK
The browser SDK is designed for client-side applications. It uses batching and async delivery to minimize performance impact.
Initialization
import { initAIObservability, logInference } from '@driftrail/sdk';
initAIObservability({
apiKey: 'dr_prod_...',
appId: 'my-app',
environment: 'prod',
batchSize: 10, // Send events in batches
flushIntervalMs: 5000, // Or every 5 seconds
debug: false
});
Logging Events
logInference({
model: 'gpt-4',
provider: 'openai',
input: { prompt: 'User query here' },
output: { text: 'AI response here' },
metadata: {
latencyMs: 350,
temperature: 0.7
}
});
Inference Events
An inference event captures a single LLM interaction, including the prompt, response, and metadata.
Event Schema
{
"event_id": "evt_a1b2c3d4...",
"timestamp": "2024-01-15T10:30:00Z",
"model": "gpt-5",
"provider": "openai",
"input": {
"prompt": "What is the capital of France?",
"messages": [...],
"retrieved_sources": [...]
},
"output": {
"text": "The capital of France is Paris.",
"tool_calls": [...]
},
"metadata": {
"latency_ms": 420,
"tokens_in": 25,
"tokens_out": 12,
"temperature": 0.7
},
"classification": {
"risk_score": 15,
"risk_level": "low",
"detected_issues": []
}
}
Risk Classification
Every event is automatically classified for risk using our AI-powered analysis engine.
Low (0-30)
Safe, no issues detected
Medium (31-60)
Minor concerns, review suggested
High (61-85)
Significant issues detected
Critical (86-100)
Immediate attention required
Detection Types
- → Hallucination detection
- → PII exposure (names, emails, SSNs)
- → Policy violations
- → Prompt injection attempts
- → Jailbreak attempts
- → Toxic/harmful content
- → Off-topic responses
- → RAG groundedness issues
RAG Sources
Track the sources your RAG system retrieves and verify that responses are grounded in those sources.
client.ingest(
model="gpt-5",
provider="openai",
input={
"prompt": "What does our refund policy say?",
"retrieved_sources": [
{"id": "doc-123", "content": "Refunds available within 30 days..."},
{"id": "doc-456", "content": "Contact support for refund requests..."}
]
},
output={"text": "According to our policy, refunds are available within 30 days..."}
)
Inline Guardrails
Block dangerous LLM outputs before they reach your users.
The guard() method runs real-time AI classification and
rule-based guardrails in under 50ms.
What it detects
- • PII exposure (emails, phones, SSNs)
- • Toxic or harmful content
- • Prompt injection attempts
- • Custom keyword/regex rules
Actions available
- • Block - Prevent output entirely
- • Redact - Mask sensitive data
- • Warn - Flag but allow
- • Allow - Safe content
Fail-Open by Default: If DriftRail is unavailable, content is allowed through.
Your app never breaks. Configure guard_mode="fail_closed" for strict environments.
Python Guardrails
Basic Usage
from driftrail import DriftRail
client = DriftRail(api_key="dr_prod_...", app_id="my-app")
# Get response from your LLM
llm_response = your_llm_call(user_prompt)
# Guard it BEFORE returning to user
result = client.guard(
output=llm_response,
input=user_prompt,
mode="strict" # or "permissive"
)
if result.allowed:
# Safe to return (may be redacted if PII was found)
return result.output
else:
# Content was blocked
print(f"Blocked: {[t.reason for t in result.triggered]}")
return "Sorry, I can't help with that."
Guard Modes
| Mode | Blocks On | Use Case |
|---|---|---|
| strict | Medium+ risk (PII, moderate toxicity, prompt injection) | Healthcare, finance, compliance-heavy apps |
| permissive | High risk only (severe toxicity, high-risk injection) | General apps, creative tools |
Fail-Closed Mode
from driftrail import DriftRail, GuardBlockedError
# Fail-closed: raises exception if content blocked or service unavailable
client = DriftRail(
api_key="dr_prod_...",
app_id="my-app",
guard_mode="fail_closed"
)
try:
result = client.guard(output=llm_response)
return result.output
except GuardBlockedError as e:
# Content was blocked
log_security_event(e.result.triggered)
return "Content blocked for safety reasons."
Node.js Guardrails
Basic Usage
import { DriftRail } from '@driftrail/sdk';
const client = new DriftRail({ apiKey: 'dr_prod_...', appId: 'my-app' });
// Get response from your LLM
const llmResponse = await yourLLMCall(userPrompt);
// Guard it BEFORE returning to user
const result = await client.guard({
output: llmResponse,
input: userPrompt,
mode: 'strict'
});
if (result.allowed) {
return result.output; // May be redacted
} else {
console.log('Blocked:', result.triggered.map(t => t.reason));
return "Sorry, I can't help with that.";
}
Express Middleware Pattern
import express from 'express';
import { DriftRail } from '@driftrail/sdk';
const app = express();
const driftrail = new DriftRail({ apiKey: '...', appId: 'my-app' });
app.post('/api/chat', async (req, res) => {
const { prompt } = req.body;
// Get LLM response
const llmResponse = await callLLM(prompt);
// Guard before returning
const guard = await driftrail.guard({
output: llmResponse,
input: prompt
});
if (!guard.allowed) {
return res.status(400).json({
error: 'Content blocked',
reasons: guard.triggered.map(t => t.reason)
});
}
res.json({ response: guard.output });
});
POST /api/guard
Real-time content safety check. Returns allow/block/redact decision in <50ms.
Request
curl -X POST https://api.driftrail.com/api/guard \
-H "Authorization: Bearer dr_prod_..." \
-H "X-App-Id: my-app" \
-H "Content-Type: application/json" \
-d '{
"output": "The LLM response to check",
"input": "Optional user prompt for context",
"mode": "strict",
"timeout_ms": 100
}'
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| output | string | Yes | The LLM output to check |
| input | string | No | User prompt (helps detect prompt injection) |
| mode | string | No | "strict" (default) or "permissive" |
| timeout_ms | number | No | Classification timeout (default: 100, max: 500) |
Response
{
"allowed": true,
"action": "redact",
"output": "Contact me at j***@***.com for details",
"triggered": [
{
"type": "classification",
"name": "PII Redaction",
"reason": "Redacted email"
}
],
"classification": {
"risk_score": 25,
"pii": { "detected": true, "types": ["email"] },
"toxicity": { "detected": false, "severity": "none" },
"prompt_injection": { "detected": false, "risk": "none" }
},
"latency_ms": 42,
"fallback": false
}
POST /api/ingest
Ingest a new inference event for monitoring and classification. Events are processed asynchronously.
Authentication
Include your API key in one of these headers:
- Authorization: Bearer dr_prod_...
- X-API-Key: dr_prod_...
Required Headers
POST /api/ingest HTTP/1.1
Content-Type: application/json
Authorization: Bearer dr_prod_...
X-App-Id: my-application
Request Body
{
"model": "gpt-5",
"provider": "openai",
"input": {
"prompt": "What is the capital of France?",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
],
"retrievedSources": [
{ "id": "doc-123", "content": "France is a country..." }
]
},
"output": {
"text": "The capital of France is Paris.",
"toolCalls": []
},
"metadata": {
"latencyMs": 420,
"tokensIn": 25,
"tokensOut": 12,
"temperature": 0.7
}
}
Response (202 Accepted)
{
"success": true,
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"job_id": "job_a1b2c3d4..."
}
Error Responses
| Status | Description |
|---|---|
| 400 | Invalid JSON or schema validation failed |
| 401 | Missing or invalid API key |
| 429 | Rate limit or usage limit exceeded |
| 500 | Internal server error |
GET /api/events
Retrieve and filter inference events. Supports pagination and filtering by app, environment, model, and time range.
Query Parameters
| Parameter | Type | Description |
|---|---|---|
| event_id | string | Get a specific event by ID |
| app_id | string | Filter by application |
| environment | string | prod, staging, or dev |
| model | string | Filter by model name |
| start_time | ISO 8601 | Start of time range |
| end_time | ISO 8601 | End of time range |
| limit | integer | Max results (default: 100) |
| offset | integer | Pagination offset |
Example Request
curl -X GET "https://api.driftrail.com/api/events?app_id=my-app&limit=50" \
-H "Authorization: Bearer dr_prod_..."
Response
{
"events": [
{
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15T10:30:00Z",
"model": "gpt-5",
"provider": "openai",
"app_id": "my-app",
"environment": "prod",
"latency_ms": 420,
"tokens_in": 25,
"tokens_out": 12
}
],
"total": 1250,
"limit": 50,
"offset": 0
}
GET /api/classifications
Retrieve risk classification results. Includes endpoints for risk distribution and high-risk event alerts.
Endpoints
List all classifications with optional filters
Get risk level distribution (low, medium, high, critical)
Get events above risk threshold (default: 70)
Risk Distribution Example
curl -X GET "https://api.driftrail.com/api/classifications/distribution?app_id=my-app" \
-H "Authorization: Bearer dr_prod_..."
Response
{
"low": 850,
"medium": 280,
"high": 95,
"critical": 15,
"total": 1240
}
High-Risk Events Query
curl -X GET "https://api.driftrail.com/api/classifications/high-risk?threshold=80&limit=20" \
-H "Authorization: Bearer dr_prod_..."
Response
{
"classifications": [
{
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"risk_score": 92,
"risk_level": "critical",
"detected_issues": ["pii_exposure", "hallucination"],
"classified_at": "2024-01-15T10:30:05Z"
}
]
}
AI Playground
Dashboard FeatureTest AI models with real-time DriftRail safety monitoring. Every interaction runs through our full detection pipeline, giving you instant visibility into potential risks.
Features
- → Interactive chat with multiple AI models (Gemini 2.5 Flash Lite, Gemini 3 Flash, GPT-5 Nano, Claude 4.5 Haiku)
- → Real-time detection pipeline visualization
- → Guardrail testing with automatic blocking
- → Risk analysis for hallucination, PII, toxicity, prompt injection
- → Toggle detections and streaming on/off
Supported Models
Ultra fast, lowest cost
Latest Gemini model
OpenAI's fastest model
Anthropic's efficient model
Usage Limits by Plan
| Plan | Monthly Messages | Cost |
|---|---|---|
| Starter | 25 | Free |
| Growth | 500 | $99/mo |
| Pro | 2,500 | $499/mo |
| Enterprise | 10,000+ | Custom |
API Access
The playground is also available via API for programmatic testing:
curl -X POST "https://api.driftrail.com/api-playground" \
-H "Authorization: Bearer dr_prod_..." \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"model": "gemini-flash-lite-latest",
"runDetections": true
}'
Response
{
"content": "The capital of France is Paris.",
"model": "gemini-2.5-flash-lite",
"provider": "google",
"detections": [
{
"type": "hallucination",
"risk": "low",
"confidence": 0.95,
"details": "Response is factually accurate"
}
],
"latencyMs": 450,
"tokensUsed": 25,
"usage": {
"current": 16,
"limit": 500,
"remaining": 484
}
}
Try it now: Access the AI Playground from your dashboard to test models with real-time safety monitoring.
Ready to get started?
Create your free account and start monitoring your AI in under 5 minutes.
Get Started Free →