Guardrails and Monitors
Weave provides a robust framework for implementing safety controls and monitoring systems in LLM applications through a unified scoring system. This guide explains how to leverage scorers as both guardrails for active intervention and monitors for passive evaluation in production environments.
Core Concepts
The foundation of Weave's evaluation system is the Scorer
class. This abstract base class defines a scoring interface through its score
method, which concrete implementations use to provide specific evaluation metrics. For a comprehensive overview of available metrics and custom scorer implementation, see Evaluation Metrics.
Here's a basic example of a custom scorer:
class MyScorer(Scorer):
def score(self, output: str) -> float:
"""
Evaluate the given result and return a score between 0 and 1.
Args:
result: The LLM-generated content to evaluate
Returns:
float: Score indicating quality/safety (0 = fail, 1 = pass)
"""
return 0.8 # Example score
Applying Scorers
Scorers are applied to operations using the apply_scorer
method, which returns an ApplyScorerResult
:
@dataclass
class ApplyScorerSuccess:
result: Any # The original operation result
score_call: Call # The scoring operation call object
Basic scorer application:
# Get both operation result and Call object
result, call = op.call(user_input)
# Apply scorer and get evaluation results
evaluation = await call.apply_scorer(scorer)
Always use op.call(user_input)
rather than direct invocation (op(user_input)
) when working with scorers. This method returns both the operation result and a Call
object required for scorer application.
Guardrails
Guardrails provide active safety mechanisms by evaluating LLM outputs in real-time and intervening based on scorer results. They are essential for preventing inappropriate or harmful content generation in production systems.
Implementation
async def process_with_guardrail(user_input: str) -> str:
"""
Process user input with safety guardrails.
Args:
user_input: The user's input to process
Returns:
str: Processed result if guardrail passes, fallback response if it fails
"""
result, call = op.call(user_input)
evaluation = await call.apply_scorer(guardrail)
if evaluation.score < 0.5:
return handle_failed_guardrail(result)
return result
Monitors
While guardrails provide active intervention, monitors offer passive evaluation and tracking of LLM operations. They are crucial for long-term quality assurance and system improvement.
Implementation
async def monitored_operation(user_input: str, sampling_rate: float = 0.25) -> str:
"""
Execute operation with monitoring.
Args:
user_input: The input to process
sampling_rate: Percentage of operations to monitor (0.0 to 1.0)
Returns:
str: Operation result
"""
result, call = op.call(user_input)
# Apply monitoring based on sampling rate
if random.random() < sampling_rate:
await call.apply_scorer(scorer)
return result
Scorer evaluations execute synchronously on the same machine as the operation. For high-throughput production environments, consider adjusting sampling rates based on load. Weave will soon support server-side scoring for high-throughput applications.
Analysis and Observability
Accessing Scorer Results
All scorer results are automatically logged as Feedback records in Weave, accessible through multiple interfaces:
- UI Dashboard: Access detailed scoring history in the Call details page
- Call Tables: Filter and analyze scores across operations
- Programmatic Access: Query results through API endpoints
Data Access Examples
HTTP API
calls = client.server.calls_query_stream({
# ... your filters
"include_feedback": True, # Include all scorer results
})
Python SDK
# Retrieve comprehensive feedback data for a specific call
call = client.get_call(call_id)
feedback_data = call.feedback
Next Steps
- Deep dive into Evaluation Metrics