weave
The top-level functions and classes for working with Weave.
API Overview
Classes
obj.Object
dataset.Dataset
: Dataset object with easy saving and automatic versioningmodel.Model
: Intended to capture a combination of code and data the operates on an input.prompt.Prompt
prompt.StringPrompt
prompt.MessagesPrompt
eval.Evaluation
: Sets up an evaluation which includes a set of scorers and a dataset.eval_imperative.EvaluationLogger
: This class provides an imperative interface for logging evaluations.scorer.Scorer
annotation_spec.AnnotationSpec
markdown.Markdown
: A Markdown renderable.monitor.Monitor
: Sets up a monitor to score incoming calls automatically.saved_view.SavedView
: A fluent-style class for working with SavedView objects.
Functions
api.init
: Initialize weave tracking, logging to a wandb project.api.publish
: Save and version a python object.api.ref
: Construct a Ref to a Weave object.call_context.require_current_call
: Get the Call object for the currently executing Op, within that Op.call_context.get_current_call
: Get the Call object for the currently executing Op, within that Op.api.finish
: Stops logging to weave.op.op
: A decorator to weave op-ify a function or method. Works for both sync and async.api.attributes
: Context manager for setting attributes on a call.
function init
init(
project_name: 'str',
settings: 'UserSettings | dict[str, Any] | None' = None,
autopatch_settings: 'AutopatchSettings | None' = None,
global_postprocess_inputs: 'PostprocessInputsFunc | None' = None,
global_postprocess_output: 'PostprocessOutputFunc | None' = None,
global_attributes: 'dict[str, Any] | None' = None
) → WeaveClient
Initialize weave tracking, logging to a wandb project.
Logging is initialized globally, so you do not need to keep a reference to the return value of init.
Following init, calls of weave.op() decorated functions will be logged to the specified project.
Args:
project_name
: The name of the Weights & Biases project to log to.settings
: Configuration for the Weave client generally.autopatch_settings
: Configuration for autopatch integrations, e.g. openaiglobal_postprocess_inputs
: A function that will be applied to all inputs of all ops.global_postprocess_output
: A function that will be applied to all outputs of all ops.global_attributes
: A dictionary of attributes that will be applied to all traces.
NOTE: Global postprocessing settings are applied to all ops after each op's own postprocessing. The order is always: 1. Op-specific postprocessing 2. Global postprocessing
Returns: A Weave client.
function publish
publish(obj: 'Any', name: 'str | None' = None) → ObjectRef
Save and version a python object.
If an object with name already exists, and the content hash of obj does not match the latest version of that object, a new version will be created.
TODO: Need to document how name works with this change.
Args:
obj
: The object to save and version.name
: The name to save the object under.
Returns: A weave Ref to the saved object.
function ref
ref(location: 'str') → ObjectRef
Construct a Ref to a Weave object.
TODO: what happens if obj does not exist
Args:
location
: A fully-qualified weave ref URI, or if weave.init() has been called, "name:version" or just "name" ("latest" will be used for version in this case).
Returns: A weave Ref to the object.
function require_current_call
require_current_call() → Call
Get the Call object for the currently executing Op, within that Op.
This allows you to access attributes of the Call such as its id or feedback while it is running.
@weave.op
def hello(name: str) -> None:
print(f"Hello {name}!")
current_call = weave.require_current_call()
print(current_call.id)
It is also possible to access a Call after the Op has returned.
If you have the Call's id, perhaps from the UI, you can use the get_call
method on the WeaveClient
returned from weave.init
to retrieve the Call object.
client = weave.init("<project>")
mycall = client.get_call("<call_id>")
Alternately, after defining your Op you can use its call
method. For example:
@weave.op
def add(a: int, b: int) -> int:
return a + b
result, call = add.call(1, 2)
print(call.id)
Returns: The Call object for the currently executing Op
Raises:
NoCurrentCallError
: If tracking has not been initialized or this method is invoked outside an Op.
function get_current_call
get_current_call() → Call | None
Get the Call object for the currently executing Op, within that Op.
Returns: The Call object for the currently executing Op, or None if tracking has not been initialized or this method is invoked outside an Op.
function finish
finish() → None
Stops logging to weave.
Following finish, calls of weave.op() decorated functions will no longer be logged. You will need to run weave.init() again to resume logging.
function op
op(
func: 'Callable[P, R] | None' = None,
name: 'str | None' = None,
call_display_name: 'str | CallDisplayNameFunc | None' = None,
postprocess_inputs: 'PostprocessInputsFunc | None' = None,
postprocess_output: 'PostprocessOutputFunc | None' = None,
tracing_sample_rate: 'float' = 1.0,
enable_code_capture: 'bool' = True
) → Callable[[Callable[P, R]], Op[P, R]] | Op[P, R]
A decorator to weave op-ify a function or method. Works for both sync and async. Automatically detects iterator functions and applies appropriate behavior.
function attributes
attributes(attributes: 'dict[str, Any]') → Iterator
Context manager for setting attributes on a call.
Example:
with weave.attributes({'env': 'production'}):
print(my_function.call("World"))
class Object
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
classmethod from_uri
from_uri(uri: str, objectify: bool = True) → Self
classmethod handle_relocatable_object
handle_relocatable_object(
v: Any,
handler: ValidatorFunctionWrapHandler,
info: ValidationInfo
) → Any
class Dataset
Dataset object with easy saving and automatic versioning
Examples:
# Create a dataset
dataset = Dataset(name='grammar', rows=[
{'id': '0', 'sentence': "He no likes ice cream.", 'correction': "He doesn't like ice cream."},
{'id': '1', 'sentence': "She goed to the store.", 'correction': "She went to the store."},
{'id': '2', 'sentence': "They plays video games all day.", 'correction': "They play video games all day."}
])
# Publish the dataset
weave.publish(dataset)
# Retrieve the dataset
dataset_ref = weave.ref('grammar').get()
# Access a specific example
example_label = dataset_ref.rows[2]['sentence']
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
rows
:typing.Union[trace.table.Table, trace.vals.WeaveTable]
method add_rows
add_rows(rows: Iterable[dict]) → Dataset
Create a new dataset version by appending rows to the existing dataset.
This is useful for adding examples to large datasets without having to load the entire dataset into memory.
Args:
rows
: The rows to add to the dataset.
Returns: The updated dataset.
classmethod convert_to_table
convert_to_table(rows: Any) → Union[Table, WeaveTable]
classmethod from_calls
from_calls(calls: Iterable[Call]) → Self
classmethod from_obj
from_obj(obj: WeaveObject) → Self
classmethod from_pandas
from_pandas(df: 'DataFrame') → Self
method to_pandas
to_pandas() → DataFrame
class Model
Intended to capture a combination of code and data the operates on an input. For example it might call an LLM with a prompt to make a prediction or generate text.
When you change the attributes or the code that defines your model, these changes will be logged and the version will be updated. This ensures that you can compare the predictions across different versions of your model. Use this to iterate on prompts or to try the latest LLM and compare predictions across different settings
Examples:
class YourModel(Model):
attribute1: str
attribute2: int
@weave.op()
def predict(self, input_data: str) -> dict:
# Model logic goes here
prediction = self.attribute1 + ' ' + input_data
return {'pred': prediction}
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
method get_infer_method
get_infer_method() → Callable
class Prompt
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
method format
format(**kwargs: Any) → Any
class StringPrompt
method __init__
__init__(content: str)
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
content
:<class 'str'>
method format
format(**kwargs: Any) → str
classmethod from_obj
from_obj(obj: WeaveObject) → Self
class MessagesPrompt
method __init__
__init__(messages: list[dict])
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
messages
:list[dict]
method format
format(**kwargs: Any) → list
method format_message
format_message(message: dict, **kwargs: Any) → dict
classmethod from_obj
from_obj(obj: WeaveObject) → Self
class Evaluation
Sets up an evaluation which includes a set of scorers and a dataset.
Calling evaluation.evaluate(model) will pass in rows from a dataset into a model matching the names of the columns of the dataset to the argument names in model.predict.
Then it will call all of the scorers and save the results in weave.
If you want to preprocess the rows from the dataset you can pass in a function to preprocess_model_input.
Examples:
# Collect your examples
examples = [
{"question": "What is the capital of France?", "expected": "Paris"},
{"question": "Who wrote 'To Kill a Mockingbird'?", "expected": "Harper Lee"},
{"question": "What is the square root of 64?", "expected": "8"},
]
# Define any custom scoring function
@weave.op()
def match_score1(expected: str, model_output: dict) -> dict:
# Here is where you'd define the logic to score the model output
return {'match': expected == model_output['generated_text']}
@weave.op()
def function_to_evaluate(question: str):
# here's where you would add your LLM call and return the output
return {'generated_text': 'Paris'}
# Score your examples using scoring functions
evaluation = Evaluation(
dataset=examples, scorers=[match_score1]
)
# Start tracking the evaluation
weave.init('intro-example')
# Run the evaluation
asyncio.run(evaluation.evaluate(function_to_evaluate))
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
dataset
:<class 'flow.dataset.Dataset'>
scorers
:typing.Optional[list[typing.Annotated[typing.Union[trace.op.Op, flow.scorer.Scorer], BeforeValidator(func=<function cast_to_scorer at 0x146df4f70>, json_schema_input_type=PydanticUndefined)]]]
preprocess_model_input
:typing.Optional[typing.Callable[[dict], dict]]
trials
:<class 'int'>
evaluation_name
:typing.Union[str, typing.Callable[[trace.weave_client.Call], str], NoneType]
method evaluate
evaluate(model: Union[Op, Model]) → dict
classmethod from_obj
from_obj(obj: WeaveObject) → Self
method get_eval_results
get_eval_results(model: Union[Op, Model]) → EvaluationResults
method predict_and_score
predict_and_score(model: Union[Op, Model], example: dict) → dict
method summarize
summarize(eval_table: EvaluationResults) → dict
class EvaluationLogger
This class provides an imperative interface for logging evaluations.
An evaluation is started automatically when the first prediction is logged using the log_prediction
method, and finished when the log_summary
method is called.
Each time you log a prediction, you will get back a ScoreLogger
object. You can use this object to log scores and metadata for that specific prediction. For more information, see the ScoreLogger
class.
Example:
ev = EvaluationLogger()
pred = ev.log_prediction(inputs, output)
pred.log_score(scorer_name, score)
ev.log_summary(summary)
Pydantic Fields:
name
:str | None
model
:flow.model.Model | dict | str
dataset
:flow.dataset.Dataset | list[dict] | str
property ui_url
method finish
finish() → None
Clean up the evaluation resources explicitly without logging a summary.
Ensures all prediction calls and the main evaluation call are finalized. This is automatically called if the logger is used as a context manager.
method log_prediction
log_prediction(inputs: 'dict', output: 'Any') → ScoreLogger
Log a prediction to the Evaluation, and return a reference.
The reference can be used to log scores which are attached to the specific prediction instance.
method log_summary
log_summary(summary: 'dict | None' = None) → None
Log a summary dict to the Evaluation.
This will calculate the summary, call the summarize op, and then finalize the evaluation, meaning no more predictions or scores can be logged.
class Scorer
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
column_map
:typing.Optional[dict[str, str]]
method model_post_init
model_post_init(_Scorer__context: Any) → None
method score
score(output: Any, **kwargs: Any) → Any
method summarize
summarize(score_rows: list) → Optional[dict]
class AnnotationSpec
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
field_schema
:dict[str, typing.Any]
unique_among_creators
:<class 'bool'>
op_scope
:typing.Optional[list[str]]
classmethod preprocess_field_schema
preprocess_field_schema(data: dict[str, Any]) → dict[str, Any]
classmethod validate_field_schema
validate_field_schema(schema: dict[str, Any]) → dict[str, Any]
method value_is_valid
value_is_valid(payload: Any) → bool
Validates a payload against this annotation spec's schema.
Args:
payload
: The data to validate against the schema
Returns:
bool
: True if validation succeeds, False otherwise
class Markdown
A Markdown renderable.
Args:
markup
(str): A string containing markdown.code_theme
(str, optional): Pygments theme for code blocks. Defaults to "monokai". See https://pygments.org/styles/ for code themes.justify
(JustifyMethod, optional): Justify value for paragraphs. Defaults to None.style
(Union[str, Style], optional): Optional style to apply to markdown.hyperlinks
(bool, optional): Enable hyperlinks. Defaults toTrue
.inline_code_lexer
: (str, optional): Lexer to use if inline code highlighting is enabled. Defaults to None.inline_code_theme
: (Optional[str], optional): Pygments theme for inline code highlighting, or None for no highlighting. Defaults to None.
method __init__
__init__(
markup: 'str',
code_theme: 'str' = 'monokai',
justify: 'JustifyMethod | None' = None,
style: 'str | Style' = 'none',
hyperlinks: 'bool' = True,
inline_code_lexer: 'str | None' = None,
inline_code_theme: 'str | None' = None
) → None
class Monitor
Sets up a monitor to score incoming calls automatically.
Examples:
import weave
from weave.scorers import ValidJSONScorer
json_scorer = ValidJSONScorer()
my_monitor = weave.Monitor(
name="my-monitor",
description="This is a test monitor",
sampling_rate=0.5,
op_names=["my_op"],
query={
"$expr": {
"$gt": [
{
"$getField": "started_at"
},
{
"$literal": 1742540400
}
]
}
}
},
scorers=[json_scorer],
)
my_monitor.activate()
Pydantic Fields:
name
:typing.Optional[str]
description
:typing.Optional[str]
ref
:typing.Optional[trace.refs.ObjectRef]
sampling_rate
:<class 'float'>
scorers
:list[flow.scorer.Scorer]
op_names
:list[str]
query
:typing.Optional[trace_server.interface.query.Query]
active
:<class 'bool'>
method activate
activate() → ObjectRef
Activates the monitor.
Returns: The ref to the monitor.
method deactivate
deactivate() → ObjectRef
Deactivates the monitor.
Returns: The ref to the monitor.
classmethod from_obj
from_obj(obj: WeaveObject) → Self
class SavedView
A fluent-style class for working with SavedView objects.
method __init__
__init__(view_type: 'str' = 'traces', label: 'str' = 'SavedView') → None
property entity
property label
property project
property view_type
method add_column
add_column(path: 'str | ObjectPath', label: 'str | None' = None) → SavedView
method add_columns
add_columns(*columns: 'str') → SavedView
Convenience method for adding multiple columns to the grid.
method add_filter
add_filter(
field: 'str',
operator: 'str',
value: 'Any | None' = None
) → SavedView
method add_sort
add_sort(field: 'str', direction: 'SortDirection') → SavedView
method column_index
column_index(path: 'int | str | ObjectPath') → int
method filter_op
filter_op(op_name: 'str | None') → SavedView
method get_calls
get_calls(
limit: 'int | None' = None,
offset: 'int | None' = None,
include_costs: 'bool' = False,
include_feedback: 'bool' = False,
all_columns: 'bool' = False
) → CallsIter
Get calls matching this saved view's filters and settings.
method get_known_columns
get_known_columns(num_calls_to_query: 'int | None' = None) → list[str]
Get the set of columns that are known to exist.
method get_table_columns
get_table_columns() → list[TableColumn]
method hide_column
hide_column(col_name: 'str') → SavedView
method insert_column
insert_column(
idx: 'int',
path: 'str | ObjectPath',
label: 'str | None' = None
) → SavedView
classmethod load
load(ref: 'str') → SavedView
method page_size
page_size(page_size: 'int') → SavedView
method pin_column_left
pin_column_left(col_name: 'str') → SavedView
method pin_column_right
pin_column_right(col_name: 'str') → SavedView
method remove_column
remove_column(path: 'int | str | ObjectPath') → SavedView
method remove_columns
remove_columns(*columns: 'str') → SavedView
Remove columns from the saved view.
method remove_filter
remove_filter(index_or_field: 'int | str') → SavedView
method remove_filters
remove_filters() → SavedView
Remove all filters from the saved view.
method rename
rename(label: 'str') → SavedView
method rename_column
rename_column(path: 'int | str | ObjectPath', label: 'str') → SavedView
method save
save() → SavedView
Publish the saved view to the server.
method set_columns
set_columns(*columns: 'str') → SavedView
Set the columns to be displayed in the grid.
method show_column
show_column(col_name: 'str') → SavedView
method sort_by
sort_by(field: 'str', direction: 'SortDirection') → SavedView
method to_grid
to_grid(limit: 'int | None' = None) → Grid
method to_rich_table_str
to_rich_table_str() → str
method ui_url
ui_url() → str | None
URL to show this saved view in the UI.
Note this is the "result" page with traces etc, not the URL for the view object.
method unpin_column
unpin_column(col_name: 'str') → SavedView