Introduction
Weave is a lightweight toolkit for tracking and evaluating LLM applications, built by Weights & Biases.
Our goal is to bring rigor, best-practices, and composability to the inherently experimental process of developing AI applications, without introducing cognitive overhead.
Get started by decorating Python functions with @weave.op()
.
Seriously, try the 🍪 quickstart 🍪 or
You can use Weave to:
- Log and debug language model inputs, outputs, and traces
- Build rigorous, apples-to-apples evaluations for language model use cases
- Organize all the information generated across the LLM workflow, from experimentation to evaluations to production
What's next?
Try the Quickstart to see Weave in action.