Skip to main content
This is a notebook
Weights & Biases

Setting up a custom cost model

Weave calculates costs based on the number of tokens used and the model used. Weave grabs this usage and model from the output and associates them with the call.

Let's set up a simple custom model, that calculates its own token usage, and stores that in weave.

Set up the environment

We install and import all needed packages. We set WANDB_API_KEY in our env so that we may easily login with wandb.login() (this should be given to the colab as a secret).

We set the project in W&B we want to log this into in name_of_wandb_project.

NOTE: name_of_wandb_project may also be in the format of {team_name}/{project_name} to specify a team to log the traces into.

We then fetch a weave client by calling weave.init()

%pip install wandb weave datetime --quiet
import os

import wandb
from google.colab import userdata

import weave

os.environ["WANDB_API_KEY"] = userdata.get("WANDB_API_KEY")
name_of_wandb_project = "custom-cost-model"

wandb.login()
weave_client = weave.init(name_of_wandb_project)

Setting up a model with weave

from weave import Model


class YourModel(Model):
attribute1: str
attribute2: int

def simple_token_count(self, text: str) -> int:
return len(text) // 3

# This is a custom op that we are defining
# It takes in a string, and outputs a dict with the usage counts, model name, and the output
@weave.op()
def custom_model_generate(self, input_data: str) -> dict:
# Model logic goes here
# Here is where you would have a custom generate function
prediction = self.attribute1 + " " + input_data

# Usage counts
prompt_tokens = self.simple_token_count(input_data)
completion_tokens = self.simple_token_count(prediction)

# We return a dictionary with the usage counts, model name, and the output
# Weave will automatically associate this with the trace
# This object {usage, model, output} matches the output of a OpenAI Call
return {
"usage": {
"input_tokens": prompt_tokens,
"output_tokens": completion_tokens,
},
"model": "your_model_name",
"output": prediction,
}

# In our predict function we call our custom generate function, and return the output.
@weave.op()
def predict(self, input_data: str) -> dict:
# Here is where you would do any post processing of the data
outputs = self.custom_model_generate(input_data)
return outputs["output"]

Add a custom cost

Here we add a custom cost, and now that we have a custom cost, and our calls have usage, we can fetch the calls with include_cost and our calls with have costs under summary.weave.costs.

model = YourModel(attribute1="Hello", attribute2=1)
model.predict("world")

# We then add a custom cost to our project
weave_client.add_cost(
llm_id="your_model_name", prompt_token_cost=0.1, completion_token_cost=0.2
)

# We can then query for the calls, and with include_costs=True
# we receive the costs back attached to the calls
calls = weave_client.get_calls(filter={"trace_roots_only": True}, include_costs=True)

list(calls)