Skip to main content

Evaluation

weaveDocs


weave / Evaluation

Class: Evaluation<R, E, M>

Sets up an evaluation which includes a set of scorers and a dataset.

Calling evaluation.evaluate(model) will pass in rows form a dataset into a model matching the names of the columns of the dataset to the argument names in model.predict.

Then it will call all of the scorers and save the results in weave.

Example

// Collect your examples into a dataset
const dataset = new weave.Dataset({
id: 'my-dataset',
rows: [
{ question: 'What is the capital of France?', expected: 'Paris' },
{ question: 'Who wrote "To Kill a Mockingbird"?', expected: 'Harper Lee' },
{ question: 'What is the square root of 64?', expected: '8' },
],
});

// Define any custom scoring function
const scoringFunction = weave.op(function isEqual({ modelOutput, datasetRow }) {
return modelOutput == datasetRow.expected;
});

// Define the function to evaluate
const model = weave.op(async function alwaysParisModel({ question }) {
return 'Paris';
});

// Start evaluating
const evaluation = new weave.Evaluation({
id: 'my-evaluation',
dataset: dataset,
scorers: [scoringFunction],
});

const results = await evaluation.evaluate({ model });

Extends

Type Parameters

R extends DatasetRow

E extends DatasetRow

M

Constructors

new Evaluation()

new Evaluation<R, E, M>(parameters): Evaluation<R, E, M>

Parameters

parameters: EvaluationParameters<R, E, M>

Returns

Evaluation<R, E, M>

Overrides

WeaveObject.constructor

Defined in

evaluation.ts:148

Properties

__savedRef?

optional __savedRef: ObjectRef | Promise<ObjectRef>

Inherited from

WeaveObject.__savedRef

Defined in

weaveObject.ts:49


_baseParameters

protected _baseParameters: WeaveObjectParameters

Inherited from

WeaveObject._baseParameters

Defined in

weaveObject.ts:51

Accessors

description

get description(): undefined | string

Returns

undefined | string

Inherited from

WeaveObject.description

Defined in

weaveObject.ts:89


id

get id(): string

Returns

string

Inherited from

WeaveObject.id

Defined in

weaveObject.ts:85

Methods

className()

className(): any

Returns

any

Inherited from

WeaveObject.className

Defined in

weaveObject.ts:53


evaluate()

evaluate(__namedParameters): Promise<Record<string, any>>

Parameters

__namedParameters

__namedParameters.maxConcurrency?: number = 5

__namedParameters.model: WeaveCallable<(...args) => Promise<M>>

__namedParameters.nTrials?: number = 1

Returns

Promise<Record<string, any>>

Defined in

evaluation.ts:163


predictAndScore()

predictAndScore(__namedParameters): Promise<object>

Parameters

__namedParameters

__namedParameters.columnMapping?: ColumnMapping<R, E>

__namedParameters.example: R

__namedParameters.model: WeaveCallable<(...args) => Promise<M>>

Returns

Promise<object>

model_latency

model_latency: number = modelLatency

model_output

model_output: any = modelOutput

model_success

model_success: boolean = !modelError

scores

scores: object

Index Signature

[key: string]: any

Defined in

evaluation.ts:232


saveAttrs()

saveAttrs(): object

Returns

object

Inherited from

WeaveObject.saveAttrs

Defined in

weaveObject.ts:57