Evaluation
weave • Docs
weave / Evaluation
Class: Evaluation<R, E, M>
Sets up an evaluation which includes a set of scorers and a dataset.
Calling evaluation.evaluate(model) will pass in rows form a dataset into a model matching the names of the columns of the dataset to the argument names in model.predict.
Then it will call all of the scorers and save the results in weave.
Example
// Collect your examples into a dataset
const dataset = new weave.Dataset({
id: 'my-dataset',
rows: [
{ question: 'What is the capital of France?', expected: 'Paris' },
{ question: 'Who wrote "To Kill a Mockingbird"?', expected: 'Harper Lee' },
{ question: 'What is the square root of 64?', expected: '8' },
],
});
// Define any custom scoring function
const scoringFunction = weave.op(function isEqual({ modelOutput, datasetRow }) {
return modelOutput == datasetRow.expected;
});
// Define the function to evaluate
const model = weave.op(async function alwaysParisModel({ question }) {
return 'Paris';
});
// Start evaluating
const evaluation = new weave.Evaluation({
id: 'my-evaluation',
dataset: dataset,
scorers: [scoringFunction],
});
const results = await evaluation.evaluate({ model });
Extends
Type Parameters
• R extends DatasetRow
• E extends DatasetRow
• M
Constructors
new Evaluation()
new Evaluation<
R
,E
,M
>(parameters
):Evaluation
<R
,E
,M
>
Parameters
• parameters: EvaluationParameters
<R
, E
, M
>
Returns
Evaluation
<R
, E
, M
>
Overrides
Defined in
Properties
__savedRef?
optional
__savedRef:ObjectRef
|Promise
<ObjectRef
>
Inherited from
Defined in
_baseParameters
protected
_baseParameters:WeaveObjectParameters
Inherited from
Defined in
Accessors
description
get
description():undefined
|string
Returns
undefined
| string
Inherited from
Defined in
id
get
id():string
Returns
string
Inherited from
Defined in
Methods
className()
className():
any
Returns
any
Inherited from
Defined in
evaluate()
evaluate(
__namedParameters
):Promise
<Record
<string
,any
>>
Parameters
• __namedParameters
• __namedParameters.maxConcurrency?: number
= 5
• __namedParameters.model: WeaveCallable
<(...args
) => Promise
<M
>>
• __namedParameters.nTrials?: number
= 1
Returns
Promise
<Record
<string
, any
>>
Defined in
predictAndScore()
predictAndScore(
__namedParameters
):Promise
<object
>
Parameters
• __namedParameters
• __namedParameters.columnMapping?: ColumnMapping
<R
, E
>
• __namedParameters.example: R
• __namedParameters.model: WeaveCallable
<(...args
) => Promise
<M
>>
Returns
Promise
<object
>
model_latency
model_latency:
number
=modelLatency
model_output
model_output:
any
=modelOutput
model_success
model_success:
boolean
=!modelError
scores
scores:
object
Index Signature
[key
: string
]: any
Defined in
saveAttrs()
saveAttrs():
object
Returns
object