Evaluation Module¶
-
class
evaluation.
Evaluation
(reference, targets, metrics, subregions=None)¶ Container for running an evaluation
An Evaluation is the running of one or more metrics on one or more target datasets and a (possibly optional) reference dataset. Evaluation can handle two types of metrics,
unary
andbinary
. The validity of an Evaluation is dependent upon the number and type of metrics as well as the number of datasets.A
unary
metric is a metric that runs over a single dataset. If you add aunary
metric to the Evaluation you are only required to add a reference dataset or a target dataset. If there are multiple datasets in the evaluation then theunary
metric is run over all of them.A
binary
metric is a metric that runs over a reference dataset and target dataset. If you add abinary
metric you are required to add a reference dataset and at least one target dataset. Thebinary
metrics are run over every (reference dataset, target dataset) pair in the Evaluation.An Evaluation must have at least one metric to be valid.
Default Evaluation constructor.
Parameters: - reference (
dataset.Dataset
) – The reference Dataset for the evaluation. - targets (
list
ofdataset.Dataset
) – A list of one or more target datasets for the evaluation. - metrics (
list
ofmetrics
) – A list of one or more Metric instances to run in the evaluation. - subregions (
list
ofdataset.Bounds
) – (Optional) Subregion information to use in the evaluation. A subregion is specified with a Bounds object.
Raises: ValueError
-
add_dataset
(target_dataset)¶ Add a Dataset to the Evaluation.
A target Dataset is compared against the reference dataset when the Evaluation is run with one or more metrics.
Parameters: target_dataset ( dataset.Dataset
) – The target Dataset to add to the Evaluation.Raises: ValueError – If a dataset to add isn’t an instance of Dataset.
-
add_datasets
(target_datasets)¶ Add multiple Datasets to the Evaluation.
Parameters: target_datasets ( list
ofdataset.Dataset
) – The list of datasets that should be added to the Evaluation.Raises: ValueError – If a dataset to add isn’t an instance of Dataset.
-
add_metric
(metric)¶ Add a metric to the Evaluation.
A metric is an instance of a class which inherits from metrics.Metric.
Parameters: metric ( metrics
) – The metric instance to add to the Evaluation.Raises: ValueError – If the metric to add isn’t a class that inherits from metrics.Metric.
-
add_metrics
(metrics)¶ Add multiple metrics to the Evaluation.
A metric is an instance of a class which inherits from metrics.Metric.
Parameters: metrics ( list
ofmetrics
) – The list of metric instances to add to the Evaluation.Raises: ValueError – If a metric to add isn’t a class that inherits from metrics.Metric.
-
metrics
= None¶ The list of “binary” metrics (A metric which takes two Datasets) that the Evaluation should use.
-
results
= None¶ A list containing the results of running regular metric evaluations. The shape of results is
(num_target_datasets, num_metrics)
if the user doesn’t specify subregion information. Otherwise the shape is(num_target_datasets, num_metrics, num_subregions)
.
-
run
()¶ Run the evaluation.
There are two phases to a run of the Evaluation. First, if there are any “binary” metrics they are run through the evaluation. Binary metrics are only run if there is a reference dataset and at least one target dataset.
If there is subregion information provided then each dataset is subset before being run through the binary metrics.
..note:: Only the binary metrics are subset with subregion information.
Next, if there are any “unary” metrics they are run. Unary metrics are only run if there is at least one target dataset or a reference dataset.
-
target_datasets
= None¶ The target dataset(s) which should each be compared with the reference dataset when the evaluation is run.
-
unary_metrics
= None¶ The list of “unary” metrics (A metric which takes one Dataset) that the Evaluation should use.
-
unary_results
= None¶ A list containing the results of running the unary metric evaluations. The shape of unary_results is
(num_targets, num_metrics)
wherenum_targets = num_target_ds + (1 if ref_dataset != None else 0
- reference (