Skip to content

Commit 2a11e0b

Browse files
authored
Merge pull request #112612 from likebupt/update-evaluate-model-0424
update evaluate model
2 parents 43db800 + 986ff6e commit 2a11e0b

File tree

1 file changed

+9
-30
lines changed

1 file changed

+9
-30
lines changed

articles/machine-learning/algorithm-module-reference/evaluate-model.md

Lines changed: 9 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: reference
99

1010
author: likebupt
1111
ms.author: keli19
12-
ms.date: 02/24/2020
12+
ms.date: 04/24/2020
1313
---
1414
# Evaluate Model module
1515

@@ -28,36 +28,15 @@ Use this module to measure the accuracy of a trained model. You provide a datase
2828
> If you are new to model evaluation, we recommend the video series by Dr. Stephen Elston, as part of the [machine learning course](https://blogs.technet.microsoft.com/machinelearning/2015/09/08/new-edx-course-data-science-machine-learning-essentials/) from EdX.
2929
3030

31-
There are three ways to use the **Evaluate Model** module:
31+
## How to use Evaluate Model
32+
1. Connect the **Scored dataset** output of the [Score Model](./score-model.md) to the left input port of **Evaluate Model**.
3233

33-
+ Generate scores over your training data, and evaluate the model based on these scores
34-
+ Generate scores on the model, but compare those scores to scores on a reserved testing set
35-
+ Compare scores for two different but related models, using the same set of data
34+
2. [Optional] Connect the **Scored dataset** output of the [Score Model](./score-model.md) for the second model to the **right-hand** input of **Evaluate Model**. You can easily compare results from two different models on the same data. The two input algorithms should be the same algorithm type. Or, you might compare scores from two different runs over the same data with different parameters.
3635

37-
## Use the training data
36+
> [!NOTE]
37+
> Algorithm type refers to 'Two-class Classification', 'Multi-class Classification', 'Regression', 'Clustering' under 'Machine Learning Algorithms'.
3838
39-
To evaluate a model, you must connect a dataset that contains a set of input columns and scores. If no other data is available, you can use your original dataset.
40-
41-
1. Connect the **Scored dataset** output of the [Score Model](./score-model.md) to the input of **Evaluate Model**.
42-
2. Click **Evaluate Model** module, and run the pipeline to generate the evaluation scores.
43-
44-
## Use testing data
45-
46-
A common scenario in machine learning is to separate your original data set into training and testing datasets, using the [Split](./split-data.md) module, or the [Partition and Sample](./partition-and-sample.md) module.
47-
48-
1. Connect the **Scored dataset** output of the [Score Model](score-model.md) to the input of **Evaluate Model**.
49-
2. Connect the output of the Split Data module that contains the testing data to the right-hand input of **Evaluate Model**.
50-
2. Click **Evaluate Model** module, and select **Run selected** to generate the evaluation scores.
51-
52-
## Compare scores from two models
53-
54-
You can also connect a second set of scores to **Evaluate Model**. The scores might be a shared evaluation set that has known results, or a set of results from a different model for the same data.
55-
56-
This feature is useful because you can easily compare results from two different models on the same data. Or, you might compare scores from two different runs over the same data with different parameters.
57-
58-
1. Connect the **Scored dataset** output of the [Score Model](score-model.md) to the input of **Evaluate Model**.
59-
2. Connect the output of the Score Model module for the second model to the right-hand input of **Evaluate Model**.
60-
3. Submit the pipeline.
39+
3. Submit the pipeline to generate the evaluation scores.
6140

6241
## Results
6342

@@ -134,9 +113,9 @@ The following metrics are reported for evaluating clustering models.
134113

135114
If the number of data points assigned to clusters is less than the total number of data points available, it means that the data points could not be assigned to a cluster.
136115

137-
- The scores in the column, **Maximal Distance to Cluster Center**, represent the sum of the distances between each point and the centroid of that points cluster.
116+
- The scores in the column, **Maximal Distance to Cluster Center**, represent the sum of the distances between each point and the centroid of that point's cluster.
138117

139-
If this number is high, it can mean that the cluster is widely dispersed. You should review this statistic together with the **Average Distance to Cluster Center** to determine the clusters spread.
118+
If this number is high, it can mean that the cluster is widely dispersed. You should review this statistic together with the **Average Distance to Cluster Center** to determine the cluster's spread.
140119

141120
- The **Combined Evaluation** score at the bottom of the each section of results lists the averaged scores for the clusters created in that particular model.
142121

0 commit comments

Comments
 (0)