Merge pull request #112612 from likebupt/update-evaluate-model-0424

ttorble · web-flow · commit 2a11e0bebd84 · 2020-04-24T10:44:34.000+01:00
update evaluate model
diff --git a/articles/machine-learning/algorithm-module-reference/evaluate-model.md b/articles/machine-learning/algorithm-module-reference/evaluate-model.md
@@ -9,7 +9,7 @@ ms.topic: reference
 
 author: likebupt
 ms.author: keli19
-ms.date: 02/24/2020
+ms.date: 04/24/2020
 ---
 # Evaluate Model module
 
@@ -28,36 +28,15 @@ Use this module to measure the accuracy of a trained model. You provide a datase
 > If you are new to model evaluation, we recommend the video series by Dr. Stephen Elston, as part of the [machine learning course](https://blogs.technet.microsoft.com/machinelearning/2015/09/08/new-edx-course-data-science-machine-learning-essentials/) from EdX. 
 
 
-There are three ways to use the **Evaluate Model** module:
+## How to use Evaluate Model
+1. Connect the **Scored dataset** output of the [Score Model](./score-model.md) to the left input port of **Evaluate Model**. 
 
-+ Generate scores over your training data, and evaluate the model based on these scores
-+ Generate scores on the model, but compare those scores to scores on a reserved testing set
-+ Compare scores for two different but related models, using the same set of data
+2. [Optional] Connect the **Scored dataset** output of the [Score Model](./score-model.md) for the second model to the **right-hand** input of **Evaluate Model**. You can easily compare results from two different models on the same data. The two input algorithms should be the same algorithm type. Or, you might compare scores from two different runs over the same data with different parameters.
 
-## Use the training data
+    > [!NOTE]
+    > Algorithm type refers to 'Two-class Classification', 'Multi-class Classification', 'Regression', 'Clustering' under 'Machine Learning Algorithms'. 
 
-To evaluate a model, you must connect a dataset that contains a set of input columns and scores.  If no other data is available, you can use your original dataset.
-
-1. Connect the **Scored dataset** output of the [Score Model](./score-model.md) to the input of **Evaluate Model**. 
-2. Click **Evaluate Model** module, and run the pipeline to generate the evaluation scores.
-
-## Use testing data
-
-A common scenario in machine learning is to separate your original data set into training and testing datasets, using the [Split](./split-data.md) module, or the [Partition and Sample](./partition-and-sample.md) module. 
-
-1. Connect the **Scored dataset** output of the [Score Model](score-model.md) to the input of **Evaluate Model**. 
-2. Connect the output of the Split Data module that contains the testing data to the right-hand input of **Evaluate Model**.
-2. Click **Evaluate Model** module, and select **Run selected** to generate the evaluation scores.
-
-## Compare scores from two models
-
-You can also connect a second set of scores to **Evaluate Model**.  The scores might be a shared evaluation set that has known results, or a set of results from a different model for the same data.
-
-This feature is useful because you can easily compare results from two different models on the same data. Or, you might compare scores from two different runs over the same data with different parameters.
-
-1. Connect the **Scored dataset** output of the [Score Model](score-model.md) to the input of **Evaluate Model**. 
-2. Connect the output of the Score Model module for the second model to the right-hand input of **Evaluate Model**.
-3. Submit the pipeline.
+3. Submit the pipeline to generate the evaluation scores.
 
 ## Results
 
@@ -134,9 +113,9 @@ The following metrics are reported for evaluating clustering models.
   
      If the number of data points assigned to clusters is less than the total number of data points available, it means that the data points could not be assigned to a cluster.  
   
--   The scores in the column, **Maximal Distance to Cluster Center**, represent the sum of the distances between each point and the centroid of that point’s cluster.  
+-   The scores in the column, **Maximal Distance to Cluster Center**, represent the sum of the distances between each point and the centroid of that point's cluster.  
   
-     If this number is high, it can mean that the cluster is widely dispersed. You should review this statistic together with the **Average Distance to Cluster Center** to determine the cluster’s spread.   
+     If this number is high, it can mean that the cluster is widely dispersed. You should review this statistic together with the **Average Distance to Cluster Center** to determine the cluster's spread.   
 
 -   The **Combined Evaluation** score at the bottom of the each section of results lists the averaged scores for the clusters created in that particular model.