You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This folder contains samples for using OCI Data Science ML Pipelines
4
+
5
+
Machine learning pipelines are a crucial component of the modern data science workflow. They help automate the process of building, training, and deploying machine learning models, allowing data scientists to focus on more important tasks such as data exploration and model evaluation.
6
+
7
+
At a high level, a machine learning pipeline consists of several steps, each of which performs a specific task, working together to complete a workflow. For example, the first step might be data preprocessing, where raw data is cleaned and transformed into a format that can be fed into a machine learning algorithm. The next step might be model training, where the algorithm is trained on the processed data to learn the patterns and relationships within it. Steps can be executed in sequence or in parallel, speeding up the time to complete the workflow.
8
+
One of the key advantages of using machine learning pipelines is the ability to easily repeat and reproduce the entire workflow. This is important for ensuring the reliability and reproducibility of the results, and for making it easier to experiment with different algorithms and parameters finding the best model for a given problem.
9
+
10
+
Using pipelines, you can:
11
+
12
+
- Create ML pipeline by defining the workflow of the steps
13
+
- Write reusable code for each pipeline step or use existing ML Jobs as steps.
14
+
- Execute the pipeline, set parameters for each run.
15
+
- Monitor the execution of the pipeline and review logs outputted from the steps
16
+
17
+
## Available Samples
18
+
19
+
### Simple pipeline with data sharing between steps
This is a full featured pipeline, with data processing, parallel training of models, evaluating the models and deploying the best one into a real time Model Deployment.
Helper functions for passing data between pipeline steps
15
+
The functions use a temporary file on OCI object storage to set/get data between steps in the pipeline.
16
+
The functions expect the presence of the environment variable DATA_LOCATION with the value of the OCI object storage location to be used. Here is an example of how this could looks like (don't forget the slash / at the end!):
0 commit comments