Skip to content

Commit 0566968

Browse files
committed
docs: Update docs
1 parent 931f809 commit 0566968

File tree

6 files changed

+577
-473
lines changed

6 files changed

+577
-473
lines changed

.env.example

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
1-
# Hopsworks API config
1+
# Your Hopsworks API key
22
HOPSWORKS_API_KEY=
33

4-
# OpenAI API config
5-
OPENAI_MODEL_ID=gpt-4o-mini
6-
OPENAI_API_KEY=
4+
# Your OpenAI API key (Optional - required only for the last LLM lesson)
5+
OPENAI_API_KEY=

INSTALL_AND_USAGE.md

Lines changed: 24 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,29 @@ This will:
2626

2727
# Usage
2828

29-
## Pipeline Components
29+
## Set environment variables
30+
31+
The first step before running anything is to set up the environment variables required to access Hopsworks and OpenAI's API.
32+
33+
Go to the root of the repository and run:
34+
```bash
35+
cp .env.example .env
36+
```
37+
38+
Open your `.env` file and fill it as explain in the comments.
39+
40+
## Pipeline components
3041

3142
The project consists of several pipeline components that can be run individually or all at once.
3243

33-
### Running the Complete Pipeline
44+
### Running all the pipelines at once
3445

35-
To run all pipeline components in sequence:
46+
To run all the pipeline at once in a sequence, run:
3647
```bash
3748
make all
3849
```
3950

40-
This will execute the following steps in order:
51+
This will execute all the ML pipelines in the following order:
4152
1. Feature Engineering
4253
2. Retrieval Model Training
4354
3. Ranking Model Training
@@ -51,43 +62,41 @@ You can also run each component separately:
5162

5263
### Individual Pipeline Components
5364

54-
You can also run each component separately:
55-
5665
1. **Feature Engineering**
66+
Execute the feature engineering Notebook (`notebooks/1_fp_computing_features.ipynb`):
5767
```bash
5868
make feature-engineering
5969
```
60-
Executes the feature engineering notebook (`notebooks/1_fp_computing_features.ipynb`)
6170

6271
2. **Train Retrieval Model**
72+
Execute the retrieval model training Notebook (`notebooks/2_tp_training_retrieval_model.ipynb`):
6373
```bash
6474
make train-retrieval
6575
```
66-
Trains the retrieval model using `notebooks/2_tp_training_retrieval_model.ipynb`
6776

6877
3. **Train Ranking Model**
78+
Execute the ranking model training Notebook (`notebooks/3_tp_training_ranking_model.ipynb`):
6979
```bash
7080
make train-ranking
7181
```
72-
Trains the ranking model using `notebooks/3_tp_training_ranking_model.ipynb`
7382

7483
4. **Create Embeddings**
84+
Execute the embeddings computation Notebook (`notebooks/4_fp_computing_item_embeddings.ipynb`):
7585
```bash
7686
make create-embeddings
7787
```
78-
Generates embeddings using `notebooks/4_fp_computing_item_embeddings.ipynb`
7988

8089
5. **Create Deployments**
90+
Execute the deployments creation Notebook (`notebooks/5_ip_creating_deployments.ipynb`):
8191
```bash
8292
make create-deployments
8393
```
84-
Sets up model deployments using `notebooks/5_ip_creating_deployments.ipynb`
8594

8695
6. **Schedule Materialization Jobs**
96+
Execute the materialization jobs scheduling Notebook (`notebooks/6_scheduling_materialization_jobs.ipynb`):
8797
```bash
8898
make schedule-materialization-jobs
8999
```
90-
Schedules materialization jobs using `notebooks/6_scheduling_materialization_jobs.ipynb`
91100

92101
### Notes
93102
- All notebooks are executed using IPython through the UV virtual environment
@@ -96,14 +105,16 @@ You can also run each component separately:
96105

97106
## Run Streamlit app
98107

99-
To launch the Streamlit application that uses the feature store and fine-tuned models, run:
108+
To launch the Streamlit frontend application that uses the feature store and fine-tuned models, run:
100109

101110
```bash
102111
make start-ui
103112
```
104113

105114
## Clean Hopsworks resources
106115

116+
To clean all
117+
107118
```bash
108119
make clean-hopsworks-resources
109120
```

README.md

Lines changed: 42 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,50 @@
1010
</a>
1111
</p>
1212

13+
## What will you learn?
14+
1315
The **Hands-on H&M Real-Time Personalized Recommender”** is a free course that will teach you how to build and deploy a real-time personalized recommender for H&M fashion articles using the 4-stage recommender architecture, the two-tower model design and the Hopsworks AI Lakehouse.
1416

17+
You will learn:
18+
19+
- building a recommender using the 4-stage recommender architecture
20+
- training a two-tower model for generating users and item embeddings
21+
- designing a scalable ML system using the FTI architecture
22+
- using MLOps best practices such as a feature store and model registry
23+
- deploying the real-time personalized recommender
24+
- enhancing recommendations with LLMs
25+
- implementing an interactive web interface
26+
27+
## Who is this for?
28+
29+
## Costs?
30+
31+
## How will you learn?
32+
33+
## Questions and troubleshooting
34+
1535
## Lessons
1636

17-
* [Lesson 1: Building a TikTok-like recommender](https://decodingml.substack.com/p/33d3273e-b8e3-4d98-b160-c3d239343022)
18-
* Lesson 2: The feature pipeline (WIP)
19-
* Lesson 3: The training pipeline (WIP)
20-
* Lesson 4: The inference pipeline (WIP)
21-
* Lesson 5: Building real-time recommenders with LLMs (WIP)
37+
| Lesson | Title | Description | Local Notebooks | Colab Notebooks |
38+
|--------|-------|-------------|----------------|-----------------|
39+
| 1 | [Building a TikTok-like recommender](https://decodingml.substack.com/p/33d3273e-b8e3-4d98-b160-c3d239343022) | Learn how to architect a recommender system using the 4-stage architecture and two-tower model. | - | - |
40+
| 2 | The feature pipeline | Learn how to build a scalable feature pipeline (WIP) | [1_fp_computing_features.ipynb](notebooks/1_fp_computing_features.ipynb) | - |
41+
| 3 | The training pipeline | Learn how to train and evaluate recommendation models (WIP) | [2_tp_training_retrieval_model.ipynb](notebooks/2_tp_training_retrieval_model.ipynb), [3_tp_training_ranking_model.ipynb](notebooks/3_tp_training_ranking_model.ipynb) | - |
42+
| 4 | The inference pipeline | Learn how to deploy models for real-time inference (WIP) | [4_fp_computing_item_embeddings.ipynb](notebooks/4_fp_computing_item_embeddings.ipynb), [5_ip_creating_deployments.ipynb](notebooks/5_ip_creating_deployments.ipynb) | - |
43+
| 5 | Building personalized real-time recommenders with LLMs | Learn how to enhance recommendations with LLMs (WIP) | - | - |
44+
45+
## Folder structure
46+
47+
## Install and usage
48+
49+
To understand how to install and run the code, go to the [INSTALL_AND_USAGE]() dedicated document.
50+
51+
> [!Note]
52+
> Even though you can run everything solely using the INSTALL_AND_USAGE dedicated document, we recommend that you read the articles to understand how the personalized recommender works fully.
53+
54+
## License
55+
56+
This course is an open-source project released under the Apache-2.0 license. Thus, as long you distribute our LICENSE and acknowledge your project is based on our work, you can safely clone or fork this project and use it as a source of inspiration for your educational projects (e.g., university, college degree, personal projects, etc.).
57+
58+
## Sponsors
2259

0 commit comments

Comments
 (0)