Skip to content

Commit dbc9e8e

Browse files
Minor readability changes (substratusai#16)
* Minor readability changes * refined to use the hf-based facebook model
1 parent eb0d56f commit dbc9e8e

File tree

6 files changed

+262
-333
lines changed

6 files changed

+262
-333
lines changed

README.md

Lines changed: 18 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -3,73 +3,71 @@
33
This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.
44
You can view live website here: [https://substratus.ai](https://substratus.ai)
55

6-
### Installation
6+
## Installation
77

8-
```
8+
```bash
99
yarn
1010
```
1111

12-
### Local Development
12+
## Local Development
1313

14-
```
14+
```bash
1515
yarn start
16-
# npm start
1716
```
1817

1918
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
2019

21-
#### Notebooks
20+
### Notebooks
2221

2322
A lot of the documents on this website are generated from Jupyter Notebooks. This allows for testing documentation.
2423

2524
To edit the notebook files, you can either start a notebook (see below) or use VSCode which can edit notebooks directly.
2625

27-
```
26+
```bash
2827
npm run notebook
2928
```
3029

3130
Convert the notebook files to markdown.
3231

33-
34-
```
32+
```bash
3533
npm run convert-notebooks
3634
```
3735

3836
You can clear notebook outputs:
3937

40-
```
38+
```bash
4139
npm run clear-notebooks
4240
```
4341

44-
### Build
42+
## Build
4543

46-
```
44+
```bash
4745
yarn build
4846
```
4947

5048
This command generates static content into the `build` directory and can be served using any static contents hosting service.
5149

52-
### Deployment
50+
## Deployment
5351

5452
Using SSH:
5553

56-
```
54+
```bash
5755
USE_SSH=true yarn deploy
5856
```
5957

6058
Not using SSH:
6159

62-
```
60+
```bash
6361
GIT_USER=<Your GitHub username> yarn deploy
6462
```
6563

6664
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
6765

68-
### Assets
66+
## Assets
6967

70-
#### Main Icon
68+
### Main Icon
7169

72-
Source: https://favicon.io/favicon-generator/
70+
Source: <https://favicon.io/favicon-generator/>
7371

7472
See: `./scratch/favicon_io/`
7573

@@ -78,13 +76,12 @@ Settings:
7876
* Letter: "S" (capital)
7977
* Font: Saira Stencil One
8078

81-
#### General Icons
79+
### General Icons
8280

83-
Source: https://fonts.google.com/icons
81+
Source: <https://fonts.google.com/icons>
8482

8583
Settings:
8684

8785
* Weight: 100 (min)
8886
* Grade: 0 (middle)
8987
* Optical Size: 48 (max)
90-

docs/introduction.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ slug: "/"
55

66
# Introduction
77

8-
Substratus is a cross cloud substrate for training and serving ML models. Substratus extends the Kubernetes control plane to orchestrate ML operations through the addition of new API endpoints: Model, ModelServer, Dataset, and Notebook.
8+
Substratus is a cross-cloud substrate for training and serving ML models. Substratus extends the Kubernetes control plane to orchestrate ML operations through the addition of new API endpoints: Model, ModelServer, Dataset, and Notebook.
99

1010
## Why Substratus?
1111

12-
* Train and serve models from within your cloud account. Your data stays private.
12+
* Train and serve models from within your cloud account, on a portable platform. Your data stays private.
1313
* Leverage containers to avoid library lock-in and dependency wrangling.
14-
* Let substratus calculate your resource requirements and automatically provision GPUs, CPUs, and Memory.
14+
* Let substratus calculate your resource requirements and automatically provision GPUs, CPUs, Storage, and Memory.
1515
* Adopt best practice conventions by default.
1616
* Train pre-packaged state of the art models on your own datasets.
1717
* Leverage GitOps out of the box.
@@ -23,4 +23,3 @@ Substratus is a cross cloud substrate for training and serving ML models. Substr
2323
<div class="video-container">
2424
<iframe class="video" src="https://www.youtube.com/embed/dQw4w9WgXcQ" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
2525
</div>
26-

docs/quickstart.ipynb

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
{
22
"cells": [
33
{
4+
"attachments": {},
45
"cell_type": "markdown",
56
"id": "708b18ae-c2ed-4299-9ddd-8375428521fb",
67
"metadata": {
@@ -15,7 +16,7 @@
1516
"\n",
1617
"<!-- THE MARKDOWN (.md) FILE IS GENERATED FROM THE NOTEBOOK (.ipynb) FILE -->\n",
1718
"\n",
18-
"In this quickstart guide, you will install Substratus into a Google Cloud project. You will then explore how Substratus can be used to build and deploy Open Source LLMs.\n",
19+
"In this quickstart guide, you will install Substratus into a Google Cloud Platform project. Then you'll explore how Substratus can be used to build and deploy Open Source LLMs.\n",
1920
"\n",
2021
"NOTE: Support for AWS ([GitHub Issue #12](https://github.com/substratusai/substratus/issues/12)) and Azure ([GitHub Issue #63](https://github.com/substratusai/substratus/issues/63)) is planned. Give those issues a thumbs up if you would like to see them prioritized.\n",
2122
"\n",
@@ -27,6 +28,7 @@
2728
]
2829
},
2930
{
31+
"attachments": {},
3032
"cell_type": "markdown",
3133
"id": "3d69245f-c46c-4a2a-9ea3-8762eac462e8",
3234
"metadata": {},
@@ -87,6 +89,7 @@
8789
]
8890
},
8991
{
92+
"attachments": {},
9093
"cell_type": "markdown",
9194
"id": "c58c96c7-b086-49ec-9963-e8ad6ff01d62",
9295
"metadata": {},
@@ -106,28 +109,30 @@
106109
"outputs": [],
107110
"source": [
108111
"!docker run -it \\\n",
109-
" -v $HOME/.kube:/root/.kube \\\n",
112+
" -v ${HOME}/.kube:/root/.kube \\\n",
110113
" -e PROJECT=$(gcloud config get project) \\\n",
111114
" -e TOKEN=$(gcloud auth print-access-token) \\\n",
112115
" substratusai/installer gcp-up.sh"
113116
]
114117
},
115118
{
119+
"attachments": {},
116120
"cell_type": "markdown",
117121
"id": "54555fb1-8bdf-47b0-b97c-177b3db4ddad",
118122
"metadata": {},
119123
"source": [
120-
"Your kubectl command should now be pointing at the substratus cluster."
124+
"`kubectl` should now be pointing at the substratus cluster."
121125
]
122126
},
123127
{
128+
"attachments": {},
124129
"cell_type": "markdown",
125130
"id": "e06320d8-daf9-4199-bf74-ac24129ebeda",
126131
"metadata": {},
127132
"source": [
128133
"## Build and Deploy an Open Source Model\n",
129134
"\n",
130-
"To keep things quick, a small model (125 million parameters) will be used."
135+
"To keep this quick, we'll use a small model (125 million parameters)."
131136
]
132137
},
133138
{
@@ -141,11 +146,12 @@
141146
]
142147
},
143148
{
149+
"attachments": {},
144150
"cell_type": "markdown",
145151
"id": "f6288b3b-112b-4a2f-afad-cd1728263d3c",
146152
"metadata": {},
147153
"source": [
148-
"A container build process is now running in the Substratus cluster. You can declare that you would like the built model to be deployed by applying a ModelServer manifest."
154+
"A container build process is now running in the Substratus cluster. Let's also deploy the built model by applying a ModelServer manifest. ModelServer should start serving shortly after the Model build finishes (~3 minutes)."
149155
]
150156
},
151157
{
@@ -159,6 +165,7 @@
159165
]
160166
},
161167
{
168+
"attachments": {},
162169
"cell_type": "markdown",
163170
"id": "f57db76a-ad3d-42c4-8612-dbab31b8d9c6",
164171
"metadata": {},
@@ -177,14 +184,16 @@
177184
]
178185
},
179186
{
187+
"attachments": {},
180188
"cell_type": "markdown",
181189
"id": "f8408d23-92c6-435d-a510-a5e0bae0ed01",
182190
"metadata": {},
183191
"source": [
184-
"When the ModelServer is reporting a `Ready` status, proceed to the next section to test it out."
192+
"When the ModelServer reports a `Ready` status, proceed to the next section to test it out."
185193
]
186194
},
187195
{
196+
"attachments": {},
188197
"cell_type": "markdown",
189198
"id": "239ef216-9318-4679-83f0-98fcbed2a32c",
190199
"metadata": {
@@ -193,7 +202,7 @@
193202
"source": [
194203
"## Testing out the Model Server\n",
195204
"\n",
196-
"The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the internet. Substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). The choice of how to expose the model to your users is up to you.\n",
205+
"The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the Internet. By default, substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). From here, it's up to you to expose the model to a wider network (e.g., the internal VPC network or the Internet) via annotated Service or Ingress objects.\n",
197206
"\n",
198207
"In order to access the model for exploratory purposes, forward ports from within the cluster to your local machine."
199208
]
@@ -209,14 +218,35 @@
209218
]
210219
},
211220
{
221+
"attachments": {},
212222
"cell_type": "markdown",
213223
"id": "8bca86a6-8550-4884-a7fc-83b545ffe278",
214224
"metadata": {},
215225
"source": [
216-
"The packaged model server ships with an API (for application integration) and a GUI interface (for debugging). You can now open up your browser at [http://localhost:8080](http://localhost:8080) and talk to your model!"
226+
"All substratus ModelServers ship with an API and interactive frontend. Open up your browser to [http://localhost:8080/](http://localhost:8080/) and talk to your model! Alternatively, request text generation via the HTTP API:"
217227
]
218228
},
219229
{
230+
"cell_type": "code",
231+
"execution_count": null,
232+
"id": "90825af1",
233+
"metadata": {},
234+
"outputs": [],
235+
"source": [
236+
"! curl http://localhost:8080/v1/completions \\\n",
237+
" -H \"Content-Type: application/json\" \\\n",
238+
" -d '{ \\\n",
239+
" \"model\": \"facebook-opt-125m\", \\\n",
240+
" \"prompt\": \"The quick brown fox \", \\\n",
241+
" \"max_tokens\": 30, \\\n",
242+
" \"temperature\": 0 \\\n",
243+
" }'\n",
244+
" # choices[0].text will very different\n",
245+
"{\"id\":\"cmpl-2d4c8871b20dc45e6ac98322\",\"object\":\"text_completion\",\"created\":1688628294,\"model\":\"facebook-opt-125m\",\"choices\":[{\"text\":\"I've read Patrick Beut, Richard Eichel, Elliot Gagné\",\"index\":0,\"logprobs\":null,\"finish_reason\":\"length\"}],\"usage\":{\"prompt_tokens\":1,\"completion_tokens\":16,\"total_tokens\":17}}"
246+
]
247+
},
248+
{
249+
"attachments": {},
220250
"cell_type": "markdown",
221251
"id": "95a9966d-5679-4208-9b28-0323aa80cd79",
222252
"metadata": {},
@@ -225,6 +255,7 @@
225255
]
226256
},
227257
{
258+
"attachments": {},
228259
"cell_type": "markdown",
229260
"id": "cdb508e5-a7e9-4d94-a4d6-f3b4b2cf4e3f",
230261
"metadata": {},
@@ -245,6 +276,7 @@
245276
]
246277
},
247278
{
279+
"attachments": {},
248280
"cell_type": "markdown",
249281
"id": "c6b78cea-8d41-4319-8769-2fa3ee0e02ff",
250282
"metadata": {},
@@ -266,6 +298,7 @@
266298
]
267299
},
268300
{
301+
"attachments": {},
269302
"cell_type": "markdown",
270303
"id": "a21c7867-4e73-429c-a8d8-3c0792eb52b3",
271304
"metadata": {},

docs/quickstart.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ sidebar_position: 2
66

77
<!-- THE MARKDOWN (.md) FILE IS GENERATED FROM THE NOTEBOOK (.ipynb) FILE -->
88

9-
In this quickstart guide, you will install Substratus into a Google Cloud project. You will then explore how Substratus can be used to build and deploy Open Source LLMs.
9+
In this quickstart guide, you will install Substratus into a Google Cloud Platform project. Then you'll explore how Substratus can be used to build and deploy Open Source LLMs.
1010

1111
NOTE: Support for AWS ([GitHub Issue #12](https://github.com/substratusai/substratus/issues/12)) and Azure ([GitHub Issue #63](https://github.com/substratusai/substratus/issues/63)) is planned. Give those issues a thumbs up if you would like to see them prioritized.
1212

@@ -49,24 +49,24 @@ Create a substratus GKE cluster along with supporting infrastructure (buckets, s
4949

5050
```bash
5151
docker run -it \
52-
-v $HOME/.kube:/root/.kube \
52+
-v ${HOME}/.kube:/root/.kube \
5353
-e PROJECT=$(gcloud config get project) \
5454
-e TOKEN=$(gcloud auth print-access-token) \
5555
substratusai/installer gcp-up.sh
5656
```
5757

58-
Your kubectl command should now be pointing at the substratus cluster.
58+
`kubectl` should now be pointing at the substratus cluster.
5959

6060
## Build and Deploy an Open Source Model
6161

62-
To keep things quick, a small model (125 million parameters) will be used.
62+
To keep this quick, we'll use a small model (125 million parameters).
6363

6464

6565
```bash
6666
kubectl apply -f https://raw.githubusercontent.com/substratusai/substratus/main/examples/facebook-opt-125m/model.yaml
6767
```
6868

69-
A container build process is now running in the Substratus cluster. You can declare that you would like the built model to be deployed by applying a ModelServer manifest.
69+
A container build process is now running in the Substratus cluster. Let's also deploy the built model by applying a ModelServer manifest. ModelServer should start serving shortly after the Model build finishes (~3 minutes).
7070

7171

7272
```bash
@@ -80,11 +80,11 @@ You can check on the progress of both processes using a single command.
8080
kubectl get ai
8181
```
8282

83-
When the ModelServer is reporting a `Ready` status, proceed to the next section to test it out.
83+
When the ModelServer reports a `Ready` status, proceed to the next section to test it out.
8484

8585
## Testing out the Model Server
8686

87-
The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the internet. Substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). The choice of how to expose the model to your users is up to you.
87+
The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the Internet. By default, substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). From here, it's up to you to expose the model to a wider network (e.g., the internal VPC network or the Internet) via annotated Service or Ingress objects.
8888

8989
In order to access the model for exploratory purposes, forward ports from within the cluster to your local machine.
9090

@@ -93,7 +93,21 @@ In order to access the model for exploratory purposes, forward ports from within
9393
kubectl port-forward service/facebook-opt-125m-modelserver 8080:8080
9494
```
9595

96-
The packaged model server ships with an API (for application integration) and a GUI interface (for debugging). You can now open up your browser at [http://localhost:8080](http://localhost:8080) and talk to your model!
96+
All substratus ModelServers ship with an API and interactive frontend. Open up your browser to [http://localhost:8080/](http://localhost:8080/) and talk to your model! Alternatively, request text generation via the HTTP API:
97+
98+
99+
```bash
100+
curl http://localhost:8080/v1/completions \
101+
-H "Content-Type: application/json" \
102+
-d '{ \
103+
"model": "facebook-opt-125m", \
104+
"prompt": "The quick brown fox ", \
105+
"max_tokens": 30, \
106+
"temperature": 0 \
107+
}'
108+
# choices[0].text will very different
109+
{"id":"cmpl-2d4c8871b20dc45e6ac98322","object":"text_completion","created":1688628294,"model":"facebook-opt-125m","choices":[{"text":"I've read Patrick Beut, Richard Eichel, Elliot Gagné","index":0,"logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":1,"completion_tokens":16,"total_tokens":17}}
110+
```
97111

98112
If you are interested in continuing your journey through Substratus, take a look at the [Guided Walkthrough](./category/walkthrough).
99113

docusaurus.config.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,10 @@ const config = {
112112
label: "Stack Overflow",
113113
href: "https://stackoverflow.com/questions/tagged/substratus",
114114
},
115+
{
116+
label: "Discord",
117+
href: "https://discord.gg/RcUShexGu8",
118+
},
115119
],
116120
},
117121
{

0 commit comments

Comments
 (0)