|
1 | 1 | {
|
2 | 2 | "cells": [
|
3 | 3 | {
|
| 4 | + "attachments": {}, |
4 | 5 | "cell_type": "markdown",
|
5 | 6 | "id": "708b18ae-c2ed-4299-9ddd-8375428521fb",
|
6 | 7 | "metadata": {
|
|
15 | 16 | "\n",
|
16 | 17 | "<!-- THE MARKDOWN (.md) FILE IS GENERATED FROM THE NOTEBOOK (.ipynb) FILE -->\n",
|
17 | 18 | "\n",
|
18 |
| - "In this quickstart guide, you will install Substratus into a Google Cloud project. You will then explore how Substratus can be used to build and deploy Open Source LLMs.\n", |
| 19 | + "In this quickstart guide, you will install Substratus into a Google Cloud Platform project. Then you'll explore how Substratus can be used to build and deploy Open Source LLMs.\n", |
19 | 20 | "\n",
|
20 | 21 | "NOTE: Support for AWS ([GitHub Issue #12](https://github.com/substratusai/substratus/issues/12)) and Azure ([GitHub Issue #63](https://github.com/substratusai/substratus/issues/63)) is planned. Give those issues a thumbs up if you would like to see them prioritized.\n",
|
21 | 22 | "\n",
|
|
27 | 28 | ]
|
28 | 29 | },
|
29 | 30 | {
|
| 31 | + "attachments": {}, |
30 | 32 | "cell_type": "markdown",
|
31 | 33 | "id": "3d69245f-c46c-4a2a-9ea3-8762eac462e8",
|
32 | 34 | "metadata": {},
|
|
87 | 89 | ]
|
88 | 90 | },
|
89 | 91 | {
|
| 92 | + "attachments": {}, |
90 | 93 | "cell_type": "markdown",
|
91 | 94 | "id": "c58c96c7-b086-49ec-9963-e8ad6ff01d62",
|
92 | 95 | "metadata": {},
|
|
106 | 109 | "outputs": [],
|
107 | 110 | "source": [
|
108 | 111 | "!docker run -it \\\n",
|
109 |
| - " -v $HOME/.kube:/root/.kube \\\n", |
| 112 | + " -v ${HOME}/.kube:/root/.kube \\\n", |
110 | 113 | " -e PROJECT=$(gcloud config get project) \\\n",
|
111 | 114 | " -e TOKEN=$(gcloud auth print-access-token) \\\n",
|
112 | 115 | " substratusai/installer gcp-up.sh"
|
113 | 116 | ]
|
114 | 117 | },
|
115 | 118 | {
|
| 119 | + "attachments": {}, |
116 | 120 | "cell_type": "markdown",
|
117 | 121 | "id": "54555fb1-8bdf-47b0-b97c-177b3db4ddad",
|
118 | 122 | "metadata": {},
|
119 | 123 | "source": [
|
120 |
| - "Your kubectl command should now be pointing at the substratus cluster." |
| 124 | + "`kubectl` should now be pointing at the substratus cluster." |
121 | 125 | ]
|
122 | 126 | },
|
123 | 127 | {
|
| 128 | + "attachments": {}, |
124 | 129 | "cell_type": "markdown",
|
125 | 130 | "id": "e06320d8-daf9-4199-bf74-ac24129ebeda",
|
126 | 131 | "metadata": {},
|
127 | 132 | "source": [
|
128 | 133 | "## Build and Deploy an Open Source Model\n",
|
129 | 134 | "\n",
|
130 |
| - "To keep things quick, a small model (125 million parameters) will be used." |
| 135 | + "To keep this quick, we'll use a small model (125 million parameters)." |
131 | 136 | ]
|
132 | 137 | },
|
133 | 138 | {
|
|
141 | 146 | ]
|
142 | 147 | },
|
143 | 148 | {
|
| 149 | + "attachments": {}, |
144 | 150 | "cell_type": "markdown",
|
145 | 151 | "id": "f6288b3b-112b-4a2f-afad-cd1728263d3c",
|
146 | 152 | "metadata": {},
|
147 | 153 | "source": [
|
148 |
| - "A container build process is now running in the Substratus cluster. You can declare that you would like the built model to be deployed by applying a ModelServer manifest." |
| 154 | + "A container build process is now running in the Substratus cluster. Let's also deploy the built model by applying a ModelServer manifest. ModelServer should start serving shortly after the Model build finishes (~3 minutes)." |
149 | 155 | ]
|
150 | 156 | },
|
151 | 157 | {
|
|
159 | 165 | ]
|
160 | 166 | },
|
161 | 167 | {
|
| 168 | + "attachments": {}, |
162 | 169 | "cell_type": "markdown",
|
163 | 170 | "id": "f57db76a-ad3d-42c4-8612-dbab31b8d9c6",
|
164 | 171 | "metadata": {},
|
|
177 | 184 | ]
|
178 | 185 | },
|
179 | 186 | {
|
| 187 | + "attachments": {}, |
180 | 188 | "cell_type": "markdown",
|
181 | 189 | "id": "f8408d23-92c6-435d-a510-a5e0bae0ed01",
|
182 | 190 | "metadata": {},
|
183 | 191 | "source": [
|
184 |
| - "When the ModelServer is reporting a `Ready` status, proceed to the next section to test it out." |
| 192 | + "When the ModelServer reports a `Ready` status, proceed to the next section to test it out." |
185 | 193 | ]
|
186 | 194 | },
|
187 | 195 | {
|
| 196 | + "attachments": {}, |
188 | 197 | "cell_type": "markdown",
|
189 | 198 | "id": "239ef216-9318-4679-83f0-98fcbed2a32c",
|
190 | 199 | "metadata": {
|
|
193 | 202 | "source": [
|
194 | 203 | "## Testing out the Model Server\n",
|
195 | 204 | "\n",
|
196 |
| - "The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the internet. Substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). The choice of how to expose the model to your users is up to you.\n", |
| 205 | + "The way every company chooses to expose a model will be different. In most cases models are integrated into other business applications and are rarely exposed directly to the Internet. By default, substratus will only serve the model within the Kubernetes cluster (with a Kubernetes [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object). From here, it's up to you to expose the model to a wider network (e.g., the internal VPC network or the Internet) via annotated Service or Ingress objects.\n", |
197 | 206 | "\n",
|
198 | 207 | "In order to access the model for exploratory purposes, forward ports from within the cluster to your local machine."
|
199 | 208 | ]
|
|
209 | 218 | ]
|
210 | 219 | },
|
211 | 220 | {
|
| 221 | + "attachments": {}, |
212 | 222 | "cell_type": "markdown",
|
213 | 223 | "id": "8bca86a6-8550-4884-a7fc-83b545ffe278",
|
214 | 224 | "metadata": {},
|
215 | 225 | "source": [
|
216 |
| - "The packaged model server ships with an API (for application integration) and a GUI interface (for debugging). You can now open up your browser at [http://localhost:8080](http://localhost:8080) and talk to your model!" |
| 226 | + "All substratus ModelServers ship with an API and interactive frontend. Open up your browser to [http://localhost:8080/](http://localhost:8080/) and talk to your model! Alternatively, request text generation via the HTTP API:" |
217 | 227 | ]
|
218 | 228 | },
|
219 | 229 | {
|
| 230 | + "cell_type": "code", |
| 231 | + "execution_count": null, |
| 232 | + "id": "90825af1", |
| 233 | + "metadata": {}, |
| 234 | + "outputs": [], |
| 235 | + "source": [ |
| 236 | + "! curl http://localhost:8080/v1/completions \\\n", |
| 237 | + " -H \"Content-Type: application/json\" \\\n", |
| 238 | + " -d '{ \\\n", |
| 239 | + " \"model\": \"facebook-opt-125m\", \\\n", |
| 240 | + " \"prompt\": \"The quick brown fox \", \\\n", |
| 241 | + " \"max_tokens\": 30, \\\n", |
| 242 | + " \"temperature\": 0 \\\n", |
| 243 | + " }'\n", |
| 244 | + " # choices[0].text will very different\n", |
| 245 | + "{\"id\":\"cmpl-2d4c8871b20dc45e6ac98322\",\"object\":\"text_completion\",\"created\":1688628294,\"model\":\"facebook-opt-125m\",\"choices\":[{\"text\":\"I've read Patrick Beut, Richard Eichel, Elliot Gagné\",\"index\":0,\"logprobs\":null,\"finish_reason\":\"length\"}],\"usage\":{\"prompt_tokens\":1,\"completion_tokens\":16,\"total_tokens\":17}}" |
| 246 | + ] |
| 247 | + }, |
| 248 | + { |
| 249 | + "attachments": {}, |
220 | 250 | "cell_type": "markdown",
|
221 | 251 | "id": "95a9966d-5679-4208-9b28-0323aa80cd79",
|
222 | 252 | "metadata": {},
|
|
225 | 255 | ]
|
226 | 256 | },
|
227 | 257 | {
|
| 258 | + "attachments": {}, |
228 | 259 | "cell_type": "markdown",
|
229 | 260 | "id": "cdb508e5-a7e9-4d94-a4d6-f3b4b2cf4e3f",
|
230 | 261 | "metadata": {},
|
|
245 | 276 | ]
|
246 | 277 | },
|
247 | 278 | {
|
| 279 | + "attachments": {}, |
248 | 280 | "cell_type": "markdown",
|
249 | 281 | "id": "c6b78cea-8d41-4319-8769-2fa3ee0e02ff",
|
250 | 282 | "metadata": {},
|
|
266 | 298 | ]
|
267 | 299 | },
|
268 | 300 | {
|
| 301 | + "attachments": {}, |
269 | 302 | "cell_type": "markdown",
|
270 | 303 | "id": "a21c7867-4e73-429c-a8d8-3c0792eb52b3",
|
271 | 304 | "metadata": {},
|
|
0 commit comments