Skip to content

Commit 50014a1

Browse files
authored
add miss lifecycle crd yaml (#312)
* add miss lifecycle crd yaml * add miss api docs
1 parent 195729a commit 50014a1

File tree

3 files changed

+345
-196
lines changed

3 files changed

+345
-196
lines changed

config/crd/bases/inference.llmaz.io_backendruntimes.yaml

+220
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,226 @@ spec:
169169
Image represents the default image registry of the backendRuntime.
170170
It will work together with version to make up a real image.
171171
type: string
172+
lifecycle:
173+
description: Lifecycle represents hooks executed during the lifecycle
174+
of the container.
175+
properties:
176+
postStart:
177+
description: |-
178+
PostStart is called immediately after a container is created. If the handler fails,
179+
the container is terminated and restarted according to its restart policy.
180+
Other management of the container blocks until the hook completes.
181+
More info: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
182+
properties:
183+
exec:
184+
description: Exec specifies a command to execute in the container.
185+
properties:
186+
command:
187+
description: |-
188+
Command is the command line to execute inside the container, the working directory for the
189+
command is root ('/') in the container's filesystem. The command is simply exec'd, it is
190+
not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use
191+
a shell, you need to explicitly call out to that shell.
192+
Exit status of 0 is treated as live/healthy and non-zero is unhealthy.
193+
items:
194+
type: string
195+
type: array
196+
x-kubernetes-list-type: atomic
197+
type: object
198+
httpGet:
199+
description: HTTPGet specifies an HTTP GET request to perform.
200+
properties:
201+
host:
202+
description: |-
203+
Host name to connect to, defaults to the pod IP. You probably want to set
204+
"Host" in httpHeaders instead.
205+
type: string
206+
httpHeaders:
207+
description: Custom headers to set in the request. HTTP
208+
allows repeated headers.
209+
items:
210+
description: HTTPHeader describes a custom header to
211+
be used in HTTP probes
212+
properties:
213+
name:
214+
description: |-
215+
The header field name.
216+
This will be canonicalized upon output, so case-variant names will be understood as the same header.
217+
type: string
218+
value:
219+
description: The header field value
220+
type: string
221+
required:
222+
- name
223+
- value
224+
type: object
225+
type: array
226+
x-kubernetes-list-type: atomic
227+
path:
228+
description: Path to access on the HTTP server.
229+
type: string
230+
port:
231+
anyOf:
232+
- type: integer
233+
- type: string
234+
description: |-
235+
Name or number of the port to access on the container.
236+
Number must be in the range 1 to 65535.
237+
Name must be an IANA_SVC_NAME.
238+
x-kubernetes-int-or-string: true
239+
scheme:
240+
description: |-
241+
Scheme to use for connecting to the host.
242+
Defaults to HTTP.
243+
type: string
244+
required:
245+
- port
246+
type: object
247+
sleep:
248+
description: Sleep represents a duration that the container
249+
should sleep.
250+
properties:
251+
seconds:
252+
description: Seconds is the number of seconds to sleep.
253+
format: int64
254+
type: integer
255+
required:
256+
- seconds
257+
type: object
258+
tcpSocket:
259+
description: |-
260+
Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept
261+
for backward compatibility. There is no validation of this field and
262+
lifecycle hooks will fail at runtime when it is specified.
263+
properties:
264+
host:
265+
description: 'Optional: Host name to connect to, defaults
266+
to the pod IP.'
267+
type: string
268+
port:
269+
anyOf:
270+
- type: integer
271+
- type: string
272+
description: |-
273+
Number or name of the port to access on the container.
274+
Number must be in the range 1 to 65535.
275+
Name must be an IANA_SVC_NAME.
276+
x-kubernetes-int-or-string: true
277+
required:
278+
- port
279+
type: object
280+
type: object
281+
preStop:
282+
description: |-
283+
PreStop is called immediately before a container is terminated due to an
284+
API request or management event such as liveness/startup probe failure,
285+
preemption, resource contention, etc. The handler is not called if the
286+
container crashes or exits. The Pod's termination grace period countdown begins before the
287+
PreStop hook is executed. Regardless of the outcome of the handler, the
288+
container will eventually terminate within the Pod's termination grace
289+
period (unless delayed by finalizers). Other management of the container blocks until the hook completes
290+
or until the termination grace period is reached.
291+
More info: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
292+
properties:
293+
exec:
294+
description: Exec specifies a command to execute in the container.
295+
properties:
296+
command:
297+
description: |-
298+
Command is the command line to execute inside the container, the working directory for the
299+
command is root ('/') in the container's filesystem. The command is simply exec'd, it is
300+
not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use
301+
a shell, you need to explicitly call out to that shell.
302+
Exit status of 0 is treated as live/healthy and non-zero is unhealthy.
303+
items:
304+
type: string
305+
type: array
306+
x-kubernetes-list-type: atomic
307+
type: object
308+
httpGet:
309+
description: HTTPGet specifies an HTTP GET request to perform.
310+
properties:
311+
host:
312+
description: |-
313+
Host name to connect to, defaults to the pod IP. You probably want to set
314+
"Host" in httpHeaders instead.
315+
type: string
316+
httpHeaders:
317+
description: Custom headers to set in the request. HTTP
318+
allows repeated headers.
319+
items:
320+
description: HTTPHeader describes a custom header to
321+
be used in HTTP probes
322+
properties:
323+
name:
324+
description: |-
325+
The header field name.
326+
This will be canonicalized upon output, so case-variant names will be understood as the same header.
327+
type: string
328+
value:
329+
description: The header field value
330+
type: string
331+
required:
332+
- name
333+
- value
334+
type: object
335+
type: array
336+
x-kubernetes-list-type: atomic
337+
path:
338+
description: Path to access on the HTTP server.
339+
type: string
340+
port:
341+
anyOf:
342+
- type: integer
343+
- type: string
344+
description: |-
345+
Name or number of the port to access on the container.
346+
Number must be in the range 1 to 65535.
347+
Name must be an IANA_SVC_NAME.
348+
x-kubernetes-int-or-string: true
349+
scheme:
350+
description: |-
351+
Scheme to use for connecting to the host.
352+
Defaults to HTTP.
353+
type: string
354+
required:
355+
- port
356+
type: object
357+
sleep:
358+
description: Sleep represents a duration that the container
359+
should sleep.
360+
properties:
361+
seconds:
362+
description: Seconds is the number of seconds to sleep.
363+
format: int64
364+
type: integer
365+
required:
366+
- seconds
367+
type: object
368+
tcpSocket:
369+
description: |-
370+
Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept
371+
for backward compatibility. There is no validation of this field and
372+
lifecycle hooks will fail at runtime when it is specified.
373+
properties:
374+
host:
375+
description: 'Optional: Host name to connect to, defaults
376+
to the pod IP.'
377+
type: string
378+
port:
379+
anyOf:
380+
- type: integer
381+
- type: string
382+
description: |-
383+
Number or name of the port to access on the container.
384+
Number must be in the range 1 to 65535.
385+
Name must be an IANA_SVC_NAME.
386+
x-kubernetes-int-or-string: true
387+
required:
388+
- port
389+
type: object
390+
type: object
391+
type: object
172392
livenessProbe:
173393
description: |-
174394
Periodic probe of backend liveness.

docs/reference/core.v1alpha1.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -74,19 +74,19 @@ in autoscaling. Right now, it will be used in two places:</p>
7474
<p>Name represents the flavor name, which will be used in model claim.</p>
7575
</td>
7676
</tr>
77-
<tr><td><code>requests</code><br/>
77+
<tr><td><code>limits</code><br/>
7878
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#resourcelist-v1-core"><code>k8s.io/api/core/v1.ResourceList</code></a>
7979
</td>
8080
<td>
81-
<p>Requests defines the required accelerators to serve the model for each replica,
82-
like &lt;nvidia.com/gpu: 8&gt;. For multi-hosts cases, the requests here indicates
81+
<p>Limits defines the required accelerators to serve the model for each replica,
82+
like &lt;nvidia.com/gpu: 8&gt;. For multi-hosts cases, the limits here indicates
8383
the resource requirements for each replica, usually equals to the TP size.
8484
Not recommended to set the cpu and memory usage here:</p>
8585
<ul>
8686
<li>if using playground, you can define the cpu/mem usage at backendConfig.</li>
8787
<li>if using inference service, you can define the cpu/mem at the container resources.
88-
However, if you define the same accelerator requests at playground/service as well,
89-
the requests will be overwritten by the flavor requests.</li>
88+
However, if you define the same accelerator resources at playground/service as well,
89+
the resources will be overwritten by the flavor limit here.</li>
9090
</ul>
9191
</td>
9292
</tr>

0 commit comments

Comments
 (0)