You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by VaishnaviHire March 19, 2025
Hello everyone! I, alongwith @leseb and @rhuss would like to initiate a discussion about enhancing Llama Stack’s deployment capabilities by introducing a Kubernetes Operator.
Why the operator?
An Operator can provide several advantages for Llama-Stack server deployments -
simplified deployments: through declarative configurations(CRD)
automated updates : when new versions are released
self-healing capabilities: recovering a failing instance
dynamic scaling
making Llama Stack easier to run in a Kubernetes environment.
How it fits to Llama Stack?
The Operator could handle:
Provisioning required resources
Pulling and configuring the appropriate Llama Stack version and distribution
Monitoring server status and performing automated recovery
Scaling resources based on defined parameters
Repository Structure
To keep things modular, we propose hosting this operator in a separate GitHub repository(e.g llama-stack-k8s-operator). A separate Github repository ensures separation of concerns between the Llama Stack server and deployment strategies. It will also provide a dedicated space for Kubernetes-specific issues and contributions.
We are seeking community's feedback on leveraging Kubernetes operators for efficient deployment and management of Llama Stack server instances. Based on this feedback, we would like to share details of our prototype, including the Custom Resource Definition (CRD) and controller implementation, in subsequent discussions.
The text was updated successfully, but these errors were encountered:
As made obvious by my PR linked above I could not agree more.
Would love to know when this new repository has been created to contribute my Helm Chart to it. And happy to get involved in the development of the operator.
As made obvious by my PR linked above I could not agree more.
Would love to know when this new repository has been created to contribute my Helm Chart to it. And happy to get involved in the development of the operator.
Based on our recent conversations during the community call, this item in-progress. We should get the repo soon, stay tuned!
Discussed in #1707
Originally posted by VaishnaviHire March 19, 2025
Hello everyone! I, alongwith @leseb and @rhuss would like to initiate a discussion about enhancing Llama Stack’s deployment capabilities by introducing a Kubernetes Operator.
Why the operator?
An Operator can provide several advantages for Llama-Stack server deployments -
making Llama Stack easier to run in a Kubernetes environment.
How it fits to Llama Stack?
The Operator could handle:
Repository Structure
To keep things modular, we propose hosting this operator in a separate GitHub repository(e.g llama-stack-k8s-operator). A separate Github repository ensures separation of concerns between the Llama Stack server and deployment strategies. It will also provide a dedicated space for Kubernetes-specific issues and contributions.
We are seeking community's feedback on leveraging Kubernetes operators for efficient deployment and management of Llama Stack server instances. Based on this feedback, we would like to share details of our prototype, including the Custom Resource Definition (CRD) and controller implementation, in subsequent discussions.
The text was updated successfully, but these errors were encountered: