Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying Llama Stack Deployment with Kubernetes Operator #1814

Open
leseb opened this issue Mar 27, 2025 · 2 comments
Open

Simplifying Llama Stack Deployment with Kubernetes Operator #1814

leseb opened this issue Mar 27, 2025 · 2 comments
Assignees

Comments

@leseb
Copy link
Collaborator

leseb commented Mar 27, 2025

Discussed in #1707

Originally posted by VaishnaviHire March 19, 2025
Hello everyone! I, alongwith @leseb and @rhuss would like to initiate a discussion about enhancing Llama Stack’s deployment capabilities by introducing a Kubernetes Operator.

Why the operator?
An Operator can provide several advantages for Llama-Stack server deployments -

  • simplified deployments: through declarative configurations(CRD)
  • automated updates : when new versions are released
  • self-healing capabilities: recovering a failing instance
  • dynamic scaling

making Llama Stack easier to run in a Kubernetes environment.

How it fits to Llama Stack?
The Operator could handle:

  • Provisioning required resources
  • Pulling and configuring the appropriate Llama Stack version and distribution
  • Monitoring server status and performing automated recovery
  • Scaling resources based on defined parameters

Repository Structure
To keep things modular, we propose hosting this operator in a separate GitHub repository(e.g llama-stack-k8s-operator). A separate Github repository ensures separation of concerns between the Llama Stack server and deployment strategies. It will also provide a dedicated space for Kubernetes-specific issues and contributions.

We are seeking community's feedback on leveraging Kubernetes operators for efficient deployment and management of Llama Stack server instances. Based on this feedback, we would like to share details of our prototype, including the Custom Resource Definition (CRD) and controller implementation, in subsequent discussions.

@jland-redhat
Copy link
Contributor

As made obvious by my PR linked above I could not agree more.

Would love to know when this new repository has been created to contribute my Helm Chart to it. And happy to get involved in the development of the operator.

@leseb
Copy link
Collaborator Author

leseb commented Mar 31, 2025

As made obvious by my PR linked above I could not agree more.

Would love to know when this new repository has been created to contribute my Helm Chart to it. And happy to get involved in the development of the operator.

Based on our recent conversations during the community call, this item in-progress. We should get the repo soon, stay tuned!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants