-
Notifications
You must be signed in to change notification settings - Fork 27
HOLD FOR RELEASE: Updates for Embedded Cluster HA #3136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
0f9b7a7
66287d8
59e3a71
7399257
2c243de
5ea20eb
dc0366b
bdb7bc8
4daa291
e0c13b3
bf11450
d6807b5
a132db0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,20 +8,16 @@ The topic describes managing nodes in clusters created with Replicated Embedded | |
|
||
Multi-node clusters with Embedded Cluster have the following limitations: | ||
|
||
* Support for multi-node clusters with Embedded Cluster is Beta. Only single-node embedded clusters are Generally Available (GA). | ||
|
||
* High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]). | ||
|
||
* The same Embedded Cluster data directory used at installation is used for all nodes joined to the cluster. This is either the default `/var/lib/embedded-cluster` directory or the directory set with the [`--data-dir`](/reference/embedded-cluster-install#flags) flag. You cannot choose a different data directory for Embedded Cluster when joining nodes. | ||
|
||
* More than one controller node should not be joined at the same time. When joining a controller node, a warning is printed that explains that the user should not attempt to join another node until the controller node joins successfully. | ||
|
||
## Add Nodes to a Cluster (Beta) {#add-nodes} | ||
## Add Nodes to a Cluster {#add-nodes} | ||
|
||
You can add nodes to create a multi-node cluster in online (internet-connected) and air-gapped (limited or no outbound internet access) environments. The Admin Console provides the join command that you use to join nodes to the cluster. | ||
|
||
:::note | ||
Multi-node clusters are not highly available by default. For information about enabling high availability, see [Enable High Availability for Multi-Node Clusters (Alpha)](#ha) below. | ||
Multi-node clusters are not highly available by default. For information about enabling high availability, see [Enable High Availability for Multi-Node Clusters](#ha) below. | ||
::: | ||
|
||
To add nodes to a cluster: | ||
|
@@ -50,11 +46,7 @@ To add nodes to a cluster: | |
|
||
* If the Embedded Cluster Config [roles](/reference/embedded-config#roles) key is not configured, all new nodes joined to the cluster are assigned the `controller` role by default. The `controller` role designates nodes that run the Kubernetes control plane. Controller nodes can also run other workloads, such as application or Replicated KOTS workloads. | ||
|
||
* Roles are not updated or changed after a node is added. If you need to change a node’s role, reset the node and add it again with the new role. | ||
|
||
* For multi-node clusters with high availability (HA), at least three `controller` nodes are required. You can assign both the `controller` role and one or more `custom` roles to the same node. For more information about creating HA clusters with Embedded Cluster, see [Enable High Availability for Multi-Node Clusters (Alpha)](#ha) below. | ||
|
||
* To add non-controller or _worker_ nodes that do not run the Kubernetes control plane, select one or more `custom` roles for the node and deselect the `controller` role. | ||
* The role cannot be changed after a node is added. If you need to change a node’s role, reset the node and add it again with the new role. | ||
|
||
1. Do one of the following to make the Embedded Cluster installation assets available on the machine that you will join to the cluster: | ||
|
||
|
@@ -83,13 +75,11 @@ To add nodes to a cluster: | |
|
||
1. Repeat these steps for each node you want to add. | ||
|
||
## Enable High Availability for Multi-Node Clusters (Alpha) {#ha} | ||
## High Availability for Multi-Node Clusters{#ha} | ||
|
||
Multi-node clusters are not highly available by default. The first node of the cluster is special and holds important data for Kubernetes and KOTS, such that the loss of this node would be catastrophic for the cluster. Enabling high availability (HA) requires that at least three controller nodes are present in the cluster. Users can enable HA when joining the third node. | ||
Multi-node clusters are not highly available by default. The first node of the cluster holds important data for Kubernetes and KOTS, such that the loss of this node would be catastrophic for the cluster. Enabling high availability requires that at least three controller nodes are present in the cluster. | ||
|
||
:::important | ||
High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]). | ||
::: | ||
Users are automatically prompted to enable HA when joining the third controller node to a cluster. Alternatively, users can enable HA with the `enable-ha` command after adding three or more controller nodes. | ||
|
||
### HA Architecture | ||
|
||
|
@@ -101,20 +91,10 @@ For more information about the Embedded Cluster built-in extensions, see [Built- | |
|
||
Enabling high availability has the following requirements: | ||
|
||
* High availability is supported with Embedded Cluster 1.4.1 or later. | ||
* High availability is supported with Embedded Cluster **VERSION** and later. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ^ probably should change this to the GA version |
||
|
||
* High availability is supported only for clusters where at least three nodes with the `controller` role are present. | ||
|
||
### Limitations | ||
|
||
Enabling high availability has the following limitations: | ||
|
||
* High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]). | ||
|
||
* The `--enable-ha` flag serves as a feature flag during the Alpha phase. In the future, the prompt about migrating to high availability will display automatically if the cluster is not yet HA and you are adding the third or more controller node. | ||
|
||
* HA multi-node clusters use rqlite to store support bundles up to 100 MB in size. Bundles over 100 MB can cause rqlite to crash and restart. | ||
|
||
### Best Practices for High Availability | ||
|
||
Consider the following best practices and recommendations for creating HA clusters: | ||
|
@@ -125,23 +105,38 @@ Consider the following best practices and recommendations for creating HA cluste | |
|
||
* You can have any number of _worker_ nodes in HA clusters. Worker nodes do not run the Kubernetes control plane, but can run workloads such as application or Replicated KOTS workloads. | ||
|
||
### Create a Multi-Node HA Cluster | ||
### Create a Multi-Node Cluster with High Availability {#create-ha} | ||
|
||
You can enable high availability for a multi-node cluster when joining the third controller node. Alternatively, you can enable HA for an existing cluster with three or more controller nodes. For more information, see [Enable High Availability For an Existing Cluster](#enable-ha-existing) below. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ^ Split the options into two different headings: "Create a Multi-Node Cluster with High Availability" and "Enable High Availability for an Existing Cluster". I also tried combining them in the same procedure but this felt easier to skim from looking at the TOC for the page |
||
|
||
To create a multi-node HA cluster: | ||
|
||
1. Set up a cluster with at least two controller nodes. You can do an online (internet-connected) or air gap installation. For more information, see [Online Installation with Embedded Cluster](/enterprise/installing-embedded) or [Air Gap Installation with Embedded Cluster](/enterprise/installing-embedded-air-gap). | ||
|
||
1. SSH onto a third node that you want to join to the cluster as a controller. | ||
|
||
1. Run the join command provided in the Admin Console **Cluster Management** tab and pass the `--enable-ha` flag. For example: | ||
1. On the third node, run the join command provided in the Admin Console **Cluster Management** tab. | ||
|
||
**Example:** | ||
|
||
```bash | ||
sudo ./APP_SLUG join --enable-ha 10.128.0.80:30000 tI13KUWITdIerfdMcWTA4Hpf | ||
sudo ./APP_SLUG join 10.128.0.80:30000 tI13KUWITdIerfdMcWTA4Hpf | ||
``` | ||
Where `APP_SLUG` is the unique slug for the application. | ||
|
||
1. After the third node joins the cluster, type `y` in response to the prompt asking if you want to enable high availability. | ||
1. In response to the prompt asking if you want to enable high availability, type `y` or `yes`: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the screenshot here is different, but we can make it a follow up to update it if you'd like. |
||
|
||
 | ||
[View a larger version of this image](/images/embedded-cluster-ha-prompt.png) | ||
|
||
1. Wait for the migration to complete. | ||
1. Wait for the migration to HA to complete. | ||
|
||
### Enable High Availability For an Existing Cluster {#enable-ha-existing} | ||
|
||
To enable high availability for an existing Embedded Cluster installation with three or more controller nodes, run the following command: | ||
|
||
```bash | ||
sudo ./APP_SLUG enable-ha | ||
``` | ||
|
||
Where `APP_SLUG` is the unique slug for the application. | ||
paigecalvert marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,17 +22,20 @@ kind: Config | |
spec: | ||
version: 2.1.3+k8s-1.30 | ||
roles: | ||
controller: | ||
name: management | ||
labels: | ||
management: "true" | ||
controller: | ||
name: app | ||
labels: | ||
app: "true" | ||
custom: | ||
- name: app | ||
- name: gpu | ||
labels: | ||
gpu: "true" | ||
- name: database | ||
labels: | ||
app: "true" | ||
domains: | ||
proxyRegistryDomain: proxy.yourcompany.com | ||
replicatedAppDomain: updates.yourcompany.com | ||
database: "true" | ||
domains: | ||
proxyRegistryDomain: proxy.yourcompany.com | ||
replicatedAppDomain: updates.yourcompany.com | ||
extensions: | ||
helm: | ||
repositories: | ||
|
@@ -72,25 +75,12 @@ For a full list of versions, see the Embedded Cluster [releases page](https://gi | |
|
||
You can optionally customize node roles in the Embedded Cluster Config using the `roles` key. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I reverted this back to As part of reverting back to the original description of how this feature works, I also did a bit of a rewrite/reorg to simplify things. This meant reducing word count, using one full example for setting roles.controller and roles.custom with names and labels rather than a few scattered examples, and changing the subheadings so that "labels" isn't it's own h3 anymore |
||
|
||
If the `roles` key is configured, users select one or more roles to assign to a node when it is joined to the cluster. A single node can be assigned: | ||
* The `controller` role, which designates nodes that run the Kubernetes control plane | ||
* One or more `custom` roles | ||
* Both the `controller` role _and_ one or more `custom` roles | ||
A common use case for customizing node roles is to assign workloads to specific nodes. For example, if your application has graphics processing unit (GPU) workloads, you could create a `custom` role that will add a `gpu=true` label to any node that is assigned the role. This allows you to then schedule GPU workloads on nodes labled `gpu=true`. | ||
|
||
For more information about how to assign node roles in the Admin Console, see [Managing Multi-Node Clusters with Embedded Cluster](/enterprise/embedded-manage-nodes). | ||
When the `roles` key is configured, users select one or more roles to assign to a node when it is joined to the cluster. For more information, see [Managing Multi-Node Clusters with Embedded Cluster](/enterprise/embedded-manage-nodes). | ||
|
||
If the `roles` key is _not_ configured, all nodes joined to the cluster are assigned the `controller` role. The `controller` role designates nodes that run the Kubernetes control plane. Controller nodes can also run other workloads, such as application or Replicated KOTS workloads. | ||
|
||
For more information, see the sections below. | ||
|
||
### controller | ||
|
||
By default, all nodes joined to a cluster are assigned the `controller` role. | ||
|
||
You can customize the `controller` role in the following ways: | ||
* Change the `name` that is assigned to controller nodes. By default, controller nodes are named “controller”. If you plan to create any `custom` roles, Replicated recommends that you change the default name for the `controller` role to a term that is easy to understand, such as "management". This is because, when you add `custom` roles, both the name of the `controller` role and the names of any `custom` roles are displayed to the user when they join a node. | ||
* Add one or more `labels` to be assigned to all controller nodes. See [labels](#labels). | ||
|
||
#### Example | ||
|
||
```yaml | ||
|
@@ -99,45 +89,11 @@ kind: Config | |
spec: | ||
roles: | ||
controller: | ||
name: management | ||
labels: | ||
management: "true" # Label applied to "management" nodes | ||
``` | ||
|
||
### custom | ||
|
||
You can add `custom` roles that users can assign to one or more nodes in the cluster. Each `custom` role that you add must have a `name` and can also have one or more `labels`. See [labels](#labels). | ||
|
||
Adding `custom` node roles is useful if you need to assign application workloads to specific nodes in multi-node clusters. For example, if your application has graphics processing unit (GPU) workloads, you could create a `custom` role that will add a `gpu=true` label to any node that is assigned the role. This allows you to then schedule GPU workloads on nodes labled `gpu=true`. Or, if your application includes any resource-intensive workloads (such as a database) that must be run on dedicated nodes, you could create a `custom` role that adds a `db=true` label to the node. This way, the database workload could be assigned to a certain node or nodes. | ||
|
||
#### Example | ||
|
||
```yaml | ||
apiVersion: embeddedcluster.replicated.com/v1beta1 | ||
kind: Config | ||
spec: | ||
roles: | ||
custom: | ||
- name: app | ||
# Optionally change the name for the default controller role | ||
name: app | ||
labels: | ||
app: "true" # Label applied to "app" nodes | ||
``` | ||
|
||
### labels | ||
|
||
You can define Kubernetes labels for the default `controller` role and any `custom` roles that you add. When `labels` are defined, Embedded Cluster applies the label to any node in the cluster that is assigned the given role. Labels are useful for tasks like assigning workloads to nodes. | ||
|
||
#### Example | ||
|
||
```yaml | ||
apiVersion: embeddedcluster.replicated.com/v1beta1 | ||
kind: Config | ||
spec: | ||
roles: | ||
controller: | ||
name: management | ||
labels: | ||
management: "true" # Label applied to "management" nodes | ||
# Custom roles | ||
custom: | ||
- name: db | ||
labels: | ||
|
@@ -147,6 +103,22 @@ spec: | |
gpu: "true" # Label applied to "gpu" nodes | ||
``` | ||
|
||
### roles.controller | ||
|
||
In the `roles.controller` key, you can set the following fields to customize the default controller role: | ||
* `name`: Set the name that is assigned to controller nodes. By default, controller nodes are named “controller”. | ||
:::note | ||
If you plan to create any custom roles, Replicated recommends that you change the default name for the controller role to a term that is easy to understand, such as "app". This is because, when you add custom roles, both the name of the controller role and the names of any custom roles are displayed to the user when they join a node. | ||
::: | ||
* `labels`: Kubernetes labels that Embedded Cluster will apply to any node in the cluster that is assigned the given role. | ||
|
||
|
||
### roles.custom | ||
|
||
In the `roles.custom` key, you can add custom roles. Each custom role includes the following fields: | ||
* `name`: (Required) A name for the custom role. | ||
* `labels`: Kubernetes labels that Embedded Cluster will apply to any node in the cluster that is assigned the given role. | ||
|
||
## domains | ||
|
||
Configure the `domains` key so that Embedded Cluster uses your custom domains for the Replicated proxy registry and Replicated app service. | ||
|
@@ -178,17 +150,7 @@ Helm extensions are updated when new versions of your application are deployed f | |
|
||
The format for specifying Helm extensions uses the same k0s Helm extensions format from the k0s configuration. For more information about these fields, see the [k0s documentation](https://docs.k0sproject.io/stable/helm-charts/#example). | ||
|
||
### Limitation | ||
|
||
If a Helm extension is removed from the Embedded Cluster Config, the associated Helm chart is not removed from the cluster. | ||
|
||
### Requirements | ||
|
||
* The `version` field is required. Failing to specify a chart version will cause problems for upgrades. | ||
|
||
* If you need to install multiple charts in a particular order, set the `order` field to a value greater than or equal to 10. Numbers below 10 are reserved for use by Embedded Cluster to deploy things like a storage provider and the Admin Console. If an `order` is not provided, Helm extensions are installed with order 10. | ||
|
||
### Example | ||
#### Example | ||
|
||
```yaml | ||
apiVersion: embeddedcluster.replicated.com/v1beta1 | ||
|
@@ -223,6 +185,16 @@ spec: | |
digest: "" | ||
``` | ||
|
||
### Limitation | ||
|
||
If a Helm extension is removed from the Embedded Cluster Config, the associated Helm chart is not removed from the cluster. | ||
|
||
### Requirements | ||
|
||
* The `version` field is required. Failing to specify a chart version will cause problems for upgrades. | ||
|
||
* If you need to install multiple charts in a particular order, set the `order` field to a value greater than or equal to 10. Numbers below 10 are reserved for use by Embedded Cluster to deploy things like a storage provider and the Admin Console. If an `order` is not provided, Helm extensions are installed with order 10. | ||
|
||
## unsupportedOverrides | ||
|
||
:::important | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced "Users can enable HA when joining the third node." with: