Skip to content

HOLD FOR RELEASE: Updates for Embedded Cluster HA #3136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Conversation

paigecalvert
Copy link
Contributor

@paigecalvert paigecalvert commented Mar 26, 2025

https://deploy-preview-3136--replicated-docs.netlify.app/enterprise/embedded-manage-nodes#ha

Docs edits to the description of the roles key for clarity (no changes to the details of how that feature actually works): https://deploy-preview-3136--replicated-docs.netlify.app/reference/embedded-config#roles

@replicated-ci replicated-ci added type::docs Improvements or additions to documentation type::feature labels Mar 26, 2025
Copy link

netlify bot commented Mar 26, 2025

Deploy Preview for replicated-docs ready!

Name Link
🔨 Latest commit a132db0
🔍 Latest deploy log https://app.netlify.com/sites/replicated-docs/deploys/67fd7b344593070008c322f4
😎 Deploy Preview https://deploy-preview-3136--replicated-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Mar 26, 2025

Deploy Preview for replicated-docs-upgrade ready!

Name Link
🔨 Latest commit a132db0
🔍 Latest deploy log https://app.netlify.com/sites/replicated-docs-upgrade/deploys/67fd7b343ff28c00080b1b02
😎 Deploy Preview https://deploy-preview-3136--replicated-docs-upgrade.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@paigecalvert paigecalvert changed the title Updates for Embedded Cluster HA WIP: Updates for Embedded Cluster HA Mar 26, 2025

Multi-node clusters are not highly available by default. The first node of the cluster is special and holds important data for Kubernetes and KOTS, such that the loss of this node would be catastrophic for the cluster. Enabling high availability (HA) requires that at least three controller nodes are present in the cluster. Users can enable HA when joining the third node.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced "Users can enable HA when joining the third node." with:

Users are automatically prompted to enable HA when joining the third controller node to a cluster. Alternatively, users can enable HA after adding three or more controller nodes with the enable-ha command.

@@ -101,47 +91,56 @@ For more information about the Embedded Cluster built-in extensions, see [Built-

Enabling high availability has the following requirements:

* High availability is supported with Embedded Cluster 1.4.1 or later.
* High availability is supported with Embedded Cluster **VERSION** and later.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ probably should change this to the GA version

### Create a Multi-Node HA Cluster
### Create a Multi-Node Cluster with High Availability {#create-ha}

You can enable high availability for a multi-node cluster when joining the third controller node. Alternatively, you can enable HA for an existing cluster with three or more controller nodes. For more information, see [Enable High Availability For an Existing Cluster](#enable-ha-existing) below.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Split the options into two different headings: "Create a Multi-Node Cluster with High Availability" and "Enable High Availability for an Existing Cluster".

I also tried combining them in the same procedure but this felt easier to skim from looking at the TOC for the page

labels:
management: "true" # Label applied to "management" nodes
```
## nodeRoles
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrote this section and renamed to nodeRoles

Copy link
Member

@ajp-io ajp-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not final, but it looks like we might stick with our current node roles set up in the EC Config and continue letting people multi-select. I'll get back to you with a final word. I'm not confident enough that the proposal is making things better enough to warrant changing it.

@ajp-io
Copy link
Member

ajp-io commented Mar 28, 2025

plan is to stick with the current roles setup, as well as the multi-select of roles per node

@paigecalvert paigecalvert marked this pull request as ready for review April 8, 2025 19:32
@paigecalvert paigecalvert requested a review from a team as a code owner April 8, 2025 19:32
@paigecalvert paigecalvert changed the title WIP: Updates for Embedded Cluster HA HOLD FOR RELEASE: Updates for Embedded Cluster HA Apr 8, 2025
@@ -72,61 +75,12 @@ For a full list of versions, see the Embedded Cluster [releases page](https://gi

You can optionally customize node roles in the Embedded Cluster Config using the `roles` key.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this back to roles (from nodeRoles)

As part of reverting back to the original description of how this feature works, I also did a bit of a rewrite/reorg to simplify things. This meant reducing word count, using one full example for setting roles.controller and roles.custom with names and labels rather than a few scattered examples, and changing the subheadings so that "labels" isn't it's own h3 anymore


1. After the third node joins the cluster, type `y` in response to the prompt asking if you want to enable high availability.
1. In response to the prompt asking if you want to enable high availability, type `y` or `yes`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the screenshot here is different, but we can make it a follow up to update it if you'd like.

@@ -135,9 +89,11 @@ kind: Config
spec:
roles:
controller:
# Optionally change the name for the default controller role
name: management
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about it more, i don't like management as a name because that implies it's just the controller. but people's workloads run here too.

like in these examples, your db runs on dedicated db nodes, and everything else runs on controllers. so it's not really just a management node then.

i think app could be a good example name here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll use app

In the `roles.controller` key, you can set the following fields to customize the default controller role:
* `name`: Set the name that is assigned to controller nodes. By default, controller nodes are named “controller”.
:::note
If you plan to create any custom roles, Replicated recommends that you change the default name for the controller role to a term that is easy to understand, such as "management". This is because, when you add custom roles, both the name of the controller role and the names of any custom roles are displayed to the user when they join a node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think i'd offer a recommendation like management. app is what i suggested for our examples above, and we could say that here if you think the guidance is helpful. but I don't know what people will run on controllers, so it's hard to give a good name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could get rid of that note too. It was a carry-over

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the note might be worthwhile so people know that controller will be the role name otherwise. i guess they'll see that when they test anyway though. so we can remove if you think it's easier/clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still have "By default, controller nodes are named 'controller' in the name description, so that might cover it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type::docs Improvements or additions to documentation type::feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants