Skip to content

Cluster Feature Gate in etcd #5189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

siyuanfoundation
Copy link
Contributor

  • One-line PR description: Cluster Feature Gate in etcd
  • Other comments:

Copy link

linux-foundation-easycla bot commented Mar 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: siyuanfoundation / name: Siyuan Zhang (1acec52)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: siyuanfoundation
Once this PR has been reviewed and has the lgtm label, please assign ahrtr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from jmhbnz and serathius March 5, 2025 23:21
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 5, 2025
@siyuanfoundation siyuanfoundation force-pushed the cluster-fg branch 4 times, most recently from e43204c to 99b939a Compare March 6, 2025 21:51
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 6, 2025
@siyuanfoundation siyuanfoundation force-pushed the cluster-fg branch 2 times, most recently from eef674b to a68a074 Compare March 7, 2025 18:15
@siyuanfoundation siyuanfoundation marked this pull request as draft March 21, 2025 19:21
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 21, 2025
@siyuanfoundation siyuanfoundation force-pushed the cluster-fg branch 11 times, most recently from 5577722 to 1147812 Compare May 20, 2025 22:07
@siyuanfoundation siyuanfoundation force-pushed the cluster-fg branch 8 times, most recently from b5a2322 to 60d8ef9 Compare May 21, 2025 21:53
Signed-off-by: Siyuan Zhang <[email protected]>

Co-authored-by: Marek Siarkowicz <[email protected]>
@siyuanfoundation siyuanfoundation marked this pull request as ready for review May 21, 2025 21:59
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2025
@k8s-ci-robot k8s-ci-robot requested a review from ahrtr May 21, 2025 21:59
@siyuanfoundation
Copy link
Contributor Author

@k8s-ci-robot k8s-ci-robot requested a review from ivanvc May 21, 2025 22:00
@k8s-ci-robot
Copy link
Contributor

@siyuanfoundation: GitHub didn't allow me to request PR reviews from the following users: fuweid.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @serathius @ahrtr @jmhbnz @fuweid @ivanvc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.


### Notes/Constraints/Caveats (Optional)

#### Should we allow users to change feature gate value from etcd endpoints?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Should we allow users to change feature gate value from etcd endpoints?
#### Should we allow users to dynamically change feature gate values at runtime via etcd endpoints?

Comment on lines +209 to +210
NOTE: even though we have `PreRelease` information in the feature spec, the purpose of cluster feature gate is not for lifecycle management. It is mainly for cluster level feature enablement. So a cluster feature can be `false`, and may not be removed even after graduation if we want to keep its togglability.
This is different from server level feature gate, where the feature gate is used for lifecycle tracking purpose, and a feature gate is expected to be removed eventually and permanent boolean flags would be added if needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I would suggest to remove "the purpose of cluster feature gate is not for lifecycle management" to avoid unnecessary confusion. The purpose of a feature is for lifecycle management, no matter it's server level feature or cluster level feature. The only difference is cluster level feature needs extra consideration (consensus).
  • The practice "a server feature gate is expected to be removed eventually and permanent boolean flags would be added if needed" should be documented in the server feature KEP. Also such a boolean flag should be added in the first place instead of being added in a very late phase (i.e. when the feature has GAed and we we are planning to remove it). Also should we really need to remove a GAed feature gate?


A feature can be registered as server level feature or cluster level feature, but not both.
For data compatibility, we do not support changing a feature gate from one type to another.
If a developer finds out that their feature needs to be a cluster level feature instead of server level after a release, they should add a new feature and use a new feature name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's easy to follow in practice, as there might be overlap or conflict between the existing and new features. There are some uncertainties and potential risks. Hopefully we will never see such cases.


On the other hand, cluster feature gates would not affect rolling upgrade/downgrade/update of existing clusters. As long as the cluster version is determined, the cluster feature gate can be set consistently.

Here is how we guarantee no mixed-version new clusters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should introduce additional restriction. The existing restriction of rejecting adding a lower version member into a cluster with higher cluster version should be already good enough. Also panicking mentioned below seems overkilled.


1. When a server starts, it sends its original flag values of `--cluster-feature-gates` and other cluster params through a new `ProposedClusterParams` field in the `MemberAttributes` update in `publishV3`. In the `ProposedClusterParams`, there is also a `Version` field set to the `ServerVersion.MajorMinor`. The server will bootstrap the `ClusterParams` with the default values at the `ServerVersion` with all Alpha and Beta cluster feature gates disabled (because we may want to change the default value of a Beta feature in patch version).

1. Whenever a server receives `MemberAttributes` update of any peer, it will check the new `ProposedClusterParams.Version` against its `ClusterParams.Version`, and will panic if the `ProposedClusterParams.Version` is less than `ClusterParams.Version`. This would make sure a brand new cluster always starts with the same version (Major.Minor) for all members.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow a server with a higher version to join a cluster with lower cluster version. We shouldn't break this use case.


### Set the Feature Gates

To set the correct user expectation, we will add a new flag `--cluster-feature-gates` (+`config-file` configuration support) similar to the server `--feature-gates` to set the values of cluster feature gates as expected by the local server. The server will use this flag value to set the `proposed_cluster_params` sent to the cluster. The final values of the cluster feature gate are actually decided through raft consensus.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final values of the cluster feature gate are actually decided through raft consensus

I think this description is inaccurate and confusing. Normally the raft messages are applied by majorities, instead of decided by majorities. The final value of the cluster feature gate should be decided by the etcdserver (the leader based on the section "Consensus Algorithm", which might be also problematic, i.e an old leader gets stuck steping down, non-atomic leader change etc.)

Comment on lines +326 to +327
// cluster level parameters for the whole cluster at the cluster version.
ClusterParams cluster_params = 2 [(versionpb.etcd_version_field)="3.7"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't allow dynamically change cluster level feature gates at runtime, then we don't need to add ClusterParams here. Can we just rely on the publishV3 to publish the ClusterParams during bootstrap?

Comment on lines +336 to +338
#### Restart a Member

When a member is restarted, its new flag settings would be published in `MemberAttributes` update through `publishV3`. The leader would re-reconcile the `ClusterParams` based on the updated list of voting members and update the `ClusterParams` if needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a case below,

  • assuming a cluster-level feature gate has been enabled by the leader, and the decision has been replicated to all members.
  • stop one member;
  • all other members apply one entry based on the enabled cluster-level feature.
  • start the member with cluster-level feature disabled.
    • you need to ensure that the member applies the entries with cluster-level feature enabled although it starts with cluster-level feature gate disabled flag.

Comment on lines +377 to +378
FeatureGates map[featuregate.Feature]bool
// other cluster level parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each cluster-level feature, it might not just be enabled / disabled, it might have related parameters. We need to ensure the parameters are consistent as well. How do you handle the case the parameters not consistent? (I can't find such an example for now, but theoretically it may happen in future)

Also how do you check other cluster level parameters, i.e max-learner? Are you going to cover it in a separate KEP?

The `etcdctl` commands could look like:
* `etcdctl endpoint featuregate $featureName` returns true if the feature is enabled

Because the feature gate of a cluster could change anytime, even if the client has queried the etcd cluster for feature availability before sending a request, feature availability can be changed by the time the request is sent. We are proposing to add a new `WithRequireFeature(ctx context.Context, featureName string) context.Context` function in `clientv3` package to pass metadada of the required feature into the context which could be checked against the server cluster feature gate to determined the request should be served.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clientv3 API should be stable. I don't expect any change on any of the existing clientv3 API. So I don't think we should add WithRequireFeature(ctx context.Context, featureName string) context.Context, at least not for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants