You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
Component version: v1.29.0
What k8s version are you using (kubectl version)?:
$ kubectl version
Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.29.4+rke2r1
What environment is this in?:
Rancher managed Kubernetes cluster on an OpenStack based cloud at CloudFerro.
What did you expect to happen?:
The autoscaler should scale down the worker nodes.
What happened instead?:
The autoscaler was not able to scale down the worker nodes.
How to reproduce it (as minimally and precisely as possible):
Deploy a Rancher managed Kubernetes cluster on an OpenStack based cloud.
Start a workload which will exceed quota or which will exhaust OpenStack resources
Stop the workload
Autoscaler will not be able to downscale the cluster anymore
Anything else we need to know?:
When the issue occurs, these kind of logs can be found in the autoscaler pod logs: I0326 12:39:22.919386 1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"k8s-worker-eo2a-xlarge-c29f50b3-s74qs.novalocal", UID:"c23b0b6d-4f0c-484d-a5c5-c7c896c08be1", APIVersion:"v1", ResourceVersion:"101114911", FieldPath:""}): type: 'Warning' reason: 'ScaleDownFailed' failed to delete empty node: failed to delete nodes from group worker-eo2a-xlarge: could not find providerID in machine: k8s-worker-eo2a-xlarge-78b857bb76x5hgc6-gz6t8/fleet-default
Previously, we logged #6778 which was closed because we thought that after upgrading the issue was fixed. We now noticed it again, so logging this ticket.
In the autoscaler Grafana dashboard you typically see that the autoscaler is aware of the unneeded nodes, but scaling down fails, probably due to this providerID issue:
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
Component version: v1.29.0
What k8s version are you using (
kubectl version
)?:What environment is this in?:
Rancher managed Kubernetes cluster on an OpenStack based cloud at CloudFerro.
What did you expect to happen?:
The autoscaler should scale down the worker nodes.
What happened instead?:
The autoscaler was not able to scale down the worker nodes.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
When the issue occurs, these kind of logs can be found in the autoscaler pod logs:
I0326 12:39:22.919386 1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"k8s-worker-eo2a-xlarge-c29f50b3-s74qs.novalocal", UID:"c23b0b6d-4f0c-484d-a5c5-c7c896c08be1", APIVersion:"v1", ResourceVersion:"101114911", FieldPath:""}): type: 'Warning' reason: 'ScaleDownFailed' failed to delete empty node: failed to delete nodes from group worker-eo2a-xlarge: could not find providerID in machine: k8s-worker-eo2a-xlarge-78b857bb76x5hgc6-gz6t8/fleet-default
Previously, we logged #6778 which was closed because we thought that after upgrading the issue was fixed. We now noticed it again, so logging this ticket.
In the autoscaler Grafana dashboard you typically see that the autoscaler is aware of the unneeded nodes, but scaling down fails, probably due to this providerID issue:
The text was updated successfully, but these errors were encountered: