Cluster Bare Autoscaler

Introduction

Cluster Bare Autoscaler (CBA) automatically adjusts the size of a bare-metal Kubernetes cluster by powering nodes off or on based on real-time resource usage, while safely cordoning and draining nodes before shutdown.

This project is similar to the official Kubernetes Cluster Autoscaler, but with key differences:

CBA does not terminate or bootstrap instances.
Instead, it powers down and wakes up bare-metal nodes using mechanisms like Wake-on-LAN, or other pluggable power controllers.
Nodes are cordoned and drained safely before shutdown.

CBA uses a chainable strategy model for deciding when to scale down a node. Strategies can be enabled individually or used together:

Resource-aware strategy — checks CPU and memory requests and usage.
Load average strategy — evaluates /proc/loadavg via a per-node metrics DaemonSet.

It is especially suited for self-managed data centers, homelabs, or cloud-like bare-metal environments.

Features

✅ Pluggable scale-down and scale-up strategies
- Multi-strategy chaining with short-circuit logic
- Dry-run mode for testing (--dry-run)
✅ Resource-aware scale-down
- Considers CPU and memory requests
- Optionally uses live usage metrics
✅ Load average-aware scale-down and scale-up using /proc/loadavg
- Supports aggregation modes: average, median, p75, p90
- Separate thresholds for scale-up and scale-down decisions
- CLI dry-run overrides:
  - --dry-run-cluster-load-down
  - --dry-run-cluster-load-up
✅ MinNodeCount-based scale-up to maintain minimum node count
✅ Cooldown tracking
- Global cooldown period
- Per-node boot/shutdown cooldowns
✅ Node eligibility filtering
- Ignores nodes with label cba.dev/was-powered-off: true
- Respects .Disabled flag in config.yaml
✅ Safe cordon and drain using Kubernetes eviction API
✅ Wake-on-LAN support for powering on bare-metal machines

Additional, optional services

✅ Metrics DaemonSet for per-node load average via /proc/loadavg
✅ Power-off DaemonSet for secure node shutdown via systemd socket activation
✅ Wake-on-LAN agent (wol-agent) for powering nodes via HTTP-triggered magic packets

Quickstart

Label your managed nodes
Apply a label to every node you want the autoscaler to manage:
```
kubectl label node <node-name> cba.dev/is-managed=true
```
(Optional) Manually annotate MAC addresses
This is needed if you use Wake-on-LAN (WOL) and cannot rely on auto-discovery:
```
kubectl annotate node <node-name> cba.dev/mac-address-override=aa:bb:cc:dd:ee:ff
```
If not manually annotated, MACs will be discovered from the node's local poweroff daemon Pod (via /mac) and stored in cba.dev/mac-address.

Install the autoscaler with Helm

helm install cba cluster-bare-autoscaler/cluster-bare-autoscaler      -n <namespace>      -f values.yaml

Observe behavior in dry-run mode (optional)
Run with --dry-run to observe decision logic without powering nodes.

Node Definitions (via Annotations)

CBA automatically discovers nodes based on labels and manages them based on annotations.

Required Label:
Nodes must be labeled to be considered managed:
cba.dev/is-managed: "true"
Optional Exclusion Label:
Add this label to skip specific nodes from scaling:
cba.dev/disabled: "true"

Annotations:

Key	Purpose
`cba.dev/mac-address`	Auto-discovered MAC address for WOL
`cba.dev/mac-address-override`	Manually specified MAC (takes precedence)
`cba.dev/was-powered-off`	Marked by autoscaler after shutdown

Metrics

The autoscaler exposes Prometheus metrics on port :9090 at the /metrics endpoint.
Metrics include evaluation counts, shutdown attempts/successes, eviction failures, and per-node powered-off status.

Installation

Using Helm

helm install cba cluster-bare-autoscaler/cluster-bare-autoscaler   -n <namespace>   -f values.yaml

Configuration

All configuration is passed via a ConfigMap in YAML form.
The root config keys are defined in the values.yaml.

See the example config.yaml for details.

Development

Build the binary

make build_binary

Run tests

make test

Build container image (multi-arch with `ko`)

make build_image
make publish_image

Run locally

go run main.go --config=./config.yaml --dry-run

Roadmap & TODO

See TODO.md for the current feature backlog and development roadmap.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github/workflows		.github/workflows
helm		helm
metrics-daemonset		metrics-daemonset
pkg		pkg
poweroff-daemonset		poweroff-daemonset
wol-agent		wol-agent
.gitignore		.gitignore
.go-version		.go-version
.ko.yaml		.ko.yaml
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
README.md:Zone.Identifier		README.md:Zone.Identifier
TODO.md		TODO.md
config.yaml		config.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cluster Bare Autoscaler

Introduction

Features

Additional, optional services

Quickstart

Node Definitions (via Annotations)

Metrics

Installation

Using Helm

Configuration

Development

Build the binary

Run tests

Build container image (multi-arch with `ko`)

Run locally

Roadmap & TODO

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Languages

License

docent-net/cluster-bare-autoscaler

Folders and files

Latest commit

History

Repository files navigation

Cluster Bare Autoscaler

Introduction

Features

Additional, optional services

Quickstart

Node Definitions (via Annotations)

Metrics

Installation

Using Helm

Configuration

Development

Build the binary

Run tests

Build container image (multi-arch with ko)

Run locally

Roadmap & TODO

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Languages

Build container image (multi-arch with `ko`)

Packages