Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd does not resolve TLS DNSname when checking SAN for client transport security certificate #19691

Open
4 tasks done
xadips opened this issue Mar 28, 2025 · 5 comments
Open
4 tasks done
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. type/bug

Comments

@xadips
Copy link

xadips commented Mar 28, 2025

Bug report criteria

What happened?

I was trying to migrate etcd cluster member certificates without downtime from from old root CA generated certificates, to ones generated from new root CA. I've combined both CA's into a single trusted bundle that is used in both peer-transport-security and client-transport-security for trusted-ca-file do a rolling restart to make the nodes trust certificates issued from both CA's then start switching member certificates to new ones one member at a time.

I use the same certificate for client and peer communication

The old certificates had both DNS and IP SAN of the member

            X509v3 Subject Alternative Name: 
                DNS:<hostname>,  IP Address:10.XX.XXX.XX

The new certificates had only the DNS SAN of the member

            X509v3 Subject Alternative Name: 
                DNS:<hostname>

The new certificate did work for peer transport security(when in use with the old certificate for client transport security), but did not work for client transport security throwing warnings:

{"level":"warn","ts":"2025-03-28T20:20:41.116994Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.XX.XXX.XX:55972","server-name":"","error":"remote error: tls: bad certificate"}
{"level":"warn","ts":"2025-03-28T20:20:41.121504Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.XX.XXX.XX.:38738","server-name":"","error":"tls: first record does not look like a TLS handshake"}

What did you expect to happen?

Based on documentation since etcd 3.2.0 server resolves TLS DNSNames when checking SAN so I expected the DNS resolution to work for both peer and client certificates but it only worked for peer communication is this expected.

How can we reproduce it (as minimally and precisely as possible)?

Use certificates with only DNS:hostname in SAN field for both peer and client transport security

Anything else we need to know?

Is this behaviour expected for client transport security? If yes, the documentation might be unclear on when the DNS resolution works.

Etcd version (please run commands below)

$ etcd --version
# paste output here
/etc/etcd # etcd --version
etcd Version: 3.5.21
Git SHA: a17edfd
Go Version: go1.23.7
Go OS/Arch: linux/amd64
$ etcdctl version
# paste output here
 /etc/etcd # etcdctl version
etcdctl version: 3.5.21
API version: 3.5

Etcd configuration (command line flags or environment variables)

# paste your configuration here
# Human-readable name for this member.
name: node1

# Path to the data directory.
data-dir: /node1.etcd

# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 100000

# Time (in milliseconds) of a heartbeat interval.
heartbeat-interval: 100

# Time (in milliseconds) for an election to timeout.
election-timeout: 1000

# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 2147483648

# List of comma separated URLs to listen on for peer traffic.
listen-peer-urls: https://10.XX.XXX.XX:2380

# List of comma separated URLs to listen on for client traffic.
listen-client-urls: https://10.XX.XXX.XX:2379

# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 5

# Maximum number of wal files to retain (0 is unlimited).
max-wals: 5

# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: https://10.XX.XXX.XX:2380

# List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: https://10.XX.XXX.XX:2379

# Discovery URL used to bootstrap the cluster.
discovery:

# Initial cluster configuration for bootstrapping.
initial-cluster: node2=https://10.XX.X.XXX:2380,node1=https://10.XX.XXX.XX:2380,node3=https://10.XX.XX.XXX:2380

# Initial cluster token for the etcd cluster during bootstrap.
initial-cluster-token: etcd-cluster

# Initial cluster state ('new' or 'existing').
initial-cluster-state: existing

# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: true

# Specify a v3 authentication token type and its options ('simple' or 'jwt').
auth-token: simple

# Accept etcd V2 client requests.
enable-v2: true

# Enable runtime profiling data via HTTP server.
enable-pprof: false

client-transport-security:
  # Path to the client server TLS cert file.
  # old cert that still works
  # cert-file: /etc/etcd/pki/etcd/etcd.pem

  # new cert that didn't work
  cert-file: /etc/etcd/server.crt

  # Path to the client server TLS key file.
  # old key that still works
  # key-file: /etc/etcd/pki/etcd/etcd-key.pem

  # new cert that didn't work
  key-file: /etc/etcd/server.key

  # Enable client cert authentication
  client-cert-auth: true

  # Path to the client server TLS trusted CA cert file.
  trusted-ca-file: /etc/etcd/trusted-ca-bundle.crt

  # Client TLS using generated certificates.
  auto-tls: false

peer-transport-security:
  # Path to the peer server TLS cert file.
  # new cert
  cert-file: /etc/etcd/server.crt

  # Path to the peer server TLS key file.
  # new key
  key-file: /etc/etcd/server.key

  # Enable peer client cert authentication.
  client-cert-auth: true

  # Path to the peer server TLS trusted CA cert file.
  trusted-ca-file: /etc/etcd/trusted-ca-bundle.crt

  # Peer TLS using self-generated certificates if peer-key-file and peer-cert-file are not provided.
  auto-tls: false

# Enable debug-level logging for etcd.
debug: false

# Force to create a new one member cluster.
force-new-cluster: false

auto-compaction-mode: periodic

auto-compaction-retention: '48'

metrics: extensive

listen-metrics-urls: http://10.XX.XXX.XX:2378
@ivanvc
Copy link
Member

ivanvc commented Mar 30, 2025

Hi @xadips, thanks for opening the issue. Can you confirm that you still have this issue using the latest patch version (v3.5.21) from the v3.5 minor?

v3.5.9 is almost two years old, and some backports may have fixed the issue. Confirming that it exists in the latest version can help with the triage.

Thanks again.

@ivanvc ivanvc added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Mar 30, 2025
@xadips xadips changed the title etcd 3.5.9 does not resolve TLS DNSname when checking SAN for client transport security certificate etcd 3.5.21 does not resolve TLS DNSname when checking SAN for client transport security certificate Mar 31, 2025
@xadips
Copy link
Author

xadips commented Mar 31, 2025

Still present on v3.5.21 updating title and description

@serathius serathius changed the title etcd 3.5.21 does not resolve TLS DNSname when checking SAN for client transport security certificate etcd does not resolve TLS DNSname when checking SAN for client transport security certificate Mar 31, 2025
@serathius
Copy link
Member

Mentioning etcd version implies that it was regression that happened on it.

@xadips
Copy link
Author

xadips commented Mar 31, 2025

I found this while switching from cfssl generated certificates to vault pki ones, thought there might be issue with new certificates, but I just tested again to confirm.

Manually generated a certificate with cfssl for a single member without IP in SAN and encountered the same problem:

cfssl gencert -profile=kubernetes -ca=ca.pem 
-ca-key=ca_key.pem -config=ca-config.json 
-hostname=<hostname> 
etcd/etcd-csr.json | cfssljson -bare etcd

etcdctl warns about it when trying to access the said member
transport: authentication handshake failed: tls: failed to verify certificate: x509: cannot validate certificate for 10.XX.XX.XXX because it doesn't contain any IP SANs

@ahrtr
Copy link
Member

ahrtr commented Apr 14, 2025

It seems not an etcd specific issue, instead it's a generic certificate issue https://serverfault.com/questions/611120/failed-tls-handshake-does-not-contain-any-ip-sans

What's the values for the following options,

  • --listen-client-urls
  • --advertise-client-urls
  • --listen-peer-urls
  • --initial-advertise-peer-urls

Does the peer communicate using dns, while client-server communication using IP address?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. type/bug
Development

No branches or pull requests

4 participants