Skip to content

histogram without distribution of values transmits bucket_counts but not explicit_bounds using the micrometer timer #7238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
niecore opened this issue Apr 1, 2025 · 2 comments
Labels
Bug Something isn't working

Comments

@niecore
Copy link

niecore commented Apr 1, 2025

Describe the bug

I noticed that the histogram which is generated by the micrometer 1.5 instrumentation does not match the specification. The HistogramDataPoint contains the field count, sum and bucket_counts(with array length 1) but no explicit_bounds (see log below):

{\"name\":\"test.timer\",\"description\":\"description\",\"unit\":\"s\",\"histogram\":{\"dataPoints\":[{\"startTimeUnixNano\":\"1743493817029495356\",\"timeUnixNano\":\"1743493877039760941\",\"count\":\"60\",\"sum\":8.737000000000002,\"bucketCounts\":[\"60\"],\"min\":0.019,\"max\":0.293}],\"aggregationTemporality\":2}}

I found following comment in the specification which says that an HistogramDataPointneeds to have booth explicit_bounds and bucket_counts set with both length or booth omitted: https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L496

I am not sure if this is related to the OpenTelemetryMeterRegistry or faulty for all histogram metrics. I discovered the issue while receiving the histogram metric on the otlp input of the elastic apm server: https://github.com/elastic/apm-data/blob/main/input/otlp/metrics.go#L367

Steps to reproduce

  1. instantiate io.micrometer.core.instrument.Timer
  2. set OTEL_INSTRUMENTATION_MICROMETER_ENABLED=true
  3. observe outgoing data via OTEL_METRICS_EXPORTER=console

[java-app-otel] | [otel.javaagent 2025-04-01 13:34:56:783 +0000] [PeriodicMetricReader-1] INFO io.opentelemetry.exporter.logging.LoggingMetricExporter - metric: ImmutableMetricData{resource=Resource{schemaUrl=https://opentelemetry.io/schemas/1.24.0, attributes={container.id="a77637a48f8cf2b64630dbb2489a5a74093d5195953a77d8c8411662ccbb56c6", deployment.environment="qa", host.arch="aarch64", host.name="a77637a48f8c", os.description="Linux 6.12.7-200.fc41.aarch64", os.type="linux", process.command_args=[/opt/java/openjdk/bin/java, -javaagent:/app/elastic-otel-javaagent.jar, -Dotel.instrumentation.logback-appender.experimental.capture-marker-attribute=true, -jar, /app/app.jar], process.executable.path="/opt/java/openjdk/bin/java", process.pid=1, process.runtime.description="Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.14+7", process.runtime.name="OpenJDK Runtime Environment", process.runtime.version="17.0.14+7", service.instance.id="a93f4706-44e2-4fb3-b33b-c4fe204b0345", service.name="java-demo-app-otel", telemetry.distro.name="elastic", telemetry.distro.version="1.3.0", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.47.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.micrometer-1.5, version=null, schemaUrl=null, attributes={}}, name=test.timer, description=description, unit=s, type=HISTOGRAM, data=ImmutableHistogramData{aggregationTemporality=CUMULATIVE, points=[ImmutableHistogramPointData{getStartEpochNanos=1743514436723244038, getEpochNanos=1743514496732826544, getAttributes={}, getSum=7.910000000000002, getCount=60, hasMin=true, getMin=0.0, hasMax=true, getMax=0.294, getBoundaries=[], getCounts=[60], getExemplars=[]}]}}

Expected behavior

https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L498

Actual behavior

See description above

Javaagent or library instrumentation version

opentelemetry-micrometer-1.5

Environment

JDK: eclipse-temurin:17-jre
OS: alpine

Additional context

No response

@niecore niecore added the Bug Something isn't working label Apr 1, 2025
@laurit
Copy link
Contributor

laurit commented Apr 2, 2025

@jack-berg could you take a look

@jack-berg
Copy link
Member

Couple of things:

  • Micrometer timers come with or without bucket boundaries, based on whether sla(Duration...) is called, i.e.:

    Timer.builder("foo")
            .sla(Duration.ofSeconds(5), Duration.ofSeconds(10))
            .register(meterRegistry);
    
  • The micrometer instrumentation bridges micrometer timer bucket boundaries here and here. Note there's a small bug in this instrumentation since it exits early if !(builder instanceof ExtendedDoubleHistogramBuilder) (i.e. the the incubator isn't present) but this is no longer necessary now that explicit bucket boundary advice is stable.

  • If a timer doesn't have any buckets, the micrometer instrumentation calls setExplicitBucketBoundaries(Collections.emptyList()). This is valid, and downgrades the histogram to a "single bucket histogram" (i.e. all the measurements are in a single bucket) which turns it into what is traditionally called a summary metric (count, sum, min, max).

  • However, you correctly point out that the proto comment clearly says that "explciit_bounds" and "bucket_counts" must either be both present or both omitted. In the single bucket histogram case, we are setting "bucket_counts", but not "explicit_bounds" and this is bug we should fix.

Transferring to opentelemetry-java since this is a bug in the SDK / OTLP serialization logic.

@jack-berg jack-berg transferred this issue from open-telemetry/opentelemetry-java-instrumentation Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants