You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kubernetes: v1.30
Local setup using kind
OTel Collector: v0.121.0
sdk: 1.31.0
Nginx Ingress Controller
Description
When using the OpenTelemetry collector directly with port forwarding (4318), the metric export latency is normal (50-150ms). However, when introducing an Nginx ingress in front of the collector, the latency increases dramatically to 3-5 seconds per export. Steps to Reproduce
Simple OTel collector pipeline configuration is used
OTLP HTTP exporter is being used (port 4318)
No visible errors in logs, just increased latency
Using ingress rewrite rule: nginx.ingress.kubernetes.io/rewrite-target: /v1/metrics
Collector configured as statefulset with minimal processing (batch processor only)
Troubleshooting Attempted
Verified that the ingress configuration is correct by confirming metrics are received
Checked Nginx ingress controller logs for any errors or warnings
Confirmed that other services behind the same ingress controller don't experience similar latency issues
Using a minimal collector configuration with only debug exporter
Ingress is using nginx.ingress.kubernetes.io/rewrite-target annotation that might affect routing
Impact
This latency increase makes using Nginx ingress in front of OTel collector impractical for production environments where timely metric export is critical.
# Simple OpenTelemetry Collector configuration
# Just receives metrics on port 4318 and outputs to stdout
global:
defaultApplicationName: "metrics-local-kind"
defaultSubsystemName: "metrics-local-kind"
nameOverride: "metrics-local-kind"
fullnameOverride: "metrics-local-kind"
mode: "statefulset" # Keeping statefulset as in original config
# Disable all presets that we don't need
presets:
logsCollection:
enabled: false
hostMetrics:
enabled: false
kubernetesAttributes:
enabled: false # Changed to false since we're just printing to stdout
clusterMetrics:
enabled: false
kubeletMetrics:
enabled: false
configMap:
create: true
# The core configuration
config:
exporters:
# Only using debug exporter to print to stdout
debug:
verbosity: detailed # Print detailed metrics information
extensions:
health_check: {} # Keep health check for monitoring
processors:
batch: # Basic batch processor to efficiently handle metrics
send_batch_size: 1024
timeout: "1s"
receivers:
otlp: # OTLP receiver to get metrics
protocols:
http:
endpoint: "0.0.0.0:4318" # Listen for HTTP OTLP metrics on port 4318
service:
extensions:
- health_check
pipelines:
metrics: # Simple metrics pipeline
receivers:
- otlp
processors:
- batch
exporters:
- debug # Only export to debug (stdout)
# Container image configuration
image:
repository: otel/opentelemetry-collector-contrib
pullPolicy: IfNotPresent
tag: "0.121.0" # Keeping your version
command:
name: otelcol-contrib
# Basic setup for the service account
serviceAccount:
create: true
# We don't need cluster role
clusterRole:
create: false
# Restoring statefulset configuration from original
statefulset:
persistentVolumeClaimRetentionPolicy:
enabled: true
whenDeleted: Delete
whenScaled: Retain
volumeClaimTemplates:
- metadata:
name: queue
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "1Gi"
# Add pod identity as an environment variable for application use
extraVolumeMounts:
- name: queue
mountPath: /var/lib/storage/queue
initContainers:
- name: init-fs
image: busybox:latest
command:
- sh
- "-c"
- "chown -R 10001: /var/lib/storage/queue"
volumeMounts:
- name: queue
mountPath: /var/lib/storage/queue
# Enable required ports
ports:
otlp-http:
enabled: true
containerPort: 4318
servicePort: 4318
protocol: TCP
metrics:
enabled: true
containerPort: 8888
servicePort: 8888
protocol: TCP
# Minimal resource requirements
resources:
limits:
memory: 200Mi
requests:
cpu: 200m
memory: 200Mi
replicaCount: 1
# Simple ClusterIP service
service:
type: ClusterIP
# Keeping the ingress configuration
ingress:
enabled: true
ingressClassName: nginx # Matches the NGINX Ingress Controller
hosts:
- host: otel-metrics.local # Dummy host for local testing in Kind
paths:
- path: /
pathType: Prefix
port: 4318
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /v1/metrics # Rewrite to OTLP metrics endpoint
python app
import logging
logger = logging.getLogger("via_telemetry")
from typing import Optional
from opentelemetry.sdk.resources import SERVICE_NAME, Attributes, Resource
from opentelemetry.exporter.otlp.proto.http import Compression
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter
from opentelemetry.metrics import set_meter_provider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics._internal.export import MetricExporter
from opentelemetry.sdk.metrics.export import (
ConsoleMetricExporter,
PeriodicExportingMetricReader,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry import metrics
exporter = OTLPMetricExporter(
endpoint='http://0.0.0.0:4318/v1/metrics',
timeout=8, # type: ignore[arg-type]
compression=Compression("none"),
)
metric_readers = []
reader = PeriodicExportingMetricReader(exporter)
metric_readers.append(reader)
service_attr: Attributes = {SERVICE_NAME: "sdk_test", "team": "o11y"}
service_resource = Resource(attributes=service_attr)
meter_provider = MeterProvider(resource=service_resource, metric_readers=metric_readers)
set_meter_provider(meter_provider)
meter = metrics.get_meter("otel-tests")
process_counter = meter.create_counter(
name="sdk_counter_tests",
unit="invocation",
description="Counts the number of process invocations with large increase",
)
def main():
counter = 0
logger.info("Function triggered successfully")
labels = {
'env': 'dev',
'city_id': '123',
}
try:
while True:
rand_num = random.randrange(1, 10)
logger.info("Function triggered successfully")
process_counter.add(rand_num, labels)
counter += rand_num
time.sleep(1)
except KeyboardInterrupt:
start = time.time()
meter_provider.force_flush(8000)
end = time.time()
print(f"time to flush metrics: {end - start}")
print(f'total counter: {counter}')
if __name__ == "__main__":
main()
What happened?
With direct port forwarding: 50-150ms latency
With Nginx ingress: 3-5s latency (30-100x increase)
Steps to Reproduce
Set up a kind cluster with Kubernetes 1.30
Deploy OTel collector v0.121.0 with a simple pipeline
Test direct export via port forwarding:
Copykubectl port-forward service/otel-collector 4318:4318
Result: Export latency is 50-150ms
Deploy Nginx ingress controller and configure it to route to the OTel collector
Export metrics through the ingress
Result: Export latency increases to 3-5 seconds
Expected Result
The latency should remain comparable when using an ingress, perhaps with a slight increase but not a 30-100x degradation.
Additional Information
Actual Result
When using the OpenTelemetry collector directly with port forwarding (4318), the metric export latency is normal (50-150ms). However, when introducing an Nginx ingress in front of the collector, the latency increases dramatically to 3-5 seconds per export.
Additional context
No response
Would you like to implement a fix?
None
The text was updated successfully, but these errors were encountered:
thank you @xrmx
actually i'm not sure it belongs to nginx also,
this delay only occurs when behind the ingress there is an OTel collector
Maybe someone has any clue?
Uh oh!
There was an error while loading. Please reload this page.
Describe your environment
Environment
Kubernetes: v1.30
Local setup using kind
OTel Collector: v0.121.0
sdk: 1.31.0
Nginx Ingress Controller
Description
When using the OpenTelemetry collector directly with port forwarding (4318), the metric export latency is normal (50-150ms). However, when introducing an Nginx ingress in front of the collector, the latency increases dramatically to 3-5 seconds per export.
Steps to Reproduce
Simple OTel collector pipeline configuration is used
OTLP HTTP exporter is being used (port 4318)
No visible errors in logs, just increased latency
Using ingress rewrite rule: nginx.ingress.kubernetes.io/rewrite-target: /v1/metrics
Collector configured as statefulset with minimal processing (batch processor only)
Troubleshooting Attempted
Verified that the ingress configuration is correct by confirming metrics are received
Checked Nginx ingress controller logs for any errors or warnings
Confirmed that other services behind the same ingress controller don't experience similar latency issues
Using a minimal collector configuration with only debug exporter
Ingress is using nginx.ingress.kubernetes.io/rewrite-target annotation that might affect routing
Impact
This latency increase makes using Nginx ingress in front of OTel collector impractical for production environments where timely metric export is critical.
python app
What happened?
With direct port forwarding: 50-150ms latency
With Nginx ingress: 3-5s latency (30-100x increase)
Steps to Reproduce
Set up a kind cluster with Kubernetes 1.30
Deploy OTel collector v0.121.0 with a simple pipeline
Test direct export via port forwarding:
Copykubectl port-forward service/otel-collector 4318:4318
Result: Export latency is 50-150ms
Deploy Nginx ingress controller and configure it to route to the OTel collector
Export metrics through the ingress
Result: Export latency increases to 3-5 seconds
Expected Result
The latency should remain comparable when using an ingress, perhaps with a slight increase but not a 30-100x degradation.
Additional Information
Actual Result
When using the OpenTelemetry collector directly with port forwarding (4318), the metric export latency is normal (50-150ms). However, when introducing an Nginx ingress in front of the collector, the latency increases dramatically to 3-5 seconds per export.
Additional context
No response
Would you like to implement a fix?
None
The text was updated successfully, but these errors were encountered: