Description
Specifications
- Client Version: 1.48.0
- InfluxDB Version: 2 (influxdb:2 docker image)
- Platform: alpine (Python 3.12 docker image)
Code sample to reproduce problem
import pandas as pd
import time
import influxdb_client
from influxdb_client.client.write_api import PointSettings
ONE_SECOND = pd.Timedelta('1s')
# Time for the first measures
t = pd.Timestamp.now(tz='UTC') - pd.Timedelta('2min').floor('s')
# Define a first client which receives measures for odd numbers
client_1 = influxdb_client.InfluxDBClient(**influx_configs)
w_api_1 = client_1.write_api()
q_api_1 = client_1.query_api()
# Write a point to the first client before the second client is created
point_1_1 = [
{"measurement": "test",
"tags": {"client_number": "1",
"numbers": "odds"},
"time": t.isoformat(),
"fields": {"value": 1.}}
]
w_api_1.write(bucket='test_tags_bug', record=point_1_1)
# Define a second client which receives measures for even numbers
# This client uses default tags
client_2_tags = {"client_number": "2", "numbers": "even" }
client_2 = influxdb_client.InfluxDBClient(
default_tags=client_2_tags, **influx_configs
)
# Define a write API for the second client with default arguments
w_api_2 = client_2.write_api()#point_settings=PointSettings()) #uncomment to see the difference
# Write a point to the second client
point_2_1 = [
{"measurement": "test",
"time": (t + ONE_SECOND).isoformat(),
"fields": {"value": 2.}}
]
w_api_2.write(bucket='test_tags_bug', record=point_2_1)
# Define a third client which receives measures for odd numbers and a half
client_3 = influxdb_client.InfluxDBClient(**influx_configs)
w_api_3 = client_3.write_api()
w_api_3_2 = client_3.write_api(point_settings=PointSettings())
point_3_1 = [
{"measurement": "test",
"tags": {"client_number": "3", "numbers": "odds and a half"},
"time": (t + 2*ONE_SECOND).isoformat(),
"fields": {"value": 1.5}}
]
w_api_3.write(bucket='test_tags_bug', record=point_3_1)
# Write a second point for each client, one minute after the first point
t_2 = (t + pd.Timedelta('1min'))
point_1_2 = [
{"measurement": "test",
"tags": {"client_number": "1", "numbers": "odds"},
"time": t_2.isoformat(),
"fields": {"value": 3.}}
]
point_2_2 = [
{"measurement": "test",
"time": (t_2 + ONE_SECOND).isoformat(),
"fields": {"value": 4.}}
]
point_3_2 = [
{"measurement": "test",
"tags": {"client_number": "3", "numbers": "odds and a half"},
"time": (t_2 + 2 * ONE_SECOND).isoformat(),
"fields": {"value": 3.5}}
]
# client 1 and 2 uses the same write API as initially
w_api_1.write(bucket='test_tags_bug', record=point_1_2)
w_api_2.write(bucket='test_tags_bug', record=point_2_2)
# client 3 uses a different write API, for which a point settings was defined
# explicitly
w_api_3_2.write(bucket='test_tags_bug', record=point_3_2)
time.sleep(2)
q_api_1.query_data_frame('from(bucket: "test_tags_bug") |> range(start: -1h) |> filter(fn: (r) => r["_measurement"] == "test") |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")')
Expected behavior
When using the default_tags
argument of the InfluxDBClient
object, I would expect that any record by this client has this tags by default, but that should not affect the tags of other clients.
The expected result of the code example should be :
result | table | _start | _stop | _time | _measurement | client_number | numbers | value |
---|---|---|---|---|---|---|---|---|
_result | 0 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:05.721710+00:00 | test | 1 | odds | 1.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:06.721710+00:00 | test | 2 | even | 2.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:07.721710+00:00 | test | 3 | odds and a half | 1.5 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:05.721710+00:00 | test | 1 | odds | 3.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:06.721710+00:00 | test | 2 | even | 4.0 |
_result | 2 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:07.721710+00:00 | test | 3 | odds and a half | 3.5 |
Actual behavior
When using default_tags
argument of the InfluxDBClient
object, this alters the tags of any existing or new WriteApi
instance, whatever the client instance which created it.
The problem arises for instances of WriteApi
created using InfluxDBClient.write_api()
with default kwarg values.
The result of the code example is :
result | table | _start | _stop | _time | _measurement | client_number | numbers | value |
---|---|---|---|---|---|---|---|---|
_result | 0 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:05.721710+00:00 | test | 1 | odds | 1.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:06.721710+00:00 | test | 2 | even | 2.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:43:07.721710+00:00 | test | 2 | even | 1.5 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:05.721710+00:00 | test | 2 | even | 3.0 |
_result | 1 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:06.721710+00:00 | test | 2 | even | 4.0 |
_result | 2 | 2025-02-03 09:45:07.739252+00:00 | 2025-02-03 10:45:07.739252+00:00 | 2025-02-03 10:44:07.721710+00:00 | test | 3 | odds and a half | 3.5 |
Additional info
Problem investigation
After investigating the code, the problem is caused by the fact that the default kwarg value for point_settings
in InfluxDBClient.write_api
is set to a mutable PointSettings
instance, which is modified by the method.
As a pointer to the same instance is passed to any call of this method, whatever the InfluxDBClient
instance making the call, al WriteApi
instances created using InfluxDBClient.write_api()
use **the same instance of PointSettings
.
Following the code execution, when the client.write_api()
method is called below, the problem begins :
- The client (say
a_client
) instantiates aWriteApi
. As no value is passed forpoint_settings
it uses with thePointSettings
instance which is created when the function is loaded (see e.g. here). PointSettings
instance is consummed by theWriteApi.__init__
- Which passes it to the
_BaseWriteApi.__init__
which itself binds it to its_point_settings
attribute which is modified inplace by passing it the default tags from the client. - From this moment on, any existing or future
WriteApi
instance created by usingInfluxDBClient.write_api()
has the default tags ofa_client
!
Possible workaround
Always provide a value for the PointSettings
kwarg of the InfluxDBClient.write_api
method, as in the provided example when creating w_api_3_2
.
Possible fix
Change the implementation so that the default kwarg value is not a mutable, which is a know "gotcha".
Rather, use a pattern where None
value is passed and in the method body point_settings = point_settings or PointSettings()
.