Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[translator/azurelogs] Improve performance #39340

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

constanca-m
Copy link
Contributor

@constanca-m constanca-m commented Apr 11, 2025

Description

This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests.

These are the main changes:

  • Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id.
  • Remove the map mappings. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away.
  • Use config fastest for jsoniter and borrow iterator.

Results:

goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
                             │   before    │               this PR               │
                             │   sec/op    │   sec/op     vs base                │
UnmarshalLogs/1000_record-16   2.226m ± 2%   1.590m ± 6%  -28.56% (p=0.000 n=10)
UnmarshalLogs/1_record-16      2.890µ ± 1%   2.155µ ± 1%  -25.45% (p=0.000 n=10)
UnmarshalLogs/100_record-16    217.9µ ± 2%   155.4µ ± 1%  -28.69% (p=0.000 n=10)
geomean                        111.9µ        81.05µ       -27.58%

                             │    before    │               this PR                │
                             │     B/op     │     B/op      vs base                │
UnmarshalLogs/1000_record-16   2.093Mi ± 0%   1.293Mi ± 0%  -38.24% (p=0.000 n=10)
UnmarshalLogs/1_record-16      2.484Ki ± 0%   1.506Ki ± 0%  -39.39% (p=0.000 n=10)
UnmarshalLogs/100_record-16    216.0Ki ± 0%   144.9Ki ± 0%  -32.91% (p=0.000 n=10)
geomean                        104.8Ki        66.11Ki       -36.91%

                             │   before    │               this PR               │
                             │  allocs/op  │  allocs/op   vs base                │
UnmarshalLogs/1000_record-16   38.05k ± 0%   20.03k ± 0%  -47.36% (p=0.000 n=10)
UnmarshalLogs/1_record-16       52.00 ± 0%    31.00 ± 0%  -40.38% (p=0.000 n=10)
UnmarshalLogs/100_record-16    3.835k ± 0%   2.025k ± 0%  -47.20% (p=0.000 n=10)
geomean                        1.965k        1.079k       -45.07%

Performance increased in all metrics.

There are still improvements that can be done, but I will not add them to this PR so it won't get too big:

  • We should not carry an attributes map, but instead we should add them to the record as soon as possible.

These issues might also get affected by #39186 if it goes forward.

Link to tracking issue

Relates #39119.

Testing

Unit tests and benchmark.

@constanca-m constanca-m requested review from atoulme and a team as code owners April 11, 2025 15:20
@@ -227,26 +218,42 @@ func copyPropertiesAndApplySemanticConventions(category string, properties *any,
return
}

// TODO: check if this is a valid JSON string and parse it?
Copy link
Contributor Author

@constanca-m constanca-m Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this TODO: if we parse as JSON we will lose some fields and it would be a breaking change at this point. I think it will also be worse for performance to unmarshal instead of just iterating over the map. But having any is an issue for performance we should look at.

Comment on lines +319 to +324
// TODO Add unit test again once bug gets fixed.
// Bug https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/39186#issuecomment-2798517892
// "log_maximum": {
// logFilename: "log-maximum.json",
// expectedFilename: "log-maximum-expected.yaml",
// },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants