Skip to content

ARROW-5950: [Rust][DataFusion] Add logger #8986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

Dandandan
Copy link
Contributor

It seems useful to me to do some extra (debug, info, etc) logging.

I added to the benchmark.
Executing with RUST_LOG="debug" gives (next to sqlparser logs):

[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=None
[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Optimized logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=Some([4, 5, 6, 7, 8, 9, 10])
Query 1 iteration 9 took 802.7 ms
Query 1 avg time: 915.30 ms

@codecov-io
Copy link

Codecov Report

Merging #8986 (aacc33d) into master (38ba81b) will decrease coverage by 0.00%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8986      +/-   ##
==========================================
- Coverage   82.65%   82.65%   -0.01%     
==========================================
  Files         200      200              
  Lines       49675    49678       +3     
==========================================
+ Hits        41058    41060       +2     
- Misses       8617     8618       +1     
Impacted Files Coverage Δ
rust/arrow-flight/src/arrow.flight.protocol.rs 0.00% <ø> (ø)
rust/benchmarks/src/bin/tpch.rs 0.00% <0.00%> (ø)
rust/datafusion/src/execution/context.rs 89.45% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 38ba81b...aacc33d. Read the comment docs.

@github-actions
Copy link

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Yes, good idea.

FWIW for me we can add it to the bin, also.

@Dandandan
Copy link
Contributor Author

FWIW for me we can add it to the bin, also.

I think that's a good idea, but don't want to add a logger as dependency directly to datafusion. Doing so at the bin level is not yet supported rust-lang/cargo#1982 It will probably change though rust-lang/rfcs#3020
Maybe at some point they can move to their own project.

@github-actions github-actions bot added the needs-rebase A PR that needs to be rebased by the author label Dec 22, 2020
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the idea of allowing logging, as well as the implementation of using the standard logging system is a good idea. Thanks @Dandandan!

It looks like a rebase is needed but otherwise I think this PR is ready to go

@Dandandan
Copy link
Contributor Author

Thanks @alamb updated the PR

@github-actions github-actions bot removed the needs-rebase A PR that needs to be rebased by the author label Dec 22, 2020
@alamb alamb closed this in 0519c4c Dec 22, 2020
Dandandan added a commit to Dandandan/arrow that referenced this pull request Dec 22, 2020
It seems useful to me to do some extra (debug, info, etc) logging.

I added to the benchmark.
Executing with `RUST_LOG="debug"` gives (next to sqlparser logs):

```
[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=None
[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Optimized logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=Some([4, 5, 6, 7, 8, 9, 10])
Query 1 iteration 9 took 802.7 ms
Query 1 avg time: 915.30 ms
```

Closes apache#8986 from Dandandan/add_logger

Lead-authored-by: Heres, Daniel <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
It seems useful to me to do some extra (debug, info, etc) logging.

I added to the benchmark.
Executing with `RUST_LOG="debug"` gives (next to sqlparser logs):

```
[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=None
[2020-12-21T23:04:59Z DEBUG datafusion::execution::context] Optimized logical plan:
     Sort: #l_returnflag ASC NULLS FIRST, #l_linestatus ASC NULLS FIRST
      Projection: #l_returnflag, #l_linestatus, #SUM(l_quantity) AS sum_qty, #SUM(l_extendedprice) AS sum_base_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount) AS sum_disc_price, #SUM(l_extendedprice Multiply Int64(1) Minus l_discount Multiply Int64(1) Plus l_tax) AS sum_charge, #AVG(l_quantity) AS avg_qty, #AVG(l_extendedprice) AS avg_price, #AVG(l_discount) AS avg_disc, #COUNT(UInt8(1)) AS count_order
        Aggregate: groupBy=[[#l_returnflag, #l_linestatus]], aggr=[[SUM(#l_quantity), SUM(#l_extendedprice), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount), SUM(#l_extendedprice Multiply Int64(1) Minus #l_discount Multiply Int64(1) Plus #l_tax), AVG(#l_quantity), AVG(#l_extendedprice), AVG(#l_discount), COUNT(UInt8(1))]]
          Filter: #l_shipdate LtEq CAST(Utf8("1998-09-02") AS Date32(Day))
            TableScan: lineitem projection=Some([4, 5, 6, 7, 8, 9, 10])
Query 1 iteration 9 took 802.7 ms
Query 1 avg time: 915.30 ms
```

Closes apache#8986 from Dandandan/add_logger

Lead-authored-by: Heres, Daniel <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants