Skip to content

When datafusion.execution.parquet.coerce_int96 is set, timestamp type is still reported as Timestamp(nanoseconds) #15721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alamb opened this issue Apr 15, 2025 · 3 comments · Fixed by #15750
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Apr 15, 2025

Describe the bug

datafusion.execution.parquet.coerce_int96 is supposed to

If true, parquet reader will read columns of physical type int96 as originating from a different resolution than nanosecond. This is useful for reading data from systems like Spark which stores microsecond resolution timestamps in an int96 allowing it to write values with a larger date range than 64-bit timestamps with nanosecond resolution.

However, when I set this to ms the type is still reported to be Timestamp(Nanoseconds)

To Reproduce

-- Enable coercion of int96 to microseconds
set datafusion.execution.parquet.coerce_int96 = ms;

-- Create external table
CREATE EXTERNAL TABLE int96_from_spark
STORED AS PARQUET
LOCATION 'parquet-testing/data/int96_from_spark.parquet';

-- Print schema
describe int96_from_spark;

Results in

+-------------+-----------------------------+-------------+
| column_name | data_type                   | is_nullable |
+-------------+-----------------------------+-------------+
| a           | Timestamp(Nanosecond, None) | YES         |
+-------------+-----------------------------+-------------+
1 row(s) fetched.
Elapsed 0.001 seconds.

Expected behavior

I expect the output type to be Timestamp(Microsecond, None)

Additional context

@alamb
Copy link
Contributor Author

alamb commented Apr 15, 2025

@andygrove
Copy link
Member

Thanks @alamb. @mbutrovich fyi

@chenkovsky
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants