Skip to content

Fix wasm32 build on version 46 #15102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 11, 2025
Merged

Fix wasm32 build on version 46 #15102

merged 2 commits into from
Mar 11, 2025

Conversation

XiangpengHao
Copy link
Contributor

@XiangpengHao XiangpengHao commented Mar 9, 2025

Which issue does this PR close?

Rationale for this change

I encountered a compile error when trying to upgrade DataFusion to V46 for paruqet viewer.

I think the bug was accidentally introduced in #14464 cc @logan-keede

What changes are included in this PR?

Reverted the wasm32 gate as it should work for wasm32. I also enabled the parquet feature for wasm-test so that we can capture the bug in ci.

I also removed the fs feature on tokio, as I don't think parquet datasource relies on it, which breaks wasm build.

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the development-process Related to development process of DataFusion label Mar 9, 2025
- name: Install dependencies
run: |
apt-get update -qq
apt-get install -y -qq clang
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this I got: https://github.com/XiangpengHao/datafusion/actions/runs/13744579987/job/38437951143#step:6:144

warning: [email protected]+zstd.1.5.6: ToolExecError: Command LC_ALL="C" "sccache" "clang" "-O0" "-ffunction-sections" "-fdata-sections" "-fno-exceptions" "-g" "-fno-omit-frame-pointer" "--target=wasm32-unknown-unknown" "-I" "wasm-shim/" "-I" "zstd/lib/" "-I" "zstd/lib/common" "-fvisibility=hidden" "-DZSTD_LIB_DEPRECATED=0" "-DXXH_PRIVATE_API=" "-DZSTDLIB_VISIBILITY=" "-DZSTDERRORLIB_VISIBILITY=" "-o" "/__w/datafusion/datafusion/target/wasm32-unknown-unknown/debug/build/zstd-sys-789e7626e12bcd14/out/44ff4c55aa9e5133-debug.o" "-c" "zstd/lib/common/debug.c" with args clang did not execute successfully (status code exit status: 2).cargo:warning=Compiler family detection failed due to error: ToolNotFound: Failed to find tool. Is `clang` installed?
warning: [email protected]+zstd.1.5.6: Compiler family detection failed due to error: ToolNotFound: Failed to find tool. Is `clang` installed?
warning: [email protected]+zstd.1.5.6: Compiler family detection failed due to error: ToolNotFound: Failed to find tool. Is `clang` installed?
warning: [email protected]+zstd.1.5.6: Compiler family detection failed due to error: ToolNotFound: Failed to find tool. Is `clang` installed?
warning: [email protected]+zstd.1.5.6: sccache: error: failed to execute compile
warning: [email protected]+zstd.1.5.6: sccache: caused by: cannot find binary path

So I added clang here

@logan-keede
Copy link
Contributor

Thanks for cleaning up after me @XiangpengHao.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @XiangpengHao

It seems like this code is not covered and thus is likely to get broken again as part of refactoring.

We have a CI job that is supposed to check wasm like this:

linux-wasm-pack:
name: build with wasm-pack
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Install wasm-pack
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- name: Build with wasm-pack
working-directory: ./datafusion/wasmtest

It has a special crate that compiles some sort of test
https://github.com/apache/datafusion/blob/main/datafusion/wasmtest

Can you help me understand what is different about the parquet viewer that isn't covered by the existing test?

@@ -45,7 +45,7 @@ chrono = { version = "0.4", features = ["wasmbind"] }
# all the `std::fmt` and `std::panicking` infrastructure, so isn't great for
# code size when deploying.
console_error_panic_hook = { version = "0.1.1", optional = true }
datafusion = { workspace = true }
datafusion = { workspace = true, features = ["parquet"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what is different about the parquet viewer that isn't covered by the existing test?

Here's the secret @alamb . I think the workspace datafusion has all features disabled. If we enable parquet feature here, we should be able to see compile error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed you are right -- I verified I got a compile error without the code changes in this PR

andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion/datafusion/wasmtest$ wasm-pack build --dev

Compiling datafusion-datasource-csv v46.0.0 (/Users/andrewlamb/Software/datafusion/datafusion/datasource-csv)
error[E0432]: unresolved import `crate::file_format::coerce_file_schema_to_view_type`
   --> datafusion/datasource-parquet/src/opener.rs:23:40
    |
23  |     coerce_file_schema_to_string_type, coerce_file_schema_to_view_type,
    |                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |                                        |
    |                                        no `coerce_file_schema_to_view_type` in `file_format`
    |                                        help: a similar name exists in the module: `coerce_file_schema_to_string_type`
    |
note: found an item that was configured out
   --> datafusion/datasource-parquet/src/file_format.rs:470:8
    |
470 | pub fn coerce_file_schema_to_view_type(
    |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: the item is gated here
   --> datafusion/datasource-parquet/src/file_format.rs:469:1
    |
469 | #[cfg(not(target_arch = "wasm32"))]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0425]: cannot find function `coerce_file_schema_to_view_type` in this scope
   --> datafusion/datasource-parquet/src/file_format.rs:726:27
    |
519 | / pub fn coerce_file_schema_to_string_type(
520 | |     table_schema: &Schema,
521 | |     file_schema: &Schema,
522 | | ) -> Option<Schema> {
...   |
571 | | }
    | |_- similarly named function `coerce_file_schema_to_string_type` defined here
...
726 |       if let Some(merged) = coerce_file_schema_to_view_type(&table_schema, &file...
    |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `coerce_file_schema_to_string_type`

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 maybe we should plant to make a DataFusion 46.0.1 release with this fix

@alamb
Copy link
Contributor

alamb commented Mar 10, 2025

I have been thinking about this issue

I plan to:

  1. Make a ticket proposing a new patch release for datafusion 46
  2. Make a ticket to cover doing something with parquet via wasm (not just adding the feature flag as in this PR)

@XiangpengHao
Copy link
Contributor Author

Make a ticket to cover doing something with parquet via wasm (not just adding the feature flag as in this PR)

Sounds great! I can also try to upgrade DataFusion in parquet viewer before every major release to see if I can help capture anything.

@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

BTW I think we should consider fixing this in a 46.0.0 release.

@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

Filed a ticket to track adding coverage

@alamb alamb merged commit 6f285d6 into apache:main Mar 11, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

Thanks again @XiangpengHao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of DataFusion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

regression: DataFusion 46 wasm compile error with parquet
3 participants