Skip to content

Use 'crc-fast' crate for better CRC performance #4111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 12, 2025

Conversation

onethumb
Copy link
Contributor

@onethumb onethumb commented May 2, 2025

Motivation and Context

The crc-fast crate provides better performance (>100GiB/s for CRC32 and CRC32C) and a unified solution for calculating CRC32, CRC32C, and CRC64NVME, the three supported CRC checksums.

Description

Replaces the current three separate implementations with a single implementation.

Testing

Drop-in minimal change, existing test coverage is comprehensive and still passes.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Replaces crc64fast-nvme with the new crc-fast crate.
Provides a massive performance improvement on modern hardware
(>100GiB/s, depending on the platform).

Also reduces the package dependencies, since crc-fast provides
accelerated calculation for all 3 supported CRC checksums.
@onethumb onethumb requested a review from a team as a code owner May 2, 2025 17:18
@onethumb
Copy link
Contributor Author

onethumb commented May 2, 2025

Based on the PR template, Contributing Guidelines, and reviewing some other open PRs, it wasn't obvious to me what I should do with a possible .changelog entry, so I omitted that portion. Happy to add it if someone can clarify what, if anything, I should be doing there.

Also not clear how some of the other Cargo.lock files get updated appropriately to pick up this change from aws-smithy-checksum. I'm happy to update these, if necessary, and someone can point me at docs for how. I can see that simply assembling the SDK with Gradle updates many, but not all, of these, so perhaps it's some other build process?

The remaining untouched lock files which reference the older packages are:

  • aws/rust-runtime/Cargo.lock
  • aws/sdk/Cargo.lock
  • aws/sdk/benchmarks/previous-release-comparison/Cargo.lock (which seems likely ok?)
  • aws/sdk/benchmarks/s3-express/Cargo.lock

Tagging @landonxjames since we've worked on checksums for this project in the past.

@landonxjames
Copy link
Contributor

landonxjames commented May 5, 2025

Thanks for the contribution! Massaged a few things that were impacting CI (version bump in aws-smithy-checksums crate, updated the Cargo.lock files with ./gradlew aws:sdk:cargoUpdateAllLockfiles, ran rustfmt in the checksums crate). Assuming that all works out it should take care of most of the issues.

I think all that remains would be the failures on our exotic platform tests. I suspect those probably require a software-only fallback implementation for platforms that don't support any of the expected intrinsics:

error: Unsupported architecture for SIMD CRC calculation
  --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:75:9
   |
75 |         compile_error!("Unsupported architecture for SIMD CRC calculation");
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: Unsupported architecture for SIMD CRC calculation
  --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:91:5
   |
91 |     compile_error!("Unsupported architecture for SIMD CRC calculation");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: unused import: `crate::algorithm`
 --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:8:5
  |
8 | use crate::algorithm;
  |     ^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused imports: `Width32` and `Width64`
 --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:9:33
  |
9 | use crate::structs::{CrcParams, Width32, Width64};
  |                                 ^^^^^^^  ^^^^^^^

error[E0308]: mismatched types
  --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:74:5
   |
74 | /     {
75 | |         compile_error!("Unsupported architecture for SIMD CRC calculation");
76 | |     }
   | |_____^ expected `u64`, found `()`

error[E0308]: mismatched types
  --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.1.0/src/arch/mod.rs:79:24
   |
79 | pub fn get_target() -> String {
   |        ----------      ^^^^^^ expected `String`, found `()`
   |        |
   |        implicitly returns `()` as its body has no tail or `return` expression

For more information about this error, try `rustc --explain E0308`.
warning: `crc-fast` (lib) generated 2 warnings
error: could not compile `crc-fast` (lib) due to 4 previous errors; 2 warnings emitte

@onethumb
Copy link
Contributor Author

onethumb commented May 6, 2025

Great, thanks for the info. Is it easiest if I include a software fallback in the in crc_fast crate (which would be trivial to implement)?

It's possible the POWER8 VPMSUMW and VPMSUMD intrinsics are available for PowerPC/PPC64 in Rust, I'd have to check. They were introduced in POWER8 (2013), but I don't know what your exotic baseline is and whether it's earlier than that?

@landonxjames
Copy link
Contributor

landonxjames commented May 6, 2025

Great, thanks for the info. Is it easiest if I include a software fallback in the in crc_fast crate (which would be trivial to implement)?

I think that would be best for us. Although both of our exotic platform tests are for powerpc variants that is really just to cover that we support 32bit architectures and a less common 64bit one, they aren't really meant to be fully representational of all the platforms we strive to support.

I'll take a look at those two other failing CI checks in the coming days, they seem to be hitting some weird versioning stuff unrelated to your change.

Uses the `crc` crate to provide a table-based software fallback for
architectures which don’t support hardware acceleration.
@onethumb
Copy link
Contributor Author

onethumb commented May 8, 2025

@landonxjames Ok, I added a software fallback in crc-fast v1.2.0 and bumped the version here. Should pass the exotic tests now. Let me know if there's anything else.

Add some flags/transitive dep features to webassembly tests to get
getrandom crate to compile.

Note there are still some issues getting crc-fast crate to compile for
wasm due to use of the libc crate
@landonxjames landonxjames requested a review from a team as a code owner May 9, 2025 20:13
@landonxjames
Copy link
Contributor

Cleaned up most of the CI issues and it looks like there is still one problem compiling the crc-fast crate for the wasm32-unknown-unknown target (which I believe doesn't have a libc). The error suggest using the c_char type from std::ffi, so that might fix it, but I'll have to do some testing.

https://github.com/smithy-lang/smithy-rs/actions/runs/14936949828/job/41969146796?pr=4111

error[E0412]: cannot find type `c_char` in crate `libc`
   --> /opt/cargo/registry/src/index.crates.io-6f17d22bba15001f/crc-fast-1.2.0/src/ffi.rs:240:58
    |
240 | pub extern "C" fn crc_fast_get_version() -> *const libc::c_char {
    |                                                          ^^^^^^ not found in `libc`
    |
help: consider importing one of these type aliases
    |
8   + use std::ffi::c_char;
    |
8   + use std::os::raw::c_char;
    |
8   + use core::ffi::c_char;
    |
help: if you import `c_char`, refer to it directly
    |
240 - pub extern "C" fn crc_fast_get_version() -> *const libc::c_char {
240 + pub extern "C" fn crc_fast_get_version() -> *const c_char {
    |

@onethumb
Copy link
Contributor Author

@landonxjames Ok, I fixed the wasm32-unknown-unknown issue (building a C-compatible shared library for wasm targets isn't really helpful anyway) in crc-fast v1.2.1 and updated accordingly. Let me know if anything else comes up. 👍

@landonxjames
Copy link
Contributor

All the CI is passing so this looks good (that Semver failure always happens for external contributions that make changes to things included in our build image). Running the Canary against this PR as a final check: https://github.com/smithy-lang/smithy-rs/actions/runs/14978881530

@landonxjames landonxjames enabled auto-merge May 12, 2025 19:04
@aws-sdk-rust-ci aws-sdk-rust-ci disabled auto-merge May 12, 2025 19:05
@aws-sdk-rust-ci aws-sdk-rust-ci merged commit d8ba5d3 into smithy-lang:main May 12, 2025
40 of 42 checks passed
@landonxjames
Copy link
Contributor

Merged, thanks for the contribution! (To anyone from the future looking at this, I had to override and merge from the CI account because of the previously mentioned and expected Semver failure)

@atwam
Copy link

atwam commented May 19, 2025

While this looks like a great improvement, there is an issue with the use of the optimize_crc32_auto feature of crc-fast crate (see awslabs/aws-sdk-rust#1291 (comment))

In short, that breaks cross-compilation, where binaries built on a machine may not work on a different machine that has the same architecture but a different set of instructions.

@GPSnoopy
Copy link

@landonxjames I'm uncomfortable that:

  • The author of this PR is also the author of the crc-fast crate that is being introduced as a dependency.
  • The dependency has been added without properly reviewing the claims that it is faster.
  • The crc-fast crate has virtually zero downloads prior to being added to smithy-rs (and IMHO should be considered as not ready for inclusion into the AWS SDK).

Best case scenario, it's immature and therefore has its share of bugs. See the awslabs/aws-lambda-rust-runtime#993 and awslabs/aws-sdk-rust#1291, which includes the program being killed with "Illegal Instruction".

Worse case scenario (albeit unlikely - and I do not claim that this is the case), the dependency is malicious and contains security backdoors (see https://www.reddit.com/r/sysadmin/comments/1bqu3zx/backdoor_in_upstream_xzliblzma_leading_to_ssh/).

I would vote for rolling back this PR, and reconsider crc-fast once it has been battle hardened, and considered production ready by the wider Rust community.

@GPSnoopy
Copy link

Alternatively, experimental dependencies should be behind opt-in feature flags

@landonxjames
Copy link
Contributor

Hi @GPSnoopy, I understand your concern, and we regret the regression this has caused, but some of those claims are inaccurate.

The author of this PR is also the author of the crc-fast crate that is being introduced as a dependency.

The author was also the author of the crc64fast-nvme we were using previously. We have worked with them before and they are a trusted partner.

The dependency has been added without properly reviewing the claims that it is faster.

The implementation and performance was reviewed and benchmarked extensively internally (including by principal engineers with expertise in high performance checksum implementations)

The crc-fast crate has virtually zero downloads prior to being added to smithy-rs (and IMHO should be considered as not ready for inclusion into the AWS SDK).

We are aware of this and chose to take the dependency anyway after discussion with the author, careful review of the code, and consideration for lowering our total number of dependencies.

Best case scenario, it's immature and therefore has its share of bugs. See the awslabs/aws-lambda-rust-runtime#993 and awslabs/aws-sdk-rust#1291, which includes the program being killed with "Illegal Instruction".

We are aware of this issue and working to figure out a fix.

@GPSnoopy
Copy link

GPSnoopy commented May 19, 2025

Hi @landonxjames. Thanks for clarifying and giving more context. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants