Skip to content

Commit 74a6a8d

Browse files
authored
feat: Move shuffle block decompression and decoding to native code and add LZ4 & Snappy support (#1192)
* Implement native decoding and decompression * revert some variable renaming for smaller diff * fix oom issues? * make NativeBatchDecoderIterator more consistent with ArrowReaderIterator * fix oom and prep for review * format * Add LZ4 support * clippy, new benchmark * rename metrics, clean up lz4 code * update test * Add support for snappy * format * change default back to lz4 * make metrics more accurate * format * clippy * use faster unsafe version of lz4_flex * Make compression codec configurable for columnar shuffle * clippy * fix bench * fmt * address feedback * address feedback * address feedback * minor code simplification * cargo fmt * overflow check * rename compression level config * address feedback * address feedback * rename constant
1 parent e72beb1 commit 74a6a8d

File tree

23 files changed

+524
-231
lines changed

23 files changed

+524
-231
lines changed

common/src/main/scala/org/apache/comet/CometConf.scala

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -272,18 +272,19 @@ object CometConf extends ShimCometConf {
272272
.booleanConf
273273
.createWithDefault(false)
274274

275-
val COMET_EXEC_SHUFFLE_COMPRESSION_CODEC: ConfigEntry[String] = conf(
276-
s"$COMET_EXEC_CONFIG_PREFIX.shuffle.compression.codec")
277-
.doc(
278-
"The codec of Comet native shuffle used to compress shuffle data. Only zstd is supported. " +
279-
"Compression can be disabled by setting spark.shuffle.compress=false.")
280-
.stringConf
281-
.checkValues(Set("zstd"))
282-
.createWithDefault("zstd")
275+
val COMET_EXEC_SHUFFLE_COMPRESSION_CODEC: ConfigEntry[String] =
276+
conf(s"$COMET_EXEC_CONFIG_PREFIX.shuffle.compression.codec")
277+
.doc(
278+
"The codec of Comet native shuffle used to compress shuffle data. lz4, zstd, and " +
279+
"snappy are supported. Compression can be disabled by setting " +
280+
"spark.shuffle.compress=false.")
281+
.stringConf
282+
.checkValues(Set("zstd", "lz4", "snappy"))
283+
.createWithDefault("lz4")
283284

284-
val COMET_EXEC_SHUFFLE_COMPRESSION_LEVEL: ConfigEntry[Int] =
285-
conf(s"$COMET_EXEC_CONFIG_PREFIX.shuffle.compression.level")
286-
.doc("The compression level to use when compression shuffle files.")
285+
val COMET_EXEC_SHUFFLE_COMPRESSION_ZSTD_LEVEL: ConfigEntry[Int] =
286+
conf(s"$COMET_EXEC_CONFIG_PREFIX.shuffle.compression.zstd.level")
287+
.doc("The compression level to use when compressing shuffle files with zstd.")
287288
.intConf
288289
.createWithDefault(1)
289290

docs/source/user-guide/configs.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ Comet provides the following configuration settings.
5050
| spark.comet.exec.memoryPool | The type of memory pool to be used for Comet native execution. Available memory pool types are 'greedy', 'fair_spill', 'greedy_task_shared', 'fair_spill_task_shared', 'greedy_global' and 'fair_spill_global', By default, this config is 'greedy_task_shared'. | greedy_task_shared |
5151
| spark.comet.exec.project.enabled | Whether to enable project by default. | true |
5252
| spark.comet.exec.replaceSortMergeJoin | Experimental feature to force Spark to replace SortMergeJoin with ShuffledHashJoin for improved performance. This feature is not stable yet. For more information, refer to the Comet Tuning Guide (https://datafusion.apache.org/comet/user-guide/tuning.html). | false |
53-
| spark.comet.exec.shuffle.compression.codec | The codec of Comet native shuffle used to compress shuffle data. Only zstd is supported. Compression can be disabled by setting spark.shuffle.compress=false. | zstd |
54-
| spark.comet.exec.shuffle.compression.level | The compression level to use when compression shuffle files. | 1 |
53+
| spark.comet.exec.shuffle.compression.codec | The codec of Comet native shuffle used to compress shuffle data. lz4, zstd, and snappy are supported. Compression can be disabled by setting spark.shuffle.compress=false. | lz4 |
54+
| spark.comet.exec.shuffle.compression.zstd.level | The compression level to use when compressing shuffle files with zstd. | 1 |
5555
| spark.comet.exec.shuffle.enabled | Whether to enable Comet native shuffle. Note that this requires setting 'spark.shuffle.manager' to 'org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager'. 'spark.shuffle.manager' must be set before starting the Spark application and cannot be changed during the application. | true |
5656
| spark.comet.exec.sort.enabled | Whether to enable sort by default. | true |
5757
| spark.comet.exec.sortMergeJoin.enabled | Whether to enable sortMergeJoin by default. | true |

native/Cargo.lock

Lines changed: 25 additions & 23 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

native/core/Cargo.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ serde = { version = "1", features = ["derive"] }
5252
lazy_static = "1.4.0"
5353
prost = "0.12.1"
5454
jni = "0.21"
55+
snap = "1.1"
56+
# we disable default features in lz4_flex to force the use of the faster unsafe encoding and decoding implementation
57+
lz4_flex = { version = "0.11.3", default-features = false }
5558
zstd = "0.11"
5659
rand = { workspace = true}
5760
num = { workspace = true }

native/core/benches/row_columnar.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ use arrow::datatypes::DataType as ArrowDataType;
1919
use comet::execution::shuffle::row::{
2020
process_sorted_row_partition, SparkUnsafeObject, SparkUnsafeRow,
2121
};
22+
use comet::execution::shuffle::CompressionCodec;
2223
use criterion::{criterion_group, criterion_main, Criterion};
2324
use tempfile::Builder;
2425

@@ -77,6 +78,7 @@ fn benchmark(c: &mut Criterion) {
7778
false,
7879
0,
7980
None,
81+
&CompressionCodec::Zstd(1),
8082
)
8183
.unwrap();
8284
});

native/core/benches/shuffle_writer.rs

Lines changed: 35 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,23 +35,52 @@ fn criterion_benchmark(c: &mut Criterion) {
3535
group.bench_function("shuffle_writer: encode (no compression))", |b| {
3636
let batch = create_batch(8192, true);
3737
let mut buffer = vec![];
38-
let mut cursor = Cursor::new(&mut buffer);
3938
let ipc_time = Time::default();
40-
b.iter(|| write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::None, &ipc_time));
39+
b.iter(|| {
40+
buffer.clear();
41+
let mut cursor = Cursor::new(&mut buffer);
42+
write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::None, &ipc_time)
43+
});
44+
});
45+
group.bench_function("shuffle_writer: encode and compress (snappy)", |b| {
46+
let batch = create_batch(8192, true);
47+
let mut buffer = vec![];
48+
let ipc_time = Time::default();
49+
b.iter(|| {
50+
buffer.clear();
51+
let mut cursor = Cursor::new(&mut buffer);
52+
write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Snappy, &ipc_time)
53+
});
54+
});
55+
group.bench_function("shuffle_writer: encode and compress (lz4)", |b| {
56+
let batch = create_batch(8192, true);
57+
let mut buffer = vec![];
58+
let ipc_time = Time::default();
59+
b.iter(|| {
60+
buffer.clear();
61+
let mut cursor = Cursor::new(&mut buffer);
62+
write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Lz4Frame, &ipc_time)
63+
});
4164
});
4265
group.bench_function("shuffle_writer: encode and compress (zstd level 1)", |b| {
4366
let batch = create_batch(8192, true);
4467
let mut buffer = vec![];
45-
let mut cursor = Cursor::new(&mut buffer);
4668
let ipc_time = Time::default();
47-
b.iter(|| write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Zstd(1), &ipc_time));
69+
b.iter(|| {
70+
buffer.clear();
71+
let mut cursor = Cursor::new(&mut buffer);
72+
write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Zstd(1), &ipc_time)
73+
});
4874
});
4975
group.bench_function("shuffle_writer: encode and compress (zstd level 6)", |b| {
5076
let batch = create_batch(8192, true);
5177
let mut buffer = vec![];
52-
let mut cursor = Cursor::new(&mut buffer);
5378
let ipc_time = Time::default();
54-
b.iter(|| write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Zstd(6), &ipc_time));
79+
b.iter(|| {
80+
buffer.clear();
81+
let mut cursor = Cursor::new(&mut buffer);
82+
write_ipc_compressed(&batch, &mut cursor, &CompressionCodec::Zstd(6), &ipc_time)
83+
});
5584
});
5685
group.bench_function("shuffle_writer: end to end", |b| {
5786
let ctx = SessionContext::new();

0 commit comments

Comments
 (0)