Open
Description
The code below is in apache arrow cpp[1]. The arrow-rs also has similiar phenomenon[2].
To be short, when size is gurantee to be less or equal to 12
, gcc would inline the memcpy
and memset
but the clang don't optimize this. See godbolt link [3]. The problem is still exists when -ffreestanding
is enabled.
c_type makeInline1(const char* data, int32_t size) {
ARROW_COMPILER_ASSUME(size <= kInlineSize); // __builtin_assume
c_type out;
out.inlined = {size, {}};
// Memcpy for 0 to 12
memcpy(&out.inlined.data, data, size);
return out;
}
Would this being a problem? If it can be fixed with some compiler flags, what flag should I use?
[1] https://github.com/apache/arrow/blob/63b34c97c5d3ca6d20dacb9e92b404986f1d7d62/cpp/src/arrow/util/binary_view_util.h#L28
[2] apache/arrow-rs#6034
[3] https://godbolt.org/z/47T8s69xK