remove vector type; support SIMD operations on arrays directly #23327

andrewrk · 2025-03-23T00:41:07Z

I could have sworn there was an issue for this already but I couldn't find it, so here it is.

Arrays coerce to vectors. Vectors coerce to arrays. What's the point of a separate type? I don't know of any type safety argument.

Typically, SIMD operations are performed on arrays. Converting between vector and array is a chore that does not accomplish anything.

There is also the awkwardness of @Vector as a way to create the type. There are multiple proposals (see below) trying to make the syntax more palatable.

Problems:

The vector type affects the ABI of C calling convention functions. This is load-bearing in compiler_rt for example:

zig/lib/compiler_rt/modti3.zig

Lines 24 to 26 in 9c9d393

    
           const v2u64 = @Vector(2, u64); 
        
           fn __modti3_windows_x86_64(a: v2u64, b: v2u64) callconv(.c) v2u64 {

If anyone comes up with examples of how this could lead to worsened type safety (i.e. it could be easy to make a mistake and have a bug rather than compile error), that would be a critical flaw in this proposal that would be likely to make it rejected.

Finally, alignment. For many CPUs, vectors have larger alignment than arrays. This proposal is to keep arrays having the same alignment as status quo. To upgrade code that should lower to exactly the same machine code as before, those arrays that are used as vectors will need to be explicitly overaligned to vector alignment. Such extra alignment is a tradeoff; the more compact memory layout can help CPU cache efficiency, but vector alignment ensures that aligned vector load instructions can be selected rather than unaligned variants. Generally, programmers would be able to use default alignment, and then occasionally, after measuring, decide that the aligned vector load instructions are worth it to put an alignment annotation on the arrays that are used for SIMD operations.

mlugg · 2025-03-23T01:08:14Z

I've independently come up with more-or-less this proposal in conversations in the past, and I think it's a good idea. My version used builtins rather than directly allowing arithmetic operations, but I don't think I have any problems with allowing arithmetic ops directly on arrays.

My reasoning for this is that vectors are predominantly intended to be used locally, where the optimizer can use the hell out of SIMD registers, so details such as memory layout really shouldn't matter that much. The only context where it seems somewhat important to me is, as you mention, calling conventions. However, I don't think that's really a problem:

Within pure-Zig code, it shouldn't really matter what the returning / parameter passing convention for arrays/vectors is, because the optimizer is free to do whatever it wants. Any failure of the compiler to do so right now (I've not tested this!) I would view as an LLVM deficiency, because this doesn't seem particularly difficult (assuming the call isn't just inlined, consider the function body and look for loads which want overalignment, and overalign parameters correspondingly).
When interacting with C code, it seems reasonable to just have declarations in std.c (or similar) which "emulate" vector types in terms of calling convention; or, if that isn't possible (calling conventions are weird and I suspect some have vector CCs which can't be matched with other types), we could always add a new parameter annotation (similar syntax to noalias) to indicate a parameter should be passed as a vector (such an annotation would only be valid on functions with a non-Zig callconv). That is, something like extern fn foo(vector x: [3]u32) void. (This does reserve a keyword, which would be nice to avoid; perhaps with some kind of general attribute system which uses an enum of all available attributes.)

silversquirl · 2025-03-23T01:38:51Z

examples of how this could lead to worsened type safety

One that comes to mind is accidentally typing a + b instead of a ++ b. It's quite specific though, since it would only work when a and b are the same length, and it's probably not a big deal in practice, since it's immediately very obvious something's gone wrong when you run the code.

andrewrk · 2025-03-23T01:43:32Z

That's a great example. To elaborate on the problem:

const std = @import("std");

const lhs = [_]i32{ 1, 2, 3, 4 };
const rhs = [_]i32{ 5, 6, 7, 8 };
const concatenated = lhs + rhs;

pub fn main() !void {
    for (concatenated) |elem| {
        std.log.info("elem: {d}", .{elem});
    }
}

With this proposal, this code would compile and run, and have incorrect behavior at runtime.

alexrp · 2025-03-23T02:17:26Z

we could always add a new parameter annotation (similar syntax to noalias) to indicate a parameter should be passed as a vector (such an annotation would only be valid on functions with a non-Zig callconv). That is, something like extern fn foo(vector x: [3]u32) void. (This does reserve a keyword, which would be nice to avoid; perhaps with some kind of general attribute system which uses an enum of all available attributes.)

What about vectors within aggregate structs, e.g.? Now you need to allow vector on fields or something to that effect. Doesn't it just start to look like vector types but more awkward at that point?

I think we also need to consider scalable vectors in all this, i.e. vectors whose size is scaled by some runtime-known constant. This is a relatively new concept that's seen on AArch64 and RISC-V. It's not obvious to me that there's a nice way to add support for these without dedicated vector type syntax.

jacobly0 · 2025-03-23T02:51:44Z

I agree with @alexrp's points. Also, I would consider this proposal blocked on none of the backends handling alignment properly, so an over-aligned array would not currently be able to produce the same machine code.

aligned vector load instructions

Note that it isn't just aligned vs non-aligned instructions, a smaller array than the full vector width would prevent even unaligned instructions for fear of reading unmapped memory.

Another problem is that array values would not have the correct alignment, and this would currently prevent vector instructions on even aligned arrays.

We would also lose a type to represent vector masks, which we currently use bool vectors for.

spkgyk · 2025-03-24T03:15:13Z

If arrays and SIMD operations become a single type, would that mean something like:

pub fn addDataSIMD(
    comptime T: type,
    comptime vector_size: usize,
    data_a: []const T,
    data_b: []const T,
    result: []T,
) !void {
    const Vector = @Vector(vector_size, T);

    if (data_a.len != data_b.len) return error.UnequalLength;
    if (data_a.len != result.len) return error.ResultLengthMismatch;

    const full_blocks = data_a.len / vector_size;

    // Process full blocks using SIMD
    for (0..full_blocks) |i| {
        const start = i * vector_size;

        const vec_a: Vector = data_a[start..][0..vector_size].*;
        const vec_b: Vector = data_b[start..][0..vector_size].*;

        result[start..][0..vector_size].* = vec_a + vec_b;
    }

    // Handle remaining elements
    const remainder = full_blocks * vector_size;
    for (remainder..data_a.len) |i| {
        result[i] = data_a[i] + data_b[i];
    }
}

will convert to:

pub fn addDataSIMD(
    comptime T: type,
    comptime vector_size: usize,
    data_a: []const T,
    data_b: []const T,
    result: []T,
) !void {
    if (data_a.len != data_b.len) return error.UnequalLength;
    if (data_a.len != result.len) return error.ResultLengthMismatch;

    const full_blocks = data_a.len / vector_size;

    // Process full blocks using SIMD
    for (0..full_blocks) |i| {
        const start = i * vector_size;

        const vec_a: [vector_size]T = data_a[start..][0..vector_size].*;
        const vec_b: [vector_size]T = data_b[start..][0..vector_size].*;

        result[start..][0..vector_size].* = vec_a + vec_b;
    }

    // Handle remaining elements - would this need to change now?
    const remainder = full_blocks * vector_size;
    for (remainder..data_a.len) |i| {
        result[i] = data_a[i] + data_b[i];
    }
}

or, since the @Vector automatically scales SIMD, it would simplify even further to:

pub fn addDataSIMD(
    comptime T: type,
    data_a: []const T,
    data_b: []const T,
    result: []T,
) !void {
    if (data_a.len != data_b.len) return error.UnequalLength;
    if (data_a.len != result.len) return error.ResultLengthMismatch;

    const full_blocks = data_a.len / vector_size;

    // Process full blocks using SIMD
    result = data_a + data_b;
}

and if that is the case, how would we tell the addDataSIMD function what vector size to use (sorry if I missed any explanations for this in the above comments!)

I think I might be over-complicating things by looking at slices instead of arrays, but I thought it was worth leaving here anyway as it would be useful for signal/video processing and machine learning.

mlugg · 2025-03-24T03:48:12Z

Your last step is invalid; this issue does not propose allowing arithmetic operators on slices, only arrays.

Snektron · 2025-03-24T19:17:11Z

I think we also need to consider scalable vectors in all this, i.e. vectors whose size is scaled by some runtime-known constant. This is a relatively new concept that's seen on AArch64 and RISC-V. It's not obvious to me that there's a nice way to add support for these without dedicated vector type syntax.

This proposal could naturally extend to that by allowing the same set of operations on slices, like written above.

alexrp · 2025-03-24T19:28:20Z

This proposal could naturally extend to that by allowing the same set of operations on slices, like written above.

I don't think so; even if we assume that you have some @vscale() builtin that you can use to size your slices appropriately for the hardware you're running on, I don't see how the compiler would know to lower an operation like vscaled_result = vscaled_slice1 * vscaled_slice2 to actual hardware instructions using scalable vectors. By not encoding the length and vscale factor in the type, you've lost the necessary knowledge.

AndrewKraevskii · 2025-03-24T19:30:04Z

I think we also need to consider scalable vectors in all this, i.e. vectors whose size is scaled by some runtime-known constant. This is a relatively new concept that's seen on AArch64 and RISC-V. It's not obvious to me that there's a nice way to add support for these without dedicated vector type syntax.

This proposal could naturally extend to that by allowing the same set of operations on slices, like written above.

Would it make sence in contexts other than gpus? For cache reasons you probably don't want to compute operation on full slice if you want to chain them.

fn foo(a: []const u8, b: []const u8: out: []const u8) void {
    out = a + b; // cache misses for all "out"
    out += b; // again cache misses
}

gingerBill · 2025-03-24T19:58:51Z

From my experience with designing Odin, it took a few years to finally figure out that I should keep fixed-length arrays and #simd vectors be separate types. I originally had arrays with array-programming must be simd, then removed the simd stuff and then add basic array programming to all arrays, and then finally add a separate #simd type.

Here are the few reasons as to why I settled on what I did:

#simd vectors are similar to arrays but they do usually have very different alignment rules
- Usually at least 16-bytes required for alignment
- align(16) [4]f32 might still be an array type but that's not really any more clear than doing @Vector(4, f32) as that is showing the intent of the type much better
#simd vectors have very different semantics when it comes to addressing/indexing. In Odin, you cannot index a lane from a #simd vector with normal-array syntax (i.e. a[i] or &a[i]) because how that maps down in instructions is very different to than the typical way a normal-array would work. So in Odin, it has this syntax:
```
e := simd.extract(v, i)
v = simd.replace(v, i, e)
```
- This also helps guarantee the operations are SIMD like rather than be poorly optimized.
- So v[i] = e is not allowed because it has to be made clear that when doing SIMD work, you are replacing a field and the assigning the new vector to the original variable, not just changing one section of memory.
Conversions between #simd types of the same type might not require any need temporary memory, whilst if it was an array, a naive conversion would do this and the optimizer might struggle in some cases to figure this out (e.g. aliasing issues)
Type conversions in Odin are typically explicit, and #simd <-> arrays are no different in that they require an explicit conversion. We supply simd.to_array, simd.from_array, simd.from_slice, et al.
- I personally do not like the implicit conversions that Zig offer.
Array programming is defined for some operators but not all for numerous pragmatic reasons. And for #simd in general, there are other operators which have use the simd.* based procedures rather than operators directly.
- << and >> are not trivially defined for array-programming in Odin nor for #simd as they can lead to very annoying bugs or unwanted behaviour.
  - This is also coupled with how Odin defines << differently from C. In Odin x << 2 is defined to have same result as as (x << 1) << 1, and that does not necessarily map well to #simd, so explicit procedures/intrinsics are required.
Things like x += y are not always what you want for certain operations depending on the type
When it comes to comparison based operations, you will want separate calls for things like simd.lanes_eq and do not overload it with ==
- a == b is always defined to return an untyped boolean and not any form of array of booleans
- And for something like simd.lanes_eq, you will want it to be like #simd[N]i32/@Vector(N, i32) and @Vector(N, bool) as when doing a lot of SIMD work, you want to treat the results like an integer
- Order comparisons (<, <=, >, >=) are not defined for either arrays nor #simd, and you must explicit specify the kind of comparison you need in #simd (e.g. simd.lanes_lt)
You also do not want to allow any form of array programming on slices or other dynamic array types
- too many checks (same length checks, alignment checks, etc)
- doesn't scale well for different platforms
- would require implicit allocations if you want the same syntax e.g. slice_a + slice_b requires an allocation or always requiring another buffer out = a + b but again requires too many checks to be useful for the syntax. + is then too magical.
Odin #simd[N]T restrict the element type to only integers, floats, or booleans
- You could limit this to anything else if you implemented the simd-like operations as functions but then you don't have any benefits with syntax.
Arrays and SIMD vectors might have completely different ABI requirements even if the alignments are the same
- In Odin, [8]f32 and #simd[8]f32 might be passed very differently depending on the platform and ABI

Things relevant to Zig:

a + b for arrays could be easily confused for a ++ b, and vice versa
a * b for arrays could be easily confused for a ** b, and vice versa
If you are to keep @Vector, I'd highly improve the @splat syntax to not be as necessary.
- x * @as(@Vector(3, f32), @splat(2)) vs x * 2

Things relevant to LLVM:

If you don't use vector types directly in LLVM sometimes, it will not generate the correct code. The auto-vectorizer is great in LLVM but it does need help sometimes, so don't rely on it fully
- You will also need to enable target features per procedure for many simd stuff, and LLVM really wants you to write the source code in vectors and not fake-array code

gingerBill · 2025-03-25T09:06:36Z

I also forgot to mention that there are scalable SIMD vectors too which don't trivially map to normal fixed-length arrays. See ARM's Scalable Vector Extension (SVE) as a brilliant example of this.

From a type system perspective, it's harder to represent them with just arrays since they are a little different. As a programmer who might want to use this, without resorting to assembly, it's going to be nigh impossible to represent it without a dedicated SIMD type.

Relative to LLVM:

LLVM now has a specialized SVE type which is of the form <vscale x N x eltty>.

andrewrk · 2025-03-25T20:02:04Z

Thanks for the discussion, all.

andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Mar 23, 2025

andrewrk added this to the 0.15.0 milestone Mar 23, 2025

andrewrk closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2025

rohlem mentioned this issue Apr 13, 2025

Difference in SIMD semantics breaks translate-c code #23536

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove vector type; support SIMD operations on arrays directly #23327

remove vector type; support SIMD operations on arrays directly #23327

andrewrk commented Mar 23, 2025

mlugg commented Mar 23, 2025

silversquirl commented Mar 23, 2025

andrewrk commented Mar 23, 2025

alexrp commented Mar 23, 2025 •

edited

Loading

jacobly0 commented Mar 23, 2025 •

edited

Loading

spkgyk commented Mar 24, 2025 •

edited

Loading

mlugg commented Mar 24, 2025

Snektron commented Mar 24, 2025 •

edited

Loading

alexrp commented Mar 24, 2025

AndrewKraevskii commented Mar 24, 2025

gingerBill commented Mar 24, 2025 •

edited

Loading

gingerBill commented Mar 25, 2025

andrewrk commented Mar 25, 2025

remove vector type; support SIMD operations on arrays directly #23327

remove vector type; support SIMD operations on arrays directly #23327

Comments

andrewrk commented Mar 23, 2025

mlugg commented Mar 23, 2025

silversquirl commented Mar 23, 2025

andrewrk commented Mar 23, 2025

alexrp commented Mar 23, 2025 • edited Loading

jacobly0 commented Mar 23, 2025 • edited Loading

spkgyk commented Mar 24, 2025 • edited Loading

mlugg commented Mar 24, 2025

Snektron commented Mar 24, 2025 • edited Loading

alexrp commented Mar 24, 2025

AndrewKraevskii commented Mar 24, 2025

gingerBill commented Mar 24, 2025 • edited Loading

gingerBill commented Mar 25, 2025

andrewrk commented Mar 25, 2025

alexrp commented Mar 23, 2025 •

edited

Loading

jacobly0 commented Mar 23, 2025 •

edited

Loading

spkgyk commented Mar 24, 2025 •

edited

Loading

Snektron commented Mar 24, 2025 •

edited

Loading

gingerBill commented Mar 24, 2025 •

edited

Loading