Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: allow @splat for arrays #20433

Closed
mlugg opened this issue Jun 27, 2024 · 11 comments · Fixed by #21635
Closed

Proposal: allow @splat for arrays #20433

mlugg opened this issue Jun 27, 2024 · 11 comments · Fixed by #21635
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@mlugg
Copy link
Member

mlugg commented Jun 27, 2024

Background

It is quite common to want to initialize arrays to a fixed constant value. Today, this is typically done with the ** operator as follows:

var pixels: [pix_count]u32 = .{0} ** pix_count;

However, this usage is a little awkward: the array length is specified twice! In this case, we could omit the type annotation:

var pixels = [1]u32{0} ** pix_count;

However, this makes it less clear what the type of pixels is, and is not consistent with a general preference for type annotations over explicitly-typed expressions. The situation gets even worse in the case of nested arrays:

var pixels: [height][width]u32 = .{.{1} ** width} ** height;
// or:
var pixels = [1][width]u32{[1]u32{1} ** width} ** height;

This is really unpleasant. In both cases, the initialization expression is quite tricky to understand at a glance; plus, in the former case there's a lot of repetition, and the latter case is borderline unreadable.

When initializing vectors to a fixed value, we already have a solution: the @splat builtin.

var flags: @Vector(10, bool) = @splat(false);

However, no equivalent exists today for arrays. This awkwardness leads to people reaching for std.mem.zeroes in many cases, which we definitely want to discourage.

Proposal

Allow @splat to initialize arrays. Nothing else about the builtin changes; the way it works is fairly obvious. The above examples look like this:

// flat array
var pixels: [pix_count]u32 = @splat(0);
// nested array
var pixels: [height][width]u32 = @splat(@splat(0));

There's really nothing else to say here. This is a trivial expansion of an existing syntax form in an intuitive manner. The meaning of the newly-allowed code is obvious, and improves readability while discouraging questionable usages of std.mem.zeroes.

@mlugg mlugg added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jun 27, 2024
@mlugg mlugg added this to the 0.14.0 milestone Jun 27, 2024
@jeffkdev
Copy link

Would @splat be able to handle an array of structs too? I am thinking of this related issue: #6068

(copied from issue):

// a default-initialized struct
const Bla = struct {
    x: i32 = 0,
    y: i32 = 0,
    z: i32 = 0,
};

// and another struct with an embedded array of those default-initialized items:
const Blub = struct {
    items: [4]Bla = [_]Bla{ .{} } ** 4,
};

With this change could Blub could be initialized like this?

const Blub = struct {
    items: [4]Bla = @splat(.{}),
};

And then independent of struct support, could it be used to default-initialize the rest of an array using concatenation if there are some number of unique values?

const pix_count = 4;
var pixels0: [pix_count]u32 = .{1,2} ++ @splat(0); // 1, 2, 0, 0 (main use-case)
var pixels1: [pix_count]u32 = .{1} ++ @splat(undefined); // 1, undefined, undefined, undefined (supports undefined values)
var pixels2: [pix_count]u32 = .{1,2,3,4} ++ @splat(0); // 1, 2, 3, 4 (supports length of zero)

@mlugg
Copy link
Member Author

mlugg commented Jun 27, 2024

Yep, absolutely. The behaviour of @splat would just be defined to initialize an array or vector by setting all elements to it's operand, forwarding the appropriate result type. There are no restrictions on the inner type.

@Cloudef
Copy link
Contributor

Cloudef commented Jun 28, 2024

Could this also improve a codegen being a builtin? ** currently has scenarios where the output is not that nice.

@nektro
Copy link
Contributor

nektro commented Jun 28, 2024

I imagine it would be identical and ** would be removed.

@mlugg
Copy link
Member Author

mlugg commented Jun 28, 2024

Why on earth would ** be removed?

@Cloudef
Copy link
Contributor

Cloudef commented Jun 28, 2024

Unless the constant 42 here is 0, zig currently likes to insert the number into the binary N amount of times, worse is that the array here is even never used as it gets overwritten immediately. It seems 0 is special cased and compiles into memset.
https://godbolt.org/z/KGWYTcPY3

@JonathanHallstrom
Copy link
Contributor

Unless the constant 42 here is 0, zig currently likes to insert the number into the binary N amount of times, worse is that the array here is even never used as it gets overwritten immediately. It seems 0 is special cased and compiles into memset. https://godbolt.org/z/KGWYTcPY3

0 is not really special, memset is used whenever the 4 bytes of the int are all the same, see https://godbolt.org/z/sEcjncfG8.

@mlugg
Copy link
Member Author

mlugg commented Jul 17, 2024

@Vexu, could you elaborate on your thumbs-down? I ask only because I've not seen an argument against this so far and would like to be aware of potential downsides to this proposal.

@Vexu
Copy link
Member

Vexu commented Jul 17, 2024

I just don't like the idea of expanding a SIMD specific builtin for what looks to me like syntax sugar purposes that doesn't even fully replace std.mem.zeroes since it doesn't apply to structs.

@mlugg
Copy link
Member Author

mlugg commented Jul 17, 2024

Sorry, I should be clearer: it's not intended to fully replace std.mem.zeroes. That function is the correct thing to use if you really want C-API-style zero initialization of a struct. However, the issue with std.mem.zeroes is that people have started using it to handle cases which really should not require reaching into the standard library; specifically this case of default-initing an array. std.mem.zeroes should really be seldom used, but a Sourcegraph search shows that it is used in many places by Bun, TigerBeetle, Mach, zig-gamedev, zls, and more, so I think it's clear we have a usability problem here.

I personally don't see a problem with expanding this builtin, because it's very obvious what the expansion does, and I don't see a way it could lead to any confusion, harm readability, etc.

@kcbanner
Copy link
Contributor

I can see there being some confusion that the other vector builtins (@reduce, @select, @shuffle) don't work on arrays, but I do like the idea of a terse way to init an array like this.

@andrewrk andrewrk added the accepted This proposal is planned. label Oct 8, 2024
@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Oct 8, 2024
mlugg added a commit to mlugg/zig that referenced this issue Oct 8, 2024
mlugg added a commit to mlugg/zig that referenced this issue Oct 9, 2024
mlugg added a commit to mlugg/zig that referenced this issue Oct 10, 2024
mlugg added a commit to mlugg/zig that referenced this issue Oct 10, 2024
mlugg added a commit that referenced this issue Oct 10, 2024
@andrewrk andrewrk modified the milestones: 0.15.0, 0.14.0 Oct 11, 2024
der-teufel-programming pushed a commit to der-teufel-programming/zig that referenced this issue Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants