-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrayBuilder struct for safe/efficient dynamic array initialisation #3131
base: master
Are you sure you want to change the base?
Conversation
I think we should have a full-fledged |
There's already an RFC for that. Mentioned in the Prior Art. I would rather get something smaller scope in quickly so that it can start to be used |
There are four ArrayVec-like crates that provide this functionality now, so zero reason for "quickly" doing "stop-gaps". If anything, we need a fifth crate that unifies their various features, ala #2990 (comment) and can thus join std with minimal resistance. |
I was speaking with @BurntSushi earlier today, he wasn't so sure if it made sense to have such functionality in std just for the sake of it. But array initialisation is one of those basic things that does seem weird to not have. While they would be similar implementations. I do think the semantics are different. Thinking of their location too. |
Yes, it's possible they do not unify cleanly, which maybe his concern. I donno.. |
To clarify my current thoughts:
|
Thanks for putting this so succinctly. This is exactly what separates this RFC from the other. Having the minimum needed to enrich I personally don't mind if a full |
The I disagree with @BurntSushi's assessment of There are multiple crates with seemingly overlapping functionality, because:
The case 1 is not a problem any more. The case 2 can easily be declared out of scope. There are pros and cons of using the heap for extra growth, but for
I'm not insisting that Rust needs have this type in std/core. Rust already requires 3rd party crates for a lot of basic functionality, and arrayvec works as a crate just fine. But if Rust wanted to have a built-in solution to initializing arrays, adopting |
Just to be clear, I said I was unsure of how widely used it is, not that it isn't widely used. And download count is somewhat deceptive. For example, about 1/3 of
OK, so I finally had a chance to actually look at this RFC and the API is a lot bigger than what I thought it would be. When I said "Safe array initialization for non-Copy non-const types seems like a primitive we should have, at minimum." above, what I had in mind was something small and primitive like this: rust-lang/rust#75644 This RFC is a lot more than a simple primitive for building arrays. Ideally, the primitive would/could be used by the various ArrayVec crates to reduce some |
Ultimately, there's only 3 functions I care about here. impl<T, const N: usize> ArrayBuilder<T, N> {
pub fn new() -> Self;
pub unsafe fn push_unchecked(&mut self, t: T); // UB if array is full
pub unsafe fn build_unchecked(self) -> [T; N]; // UB if not full
} With these primitive functions, you can build fn array_from_fn<T, const N: usize>(f: impl FnMut(usize) -> T) -> [T; N] {
let mut array = ArrayBuilder::<T, N>::new();
for i in 0..N {
unsafe { array.push_unchecked(f(i)); } // safe because array won't be full
}
unsafe { array.build_unchecked() } // safe because array will be full
} Similarly, a impl<T, const N: usize> FromIterator<T> for [T; N] {
fn from_iter<I>(iter: I) -> Self where I: IntoIterator<Item=T> {
let mut iter = iter.into_iter();
for _ in 0..N {
unsafe { array.push_unchecked(iter.next().unwrap()); } // safe because array won't be full
}
unsafe { array.build_unchecked() } // safe because array will be full
}
} The rest of the functions presented in the RFC are just natural extensions. I'm impartial to the What this ultimately has over just having |
I have a working implementation |
I would like to reiterate support for My main justification for Plus, the ability to have an equivalent to Another big thing to point out is that we technically have a cursed equivalent to Plus, there are already a couple things in the compiler itself that should be using some equivalent of Additional note rereading this: a lot of the time the case where I want "0 to |
If I understand this RFC correctly, the main motivation is to be able to safely initialize an array, which can already be done with the soon-to-be stabilized The examples from the RFC written with use core::array;
let mut values = array::IntoIter::new(["a", "b", "c", "d"]);
let arr: [String; 4] = [(); 4].map(|_| {
values.next().unwrap().to_string()
}); let mut values = 1..=10;
let arr: [usize; 10] = [(); 10].map(|_| {
let i = values.next().unwrap();
i * i
}); The iterator example is relatively difficult to replicate and requires the #![feature(
try_trait_v2,
maybe_uninit_array_assume_init,
maybe_uninit_uninit_array
)]
use core::iter;
use core::mem::MaybeUninit;
use core::ops::Try;
pub trait ArrayExt<T, const N: usize> {
fn try_map<F, U, R, X>(self, f: F) -> R
where
X: Try<Output = U>,
R: Try<Output = [U; N], Residual = X::Residual>,
F: FnMut(T) -> X;
}
impl<T, const N: usize> ArrayExt<T, N> for [T; N] {
fn try_map<F, U, R, X>(self, mut f: F) -> R
where
X: Try<Output = U>,
R: Try<Output = [U; N], Residual = X::Residual>,
F: FnMut(T) -> X,
{
let mut array: [MaybeUninit<U>; N] = MaybeUninit::uninit_array();
let mut iterator = core::array::IntoIter::new(self);
for item in array.iter_mut() {
// NOTE: it is guranteed that this will not panic
let next = iterator.next().unwrap();
*item = MaybeUninit::new(f(next)?);
}
// SAFETY: because of the previous loops all values are guranteed to be
// initialized
let result: [U; N] = unsafe { MaybeUninit::array_assume_init(array) };
R::from_output(result)
}
}
// this should do almost the same thing as in the example, except for the "remaining" function
fn chunked_iterator<I, const N: usize>(mut iter: I) -> impl Iterator<Item = [I::Item; N]>
where
I: Iterator,
{
iter::from_fn(move || {
let values: Option<[_; N]> = [(); N].try_map(|_| iter.next());
values
})
} I also want to point out that people are currently planning an abstraction called |
The only thing that the array map misses that this RFC aims to provide (as you pointed out) is partial array state. Being able to take a slice to the initialised values if it's not a complete array. Similarly, I imagined the proposed type to be how you would implement such a impl<T, const N: usize> ArrayExt<T, N> for [T; N] {
fn try_map<F, U, R, X>(self, mut f: F) -> R
where
X: Try<Output = U>,
R: Try<Output = [U; N], Residual = X::Residual>,
F: FnMut(T) -> X,
{
let mut array: ArrayBuilder<U, N> = ArrayBuilder::new();
for item in core::array::IntoIter::new(self) {
// SAFETY: N elements have not yet been pushed
unsafe { array.push_unchecked(f(item)?); }
}
// SAFETY: previous loop guarantees N elements have been set
let result: [U; N] = unsafe { array.build_unchecked() };
R::from_output(result)
}
} |
This is really interesting. I'm a big fan of the proposed storage api. If that gets decided on this RFC is indeed irrelevant as a the |
Just linking here. - #2990 was closed, but I just opened #3316 which proposes to add an However - as mentioned on that thread, the new storages API would likely supplant any need for a separately-implemented |
I really wanted to use this API, but the codegen is not great, I compared the following 3 implementations: use core::mem::MaybeUninit;
use array_builder::ArrayBuilder;
pub fn write_array0(f: fn() -> u8) -> [u8; 1024] {
core::array::from_fn(|_|f())
}
pub fn write_array1(f: fn() -> u8) -> [u8; 1024] {
let mut uninit = [MaybeUninit::uninit(); 1024];
for i in 0..1024 {
uninit[i] = MaybeUninit::new(f());
}
unsafe { (uninit.as_ptr() as *const [u8; 1024]).read() }
}
pub fn write_array2(f: fn() -> u8) -> [u8; 1024] {
let mut builder = ArrayBuilder::new();
for _ in 0..1024 {
builder.push(f());
}
builder.build().unwrap()
} And the produced ASM is as follows: write_array0 push r15
push r14
push rbx
sub rsp, 2064
mov r15, rsi
mov r14, rdi
xor ebx, ebx
.LBB9_1:
call r15
mov byte, ptr, [rsp, +, rbx, +, 9], al
inc rbx
cmp rbx, 1024
jne .LBB9_1
lea r15, [rsp, +, 1033]
lea rsi, [rsp, +, 9]
mov rbx, qword, ptr, [rip, +, memcpy@GOTPCREL]
mov edx, 1024
mov rdi, r15
call rbx
mov edx, 1024
mov rdi, r14
mov rsi, r15
call rbx
mov rax, r14
add rsp, 2064
pop rbx
pop r14
pop r15
ret
.LBB9_4:
mov r14, rax
lea rsi, [rsp, +, 9]
mov rdi, rbx
call core::ptr::drop_in_place<core::array::Guard<u8,1024_usize>>
mov rdi, r14
call _Unwind_Resume
ud2 write_array1push r15
push r14
push r12
push rbx
sub rsp, 1032
mov r15, rsi
mov r14, rdi
xor ebx, ebx
.LBB10_1:
lea r12, [rbx, +, 1]
call r15
mov byte, ptr, [rsp, +, rbx, +, 8], al
mov rbx, r12
cmp r12, 1024
jne .LBB10_1
lea rsi, [rsp, +, 8]
mov edx, 1024
mov rdi, r14
call qword, ptr, [rip, +, memcpy@GOTPCREL]
mov rax, r14
add rsp, 1032
pop rbx
pop r12
pop r14
pop r15
ret write_array2 push rbp
push r15
push r14
push rbx
sub rsp, 2072
mov r15, rsi
mov r14, rdi
mov qword, ptr, [rsp, +, 1032], 0
mov ebp, 1024
xor ebx, ebx
.LBB11_1:
call r15
cmp rbx, 1024
jae .LBB11_3
mov byte, ptr, [rsp, +, rbx, +, 8], al
mov rbx, qword, ptr, [rsp, +, 1032]
inc rbx
mov qword, ptr, [rsp, +, 1032], rbx
dec ebp
jne .LBB11_1
cmp rbx, 1024
jne .LBB11_7
lea rsi, [rsp, +, 8]
mov edx, 1024
mov rdi, r14
call qword, ptr, [rip, +, memcpy@GOTPCREL]
mov rax, r14
add rsp, 2072
pop rbx
pop r14
pop r15
pop rbp
ret
.LBB11_3:
lea rdi, [rip, +, .Lanon.c820c69e724c4167816339e4a82ffdf6.0]
lea rdx, [rip, +, .Lanon.c820c69e724c4167816339e4a82ffdf6.2]
mov esi, 30
call qword, ptr, [rip, +, _ZN4core9panicking5panic17h90931f06a97cc5e0E@GOTPCREL]
jmp .LBB11_4
.LBB11_7:
lea r14, [rsp, +, 1040]
lea rsi, [rsp, +, 8]
mov edx, 1024
mov rdi, r14
call qword, ptr, [rip, +, memcpy@GOTPCREL]
mov qword, ptr, [rsp, +, 2064], rbx
lea rdi, [rip, +, .Lanon.c820c69e724c4167816339e4a82ffdf6.4]
lea rcx, [rip, +, .Lanon.c820c69e724c4167816339e4a82ffdf6.5]
lea r8, [rip, +, .Lanon.c820c69e724c4167816339e4a82ffdf6.14]
mov esi, 43
mov rdx, r14
call qword, ptr, [rip, +, _ZN4core6result13unwrap_failed17hdc73d4affce1d414E@GOTPCREL]
.LBB11_4:
ud2
.LBB11_9:
mov rbx, rax
jmp .LBB11_10
.LBB11_11:
jmp .LBB11_12
.LBB11_13:
.LBB11_12:
mov rbx, rax
lea r14, [rsp, +, 8]
.LBB11_10:
mov rdi, r14
call core::ptr::drop_in_place<array_builder::ArrayBuilder<u8,1024_usize>>
mov rdi, rbx
call _Unwind_Resume
ud2 |
rendered