Skip to content

Commit

Permalink
String slices (#4996)
Browse files Browse the repository at this point in the history
## Description

This PR introduces `string slices`.

The basic usage is at `test/src/ir_generation/tests/str_slice.sw`:

```sway
let a: str = "ABC";
```

Before this PR `a` would be of type `str[3]`, a string array.

Now both `string slices` and `string arrays` exist. This PR contains a
new intrinsic that converts from string literals to arrays.

```sway
let a: str = "ABC";
let b: str[3] = __to_str_array("ABC");
```

Runtime conversions can be done using 

```sway
let a = "abcd";
let b: str[4] = a.try_as_str_array().unwrap();
let c = from_str_array(b);
```

string slices to string arrays can fail, so they return
`Option<str[N]>`; and because of this `try_as_str_array` lives in `std`.
The inverse, `from_str_array` only fails if `alloc` fails and lives in
`core`.

At this PR `string slices` are forbidden at `configurable`, `storage`,
`const`, and main arguments and returns. The reason for these
limitations is the internal structure of `string slices` having a `ptr`.

The optimized IR for initializing the slice is:

```
v0 = const string<3> "abc"
v1 = ptr_to_int v0 to u64, !2
v2 = get_local ptr { u64, u64 }, __anon_0, !2
v3 = const u64 0
v4 = get_elem_ptr v2, ptr u64, v3
store v1 to v4, !2

v5 = const u64 1
v6 = get_elem_ptr v2, ptr u64, v5
v7 = const u64 3
store v7 to v6, !2

v8 = get_local ptr slice, __anon_1, !2
mem_copy_bytes v8, v2, 16
```

## Checklist

- [x] I have linked to any relevant issues.
- [x] I have commented my code, particularly in hard-to-understand
areas.
- [x] I have updated the documentation where relevant (API docs, the
reference, and the Sway book).
- [x] I have added tests that prove my fix is effective or that my
feature works.
- [x] I have added (or requested a maintainer to add) the necessary
`Breaking*` or `New Feature` labels where relevant.
- [x] I have done my best to ensure that my PR adheres to [the Fuel Labs
Code Review
Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md).
- [x] I have requested a review from the relevant team or maintainers.
  • Loading branch information
xunilrj authored Sep 11, 2023
1 parent c3f3a36 commit f88bbf4
Show file tree
Hide file tree
Showing 140 changed files with 1,098 additions and 509 deletions.
32 changes: 29 additions & 3 deletions docs/book/src/basics/built_in_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Sway has the following primitive types:
1. `u32` (32-bit unsigned integer)
1. `u64` (64-bit unsigned integer)
1. `str[]` (fixed-length string)
1. `str` (string slices)
1. `bool` (Boolean `true` or `false`)
1. `b256` (256 bits (32 bytes), i.e. a hash)

Expand Down Expand Up @@ -67,21 +68,46 @@ fn returns_false() -> bool {
}
```

## String Type
## String Slices

<!-- This section should explain the string type in Sway -->
<!-- str:example:start -->
In Sway, static-length strings are a primitive type. This means that when you declare a string, its size is a part of its type. This is necessary for the compiler to know how much memory to give for the storage of that data. The size of the string is denoted with square brackets.
In Sway, string literals are stored as variable length string slices. Which means that they are stored as a pointer to the actual string data and its length.
<!-- str:example:end -->

```sway
let my_string: str = "fuel";
```

String slices, because they contain pointers have limited usage. They cannot be used is constants, storages, configurables, nor as main function argument or returns.

For these cases one must use string arrays, as described below.

## String Arrays

<!-- This section should explain the string type in Sway -->
<!-- str:example:start -->
In Sway, static-length strings are a primitive type. This means that when you declare a string array, its size is a part of its type. This is necessary for the compiler to know how much memory to give for the storage of that data. The size of the string is denoted with square brackets.
<!-- str:example:end -->

Let's take a look:

```sway
let my_string: str[4] = "fuel";
let my_string: str[4] = __to_str_array("fuel");
```

Because the string literal `"fuel"` is four letters, the type is `str[4]`, denoting a static length of 4 characters. Strings default to UTF-8 in Sway.

As above, string literals are typed as string slices. So that is why the need for `__to_str_array` that convert them to string arrays at compile time.

Conversion during runtime can be done with `from_str_array` and `try_as_str_array`. The latter can fail, given that the specified string array must be big enough for the string slice content.

```sway
let a: str = "abcd";
let b: str[4] = a.try_as_str_array().unwrap();
let c: str = from_str_array(b);
```

## Compound Types

_Compound types_ are types that group multiple values into one type. In Sway, we have arrays and tuples.
Expand Down
2 changes: 1 addition & 1 deletion docs/book/src/basics/variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ let foo: u32 = 5;
We have just declared the _type_ of the variable `foo` as a `u32`, which is an unsigned 32-bit integer. Let's take a look at a few other type annotations:

```sway
let bar: str[4] = "sway";
let bar: str[4] = __to_str_array("sway");
let baz: bool = true;
```

Expand Down
22 changes: 16 additions & 6 deletions docs/book/src/reference/compiler_intrinsics.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,20 +25,30 @@ __size_of<T>() -> u64
___

```sway
__size_of_str<T>() -> u64
__size_of_str_array<T>() -> u64
```

**Description:** Return the size of type `T` in bytes. This intrinsic differs from `__size_of` in the case of `str` type where the actual length in bytes of the string is returned without padding the byte size to the next word alignment. When `T` is not a string `0` is returned.
**Description:** Return the size of type `T` in bytes. This intrinsic differs from `__size_of` in the case of "string arrays" where the actual length in bytes of the string is returned without padding the byte size to the next word alignment. When `T` is not a string `0` is returned.

**Constraints:** None.

___

```sway
__check_str_type<T>() -> u64
__assert_is_str_array<T>()
```

**Description:** Throws a compile error if type `T` is not a string.
**Description:** Throws a compile error if type `T` is not a "string array".

**Constraints:** None.

___

```sway
__to_str_array(s: str) -> str[N]
```

**Description:** Converts a "string slice" to "string array" at compile time. Parameter "s" must be a string literal.

**Constraints:** None.

Expand All @@ -55,10 +65,10 @@ __is_reference_type<T>() -> bool
___

```sway
__is_str_type<T>() -> bool
__is_str_array<T>() -> bool
```

**Description:** Returns `true` if `T` is a str type and `false` otherwise.
**Description:** Returns `true` if `T` is a string array and `false` otherwise.

**Constraints:** None.

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/src/code/language/annotations/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ fn read_write() {

fn example() {
// ANCHOR: example
let bar: str[4] = "sway";
let bar: str = "sway";
let baz: bool = true;
// ANCHOR_END: example
}
Expand Down
9 changes: 5 additions & 4 deletions docs/reference/src/code/language/built-ins/strings/src/lib.sw
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,16 @@ library;

fn explicit() {
// ANCHOR: explicit
let fuel: str[4] = "fuel";
let blockchain: str[10] = "blockchain";
let crypto: str[6] = "crypto";
let fuel: str = "fuel";
let blockchain: str = "blockchain";
let crypto: str[6] = __to_str_array("crypto");
// ANCHOR_END: explicit
}

fn implicit() {
// ANCHOR: implicit
// The variable `fuel` has a length of 4
// The variable `fuel` is a string slice with length equals 4
let fuel = "fuel";
let crypto = __to_str_array("crypto");
// ANCHOR_END: implicit
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ library;
fn foo() {}

// Can import everything below because they are using the `pub` keyword
pub const ONE = "1";
pub const ONE = __to_str_array("1");

pub struct MyStruct {}

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/src/code/language/variables/src/lib.sw
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ fn reassignment() {
// Set `foo` to take the value of `5` and the default `u64` type
let foo = 5;

// Reassign `foo` to be a `str[4]` with the value of `Fuel`
// Reassign `foo` to be a `str` with the value of `Fuel`
let foo = "Fuel";
// ANCHOR_END: reassignment
}
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/src/code/operations/hashing/src/lib.sw
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ library;
use std::hash::*;
// ANCHOR_END: import
// ANCHOR: sha256
fn sha256_hashing(age: u64, name: str[5], status: bool) -> b256 {
fn sha256_hashing(age: u64, name: str, status: bool) -> b256 {
let mut hasher = Hasher::new();
age.hash(hasher);
hasher.write_str(name);
Expand All @@ -13,7 +13,7 @@ fn sha256_hashing(age: u64, name: str[5], status: bool) -> b256 {
}
// ANCHOR_END: sha256
// ANCHOR: keccak256
fn keccak256_hashing(age: u64, name: str[5], status: bool) -> b256 {
fn keccak256_hashing(age: u64, name: str, status: bool) -> b256 {
let mut hasher = Hasher::new();
age.hash(hasher);
hasher.write_str(name);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Sway has the following primitive types:
2. [Boolean](boolean.md)
1. `bool` (true or false)
3. [Strings](string.md)
1. `str` (string slice)
1. `str[n]` (fixed-length string of size n)
4. [Bytes](b256.md)
1. `b256` (256 bits / 32 bytes, i.e. a hash)
Expand Down
2 changes: 1 addition & 1 deletion examples/configurable_constants/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ configurable {
U8: u8 = 8u8,
BOOL: bool = true,
ARRAY: [u32; 3] = [253u32, 254u32, 255u32],
STR_4: str[4] = "fuel",
STR_4: str[4] = __to_str_array("fuel"),
STRUCT: StructWithGeneric<u8> = StructWithGeneric {
field_1: 8u8,
field_2: 16,
Expand Down
14 changes: 1 addition & 13 deletions examples/hashing/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,6 @@ script;

use std::hash::*;

impl Hash for str[4] {
fn hash(self, ref mut state: Hasher) {
state.write_str(self);
}
}

impl Hash for str[32] {
fn hash(self, ref mut state: Hasher) {
state.write_str(self);
}
}

impl Hash for Location {
fn hash(self, ref mut state: Hasher) {
match self {
Expand Down Expand Up @@ -55,7 +43,7 @@ enum Location {
}

struct Person {
name: str[4],
name: str,
age: u64,
alive: bool,
location: Location,
Expand Down
2 changes: 1 addition & 1 deletion examples/result/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,6 @@ fn main() -> Result<u64, str[4]> {
let result = divide(20, 2);
match result {
Ok(value) => Ok(value),
Err(MyContractError::DivisionByZero) => Err("Fail"),
Err(MyContractError::DivisionByZero) => Err(__to_str_array("Fail")),
}
}
2 changes: 1 addition & 1 deletion examples/structs/src/data_structures.sw
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,5 @@ pub struct Line {
}

pub struct TupleInStruct {
nested_tuple: (u64, (u32, (bool, str[2]))),
nested_tuple: (u64, (u32, (bool, str))),
}
2 changes: 1 addition & 1 deletion forc-plugins/forc-doc/src/render/item/type_anchor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ pub(crate) fn render_type_anchor(
TypeInfo::UnknownGeneric { name, .. } => Ok(box_html! {
: name.as_str();
}),
TypeInfo::Str(len) => Ok(box_html! {
TypeInfo::StringArray(len) => Ok(box_html! {
: len.span().as_str();
}),
TypeInfo::UnsignedInteger(int_bits) => {
Expand Down
2 changes: 2 additions & 0 deletions sway-ast/src/expr/op_code.rs
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,8 @@ define_op_codes!(
(Aloc, AlocOpcode, "aloc", (size: reg)),
(Cfei, CfeiOpcode, "cfei", (size: imm)),
(Cfsi, CfsiOpcode, "cfsi", (size: imm)),
(Cfe, CfeOpcode, "cfe", (size: reg)),
(Cfs, CfsOpcode, "cfs", (size: reg)),
(Lb, LbOpcode, "lb", (ret: reg, addr: reg, offset: imm)),
(Lw, LwOpcode, "lw", (ret: reg, addr: reg, offset: imm)),
(Mcl, MclOpcode, "mcl", (addr: reg, size: reg)),
Expand Down
19 changes: 11 additions & 8 deletions sway-ast/src/intrinsics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@ use std::fmt;
#[derive(Eq, PartialEq, Debug, Clone, Hash)]
pub enum Intrinsic {
IsReferenceType,
IsStrType,
IsStrArray,
SizeOfType,
SizeOfVal,
SizeOfStr,
CheckStrType,
AssertIsStrArray,
ToStrArray,
Eq,
Gt,
Lt,
Expand Down Expand Up @@ -40,11 +41,12 @@ impl fmt::Display for Intrinsic {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let s = match self {
Intrinsic::IsReferenceType => "is_reference_type",
Intrinsic::IsStrType => "is_str_type",
Intrinsic::IsStrArray => "is_str_type",
Intrinsic::SizeOfType => "size_of",
Intrinsic::SizeOfVal => "size_of_val",
Intrinsic::SizeOfStr => "size_of_str",
Intrinsic::CheckStrType => "check_str_type",
Intrinsic::SizeOfStr => "size_of_str_array",
Intrinsic::AssertIsStrArray => "assert_is_str_array",
Intrinsic::ToStrArray => "to_str_array",
Intrinsic::Eq => "eq",
Intrinsic::Gt => "gt",
Intrinsic::Lt => "lt",
Expand Down Expand Up @@ -81,11 +83,12 @@ impl Intrinsic {
use Intrinsic::*;
Some(match raw {
"__is_reference_type" => IsReferenceType,
"__is_str_type" => IsStrType,
"__is_str_array" => IsStrArray,
"__size_of" => SizeOfType,
"__size_of_val" => SizeOfVal,
"__size_of_str" => SizeOfStr,
"__check_str_type" => CheckStrType,
"__size_of_str_array" => SizeOfStr,
"__assert_is_str_array" => AssertIsStrArray,
"__to_str_array" => ToStrArray,
"__eq" => Eq,
"__gt" => Gt,
"__lt" => Lt,
Expand Down
6 changes: 4 additions & 2 deletions sway-ast/src/ty/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ pub enum Ty {
Path(PathType),
Tuple(Parens<TyTupleDescriptor>),
Array(SquareBrackets<TyArrayDescriptor>),
Str {
StringSlice(StrToken),
StringArray {
str_token: StrToken,
length: SquareBrackets<Box<Expr>>,
},
Expand All @@ -29,7 +30,8 @@ impl Spanned for Ty {
Ty::Path(path_type) => path_type.span(),
Ty::Tuple(tuple_type) => tuple_type.span(),
Ty::Array(array_type) => array_type.span(),
Ty::Str { str_token, length } => Span::join(str_token.span(), length.span()),
Ty::StringSlice(str_token) => str_token.span(),
Ty::StringArray { str_token, length } => Span::join(str_token.span(), length.span()),
Ty::Infer { underscore_token } => underscore_token.span(),
Ty::Ptr { ptr_token, ty } => Span::join(ptr_token.span(), ty.span()),
Ty::Slice { slice_token, ty } => Span::join(slice_token.span(), ty.span()),
Expand Down
7 changes: 5 additions & 2 deletions sway-core/src/abi_generation/evm_abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ pub fn abi_str(type_info: &TypeInfo, type_engine: &TypeEngine, decl_engine: &Dec
UnknownGeneric { name, .. } => name.to_string(),
Placeholder(_) => "_".to_string(),
TypeParam(n) => format!("typeparam({n})"),
Str(x) => format!("str[{}]", x.val()),
StringSlice => "str".into(),
StringArray(x) => format!("str[{}]", x.val()),
UnsignedInteger(x) => match x {
IntegerBits::Eight => "uint8",
IntegerBits::Sixteen => "uint16",
Expand Down Expand Up @@ -144,7 +145,9 @@ pub fn abi_param_type(
) -> ethabi::ParamType {
use TypeInfo::*;
match type_info {
Str(x) => ethabi::ParamType::FixedArray(Box::new(ethabi::ParamType::String), x.val()),
StringArray(x) => {
ethabi::ParamType::FixedArray(Box::new(ethabi::ParamType::String), x.val())
}
UnsignedInteger(x) => match x {
IntegerBits::Eight => ethabi::ParamType::Uint(8),
IntegerBits::Sixteen => ethabi::ParamType::Uint(16),
Expand Down
3 changes: 2 additions & 1 deletion sway-core/src/abi_generation/fuel_abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -784,7 +784,8 @@ impl TypeInfo {
UnknownGeneric { name, .. } => name.to_string(),
Placeholder(_) => "_".to_string(),
TypeParam(n) => format!("typeparam({n})"),
Str(x) => format!("str[{}]", x.val()),
StringSlice => "str".into(),
StringArray(x) => format!("str[{}]", x.val()),
UnsignedInteger(x) => match x {
IntegerBits::Eight => "u8",
IntegerBits::Sixteen => "u16",
Expand Down
2 changes: 1 addition & 1 deletion sway-core/src/asm_generation/from_ir.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ pub(crate) fn ir_type_size_in_bytes(context: &Context, ty: &Type) -> u64 {

pub(crate) fn ir_type_str_size_in_bytes(context: &Context, ty: &Type) -> u64 {
match ty.get_content(context) {
TypeContent::String(n) => *n,
TypeContent::StringArray(n) => *n,
_ => 0,
}
}
3 changes: 2 additions & 1 deletion sway-core/src/asm_generation/fuel/functions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,8 @@ impl<'ir, 'eng> FuelAsmBuilder<'ir, 'eng> {
| TypeContent::Pointer(_) => 1,
TypeContent::Slice => 2,
TypeContent::B256 => 4,
TypeContent::String(n) => size_bytes_round_up_to_word_alignment!(n),
TypeContent::StringSlice => 2,
TypeContent::StringArray(n) => size_bytes_round_up_to_word_alignment!(n),
TypeContent::Array(..) | TypeContent::Struct(_) | TypeContent::Union(_) => {
size_bytes_in_words!(ir_type_size_in_bytes(self.context, &ptr_ty))
}
Expand Down
Loading

0 comments on commit f88bbf4

Please sign in to comment.