Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change str types to _not_ explicitly include the size. #1060

Closed
emilyaherbert opened this issue Feb 25, 2022 · 19 comments
Closed

Change str types to _not_ explicitly include the size. #1060

emilyaherbert opened this issue Feb 25, 2022 · 19 comments
Labels
compiler General compiler. Should eventually become more specific as the issue is triaged enhancement New feature or request lib: core Core library Needs RFC Features that require an RFC before proceeding with an implementation P: medium

Comments

@emilyaherbert
Copy link
Contributor

This Sway code errors:

let x = "hi";
let y = x == "hi";
@emilyaherbert emilyaherbert added good first issue Good for newcomers enhancement New feature or request labels Feb 25, 2022
@ControlCplusControlV
Copy link
Contributor

So a quick question as I look at this issue, each string length is it's own type (ie str[4] is different from str[6]) so is there a blanket way to use the same implementation for each string length?

The second question I have is what is the correct way to access an individual character in a string in Sway?

@otrho
Copy link
Contributor

otrho commented Mar 20, 2022

I think strings are almost the most neglected part of Sway at the moment. 🙂 At one point I wondered whether Sway even needs them.

Comparing strings should be done with the MEQ instruction eventually, although hopefully sooner rather than later we'll have some sort of memcmp() intrinsic coming which is preferable.

I assume we'll be adopting UTF8 for them which means accessing a specific character is non-trivial, just as with Rust, and requires some work in the compiler. Determining the byte size of the string suitable for MEQ is also therefore non-trivial.

Sorry, I haven't really provided any concrete answers here...

@ControlCplusControlV
Copy link
Contributor

Strings are often a waste of gas and I don't think many Solidity contracts use them outside of error messages, so I definitely get considering leaving them out entirely.

MEQ is actually very helpful as I planning on accessing and comparing each character, but by doing the entire memory range it avoids the need to do that. The only question I have left now is how to handle each string length being its own type, perhaps some generic function?

This would put us closer to Java though in that we have a specific method to compare strings, so I don't know if that would go against Sway's philosophy too much. (A bit different though, in just have a std library function like cmp_str(), instead of attached to the string type)

There could also be another way using a generic impl once those are working.

@adlerjohn
Copy link
Contributor

Strings are needed for things like verification of EIP-191 signatures, unless you want devs to convert their UTF-8 strings to bytes beforehand manually.

@ControlCplusControlV
Copy link
Contributor

Interesting, didn't know they saw use for that. Could also be helpful to add some functions help developers move between strings and byte arrays in the future then as well

@adlerjohn adlerjohn transferred this issue from FuelLabs/sway-lib-core Mar 29, 2022
@adlerjohn adlerjohn added the lib: core Core library label Mar 29, 2022
@nfurfaro nfurfaro moved this from Todo to In Progress in Fuel Network Apr 26, 2022
@nfurfaro
Copy link
Contributor

nfurfaro commented Apr 26, 2022

I was looking to pick this up as I'm in need of a project. Not sure how to approach the fact (as mentioned above) that every different length of str is its own type.
A simple generic function could be a simple approach, but I would prefer to implement the existing Eq trait if at all possible. It seems like we would need maybe trait constraints/bounds to make this work?
ie: impl Eq for str[n] where n can be any u64 .

@otrho
Copy link
Contributor

otrho commented Apr 27, 2022

Internally strings of different length are definitely of different types. So using Eq would enforce that both strings are the same length at the type checker. But yeah, we need higher level 'kinds' to match strings of all lengths/types for a single impl. Unfortunately this feels like we might need some compiler special magic to special case them in the type checker.

As per my comment above to actually compare them you'd use MEQ (or the fabled intrinsics which I swear will happen one day) but you need a length.

It would make sense for size_of to return the byte size of the strings and you could use that. OTOH there's still the UTF8 issue which needs resolving. Although both strings may have the same number of characters they could have very different byte sizes. So perhaps getting the size_of both strings, comparing sizes and then contents is the go.

@nfurfaro
Copy link
Contributor

nfurfaro commented Apr 27, 2022

So, in order to implement this now in the stdlib:

  • trying to impl Eq is not going to work in a generic way (for strings of different lengths. We would need to implement the trait for str[0], str[1], str[2], etc...
  • Adding a stdlib function should work now (without any compiler magic or higher level 'kinds').

something along the lines of:

pub fn are_strings_eq<T>(str_a: T, str_b: T) -> bool {
    if size_of(str_a) == size_of(str_b) {
        // ...compare contents using `meq`
    } else {
        false
    }
}

cc @otrho

@otrho
Copy link
Contributor

otrho commented Apr 28, 2022

The problem above is that T could be anything. We really need to change something in the type system to make this work, AFAICT. Either we make str a type of string of any fixed length, or we introduce a trait for strings and only strings which gives us .num_chars() and .num_bytes() and can also be used in where clauses.

pub fn are_strings_eq<S>(a: S, b: S) -> bool
    where S: StaticString
{
    ...
}

Do we have where clauses for functions yet, or just type declarations? That example is really clumsy either way.

@sezna This is where your expertise shines. How shall we use strings of unknown lengths with type safety?

@sezna
Copy link
Contributor

sezna commented Apr 28, 2022

We don't provide any native support for dynamically sized strings. That's Library Territory™.

We'd either need to introduce a mechanism for implementing traits over all strs, or, defer to libraries. I vote for the latter because higher level string support is not an explicit goal of Sway and is not generally a high priority for smart contract development.

@otrho
Copy link
Contributor

otrho commented Apr 28, 2022

Note that we're not talking about dynamically sized strings at all here. These problems are with Sway's static str.

So you're conceding that str as we have them now are essentially useless and we should instead construct them using libraries. We might as well remove str as a basic type altogether then.

@otrho
Copy link
Contributor

otrho commented Apr 28, 2022

Except it really makes sense to keep string literals. I honestly think we need to tone down the type system and copy Rust to a degree.

We should make str a static const string of any fixed length. The length should not be a part of the type. The have a byte length property which we can fetch with size_of (and optionally wrap in .len() in a library). They can't be indexed. UTF8 chars can be managed in libraries once we have iterators.

Thoughts?

@adlerjohn
Copy link
Contributor

They have a byte length property which we can fetch with size_of

Is that not the case currently?

@otrho
Copy link
Contributor

otrho commented Apr 29, 2022

Yep, probably? I was just trying to lay out everything we need to make it all work. @sezna Does the above seem fair?

@nfurfaro
Copy link
Contributor

Is that not the case currently?

I think so... I have WIP for this with tests, but blocked by MemoryOverflow.

@sezna
Copy link
Contributor

sezna commented Apr 29, 2022

It makes sense, but you're suggesting essentially keeping the restriction that all strs must have a size known at compile time, but they fall under the same type technically? That's pretty much exactly what Rust does, seems reasonable.

@nfurfaro nfurfaro moved this from In Progress to Blocked in Fuel Network Apr 29, 2022
@otrho
Copy link
Contributor

otrho commented Apr 30, 2022

OK, I'm renaming this issue since all of the discussion has been here already. This is a largish change though, as it should go right through from parsing to IR.

@otrho otrho changed the title Implement eq for str type Change str types to _not_ explicitly include the size. Apr 30, 2022
@otrho otrho added compiler General compiler. Should eventually become more specific as the issue is triaged and removed good first issue Good for newcomers labels Apr 30, 2022
@sezna
Copy link
Contributor

sezna commented Apr 30, 2022

Feel free to override me, but I'm putting p: medium on this since strings are not blocking any major use cases right now

@mohammadfawaz mohammadfawaz added the Needs RFC Features that require an RFC before proceeding with an implementation label Jun 27, 2022
@anton-trunov anton-trunov mentioned this issue Jul 20, 2023
7 tasks
@xunilrj
Copy link
Contributor

xunilrj commented Sep 13, 2023

We have a new PR #4996 that changed the semantics of string literals and a new issue #5110 to address improvements to string slices.
So I will close this issue.

@xunilrj xunilrj closed this as completed Sep 13, 2023
@github-project-automation github-project-automation bot moved this from Blocked to Done in Fuel Network Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler General compiler. Should eventually become more specific as the issue is triaged enhancement New feature or request lib: core Core library Needs RFC Features that require an RFC before proceeding with an implementation P: medium
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

8 participants