-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C#: Cleanup and sync StringExtensions with core #67031
C#: Cleanup and sync StringExtensions with core #67031
Conversation
f6e5e96
to
5abd16d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding names like UTF8, I would prefer to use PascalCase as I explain here. UPPERCASE case makes names harder to read when used as part of PascalCase identifier.
This seems to be one of the cases where the .NET API is inconsistent. The properties in the Encoding
class use UPPERCASE, while other methods seem to use PascalCase (e.g., Char.ConvertToUtf32 and Char.IsAscii).
// TODO: Could be more efficient if we get a char version of `IndexOf`. | ||
// See https://github.com/dotnet/runtime/issues/44116 | ||
return instance.IndexOf(what.ToString(), from, | ||
caseSensitive ? StringComparison.Ordinal : StringComparison.OrdinalIgnoreCase); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do something like this:
if (caseSensitive)
return instance.IndexOf(what, from); // Ordinal, case sensitive
return CultureInfo.InvariantCulture.CompareInfo.IndexOf(instance, what, from, CompareOptions.OrdinalIgnoreCase);
NativeFuncs.godotsharp_string_md5_buffer(instanceStr, out var md5Buffer); | ||
using (md5Buffer) | ||
return Marshaling.ConvertNativePackedByteArrayToSystemArray(md5Buffer); | ||
#pragma warning disable CA5351 // Do Not Use Broken Cryptographic Algorithms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great if our method could be annotated in some way so that callers would get this warning, but it doesn't seem to be possible :(
Regarding the removal of methods. I think we should have some kind of table that could be used as reference to look for the equivalent if the method is removed. The removal of the following methods would be harmless:
I have some comments about the other suggestions:
str.Split(",", allowEmpty: false);
str.Split(",", StringSplitOptions.RemoveEmptyEntries);
Doesn't seem to be just a wrapper, unless our implementation is doing unnecessary things.
The purpose of this method is to return the same hash as the GDScript hash function. Not sure if it has any uses, though.
Just like with
I agree about
Slightly agree.
Our methods are less verbose, and it would be harder for a user to know about the replacement if they were removed, unless they are already very familiar with the .NET API. I agree that hiding these warnings is bad. |
Agreed, but for this PR I'm following the current naming, since I thought naming changes would grow the scope of this PR too much, but I can rename them if you prefer.
I think that's a good idea, should I add it to https://docs.godotengine.org/en/stable/tutorials/scripting/c_sharp/c_sharp_differences.html?
Yes, I'm not sure how common that is though. Would it be enough to add documentation about this?
Our implementation just clamps the length to the length of the array to avoid
I don't know, it doesn't seem useful to me. Would love to see usages of this.
This is reported by the .NET analyzers (see CA1304 and/or CA1307). Also, I personally think the
I guess it is, I personally don't have an issue with it. Wouldn't it be more common for users to want the parsed value anyway?
I agree that these ones are more difficult to find for a less-experienced .NET user, adding documentation could help here though and I'd prefer if users learned about existing .NET APIs that they can use for other non-Godot .NET applications. |
Bump, would be good to find consensus and merge. |
Just to be clear, the discussion is about methods that were not touched by this PR, I tried to keep this PR uncontroversial so it could be merged quickly since it breaks compatibility so I'd want to get this merged as early as possible during the betas. However, I can update the PR to include the changes that we've already agreed on in the discussion:
And:
|
The only pending change is my suggestion for The UTF8 word case is not required to merge this, especially considering some of these were already named that way. It's something I would like to change in the future though. Lastly, it's fine if you want to include the extra methods that can be trivially removed. But I'm wondering if it would be better to mark them as obsolete (with the error flag, instead of a warning). That could be helpful for users upgrading from 3.x. Then we remove them entirely in 4.1 or later. |
5abd16d
to
19ccfa8
Compare
- Moved `GetBaseName` to keep methods alphabetically sorted. - Removed `Length`, users should just use the Length property. - Removed `Insert`, string already has a method with the same signature that takes precedence. - Removed `Erase`. - Removed `ToLower` and `ToUpper`, string already has methods with the same signature that take precedence. - Removed `FindLast` in favor of `RFind`. - Replaced `RFind` and `RFindN` implemenation with a ca ll to `string.LastIndexOf` to avoid marshaling. - Added `LPad` and `RPad`. - Added `StripEscapes`. - Replaced `LStrip` and `RStrip` implementation with a call to `string.TrimStart` and `string.TrimEnd`. - Added `TrimPrefix` and `TrimSuffix`. - Renamed `OrdAt` to `UnicodeAt`. - Added `CountN` and move the `caseSensitive` parameter of `Count` to the end. - Added `Indent` and `Dedent`.
- Renamed `IsValidInteger` to `IsValidInt`. - Added `IsValidFileName`. - Added `IsValidHexNumber`. - Added support for IPv6 to `IsValidIPAddress`. - Added `ValidateNodeName`. - Updated the documentation of the `IsValid*` methods.
- Replaced `MD5Buffer`, `MD5Text`, `SHA256Buffer` and `SHA256Text` implementation to use the `System.Security.Cryptography` classes and avoid marshaling. - Added `SHA1Buffer` and `SHA1Text`. - Renamed `ToUTF8` to `ToUTF8Buffer`. - Renamed `ToAscii` to `ToASCIIBuffer`. - Added `ToUTF16Buffer` and `ToUTF32Buffer`. - Added `GetStringFromUTF16` and `GetStringFromUTF32`.
19ccfa8
to
d0b166d
Compare
Rebased and added
I'm fine with this option but I'm worried about users that have already upgraded to 4.0 and may miss the deprecation notice, although maybe that's to be expected of using beta software. |
I'm fine with either of the two options, or even a mix of both. It doesn't really do any harm. |
- Removed `UnicodeAt` - Removed `EndsWith` - Removed `LPad` and `RPad` - Deprecated `BeginsWith` in favor of `string.StartsWith` - Deprecated `LStrip` and `RStrip` in favor of `string.TrimStart` and `string.TrimEnd`
I have added a commit that:
|
Thanks! |
I started with the intention of creating separate commits and maybe creating separate PRs but I kind of gave up half way because all of these changes end up being related to each other in one way or another and I don't think it was useful to keep them separated but I can extract some of the simpler less-controversial changes into smaller PRs if that's preferred.
Changes
GetBaseName
to keep methods alphabetically sorted.Length
, users should just use the Length property.Insert
, string already has a method with the same signature that takes precedence.ToLower
andToUpper
, string already has methods with the same signature that take precedence.BeginsWith
in favor ofstring.StartsWith
.EndsWith
, string already has a method with the same signature that take precedence.FindLast
in favor ofRFind
(Remove String::find_last (same as rfind) #40092).RFind
andRFindN
implementation with a call tostring.LastIndexOf
to avoid marshaling (this fixes a bug caused by using the wrong parameter in the NativeFuncs call).Added.LPad
andRPad
([Complex Text Layouts] RefactorString
to use UTF-32 encoding. #40999)StripEscapes
(Fix and expose String::strip_escapes(), use it in LineEdit paste #29347).LStrip
andRStrip
implementation with a call tostring.TrimStart
andstring.TrimEnd
.LStrip
andRStrip
in favor ofstring.TrimStart
andstring.TrimEnd
.TrimPrefix
andTrimSuffix
(Add string trim_prefix, trim_suffix, lstrip and rstrip methods #18176).Erase
(String: Removeerase
method, bindings can't mutate String #54869 and RemoveString::erase
method declaration #64714).Renamed.OrdAt
toUnicodeAt
(Renamed String.ord_at to unicode_at #43790)OrdAt
/UnicodeAt
in favor of the string indexer.IsValidFileName
(a20235a).IsValidHexNumber
(Bindis_valid_hex_number
string method to GDScript #24586).IsValidInteger
toIsValidInt
(Renameis_valid_integer()
tois_valid_int()
#49659).IsValidIPAddress
(Adding IPv6 support #6925).ValidateNodeName
(Relaxes node name sanitization in gltf documents. #45545).IsValid*
methods.MD5Buffer
,MD5Text
,SHA256Buffer
andSHA256Text
implementation to use theSystem.Security.Cryptography
classes and avoid marshaling.SHA1Buffer
andSHA1Text
(Use wslay as a WebSocket library #30263).ToUTF8
toToUTF8Buffer
(Refactored binding system for core types #42780).ToAscii
toToASCIIBuffer
(Refactored binding system for core types #42780).ToUTF16Buffer
andToUTF32Buffer
([Complex Text Layouts] RefactorString
to use UTF-32 encoding. #40999 and Refactored binding system for core types #42780).GetStringFromUTF16
andGetStringFromUTF32
([Complex Text Layouts] RefactorString
to use UTF-32 encoding. #40999).Dedent
(Added String::dedent() to remove text indentation #12025).Indent
(Make--doctool
locale aware #55930).CountN
(also moved thecaseSensitive
parameter inCount
to follow the same pattern used by other methods that have an altN
method where thecaseSensitive
parameter is at the end) (Added String.count method #25090).N
could be removed since their "overloads" that don't end withN
already have an optionalcaseSensitive
parameter (this seems to be @neikeq's preference as well, see Added String.count method #25090 (comment)) but since we already have many methods with anN
"overload" I decided to follow the established pattern.Methods to consider removing
Adding extension methods can pollute the type so we should consider if the methods we add are really useful or necessary. Many of the existing methods don't add much and their behavior can be achieved with a similar one-liner using the methods provided by the BCL. Often we add methods that already exist in the BCL with a different name, this means IntelliSense will show multiple methods with very similar names and that can be confusing to users. I think it would be better to avoid providing those methods and instead recommend users to use the existing APIs which would also benefit from existing analyzers provided by Microsoft or third-party libraries that understand the existing BCL APIs and can warn users of potential incorrect or non-optimal usage.
UnicodeAt
since it's just a wrapper over the indexerstring[int index]
.BeginsWith
andEndsWith
since they're just a wrapper overstring.StartsWith
andstring.EndsWith
.EndsWith
has the same signature asstring.EndsWith
and the instance method takes precedence so it's already unused.Split
since it's just a wrapper overstring.Split
.Substr
since it's just a wrapper overstring.Substring
.Hash
since users should probably usestring.GetHashCode
instead.Find
,FindN
,RFind
andRFindN
since they're just wrappers overstring.IndexOf
andstring.LastIndexOf
.CasecmpTo
,NocasecmpTo
andCompareTo
, users should probably usestring.Equals
andstring.Compare
both of which take aStringComparison
parameter that allows specifying the culture and case sensitivity.LPad
andRPad
since they are just wrappers overstring.PadLeft
andstring.PadRight
.LStrip
andRStrip
since they are just wrappers overstring.TrimStart
andstring.TrimEnd
.TrimPrefix
andTrimSuffix
in a future .NET version whereRemovePrefix
andRemoveSuffix
may be added (Add overloads to string trimming dotnet/runtime#14386).IsValidFloat
andIsValidInt
since they're just wrappers overfloat.TryParse
andint.TryParse
.ToUTF8Buffer
,ToUTF16Buffer
,ToUTF32Buffer
,ToASCIIBuffer
,GetStringFromUTF8
,GetStringFromUTF16
,GetStringFromUTF32
andGetStringFromASCII
since they are just wrappers overSystem.Text.Encoding
.SHA1Buffer
,SHA1Text
,SHA256Buffer
,SHA256Text
,MD5Buffer
andMD5Text
since they are just wrappers overSystem.Security.Cryptography
classes (and using our methods hides CA5350 and CA5351).Methods not added
These methods are exposed in GDScript but don't currently exist in StringExtensions and haven't been added/exposed in this PR, we could add them in a future PR if we consider them useful.
humanize_size
because it's a static method and it takes an int. (Bind theString::humanize_size
method #32546).get_slice
,get_slice_count
andget_slicec
.GetSliceCount
is already implemented because it's used inCapitalize
but not exposed.GetSliceCharacter
implementsget_slicec
because it's used inCapitalize
but it's not exposed.num
,num_int64
,num_scientific
andnum_uint64
because users should useToString
and/orIFormattable
.naturalnocasecmp_to
rsplit
because we also don't really implementsplit
with the same behavior as GDScript.repeat
because users should probably use the string constructor or StringBuilder.