-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose an UTF-8 checking function that returns the index of the error #12168
Comments
If we have this, we'd probably also want to return the number of bytes corresponding to the invalid codepoint (e.g. the number of bytes starting from that index that should be replaced by |
For |
|
Related bug: #12113 |
@lifthrasiir I like your idea, except that I suspect the |
@pnkfelix Agreed. I'm not sure how to handle the same error case for |
I'm thinking of something like pub struct Utf8Error {
// if there's an error, produce a sanitized output with replacement chars
recovered: ~str,
// the byte position and exact sequence of each error
errors: ~[(uint, ~[u8])],
}
pub fn from_utf8_owned_explain(vv: ~[u8]) -> Result<~str, Utf8Error> { ... }
pub fn from_utf8_explain<'a>(v: &'a [u8]) -> Result<&'a str, Utf8Error> { ... } where the |
I think this is far too complex. Why does If you need to be able to recover from invalid utf-8 and either know the valid utf8 prefix, or keep the |
I agree to @kballard: if more control on the encoding error is required, you need to use a separate library for that. I thought returning a valid prefix does not cause overhead (since it is either a |
cc @aturon, we were discussing this in the recent string stabilization meeting |
I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized. This issue has been moved to the RFCs repo: rust-lang/rfcs#860 |
Note: this has actually been resolved. |
internal: Remove unqualified_path completions module
[`default_numeric_fallback`]: improve const context detection Fixes rust-lang#12159 The lint didn't actually recognize any of the associated consts (in the linked issue), because in those cases the parent is an `ImplItem` and not an `Item`, but it only actually emitted a lint for i32 and f64 because the other cases failed the very last check here https://github.com/rust-lang/rust-clippy/blob/bb2d4973648b1af18d7ba6a3028ed7c92fde07fb/clippy_lints/src/default_numeric_fallback.rs#L91-L96 A better check for detecting constness would be using `body_const_context`, which is what this PR does. changelog: [`default_numeric_fallback`]: recognize associated consts
Use case:
Possible candidates:
first_non_utf8_index()
as publicfrom_utf8{_owned}
's return type fromOption
toResult<_, uint>
Since it's possible to recover from such an error, it would also be nice if these returned all encoding errors.
The text was updated successfully, but these errors were encountered: