Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P2286R8 Formatting Ranges #2919

Closed
StephanTLavavej opened this issue Jul 25, 2022 · 0 comments · Fixed by #4825
Closed

P2286R8 Formatting Ranges #2919

StephanTLavavej opened this issue Jul 25, 2022 · 0 comments · Fixed by #4825
Labels
cxx23 C++23 feature fixed Something works now, yay! format C++20/23 format ranges C++20/23 ranges

Comments

@StephanTLavavej
Copy link
Member

StephanTLavavej commented Jul 25, 2022

WG21-P2286R8 Formatting Ranges
WG21-P2585R1 Improve Default Container Formatting
WG21-P2713R1 Escaping Improvements In std::format
LWG-3631 basic_format_arg(T&&) should use remove_cvref_t<T> throughout (affects formattable concept)
LWG-3750 Too many papers bump __cpp_lib_format
LWG-3839 range_formatter's set_separator, set_brackets, and underlying functions should be noexcept
LWG-3881 Incorrect formatting of container adaptors backed by std::string
LWG-3892 Incorrect formatting of nested ranges and tuples
LWG-3925 Concept formattable's definition is incorrect

Feature-test macro:

#define __cpp_lib_format_ranges 202207L
@StephanTLavavej StephanTLavavej added the cxx23 C++23 feature label Jul 25, 2022
@StephanTLavavej StephanTLavavej added the format C++20/23 format label Jul 25, 2022
@CaseyCarter CaseyCarter added the ranges C++20/23 ranges label Aug 31, 2022
CaseyCarter pushed a commit that referenced this issue May 18, 2023
This implements [\[format.string.escaped\]], which is part of WG21-P2286R8 "Formatting Ranges" and modified by WG21-P2713R1 "Escaping Improvements In `std::format`". Works towards #2919.

To implement this feature, two arrays, `__printable_ranges` and `_Grapheme_Extend_ranges`, are added to `__msvc_format_ucd_tables.hpp`.

- `__printable_ranges` represents code points whose [`General_Category`] is in the groups `L`, `M`, `N`, `P`, `S` (that is, code points that are *not* from categories `Z` or `C`), plus the ASCII space character.
  - Characters outside of these ranges are always escaped, usually using the `\u{hex-digit-sequence}` format. ([\[format.string.escaped\]/(2.2.1.2.1)])
  - It might make sense to store the unmodified `General_Category`, instead of this invented property. This requires more storage and a new data structure, though.
- `_Grapheme_Extend_ranges` represents code points with the Unicode property `Grapheme_Extend=Yes`.
  - Characters in these ranges are escaped unless they immediately follow an unescaped character. ([\[format.string.escaped\]/(2.2.1.2.2)])
  - It would be more space efficient to reuse the existing data for `Grapheme_Cluster_Break`: `Grapheme_Extend=Yes` is `Grapheme_Cluster_Break=Extend` minus `Emoji_Modifier=Yes`, and `Emoji_Modifier=Yes` is just `1F3FB..1F3FF`. I chose to define a new array for simplicity.

When the literal encoding is not UTF-8, UTF-16, or UTF-32, the set of "separator or non-printable characters" is implementation-defined. In this implementation, the set consists of all characters that correspond to non-printable Unicode code points (that is, code points outside of `__printable_ranges`, see above). If a character is non-printable, it is translated into `\u{XXXX}`, where `XXXX` is the hex value of the Unicode code point (not the original value).

If a code unit sequence cannot be converted to a Unicode scalar value, the `\x{XX}` escape sequence is used.

[`General_Category`]: https://www.unicode.org/reports/tr44/#GC_Values_Table
[\[format.string.escaped\]]: http://eel.is/c++draft/format.string.escaped
[\[format.string.escaped\]/(2.2.1.2.1)]: http://eel.is/c++draft/format.string.escaped#2.2.1.2.1
[\[format.string.escaped\]/(2.2.1.2.2)]: http://eel.is/c++draft/format.string.escaped#2.2.1.2.2
JMazurkiewicz added a commit to JMazurkiewicz/STL that referenced this issue Jul 30, 2023
* This PR implements debug-enabled standard `formatter` specializations,
* Drive-by: rename tests added in microsoft#3656 - use `text_formatting_` prefix instead of `formatting_ranges_` (for consistency),
* Towards microsoft#2919.
JMazurkiewicz added a commit to JMazurkiewicz/STL that referenced this issue Jul 30, 2023
* This PR implements debug-enabled standard `formatter` specializations,
* Drive-by: rename tests added in microsoft#3656 - use `text_formatting_` prefix instead of `formatting_ranges_` (for consistency),
* Towards microsoft#2919.
@StephanTLavavej StephanTLavavej moved this from Done to Available in STL C++23 Features Mar 24, 2024
@StephanTLavavej StephanTLavavej moved this from Available to Reviewing PR in STL C++23 Features Jul 10, 2024
@github-project-automation github-project-automation bot moved this from Reviewing PR to Done in STL C++23 Features Aug 15, 2024
@StephanTLavavej StephanTLavavej added the fixed Something works now, yay! label Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cxx23 C++23 feature fixed Something works now, yay! format C++20/23 format ranges C++20/23 ranges
Projects
Development

Successfully merging a pull request may close this issue.

2 participants