Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix strformat precision handling for strings #7941

Conversation

skilchen
Copy link
Contributor

@skilchen skilchen commented Jun 1, 2018

Another attempt to bring the precision handling for strings closer to what python does.
The relevant part of the documentation is almost a literal copy from python:

python:

The precision is a decimal number indicating how many digits should be displayed 
after the decimal point for a floating point value formatted with 'f' and 'F', or before 
and after the decimal point for a floating point value formatted with 'g' or 'G'. 
For non-number types the field indicates the maximum field size - in other words, 
how many characters will be used from the field content. The precision is not allowed 
for integer values.

Nim:

The 'precision' is a decimal number indicating how many digits should be displayed
after the decimal point in a floating point conversion. For non-numeric types the
field indicates the maximum field size - in other words, how many characters will
be used from the field content. The precision is ignored for integer conversions.

I think it makes sense to also copy python's behavior in this case. The possibility to limit the number of characters in a formatted string is useful to produce fixed size columns in a simple textual report.

This example:

import strformat

var str = "abc"
echo fmt"'>7.1 :: {str:>7.1}'", " == '>7.1 ::       a'"
echo fmt"'>7.2 :: {str:>7.2}'", " == '>7.2 ::      ab'"
echo fmt"'>7.3 :: {str:>7.3}'", " == '>7.3 ::     abc'"
echo fmt"'>7.9 :: {str:>7.9}'", " == '>7.9 ::     abc'"
echo fmt"'>7.0 :: {str:>7.0}'", " == '>7.0 ::        '"
echo fmt"' 7.1 :: {str:7.1}'", " == ' 7.1 :: a      '"
echo fmt"' 7.2 :: {str:7.2}'", " == ' 7.2 :: ab     '"
echo fmt"' 7.3 :: {str:7.3}'", " == ' 7.3 :: abc    '"
echo fmt"' 7.9 :: {str:7.9}'", " == ' 7.9 :: abc    '"
echo fmt"' 7.0 :: {str:7.0}'", " == ' 7.0 ::        '"
echo fmt"'^7.1 :: {str:^7.1}'", " == '^7.1 ::    a   '"
echo fmt"'^7.2 :: {str:^7.2}'", " == '^7.2 ::   ab   '"
echo fmt"'^7.3 :: {str:^7.3}'", " == '^7.3 ::   abc  '"
echo fmt"'^7.9 :: {str:^7.9}'", " == '^7.9 ::   abc  '"
echo fmt"'^7.0 :: {str:^7.0}'", " == '^7.0 ::        '"
str = "äöüe\u0309\u0319\u035Co\u0309"
echo fmt"'^7.1 :: {str:^7.1}'", " == '^7.1 ::    ä   '"
echo fmt"'^7.2 :: {str:^7.2}'", " == '^7.2 ::   äö   '"
echo fmt"'^7.3 :: {str:^7.3}'", " == '^7.3 ::   äöü  '"
echo fmt"'^7.0 :: {str:^7.0}'", " == '^7.0 ::        '"
echo "what follows is actually wrong, but the unicode"
echo "module has no support for graphemes"
echo fmt"'^7.4 :: {str:^7.4}'", " == '^7.4 ::  äöüe  '"
echo fmt"'^7.9 :: {str:^7.9}'", " == '^7.9 :: äöüe\u0309\u0319\u035Co\u0309'"
echo "ideally it should produce"
echo "'^7.4 ::  äöüe\u0309  '", " == '^7.4 ::  äöüe\u0309  '"
echo "'^7.9 ::  äöüe\u0309\u0319\u035Co\u0309 '", " == '^7.9 ::  äöüe\u0309\u0319\u035Co\u0309 '"

produces

'>7.1 ::       a' == '>7.1 ::       a'
'>7.2 ::      ab' == '>7.2 ::      ab'
'>7.3 ::     abc' == '>7.3 ::     abc'
'>7.9 ::     abc' == '>7.9 ::     abc'
'>7.0 ::        ' == '>7.0 ::        '
' 7.1 :: a      ' == ' 7.1 :: a      '
' 7.2 :: ab     ' == ' 7.2 :: ab     '
' 7.3 :: abc    ' == ' 7.3 :: abc    '
' 7.9 :: abc    ' == ' 7.9 :: abc    '
' 7.0 ::        ' == ' 7.0 ::        '
'^7.1 ::    a   ' == '^7.1 ::    a   '
'^7.2 ::   ab   ' == '^7.2 ::   ab   '
'^7.3 ::   abc  ' == '^7.3 ::   abc  '
'^7.9 ::   abc  ' == '^7.9 ::   abc  '
'^7.0 ::        ' == '^7.0 ::        '
'^7.1 ::    ä   ' == '^7.1 ::    ä   '
'^7.2 ::   äö   ' == '^7.2 ::   äö   '
'^7.3 ::   äöü  ' == '^7.3 ::   äöü  '
'^7.0 ::        ' == '^7.0 ::        '
what follows is actually wrong, but the unicode
module has no support for graphemes
'^7.4 ::  äöüe  ' == '^7.4 ::  äöüe  '
'^7.9 :: äöüẻ̙͜ỏ' == '^7.9 :: äöüẻ̙͜ỏ'
ideally it should produce
'^7.4 ::  äöüẻ  ' == '^7.4 ::  äöüẻ  '
'^7.9 ::  äöüẻ̙͜ỏ ' == '^7.9 ::  äöüẻ̙͜ỏ '

@kaushalmodi
Copy link
Contributor

I think it makes sense to also copy python's behavior in this case.

+1. That makes sense. You get to implement a feature without a "surprise API" for people crossing over to Nim.

case spec.typ
of 's', '\0': discard
else:
raise newException(ValueError,
"invalid type in format string for string, expected 's', but got " &
spec.typ)
if spec.precision != -1:
if spec.precision < runelen(value):
value = value.runeSubstr(0, spec.precision)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if, instead of allocating a new string via slicing, this set the length of value instead.

@@ -558,12 +558,16 @@ proc format*(value: string; specifier: string; res: var string) =
## sense to call this directly, but it is required to exist
## by the ``&`` macro.
let spec = parseStandardFormatSpecifier(specifier)
var value = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems the wrong place to fix it, shouldn't alignString be changed instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because alignString is also used for floats that are already stringified to the requested "precision". For floats you don't want to deal with "precision" in alignString.
IMHO alignString should just align strings and not modify the passed in string in any way.

@Varriount Varriount merged commit fd102f3 into nim-lang:devel Jun 4, 2018
@skilchen skilchen deleted the fix_strformat_precision_handling_for_strings branch June 15, 2018 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants