-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed bug #50224 where float without decimals were converted to integer when encode to JSON #642
Conversation
I think that this patch could introduce some regression as there is extra reallocation (spprintf is called twice). Maybe introducing another flag for spprintf would be a better solution. I think that this shouldn't be merged until you test encoding big arrays with many *.0 values. Maybe I'm wrong and the regression is minimal but it should be definitely tested IMHO... |
@bukka Do you think replacing |
I think that it would be great to do some perf testing to find out if there is any regression. The best way would be to create a big array (e.g. |
I did the performance test and it increased 23% the encoding time of floats with no decimal point. So, I did a refactor and this time decreased to less than 10%. The used memory increased 1%. You can see the execution here: https://gist.github.com/jrbasso/11101696 PS: I rebased to the latest version of master to get the tests working on travis. |
@@ -630,6 +638,14 @@ PHP_JSON_API void php_json_encode(smart_str *buf, zval *val, int options TSRMLS_ | |||
|
|||
if (!zend_isinf(dbl) && !zend_isnan(dbl)) { | |||
len = spprintf(&d, 0, "%.*k", (int) EG(precision), dbl); | |||
if (strchr(d, '.') == NULL) { | |||
char *nd = (char *)emalloc(len + 3); | |||
strcpy(nd, d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memcpy(nd, d, len);
Think that memcpy should be a bit faster than strcpy and strcat but double check that if it's correct (I wrote it quickly without thinking... :) ). Just wondering if it helps or if it's optimized by compiler anyway... |
Could you also try this variant?
|
I tried using the Summary:
I also run a test with float value = 1.1 (only the extra
|
PS: I tried to use |
I was trying to optimize even more and I found a way to optimize replacing the original I didn't commit it because it change the original implementation and go beyond the bug fix. Also, I would like your approval before change it. All tests are passing. In terms of performance, with this suggested change it results in Thoughts? |
@jrbasso I'd add comment there stating where 2048 comes from. Otherwise, it looks ok to me. |
The new function is faster and makes the decimal point easier to be added
I pushed the change. I made the 2048 a constant with a reference for the source. |
Nice! Looks good to me too. ;) Think that 2048 is a bit too much for double conversion. It's taken from apache impl when they chose that for all possible numeric conversion but it won't be never filled for this case IMHO... I was actually thinking about similar implementation for jsond yesterday. I'm thinking to go a bit further and re-implement One last note. The patch can change the generated json string check sum so I'm not sure if it should go bellow 5.6. It's up to the RM to decide. I'm sure that Stas will decide wisely. ;) |
I can reduce the variable size if you guys feel comfortable. I also think 2048 is too much, but I followed that number as it is used on the original implementation and there is a comment saying it can't be smaller. @bukka About the invalid JSON, it is not fully true. Before the call to |
@jrbasso Yeah you right. Missed that part. :) In that case using Btw. the patch also fixes the annoying warning about |
You are welcome. :) It adds another warning of using Nice to hear about the optimized version on PECL, maybe it can be a part of the core in the future. :) |
That looks like a bug. You are passing pointer to array but it should be a pointer to char (pointer to first element in the array in this case - |
It's actually not a bug for gcc because the resulted address is the same but it suppresses the warning ;) |
Good point. I updated the code to fix it. Thanks. |
Hey, finally got time to merge it to jsond in bukka/php-jsond@118b0ab . I changed it slightly and set different length for the buffer. The optimal size should be |
@bukka That's cool. Do you think I should update this PR too? |
Even they are standard part of |
@bukka True, I didn't take that into consideration ;-) |
Please confirm with 5.6 RM as 5.6 is in freeze now and pretty close to the release point. |
@bukka and @smalyshev So what needs to be done with this PR? Change the code to use the |
@jrbasso I just committed bukka/php-jsond@19e14ee to jsond and added such ifdefs. All main platforms support |
You might notice that the default value is 1080. Not sure where I got 1089 - I just tested the value and it's 1077 on my platform and max value that I googled was 1079 so it should never be higher than 1080. I'm almost sure that if it does, the constants will be defined |
I added the change the code to use |
After giving this some though I think the best course of action would be introducing a new option constant for json_encode. |
sounds fine.
|
@Tyrael I disagree with adding an option for this. Imho it's pretty clear that the old behavior is a bug and as such I don't see reason to preserve it. If such an option is added it should be the default (at which point the option won't even help with BC concerns for tests comparing json_encode output.) |
I put the original code back when the option is not being used.
…PART is disabled.
I added the option on the code. Is up to you guys to decide if we keep it or not. Revert the commits is easy. |
@nikic I don't think that we can really call it a bug. Javascript doesn't have a separate type for integers and floats, nor does the JSON spec, they only talk about numbers, which can have optional fraction parts. |
As I said I don't think that it's a bug exactly for the reason that Ferenc noted - JSON spec does not specify float type and as such the value is correctly converted to the number. However I think that it's useful (mainly for symetrical encryption/decryption). Also JS engines internally store numbers either as double or int so I understand why someone could consider it as a bug. The additional constant seems reasonable due to the BC issue for minor version. However I think that it should be discussed on @Internals and if there are still objections from Nikita or others, then we should have RFC. |
@bukka @Tyrael @nikic @smalyshev Do we have a consensus here or should I bring this discussion to internals' list? |
@bukka @Tyrael @nikic @smalyshev Any news? Almost 3 months since the last comment. What is the directions to take from here? |
I think as an option we can have it in 5.6. Please drop a note on internals@, if there would be no objections then I'd merge it into 5.6. |
For the record I'm fine with having it in 5.6. |
I think that it would be a good idea to have it as default for PHP 7. If it becomes default, then this constant will be useful only for disabling it. However it means to do something like |
P.S. That doesn't mean that I'm for merging as default now. It's a BC break when it's default. I just think that there is not such a big need that we have to merge it now. If it could wait so long I think it can wait a bit longer (till PHP 7). Cheers. |
Thread created on internals list. http://marc.info/?l=php-internals&m=141507087629656&w=2 |
This is based on bug (feature request) PHP#50224 implemented in php/php-src#642
#if defined(DBL_MANT_DIG) && defined(DBL_MIN_EXP) | ||
#define NUM_BUF_SIZE (3 + DBL_MANT_DIG - DBL_MIN_EXP) | ||
#else | ||
#define NUM_BUF_SIZE 1080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that not a tad, er, excessive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TazeTSchnitzel according with http://tigcc.ticalc.org/doc/float.html the value if the constants are defined is 3 + 16 - -999
= 1018.
There is research from @bukka and he explains on this comment:
You might notice that the default value is 1080. Not sure where I got 1089 - I just tested the value and it's 1077 on my platform and max value that I googled was 1079 so it should never be higher than 1080. I'm almost sure that if it does, the constants will be defined float.h so it won't be a problem...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually allocate less space on the stack than it was before this patch - The spprintf (xbuf_format_converter) allocates 2048 + additional space for other variables and function stack . See http://lxr.php.net/xref/PHP_TRUNK/main/spprintf.c#xbuf_format_converter for more details. As I said 1080 won't be probably used in any case as DBL_MANT_DIG, DBL_MIN_EXP are almost always defined so it will be mostly 1079 :). You might think that it's not necessary as the EG(precision) will be always smaller. Unfortunately we don't know its value at the C compile time (the dynamic allocation leads to the worse perf so we need to allocate on the stack). The only better way might be using alloca. I plan to experiment with that later to see if there no perf penalty but we need to have fallback anyway for non-alloca platforms. That requires the max space for double value otherwise there would be chance of the stack overflow...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was C99 we could do dynamic stack allocations. Alas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, the constants are defined on windows, linux and mac. So just in rare cases it will fallback to the hardcoded value.
merged |
Refs #635