Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigDecimalParser.parse very slow #1153

Closed
sirnple opened this issue Dec 5, 2023 · 4 comments
Closed

BigDecimalParser.parse very slow #1153

sirnple opened this issue Dec 5, 2023 · 4 comments
Labels
performance Issue related to performance problems or enhancements

Comments

@sirnple
Copy link

sirnple commented Dec 5, 2023

jackson-core version: 2.15.1

I have code like this, that can't finish in time:

BigDecimal result = BigDecimalParser.parse("\u000133000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0017\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u0017$\u0017\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\t\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000z\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0017\u0017\u0017\u0017\u0017nn\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000nn\u0017\u0017\u0017\u0017\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0001 \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0001\u0000\u0000|\u0000\u0000\u0000\u0000\u0000000000000000000002222222222330000000000000000000000000000003330000000003333322222222222222222222222222222222222222222222222222222222330000000000000000222222222222222222222222\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0017\u0017\u0017\u0017\u0017\u0000\u0000\u0000\u0000\u0000\u0000\u0017\u0017\u0017\u0017\u0017\u0017\u0017\u00000\u0017+.2222222222222222222222222222222222E222222242");

I have checked issue list, and I found many performance improvement, but my input still trigger an slow parse.

Additional:this input is produced by fuzz test.

@pjfanning
Copy link
Member

pjfanning commented Dec 5, 2023

Parsing BigDecimals is slow. Performance is much worse than O(n). There is not much that can be done to make it fast. Use your favourite search engine to seek out the literature.

You might find that BigDecimalParser.parseWithFastParser is a little faster.

BigDecimalParser.parse uses the built-in Java BigDecimal code while BigDecimalParser.parseWithFastParser uses https://github.com/wrandelshofer/FastDoubleParser

You might find that BigDecimalParser.parse is faster with the latest Java releases (eg Java 21).

BigDecimalParser is really an internal Jackson class and it is not recommended that it is used directly.

Jackson is built to parse and write JSON (and via modules some other formats like XML and CSV). If you use a Jackson ObjectMapper, the most recent versions of Jackson limit the size of number that it will parse. This can be tweaked but the reason to limit the number of chars that Jackson will parse for numbers is precisely because of the risk that a malicious JSON file could have a very long number string in it.

@pjfanning
Copy link
Member

@sirnple I think you have found a bug in BigDecimalParser.parse. I tried the fuzz value with new BigDecimal, BigDecimalParser.parseWithFastParser, etc. and these fail quickly due to this being an invalid number.

BigDecimalParser.parse is much slower. That code breaks up the string into smaller pieces but this string causes issues.

@cowtowncoder
Copy link
Member

Ok, so there may be an issue somewhere, but as @pjfanning said, BigDecimalParser is NOT part of public API -- it is not to be used directly by code outside Jackson-core but only indirectly via parsing API.
So in that sense reproduction is not useful in itself. I may close the issue in near future.

But if there is a way to trigger the issue through API that'd be different story.
So leaving this open for the moment.

@cowtowncoder cowtowncoder added the performance Issue related to performance problems or enhancements label Dec 6, 2023
@cowtowncoder
Copy link
Member

Since I think this issue itself is not necessary valid (wrt passing invalid numbers), I'll close it -- but it did spawn #1157 to address possibly relevant case so it's good this was filed.

@cowtowncoder cowtowncoder reopened this Dec 13, 2023
@cowtowncoder cowtowncoder closed this as not planned Won't fix, can't repro, duplicate, stale Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issue related to performance problems or enhancements
Projects
None yet
Development

No branches or pull requests

3 participants