fix: memory leak in large files with lots of errors #180
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🧰 Changes
This resolves a memory leak in how we handle very large invalid API definitions with a lot of errors. For context, with this spec running validation right now either takes upwards of minutes or will fully crash with a memory leak.
Upon investigation I discovered that our @readme/better-ajv-errors library uses @babel/code-frame which when given a large file and a large amount of errors runs code highlighting against this large file for every error it's generating a code frame for. For example: with this spec, which has upwards of 1,000 errors, only rendering the first 20 took 27 seconds with this code highlighting work. Without? 2 seconds.
I've updated
@readme/better-ajv-errors
in readmeio/better-ajv-errors#166 to no longer run code highlighting on these files as we're always handling them as JSON and highlighting everything as green does adds nothing of value to me.With that fix however, this parsing library would still hit memory leaks:
To resolve this I've decided to cap the amount of errors we render out for very large specs. For this I am only rendering the first 20 error message if the spec in question has a stringified length of 5,000,000 (this amounts to a little less than half the size of the large spec above in question). Validation now works for these situations:
Additionally since we are not showing the full amount of errors for these situations I've added a new message to the bottom of the errors we show informing the user that they have an additional amount of errors that they will see once they re-run validaiton.
I don't really love this error handling work because showing errors for this large spec still takes 10seconds and we're not even showing them everything, but thankfully(?) this situation is only going to be experienced by folks with very large API definitions that have a lot of errors.