-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.7] Cache distinct validation rule data #26509
[5.7] Cache distinct validation rule data #26509
Conversation
+1 on improving Laravel for large dataset process. I'll be starting 2019 with a project that will be exposing an API that can take 1000 rows at a time and insert into the database. I didn't work with distinct yet, but I already had to implement a BatchUnique and BatchExists for performance reasons even though the project is still just a prototype. |
This sounds like a huge assumption. How would this work with something like this?
|
This does not affect the unique rule, but only the distinct rule so your pseudo code will succeed since distinct doesn't check the database. I share your concern though and that's why the 'cached data' is being reset every time you want to validate.
|
And what if the user isn't calling $validator->passes()? What if they are using the trait directly? If Validator is the one responsible for the lifetime of the cache (and clearing it), then it would make more sense to but the caching logic in the Validator class, not the ValidatesAttributes trait. There's nothing in the code that makes sure that users of the trait properly resets the cache when needed. |
I get your point. What if we move the What do you think about this? |
I think it would look cleaner if you split out extractDistinctValues from validateDistinct in the trait, and then override that method in the Validator implementation. That way there's no conditionals in the trait about any possible cache, and the cache is entirely handled by code in the Validator class.
|
I'm not sure about that change, I understand that it looks cleaner. But, when using the trait directly it means you can't use caching unless you implement it yourself again. With the current solution you only would have to add a property which would enable the caching and doesn't need code duplication. |
While validating 1000 records with the distinct rule I encountered some impact on performance for this rule. When performing this validation the same 'distinct data set' is created every time. We're creating 1000 times an array of 999 items with almost identical data.
Because data is not being changed during validation I think we should be able to 'cache' this data for the first record and reuse it for all following records.
It will be a huge improvement on performance while validating bigger datasets:
Optionally if you're not feeling comfortable with this being the default behaviour we could maybe implement this as being configurable with a
cached
parameter?