You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As noted in PPD performance investigation the ppd application could be optimized for the case where there are large return sizes.
Test case: interactive ppd query via search from which could result in a larger number of results, for example all properties in Birmingham. Select all results option.
Current process: app requests all results from the api/data store (which for the above example takes ~100s on a larger machine). However, it only renders the first 5,000 results to HTML but then scans the remainder to count the total transactions and properties in order to render its response:
Proposed alternative: request with a limit of 5000. If the results return exactly 5000 (meaning there could be more available) issue a separate count query with a higher limit (e.g. 1,000,000) to complete the render.
We would need to verify the time for the count queries but it does appear that a significant part of the cost of PPD queries is the transport cost for large result sets. This would also require support for counting both properties and transactions in a single call.
Notes:
In the current implementation if you select at most N results the current implementation works like this; it issues a separate count query but take query only counts transactions and limits the number scanned to 10k (for performance reasons).
When download all data would need to remove the 5000 limit, the proposal only applies to the html-rendered form.
Do not propose we embark on this now, an option for future improvement that would need funding.
The text was updated successfully, but these errors were encountered:
As noted in PPD performance investigation the ppd application could be optimized for the case where there are large return sizes.
Test case: interactive ppd query via search from which could result in a larger number of results, for example all properties in Birmingham. Select all results option.
Current process: app requests all results from the api/data store (which for the above example takes ~100s on a larger machine). However, it only renders the first 5,000 results to HTML but then scans the remainder to count the total transactions and properties in order to render its response:
Proposed alternative: request with a limit of 5000. If the results return exactly 5000 (meaning there could be more available) issue a separate count query with a higher limit (e.g. 1,000,000) to complete the render.
We would need to verify the time for the count queries but it does appear that a significant part of the cost of PPD queries is the transport cost for large result sets. This would also require support for counting both properties and transactions in a single call.
Notes:
The text was updated successfully, but these errors were encountered: