-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow resolution of Data View without resolving all fields #139340
Comments
Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI) |
Pinging @elastic/kibana-app-services (Team:AppServicesSv) |
It is, but I'd like to have a thorough understanding before this is implemented. Generally speaking, we expect field list loading to be fast so I'm curious about the cases where this isn't true. What is the priority on this? Is it tied to any high priority items? |
The In those situations the |
@weltenwort Are the speed concerns still an issue with the current state of As best I know, there's no way it should be taking tens of seconds. Taking a step back, I'm happy to provide data views with async field loading, but I want to make sure I understand our current limitations. |
@weltenwort Which version of the stack is that? I might be interested to hear other relevant details - what does 'lightly loaded' mean? How many fields? I'm being stubborn about this because Ideally we'd be seeing sub-second responses. |
@dnhatn I noticed your work on benchmarks for the field caps api. Do we have a better idea of what we can expect performance wise? |
Wondering if @pugnascotia 's ES tracing work might help confirm what we're waiting on for those 3.3s 🤔 |
It would at least give you an idea what tasks are being executed. |
The clusters are managed by the observability dev productivity team's tooling. These are the details I could find, where "production" is the cluster that my Kibana instance runs on and "remote" is the cluster that is accessed via CCS: production cluster
remote cluster
Is there a way we can enable tracing on those clusters in a non-destructive way? |
I think 3 seconds is possible if the cluster has 1000+ indices. We have another optimization in elastic/elasticsearch#86323. However, it's still un-merged. I will try to get it in this week. This optimization should reduce the latency to sub-seconds. |
Right, this is not about a few seconds being too slow when the request hits that number of indices. It's about not being able to avoid it when loading a data view even when the component doesn't need the field list right away.
That sounds amazing, thank you. |
I'm glad we had this discussion to help emphasize the importance of @dnhatn 's optimization work. |
Thanks for making this connection @mattkime . Please ping us whenever you hit this kind of problems around calling field_caps (or any other API really), otherwise we don't even get to know that there are issues that you folks are looking to work around :) |
So is the conclusion that we aim to improve the performance of field resolution to be so fast that it's not an issue to resolve them even if they're not needed at all times? |
@miltonhultgren Yes, although these aren't necessarily mutually exclusive paths. What is the use case for loading a data view without the fields? I'd like to get into the details of what you're doing since I'll often learn something useful. Yes, I understand that initially you just need the index pattern and timestamp field but I'd still like to learn more. It looks like the case that might have taken 3s will now take about 0.3s. Is 0.3s meaningful in this case? I'm unaware of time to load being optimized to this degree elsewhere. All the data view code assumes the field list exists once a DataView instance has been initiated. This would be a significant change. If we were to rewrite the data views code, I'd definitely defer loading the field list. I'm trying to figure out the priority of making this change. |
We have two use cases today, one in Logs and one for a Lens table to shows host metrics. In Logs, we use the data view to resolve which indices to load logs from and we use the timestamp field as a tie breaker for sorting (I think). In the Logs case we do want the fields but at a later time, to suggest fields for filters or change which fields to show from the log document but this doesn't need to block the initial page load. For the new Lens table, we don't need the fields at all since we simply want to load the right metrics from the right index and no auto completion needs to happen for that table (though, later it might be filtered through unified search). In the rest of the Metrics UI we follow a similar pattern, initially we only load the metrics from the right index and defer the field resolution until a bit later when it's needed. So it really just boils down to wanting to defer work for later so that initial render with useful data can happen quicker.
No, I don't think so.
Understood, I think we'd do best to wait and see how the optimization performs, specially in CCS setups with slower networks/remotes and at what percentile we might have such load times. We'll also need to gather more accurate data on this, preferably from real deployments that are properly sized for the workload (the Edge cluster isn't). |
Sounds good. I'll think about how we might do this as smaller efforts instead of one big push. |
I have merged elastic/elasticsearch#86323. I think it should unblock the work here. |
Pinging @elastic/kibana-data-discovery (Team:DataDiscovery) |
So the ask would be to e.g. add a param to dataViewsService.get(dataViewId), allowing to get the data fields without resolving all fields, or create a separate function like |
I thought I'd share a bit of experience from the field: We have now updated our production cluster to 8.6.1 with the latest field cap improvement (elastic/elasticsearch#86323). Unfortunately performance is still not optimal for us: field_caps for I also fear that the problem will get worse with TSDB and syntheric source. With the good compression ratio of TSDB combined with the planned primary shard cap at 200M documents (elastic/elasticsearch#87246) a single frozen node will be holding significantly more shards in the future. I would thus expect performance to deteriorate further. |
Having frozen indices within the I think the solution should be to make sure the frozen indices are not available to the `metricbeat-* index pattern. Is this possible? Is something in the way? |
I would expect that querying data on frozen nodes leads to a slowdown. However, the mere presence of data on frozen outside of the queried time range should not have a performance impact. At least that my team and I have assumed so far. We have Kubernetes and Prometheus metrics in Given that a field_caps query does not contain a time range parameter, I can somewhat see where the problem is coming from. However, as frozen nodes are not used for indexing new data, fields on them should be rather static and hopefully cachable. |
Field_caps does support for providing the time_range filter, and runs the can_match phase to filter out irrelevant shards.
I am suspecting this is the main issue, as the performance improvements build up on deduplication of mappings that have same hash, which is not the case if there are slight changes between the different indices. Field_caps performance does not have to do with the number of shards though, but rather with the number of indices having distinct mappings. Would be great to get more feedback here to see what we can improve further. Could you open an sdhe around this? |
Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs) |
Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services) |
Yes, we intend to do this, the next step in this direction will be |
@miltonhultgren DataViewLazy has been partially implemented. Can you look and see if its useful for your needs? Fields are only loaded as requested, potentially saving a lot of overhead compared to regular DataViews. |
I'm no longer involved in the apps where we used DataViews that lead to me opening this issue. @weltenwort @neptunian Is this something that you guys could look at within the current logs and metrics code bases? |
thanks for the pointer. we have #179128 to track its usage in the log threshold alert |
When calling
dataViewsService.get(dataViewId)
the fields inside that data view are resolved at the same time, which adds a decent chunk to the time-to-resolution which also blocks rendering until that is done.There are cases in the Logs and Metrics UI where we would prefer to defer the fields resolution to a later stage yet still integrate with the Data Views service (for example, use the index pattern and timestamp field but not offer auto completion until later).
Would it be possible to make fields resolution optional until requested?
The text was updated successfully, but these errors were encountered: