-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Globe picking is very slow #8481
Comments
This actually came up twice this week on the forum:
|
I have the same issue trying to use ARCGIS terrain. Is there a way to limit the resolution to get acceptable speed as a trade off? I tried disabling the camera collision check and it did not make much difference. |
@stuartmarsden Unfortunately I don't think there's a way to limit the tile resolution, though adding a But there has been some progress on this issue. I've implemented an octree acceleration structure that does very fast ray-tile intersection tests and hooks in with all terrain providers seamlessly, but it has to generate the octree at tile load time which can cause serious delays (ArcGIS has about 130k triangles per tile and takes 100-300ms to process, whereas Cesium World Terrain has about 10k triangles per tile and 3-20ms to process). Maybe the system could be optimized some more, but I'm feeling like generating the octree at tile load time won't work for ArcGIS. It might not even work for CWT. But there might be a solution that doesn't involve an octree. Since we know the ArcGIS data is a heightmap and thus well structured, we can find a line of triangles within the tile that have potential to be intersected. We can start with one triangle and search along its neighbors that have vertices on either side of the ray, stopping once the edge of the tile is reached. It feels like we could get away with testing even fewer triangles than that, but I haven't thought it through yet. I don't think this will be faster than the octree intersection, but it will be a lot better than testing every triangle. So ultimately there will be a different solution for CWT/quantized mesh and ArcGIS/heightmaps. CWT will probably include the octree structure in the tile payload and ArcGIS/heightmaps will use this heightmap-only algorithm. Not sure when all of this will be ready, but I'll keep working on it in the background. |
It would seem that this is supposed to do what I want but it does not seem to make a difference. https://cesium.com/docs/cesiumjs-ref-doc/TerrainProvider.html#.heightmapTerrainQuality |
Ah I forgot about that option. I was messing around with it and couldn't get it do much either. I think it's supposed to control which LODs get rendered, but since all ArcGIS tiles have the same number of triangles regardless of LOD, it will make no performance difference for picking. |
Is this something that is getting any attention? It definitely causes significant performance issues for us. |
Sorry for the lack of updates. The strategy is still the same but I haven't had time to work on this (and not sure when I will). The main piece that's missing is the ray-heightmap intersection which would be similar to the algorithm described here: https://www.gamedev.net/forums/topic/672036-intersect-ray-with-heightmap/5254121/. We're dealing with a 2D grid on a curved surface (earth), so everything needs to be done with |
It would be good in the interim to have a simple way to reduce the resolution for heightmaps. I don't know where to look in the code. I assume it just makes a simple grid based on the resolution of the heightmap image. So should just be possible to only sample every set number of pixels or better if you could average. You will get worse terrain but at least it would be usable. Obviously expose the divisor number to let it be tweaked. |
The API gets in the way here. All terrain providers, heightmap or not, go through the public TerrainProvider interface. Any property we add to that has to be supported by non-heightmap terrain as well and that becomes a lot more difficult. We could potentially get around this by adding some One other hacky solution that nearly worked was to change the So I don't really see a great solution here. Adding a public |
Hi @IanLilleyT 😄 I had a further crack at the octree approach, continuing from where you left off on the The immediate bottleneck I could see, was just transferring the octree data structure across the worker threads. This caused both delays on the worker thread, and the main thread as it read the response. (main thread picture here) I've spent the last few days (learning what heck an octree is) and packing the data structure into a few array buffers before we transfer if across threads. After getting that working, we get really good worker thread transfer speeds 👍 (both on the send and receive sides) And picking time is still just incredible 👍 (hats off for getting the initial octree implementation working!) As you've already mentioned, the bottleneck is now just creating the octree, as I've captured here: 👎 However, let's just keep in mind, we're talking about less than 10 frames worth of time here compared to the current For some reference points, these are the
And with the octree on ArcGIS
Compared to Cesium World Terrain without octree
Cesium World Terrain with octree:
This is quite the improvement for ArcGIS (and CWT) terrain picking performance, compared to the currently unusable setup for us; even with the slow octree generation. What are your thoughts on this? From my perspective, I'd be happy to spend a week buttoning up the code and getting this into master in some form. Do you think we'd have a chance of merging something like this? I think with some more work we can improve the octree generation time 👍 Here's the branch diff if you're interested; it really is just a first pass at proving we could flatten the octree into a typed array. Apart from the that I've hacked it all together :) Thanks :) |
Wow, thanks for taking this further! Flattening the octree works wonders! FYI for better runtime performance testing, go to the High Dynamic Range sandcastle, which I've hijacked for picking. 10 frames of latency for generating an octee for ArcGIS... still too much I think. It just feels wrong that it takes up almost the entire worker time. Also we don't want to slow down Cesium World Terrain by too much for a feature that won't affect the average user (unless they are doing a lot of picking calls). Even a few frames there can make Cesium feel less responsive. But I feel like there are several areas for optimization. Here's some ideas that come to mind, not sure which will actually work:
There might even be ways of doing this that take a completely different approach than the one here. Open to ideas. I feel like it's possible to make the octree generation at least 4X faster. |
Also, feel free to clean up and make a PR with what you have or any new optimizations. It doesn't have to be final. That would at least help this get to the finish line. |
Thanks for all the suggestions @IanLilleyT Unfortunately I can't work on this during the day at the moment, so it's a very spare time thing right now. If yourself or someone else is keen to get this done; then feel free to continue from my additions or not :) My current plan/work in progress is getting some integration/unit tests working for a CWT tile(s) and ArcGIS tile(s); before I continue optimising the code; and make sure we have no regressions. I'll add my own two ideas to the optimisation list as well:
I like your ideas above:
I'll let you know when I get more time to work on this again 👍 |
Those two optimization ideas are good. I agree with the plan to optimize the JS version first, and then convert to WebAssembly if necessary. And yeah, draco is the only one so far. So then it's a matter of deciding what programming language to use, etc, etc and may involve input from other team members. So that's more for the distant future, even though I think it could speed things up a decent amount.
I'd like to avoid calling I'm in the same situation as you, only able to work on this in my spare time. I want to get back to this in around 2-3 weeks, but even then I'm not positive. I'll give an update then. |
@DanielLeone Hey - is there any chance we are getting back to this? your work seems to have dramatic performance change which is necessary for many applications. |
Indeed, this is necessary to even begin to use ArcGIS terrain. Please say that this will be implemented in some fashion... |
@jony89 @JoshuaKahn The most production ready version of the code was here: #9961 but as per: #9961 (comment) it was a requirement that it worked with the Dynamic Terrain Exaggeration feature, which it does not. I suggested some alternatives #9961 (comment)
|
@DanielLeone If I didn't care about using Dynamic Terrain Exaggeration - would it still be possible to use that code branch for regular work, or is it too unstable/in dev to be used like that? |
You could try, but I wouldn't use it for anything important. It's fairly old now, potentially out of date. I didn't test it a great deal so there's definitely bugs somewhere. 🙏 |
The current approach for globe picking is intersect the ray against all rendered tiles (using their bounding spheres) and sort the intersected tiles by closeness to ray. Then, for the closest tile do ray-triangle intersection tests against all of its triangles and return the closest. If there was none, test the next tile. And repeat.
This could be optimized some more.
ScreenSpaceCameraController
does one or two globe picks per frame and that alone can slow down Cesium severely with detailed terrain providers likeArcGIS
.Some ideas on how to improve this:
TerrainEncoding
quantization for faster position decode from vertex buffer (only affects nearby tiles)IntersectionTests.rayTriangleParametric
.Sandcastle (console prints how long each pick takes with left click). Select
ArcGIS
from the dropdown for some seriously slow speeds.WIP branch
The text was updated successfully, but these errors were encountered: