Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(annotation): AnnotationLayerView now shows currently displayed annotations in MultiscaleAnnotationSource #504

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

chrisj
Copy link
Contributor

@chrisj chrisj commented Nov 29, 2023

Breaking apart #493

I left out the sorting by distance from because we don't want it for the cave datasource, would a boolean sortByDistance property on MultiscaleAnnotationSource be reasonable?

I left some comments in the code about the arguments to this.virtualListSource.changed!.dispatch.

I could do a diff between the old and new list to see which elements are identical to get a proper retainCount but I'm not sure what effect it would have.

@jbms
Copy link
Collaborator

jbms commented Nov 29, 2023

This is a really cool feature --- thanks. There seems to be a performance problem when there are many annotations visible, though:

http://127.0.0.1:8080/#!%7B%22dimensions%22:%7B%22x%22:%5B8e-9%2C%22m%22%5D%2C%22y%22:%5B8e-9%2C%22m%22%5D%2C%22z%22:%5B8e-9%2C%22m%22%5D%7D%2C%22position%22:%5B15610.5%2C22510.7265625%2C24104.685546875%5D%2C%22crossSectionScale%22:1%2C%22projectionOrientation%22:%5B-0.29930439591407776%2C-0.3578258454799652%2C0.8785958290100098%2C-0.10221008211374283%5D%2C%22projectionScale%22:48.928247537774176%2C%22layers%22:%5B%7B%22type%22:%22image%22%2C%22source%22:%22precomputed://gs://neuroglancer-janelia-flyem-hemibrain/emdata/clahe_yz/jpeg%22%2C%22tab%22:%22source%22%2C%22name%22:%22emdata%22%7D%2C%7B%22type%22:%22segmentation%22%2C%22source%22:%22precomputed://gs://neuroglancer-janelia-flyem-hemibrain/v1.0/segmentation%22%2C%22tab%22:%22segments%22%2C%22segments%22:%5B%22%211292971921%22%5D%2C%22name%22:%22segmentation%22%7D%2C%7B%22type%22:%22annotation%22%2C%22source%22:%22precomputed://gs://neuroglancer-janelia-flyem-hemibrain/v1.0/synapses%22%2C%22tab%22:%22annotations%22%2C%22shader%22:%22#uicontrol%20vec3%20preColor%20color%28default=%5C%22red%5C%22%29%5Cn#uicontrol%20vec3%20postColor%20color%28default=%5C%22blue%5C%22%29%5Cn#uicontrol%20float%20preConfidence%20slider%28min=0%2C%20max=1%2C%20default=0%29%5Cn#uicontrol%20float%20postConfidence%20slider%28min=0%2C%20max=1%2C%20default=0%29%5Cn%5Cnvoid%20main%28%29%20%7B%5Cn%20%20setColor%28defaultColor%28%29%29%3B%5Cn%20%20setEndpointMarkerColor%28%5Cn%20%20%20%20vec4%28preColor%2C%200.5%29%2C%5Cn%20%20%20%20vec4%28postColor%2C%200.5%29%29%3B%5Cn%20%20setEndpointMarkerSize%282.0%2C%202.0%29%3B%5Cn%20%20setLineWidth%282.0%29%3B%5Cn%20%20if%20%28prop_pre_synaptic_confidence%28%29%3C%20preConfidence%20%7C%7C%5Cn%20%20%20%20%20%20prop_post_synaptic_confidence%28%29%3C%20postConfidence%29%20discard%3B%5Cn%7D%5Cn%22%2C%22linkedSegmentationLayer%22:%7B%22pre_synaptic_cell%22:%22segmentation%22%2C%22post_synaptic_cell%22:%22segmentation%22%7D%2C%22filterBySegmentation%22:%5B%22post_synaptic_cell%22%2C%22pre_synaptic_cell%22%5D%2C%22name%22:%22synapse%22%7D%5D%2C%22showSlices%22:false%2C%22selectedLayer%22:%7B%22visible%22:true%2C%22layer%22:%22synapse%22%7D%2C%22layout%22:%22xy-3d%22%7D

Regarding retainCount --- Suppose you are scrolled so that you can look at a particular element in a long list. We want the portion that is visible to basically stay the same if additional elements are added or removed outside of the viewport. If the list is completely reordered or changed then it doesn't matter, though.

As far as sorting, there are a few issues to consider:

  1. When viewing annotations associated with a particular segment id, there is no partial loading/rendering support. Neuroglancer always loads and attempts to render all of them, and offscreen ones just get clipped at rendering time. When not filtering by segment id, Neuroglancer instead uses the spatial index. In this case, the user-specified "Spacing" controls determine how large of a subset of the annotations to display, which both limits the levels of the spatial index that are used in the first place, but also causes subsetting of individual chunks. When subsetting individual chunks, Neuroglancer does this by simply rendering a prefix of the full list of annotations in that chunk. In order for that to result in a uniformly sampled subset, the annotations within the chunk need to be randomly ordered, which is how precomputed annotation sources should normally be generated. Therefore, without sorting the order would be meaningless.
  2. When there are many annotation chunks that are visible, if we just want to display the closest annotations, we don't want to construct the full list of annotations in order to sort it, as that would be too expensive. Instead we would need to use the chunk structure itself to determine which chunks could potentially contain the closest annotations.

@chrisj
Copy link
Contributor Author

chrisj commented Nov 30, 2023

For the performance issues, I don't see anything grossly inefficient. I think we need debouncing (or more of it) and probably moving the computation to the backend thread. Do you agree with that and is there anything tricky with moving it to the backend?

@jbms
Copy link
Collaborator

jbms commented Dec 1, 2023

For the performance issues, I don't see anything grossly inefficient. I think we need debouncing (or more of it) and probably moving the computation to the backend thread. Do you agree with that and is there anything tricky with moving it to the backend?

Moving to the backend thread is tricky because the annotation data is not accessible there once it is moved to the frontend.

In terms of efficiency, the current implementation has a few issues:

  1. For the spatial index, it considers all chunks, independent of whether they are visible.
  2. Even restricting to just the visible chunks, the total number of visible chunks may still be large. Less than about 100 annotations will be displayed in the list at any time, but an Annotation object is being created for all visible annotations. The total number of visible annotations may be very large, and iterating through all of them and creating an Annotation object takes a lot of time. Instead there are a few strategies that could be employed: for annotations indexed by segment id, since there is no spatial index we could just impose a user-defined limit, that defaults to e.g. ~200, on the number of annotations shown, so if the total number is greater than this limit, the list would just have an arbitrary subset. For the spatial index, we can also impose a limit, but can use the chunk structure to try to iterate over chunks in order of their distance to the center position. Or I suppose could just arbitrarily limit it, as in the segment id index case, which would not give a great result but would be simpler to implement. Another option would be to just disable the feature if the number of annotations is large.

I imagine the use cases you built this for are expected to have only a small number of annotations, where this isn't an issue.

…hat the virtual list renders

started to work on sorting the list for spatial index but the chunk calculation is incorrect
@chrisj
Copy link
Contributor Author

chrisj commented Jan 4, 2024

Update I'm reconsidering the use of lazy deserialization since as I said below, it doesn't work in the case we want to sort and even with unsorted segmentFilteredSources, the performance issue there should be possible to handle by observing if the segment list actually changed and the chunks were already loaded.

@jbms I wanted to try out lazy evaluation by holding off on deserializing annotations until the virtual list calls render. Performance was greatly improved but it definitely adds some code complexity and there is some cleanup/optimization left. The sorted spatial index needs to be deserialized immediately for coordinates. I am attempting to use the chunk structure to only collect and sort the annotations closest to our global position. I'll point that out in the code but the chunk I calculate using the global position divided the chunk size is different than the chunks loaded by the 3d view. I am doing a very basic calculation compared to the 3d projection.

I added debounce to listening to the visibleChunksChanged. When we have active segmentFilteredSources, I should probably be able to ignore calls to AnnotationLayerView.updateView() if the visible segments haven't changed and all the chunks were previously loaded.

Most of the expense of AnnotationLayerView.updateView() is in call to this.virtualListSource.changed!.dispatch. I want to analyze that a bit more to see if there is any potential performance improvements there.

Add/update/remove annotation element is disabled when the source is a MultiscaleAnnotationSource. Perhaps this is a good opportunity to add that as unimplemented methods on MultiscaleAnnotationSource? At some point we want to support editing our annotations within neuroglancer.

Overall, do you think this lazy approach for handling large annotation lists for segment filtered sources is worth it?

const sortChunk = new Uint32Array(rank);
const {chunkDataSize} = source.spec;
for (let i = 0; i < rank; i++) {
sortChunk[i] = Math.floor(sortByPosition.value[i] / chunkDataSize[i]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the basic chunk calculation that doesn't match up with the loaded chunks. The idea is to start with the highest level chunks and go out till we have a chunk that is loaded. Eventually it would be nice to support neighboring chunks particularly if the global is near a chunk boundary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sortByPosition you are just using the globalPosition, which is not necessarily in the same coordinate space. Instead you need to transform the relevant globalPosition and localPosition to the coordinate space of the annotation layer, as done here:

function getMousePositionInAnnotationCoordinates(

Then each scale level also has a coordinate transform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted getMousePositionInAnnotationCoordinates to getPositionInAnnotationCoordinates so I could pass in any Float32Array, call that with

const point = getPositionInAnnotationCoordinates(
  globalPosition.value,
  state,
);

I then copied the logic in forEachPossibleChunk:

private forEachPossibleChunk(

To make positionToMultiscaleChunks(position: Float32Array)
https://gist.github.com/chrisj/328828d13480627961166beb16cfcd18

This uses source.multiscaleToChunkTransform which I assume is the coordinate transform that you mentioned per scale. The issue is that it only returns 0,0,0 for the largest scale. I'm not sure if I'm misunderstanding the code but what happens in the case of a single point is that totalChunks = 1; (unless on boundary), remainder = 0
chunkIndex = 0;, size = 1, remainder % size = 0, remainder = 0 / 1 so you end up with tempChunk = [0,0,0] and that chunk usually only exists for the largest scale.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more what the issue is? It is indeed the case that at the largest scale, there is normally just a single chunk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that MultiscaleAnnotationSource.forEachPossibleChunk is supposed run the callback across all the scales that contain the annotation.

When I follow the logic, it seems to be stuck at [0,0,0] across all scales so chunks.get(tempChunk.join()); only returns a result at the largest scale.

I am looking for the smallest scales that contain my target point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this branch that shows the problem I'm experiencing
https://github.com/seung-lab/neuroglancer/tree/cj_multiscale_annotation_list_3

With this link http://127.0.0.1:8080/#!%7B%22dimensions%22:%7B%22x%22:%5B8e-9%2C%22m%22%5D%2C%22y%22:%5B8e-9%2C%22m%22%5D%2C%22z%22:%5B8e-9%2C%22m%22%5D%7D%2C%22position%22:%5B15610.5%2C22520.5%2C24095.5%5D%2C%22crossSectionScale%22:1%2C%22projectionOrientation%22:%5B-0.29930439591407776%2C-0.3578258454799652%2C0.8785958290100098%2C-0.10221008211374283%5D%2C%22projectionScale%22:12.370998330885378%2C%22layers%22:%5B%7B%22type%22:%22annotation%22%2C%22source%22:%22precomputed://gs://neuroglancer-janelia-flyem-hemibrain/v1.0/synapses%22%2C%22tab%22:%22annotations%22%2C%22shader%22:%22#uicontrol%20vec3%20preColor%20color%28default=%5C%22red%5C%22%29%5Cn#uicontrol%20vec3%20postColor%20color%28default=%5C%22blue%5C%22%29%5Cn#uicontrol%20float%20preConfidence%20slider%28min=0%2C%20max=1%2C%20default=0%29%5Cn#uicontrol%20float%20postConfidence%20slider%28min=0%2C%20max=1%2C%20default=0%29%5Cn%5Cnvoid%20main%28%29%20%7B%5Cn%20%20setColor%28defaultColor%28%29%29%3B%5Cn%20%20setEndpointMarkerColor%28%5Cn%20%20%20%20vec4%28preColor%2C%200.5%29%2C%5Cn%20%20%20%20vec4%28postColor%2C%200.5%29%29%3B%5Cn%20%20setEndpointMarkerSize%282.0%2C%202.0%29%3B%5Cn%20%20setLineWidth%282.0%29%3B%5Cn%20%20if%20%28prop_pre_synaptic_confidence%28%29%3C%20preConfidence%20%7C%7C%5Cn%20%20%20%20%20%20prop_post_synaptic_confidence%28%29%3C%20postConfidence%29%20discard%3B%5Cn%7D%5Cn%22%2C%22filterBySegmentation%22:%5B%22post_synaptic_cell%22%2C%22pre_synaptic_cell%22%5D%2C%22name%22:%22synapse%22%7D%5D%2C%22showSlices%22:false%2C%22selectedLayer%22:%7B%22size%22:426%2C%22visible%22:true%2C%22layer%22:%22synapse%22%7D%2C%22layout%22:%223d%22%2C%22statistics%22:%7B%22size%22:248%7D%7D

In the console you will see output such as

addChunk 0,0,0
addChunk 28,31,35
addChunk 14,15,17
addChunk 0,1,1
addChunk 0,0,0
addChunk 1,0,1
addChunk 29,30,36
addChunk 28,32,36
addChunk 0,0,1
... (removed 20 lines)
addChunk 2,3,4
addChunk 7,7,9
addChunk 1,1,2
addChunk 1,1,1
5 no chunk for key 0,0,0
chunk exists 0,0,0 size 17209,15302,19814

@jbms
Copy link
Collaborator

jbms commented Jan 6, 2024

Update I'm reconsidering the use of lazy deserialization since as I said below, it doesn't work in the case we want to sort and even with unsorted segmentFilteredSources, the performance issue there should be possible to handle by observing if the segment list actually changed and the chunks were already loaded.

Potentially you could sort without actually fully deserializing, by just creating an array of indices, [0, num_elements], and sorting that with a comparison function that grabs the coordinates directly from the serialized geometry data. I expect that may still be faster than materializing all of the annotations, but it might not be.

In general it seems to me that a top priority is ensuring that regardless of what is selected, the list doesn't degrade performance, and instead we should degrade the quality of the list (e.g. skip sorting, etc. if there are too many items) as needed to retain reasonable performance. If necessary there could be an additional user-configurable setting that trades off performance vs quality, though ideally we could avoid that extra complexity.

@jbms I wanted to try out lazy evaluation by holding off on deserializing annotations until the virtual list calls render. Performance was greatly improved but it definitely adds some code complexity and there is some cleanup/optimization left. The sorted spatial index needs to be deserialized immediately for coordinates. I am attempting to use the chunk structure to only collect and sort the annotations closest to our global position. I'll point that out in the code but the chunk I calculate using the global position divided the chunk size is different than the chunks loaded by the 3d view. I am doing a very basic calculation compared to the 3d projection.

I added debounce to listening to the visibleChunksChanged. When we have active segmentFilteredSources, I should probably be able to ignore calls to AnnotationLayerView.updateView() if the visible segments haven't changed and all the chunks were previously loaded.

Most of the expense of AnnotationLayerView.updateView() is in call to this.virtualListSource.changed!.dispatch. I want to analyze that a bit more to see if there is any potential performance improvements there.

Add/update/remove annotation element is disabled when the source is a MultiscaleAnnotationSource. Perhaps this is a good opportunity to add that as unimplemented methods on MultiscaleAnnotationSource? At some point we want to support editing our annotations within neuroglancer.

Might make sense to wait to add the methods until there will be at least one implementation.

Overall, do you think this lazy approach for handling large annotation lists for segment filtered sources is worth it?

For unsorted lists filtered by segments, lazy deserialization will be a pretty big win, because in some cases we may have several hundred segments, and it will be expensive to deserialize all of those, compared to just deserializing the ~30 that are visible, which could be done quite efficiently. This could be a good fallback in the case where there are more than a certain number of annotations.

I suppose the lazy deserialization could still happen on a per-chunk basis, though, rather than deserializing individual elements within a chunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants