Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tags in events that will be added to metadata. #91

Merged
merged 6 commits into from
Jun 6, 2023

Conversation

nicost
Copy link
Member

@nicost nicost commented Jun 4, 2023

Implementation of issue #90 (closes #90). Tested by adding position name in MM acqengJ. Works.

@nicost nicost requested a review from henrypinkard June 4, 2023 03:59
@henrypinkard
Copy link
Member

henrypinkard commented Jun 4, 2023

Looks good. A couple things:

  1. You should probably also implement the toJSON and fromJSON functions. These allow the event to be serialized and deserialized for (e.g. sending to pycromanager)
  2. There's a test function that runs automatically on pushes to pycromanager that checks the cycle consistency of these functions
  3. Adding to the docs of acquisition events in PM is a probably a good idea as well as maybe describing how to add tags to acquisition events here
  4. Do you think it is worth adding the tags not all to the top level of image metadata, but maybe lower under a "User tags" or something similar field? I've always thought that having all the tags at the same level of the tree was confusing and chaotic, and it might be good to move away from that and take advantage of the hierarchical structure of JSON. This would also prevent collisions between "official" metadata names and user supplied ones

@nicost
Copy link
Member Author

nicost commented Jun 5, 2023

  1. Done
  2. Those test are nice, but need a complicated environment including Python to develop. I added test code in the Java layer to ensure that the serialization / deserialization works correctly. If you are set up to develop the Python test code, please add those lines needed to test tags.
  3. Not quite sure what that should look like. Does the following look correct?

Use these fields to add information to be propagated to the image metadata

'tags': [['Key', 'Value'],
['Key2', 'Value2']],

  1. Funnily enough, for one of my first uses of these tags (to add the correct Position name to the metadata in Micro-Manager), I need the key to be in the top level, so pushing them into a hardcoded lower level would defeat at least several uses of this functionality (and necessitate yet other data structure). We could make tags a structure that includes 3 parts: level-key-value, where null for the level will push the key value into the upper level. Not quite sure if now is the time to do this. @marktsuchida what do you think about this?

@henrypinkard
Copy link
Member

I just realized there's a much simpler way to do this that doesn't require additions to the API. You can just implement an ImageProcessor and add/modify tags as you see fit. If that covers everything you need to do I think that would be a better route for now. What do you think?

@nicost
Copy link
Member Author

nicost commented Jun 5, 2023

Does the event specify the processor? If so, where does that happen? I believe it to be crucial that the event specifies the tags to be added.

@henrypinkard
Copy link
Member

No the processor is added on a per-acquisition basis

If you're just trying to name the positions, then it should be a matter of just adding the metadata you want based on the value of position axis within the image processor

I believe it to be crucial that the event specifies the tags to be added.

In a way it already does if the metadata you want to add can be deterministically known from the Axes before the acquisition even runs

Were there other tags besides the position label you want to add?

@nicost
Copy link
Member Author

nicost commented Jun 5, 2023

If you're just trying to name the positions, then it should be a matter of just adding the metadata you want based on the value of position axis within the image processor

The DataSink then also needs knowledge of the PositionList, something the implementation in MM currently does not have (and I have to shudder about passing it multiple PositionLists and assigning things correctly when running multiple MDAs concurrently). It makes all these things so much more complex.

Were there other tags besides the position label you want to add?

Acquisition number for multiple acquisition. Doing this by adding a virtual axis works, but is a complete hack. It seems quite obvious to me that an event will want to add metadata that should find their way to the image. Doing this by relying on image coordinates and adding them later after the image is received is a much more indirect, non- intuitive, and less flexible way.

@henrypinkard
Copy link
Member

If you're just trying to name the positions, then it should be a matter of just adding the metadata you want based on the value of position axis within the image processor

The DataSink then also needs knowledge of the PositionList, something the implementation in MM currently does not have (and I have to shudder about passing it multiple PositionLists and assigning things correctly when running multiple MDAs concurrently). It makes all these things so much more complex.

I don't understand why this would be. Since events and image processors both get add to the Acquisition, why does the DataSink need to be involved at all here? Is it because you're using the same datasink for multiple acquisitions?

Were there other tags besides the position label you want to add?

Acquisition number for multiple acquisition. Doing this by adding a virtual axis works, but is a complete hack.

To me this seems like that belongs in the summary metadata, rather than the image metadata, since it is specific to each acquisition, not any particular image within them.

Are you implementing multiple (logical) acquisitions through a single Acquisition object? I would think if you have multiple acquisitions you can have multiple instances of this class, and each one has its own image processor that adds any metadata specific to it.

It seems quite obvious to me that an event will want to add metadata that should find their way to the image. Doing this by relying on image coordinates and adding them later after the image is received is a much more indirect, non- intuitive, and less flexible way.

I don't disagree that this would be a useful feature, but since it seems to be tied up in the much larger issue of how metadata should be structured, I think an alternative implementation (especially one that works with the existing API) is strongly preferably. I think that disorganized metadata has been a long-running and critical issue with the Java layer of MM, because it has allowed requirements to leak through API boundaries, making it harder to re-use modules in different contexts and integrate new ones. Solving this is definitely a good goal in the long term, but since (I'm guessing) you'd like to get something up and running relatively soon, now is not the time to take that on. And doubling down on the implicit assumptions made by higher level MMStudio code may create more work/breaking changes when later when moving towards a more general purpose acquisition engine.

@nicost
Copy link
Member Author

nicost commented Jun 5, 2023

I don't understand why this would be. Since events and image processors both get add to the Acquisition, why does the DataSink need to be involved at all here? Is it because you're using the same datasink for multiple acquisitions?

I have no understanding of ImageProcessors in the context of AcqEngJ. How do they get attached, what are their inputs and outputs?

Acquisition number for multiple acquisition. Doing this by adding a virtual axis works, but is a complete hack.

To me this seems like that belongs in the summary metadata, rather than the image metadata, since it is specific to each acquisition, not any particular image within them.

Are you implementing multiple (logical) acquisitions through a single Acquisition object? I would think if you have multiple acquisitions you can have multiple instances of this class, and each one has its own image processor that adds any metadata specific to it.

Yes. They all share the same time information, i.e need to be carried out simultanuously, but data need to go to different DataSinks. The DataSink needs information (in the image metadata) to know where to direct the image. I do not know which class "this class" referred to.

I don't disagree that this would be a useful feature, but since it seems to be tied up in the much larger issue of how metadata should be structured, I think an alternative implementation (especially one that works with the existing API) is strongly preferably.

I do not see how this is tied up in a larger issue of metadata structure. TaggedImage provides metadata as key-value pairs. There is a lot enabled by giving Acquisition Events the ability to add key-value pairs to the TaggedImage acquired as a result of the Acquisition Event. This is not a Micro-Manager specific feature, but others will benefit as well. I am happy to see this implemented in other ways, but I see no solutions given the current API (plus, by now I have invested a decent amount of work in this, and may soon loose steam).

@henrypinkard
Copy link
Member

I have no understanding of ImageProcessors in the context of AcqEngJ. How do they get attached, what are their inputs and outputs?

You add the processor to the acquisition before starting it. Then create a processor like this and just add the tags you want and return it.

Yes. They all share the same time information, i.e need to be carried out simultanuously, but data need to go to different DataSinks. The DataSink needs information (in the image metadata) to know where to direct the image. I do not know which class "this class" referred to.

"this class" == Acquisition. If you're running multiple acquisitions, I would think you would want multiple instances of the Acquisition class running in parallel

I do not see how this is tied up in a larger issue of metadata structure. TaggedImage provides metadata as key-value pairs. There is a lot enabled by giving Acquisition Events the ability to add key-value pairs to the TaggedImage acquired as a result of the Acquisition Event. This is not a Micro-Manager specific feature, but others will benefit as well. I am happy to see this implemented in other ways, but I see no solutions given the current API (plus, by now I have invested a decent amount of work in this, and may soon loose steam).

It is tied up in the larger issue of metadata structure because if something is implemented in the event that allows tags to be passed in, and those tags end up at the top level of the metadata, then we cannot, say, move them to another level of the hierarchy (e.g. another JSON object under a "UserTags" key) rather than keeping them at the top level without breaking backwards compatibility later on. I strongly feel that dumping all metadata into the same top level is a bad way of organizing metadata in general, because it makes it harder to parse when there are many values, it leaves open the possibility of overriding required keys (which can be nightmare to debug), and it provides no indication to the user/developer which keys are required and which are optional. In contrast, when doing it through an image processor, where in the metadata hierarchy things are new keys are inserted is entirely in user code, so it won't change with changes to the API.

I'm sorry if my initial comment led you in the wrong direction, though I think it is a sunk cost at this point and we should not merge this as it currently is owing to the aforementioned concerns. If there are further challenges with doing this through a processor, I'm happy to help take a look

@nicost
Copy link
Member Author

nicost commented Jun 5, 2023

You add the processor to the acquisition before starting it. Then create a processor like this and just add the tags you want and return it.

This can never help discerning tagged images that have the same axis designations (as needed when running different channel/slice settings at multiple locations). It only can add tags based on the existing tags. This could add the position name, but it would be a very complex mechanism involving lots of code for something that is done in the PR with a few lines of code.

"this class" == Acquisition. If you're running multiple acquisitions, I would think you would want multiple instances of the Acquisition class running in parallel

I have no idea how that would work (i.e, how to run multiple instances of the Acquisition class at the same time, seems a recipe for loads of conflicts and unintended behaviors). It works fine if I am given the ability to add Metadata in and event. Your work around to add a bogus axis also works, but that feels like a hack that is formalized in this PR.

In contrast, when doing it through an image processor, where in the metadata hierarchy things are new keys are inserted is entirely in user code, so it won't change with changes to the API.

Same could be done by letting the user add a JSONObject in the event. If you like that better, happy to change it that way.

The current API forces me to use hacks, and writes lots of code passing data to places that have no need to know, which all could be easily done by giving the event the ability to transfer metadata into the image. Not allowing this, leads to loads of ugliness in the "user" code.

@henrypinkard
Copy link
Member

Same could be done by letting the user add a JSONObject in the event. If you like that better, happy to change it that way.

I don't see how this would solve the problem I am describing. I don't think we are progressing on this. Maybe it is better to discuss next time on zoom

@nicost
Copy link
Member Author

nicost commented Jun 6, 2023

OK, I am turning around on this. So the rule is: the Acquisition engine produces tagged images with unique axis designations (are duplicates ever allowed?). All metadata can always be deduced from those unique axis coordinates. Running multiple acquisition settings at the same time then becomes adding another axis (and the fact that we currently need to remove that acquisition number axis more points to shortcomings in our down stream code).

@henrypinkard
Copy link
Member

henrypinkard commented Jun 6, 2023

(Edit: Was this before I saw your previous comment)

Here's a potential solution that just occurred to me:

  1. Add "tags" to the event as you done, but make the added tags appear in a "Tags": {} JSON object within the metadata rather than the top level
  2. Use an image processor in your plugin code to extract them them and add them to the top level as you need

This would address the issues I've pointed out, while allowing you to use the workflow you're describing. Thoughts?

@henrypinkard
Copy link
Member

All metadata can always be deduced from those unique axis coordinates.

No I think you are right previously that adding a mechanism to add arbitrary other metadata from the event is helpful

Running multiple acquisition settings at the same time then becomes adding another axis (and the fact that we currently need to remove that acquisition number axis more points to shortcomings in our down stream code).

It's one way to do it. Adding custom metadata is another way. Running separate acquisitions in parallel is another (Ivan is currently doing this). Ideally the engine is flexible and capable enough to support all 3

@nicost
Copy link
Member Author

nicost commented Jun 6, 2023

OK. Just pushed code that will add the Tags as JSONObjects under the key "Tags".

B.t.w., no image processor needed, the DataSink gets the image and can do the "translation".

Let me know if we'll go this route, so that I can finish the MM part.

@henrypinkard
Copy link
Member

Yes, sounds good. I can take care of adding to the PM docs and tests.

2. Those test are nice, but need a complicated environment including Python to develop. I added test code in the Java layer to ensure that the serialization / deserialization works correctly. If you are set up to develop the Python test code, please add those lines needed to test tags.

For future reference, the environment is quite simple. It is, a python installation + pip install pycromanger. Then you can just run the test from the terminal. It is good to keep the tests up to date, as they are built into lots of automation of the build system and make sure components work as expected

@nicost
Copy link
Member Author

nicost commented Jun 6, 2023

Awesome, ready to merge.

Then you can just run the test from the terminal.

How? What may look easy to one person may be a big obstacle to someone else....

@henrypinkard henrypinkard merged commit 3173e2b into micro-manager:main Jun 6, 2023
@henrypinkard
Copy link
Member

While MM is running, and having enabled ZMQ server, paste in the code of the test directly, or make a temp file and paste it in there and run with python your_script.py.

The relevant test is:

    events = [
        {'axes': {'channel': 'DAPI'},
         'config_group': ['Channel', 'DAPI']},
        {'axes': {'view': 0},
         'camera': 'Camera'},
        {'axes': {'z': 4},
         'z': 123.34},
        {'axes': {'time': 0},
         'exposure': 100.0},
        {'axes': {'time': 1},
         'properties': [['DeviceName', 'PropertyName', 'PropertyValue']]},
        {'axes': {'z': 1},
         'stage_positions': [['ZDeviceName', 123.45]]},

       # TODO add a JSON serialized version of the event to test here
    ]

    def hook_fn(event):
        test_event = events.pop(0)
        print('comparing\n', event, '\n', test_event)
        assert (event == test_event)
        return None  # cancel the event

    with Acquisition(show_display=False, pre_hardware_hook_fn=hook_fn) as acq:
        # copy list of events to avoid popping from original
        events_copy = [e for e in events]
        for test_event in events:
            acq.acquire(test_event)

If things aren't working as expected, the assert statement will throw an exception. Otherwise it will complete with no errors

The tests all run automatically on PRs to PM, so you don't have to run them all yourself if adding a new one.

Or you can forgo this and just have them run in the github action on Pycromanager PRs, but if something doesn't work you won't find that out until having waited 10 min

Does that make sense? do you want to give it a try or shall I do it?

@nicost
Copy link
Member Author

nicost commented Jun 7, 2023

I ran this code and get:

Traceback (most recent call last):
  File "C:\projects\personal\pycro-manager\pycromanager\test\tmp.py", line 24, in <module>
    with Acquisition(show_display=False, pre_hardware_hook_fn=hook_fn) as acq:
NameError: name 'Acquisition' is not defined

Is there a file that I can run that works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add image tags to events
2 participants