-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support marker timestamps (for FLIM, etc.) #44
Comments
@marktsuchida: This sounds like a reasonable suggestion.
Now the main issue is that @tritemio has moved to greener pastures and probably does not support photon-hdf5 anymore. Maybe @talaurence can help with that? |
I agree that these are reasonable suggestions, and getting them done is aligned with my current research work. I noticed that the original post is a few months old. Let us know if you remain interested in pursuing these additions. |
Thanks. I still have an interest in this happening, but I'm not sure when I will be able to get back to it. Happy to help if the timeframe is not that urgent. @smXplorer: I'm not 100% sure I understood your question about unused bits, but the devices I'm familiar with (or at least BH boards, which are fresher in my memory) do not have the ability to turn off any of the fixed number of (hardware) marker input lines. In other words, the bits in the vendor native format always map directly to the hardware input terminals (they may be scattered in a weird order across a 32-bit record, so that requires reordering). So, for a device that has 4 marker lines (0...3), we would map those to the 4 (= My thinking is that because we need additional metadata (to be specified separately; the "part 2" as I called it above) anyway to describe how the marker bits map to application-specific interpretations (pixel clock, line clock, etc., if the application is FLIM), it is not important how the bits are ordered and it is okay if (a reasonable number of) unused bits are included. Users of the file format could even choose to map to bits in a way that does not correspond to hardware lines (say, in order to follow some local standard), although personally I would prefer to minimize such transformations in my own work. This does mean that a specification for that additional metadata is needed before we can fully describe a FLIM experiment, though. |
@marktsuchida Sorry if I wasn't clear (and I wasn't). I meant to ask whether you were proposing a "flexible" scheme where, if, say, 3 markers are needed for an app, bit 0, 1 and 2 will be used, no matter whether the vendor uses, say, bit 1, 2 and 4. If you wanted to stick to the vendor's choice, you would have "unused" bits, but a lazy reader would be able to always assume, say, that bit 1 is always this marker, bit 2 that marker and bit 4 that other marker, without having to bother reading the definition of the markers (i.e. assuming that it is just the vendor's convention). The question comes from the discussion we had in the past about detector ID and as a matter of fact, re-reading that section of the specs: https://photon-hdf5.readthedocs.io/en/0.5.dev/phdata.html#setup-detectors-group, I realize that these "markers" were included in the discussion. In other words, a "marker" is a "signal" in the corresponding "detector". Please read that section and let me know if you understand it the way I do. |
@smXplorer Regarding the detector ID section, please see my comment preceding this issue at #41 (comment). In summary, I feel that storing markers as a special case of a detector is inconvenient at best. In this context I need to reiterate my earlier question about whether timestamps are monotonically increasing, or just monotonically non-decreasing. If I'm not mistaken, if detector 0 and detector 1 see a photon at the same time (i.e., within the timestamp counter period), the same timestamp is repeated in I think that probably works well enough for photon timestamps (though a format guaranteeing strict monotonic increase would have been nice), but a similar scheme for markers would make it harder to write correct code that assigns photons to pixels, because the order of markers stored in a file may not be the order in which they need to be processed. And it gets even more complicated if the same array ( Getting back to your first question about app-based bit assignment -- are you proposing something like bit 0 = pixel clock, bit 1 = line clock, and bit 2 = frame clock? Aside from the fact that you could still have unused bits (because any 1 of those 3 clocks is sufficient for recovering a FLIM image, and not all setups have all 3), the difficulty I see is that then it becomes hard for users to use this format to store additional or non-standard markers that are not (yet) provided by the format specification. Maybe they need to record the timing of some sort of experimental stimulus, or they are scanning in a non-raster pattern. Should they use bits 3 and above for that? But then the specification cannot evolve to introduce new standard markers in the future. If we use separate metadata to map the bits, non-standard markers could just be left unmapped (or maybe annotated with a human-readable string). Note that I don't care if the bits follow any vendor convention. Which hardware line is used for pixel/line/other clock is entirely up to the user, at least outside of very specific cases (even though the BH Handbook makes it hard to see this). I just want the file format to be capable of storing all of the raw data even before any interpretation or rearrangement takes place. Among other things, this can help a lot while setting up an imaging system or troubleshooting acquisition software, because you can first record and save the data and then look at all of the markers at a glance. If the file format would only store the markers you tell it to, and the result is wrong, then you need to figure out if you misconfigured or the software is buggy or the cables are connected incorrectly, and the file format fails to be a tool in the process. That seems like a lost opportunity. |
@marktsushida This is correct. The whole idea of this file format being to be as general as possible without bogging down users with too many arcane rules, adding a different type of data stream for those hardware markers sounds a bit counter-productive to me. On the contrary, it would seem to me that it is convenient to have all the timestamps ordered within each photon_data group, the task of the reader being to dispatch them in separate buckets before processing them to their heart's content (rather than having to check two different types of arrays to figure out their respective relation). I am not quite sure why having one time stamp tagged as "pixel", then "line" is such an impediment (it will generally not even be associated with a photon).
I am leaving this on the back-burner at this time, since it doesn't apply if markers are just one type of 'detector'.
I understand your concern. If I understand you correctly, you would want to have a quick way of looking only at the tags of interest. I can offer you a simple solution within the current framework: create a photon_data group that only includes those tags, no photon timestamps. Would that work? |
@smXplorer Apologies for the long delay. I think you have convinced me that it's better to put marker timestamps in the same array as the photon timestamps. I agree that the undefined ordering within elements at the same timestamp is a minor annoyance at most, and can be dealt with cleanly (esp. if the spec makes it clear that this is a requirement for readers). Another issue was with the fact that I guess then it's a matter of extending
No, they will not (in general) be in any particular order in the original data. But that's okay -- once it is made explicit that readers are responsible for scanning ahead to look at all elements with the same timestamp, it is no longer a problem. I think the reason I was bothered with this had more to do with the fact that the spec never mentions what happens when timestamps from different detectors collide; it wasn't initially clear to me that this was even allowed, because devices with a single TAC or TDC don't produce such data even if multiplexed; yet for markers it is crucial. (I was also operating under the general notion that formats that only allow a single representation for any given data are better, but that can be traded off.)
Not particularly. My main concern is whether one can save all of the raw data at hand, even when some of it is supposedly not used and/or the interpretation for some/all of it is not (yet) available. I'm not saying that people should in general write files that don't record which marker is what; my point is that it should be possible to do so, or else the format will be harder to use for developing new applications and troubleshooting work-in-progress systems, or to salvage data saved by misconfigured software. This can be cleanly addressed under a scheme where markers are stored in For example, there could be |
Of course
I am not following.
I was mulling over that after our previous exchange, trying to see whether we could dispense with defining an additional specification, but I haven't been able to come up with anything, so I would tend to agree about the need to define a new category in the
I have no objection with the above suggestion of field names, but they might be either overly general ( @talaurence maybe can chip in. |
@smXplorer I don't feel super strongly about zeroing and/or indicating unused elements of With that out of the way: The field names that I gave as an example were really meant just as an example: I wanted to convey the basic structure, not the names. I like the names containing |
The point is that zero (or any other value for that matter) is a perfectly valid As far as people trying to hijack the format to pass on proprietary or undocumented data, I am afraid that can't be prevented... Any suggestion to improve the
The denomination chn made sense for physical detectors collecting photons post "filtering" by some optics part. I am not so sure it conveys any meaning in the case of "markers". In fact, it might result in confusion instead. in the Note that while optional, the |
@smXplorer Thanks. I like the I cannot agree more that all relevant information in Note that I'm viewing this from the perspective of writing HDF5 files using my own code, not phconvert, because in my work the data will (hopefully) go straight from the device to a Photon-HDF5 file, without using vendor file formats for temporary storage. (Also the programming language is not Python.) In this setting, if the file format allows me to store "unused" or "uninterpreted" markers, it's extremely useful for troubleshooting (either during development or when setting up a new system). If the format doesn't allow this, then I would have no choice other than to come up with my own format (or use the vendor format) to save "raw" data, which seems unfortunate. Of course, once the software is debugged and configured, the software should correctly record the correspondence between marker number and pixel/line/frame (but might still record a 4th, unused marker that may be useful in forensic examination). In other words, I'm trying to make the point that this file format can start being useful even before one gets to the point of being able to record scientific data from successful experiments (and at least some of us spend more time building microscopes then acquiring data on them). I'm also fine with your scheme of having a string array I cannot, in the context of FLIM, think of any values for the elements of that field other than So unless other people have other ideas, I would be happy with only standardizing Moving on, since we are already talking about FLIM-specific markers, in order to completely describe a FLIM experiment there needs to be additional metadata in
These are central to FLIM and, among other things, allow FLIM image recovery from any dataset containing at least one of pixel/line/frame markers. Raster width and height should most certainly be required for FLIM datasets (only a data recovery tool should have to deal with datasets that don't record raster size). The pixel period (aka pixel time, but period is less ambiguous) should be more or less required, though images can be recovered without it if the Most datasets will have some redundancy of information between the marker timestamps, periods, and raster size, and there is more than one (nearly but not strictly equivalent) way to assign photons to pixels given such data. Most readers will only implement one or a few strategies, and they should simply check that the markers and parameters required for their operation are present and otherwise reject the file or try another strategy. A couple of additional fields that are not required for constructing FLIM images but might be worth including are
|
@marktsuchida Using photon-HDF5 (ph5) doesn't preclude you from developing your alternate or intermediate format. It was never designed to be as flexible as a programming language. It is here to exchange data in a reproducible way. As a matter of fact, we are using an intermediate hdf5 format to store raw data, which Re: additional @talaurence Any comments/suggestions? |
@smXplorer Sounds good; I think that is a good plan, especially seeing that I will start working on a PR so that we have something more concrete to hammer on (it will probably take me a few weeks). |
Based on discussion in #41, here is a proposal for supporting "marker" events generated by TCSPC hardware (sometimes called external markers). Among other things, this will allow storing all raw data needed for FLIM and related experiments.
This can be considered a "part 1" for supporting FLIM data. A "part 2" (not included here) would need to specify additional metadata fields to fully describe a FLIM experiment (such as: pixel rate, marker-to-scan-clock correspondence, and marker time offsets), but we first need a specification for storing the marker data itself.
Proposed additions:
A photon dataset can have associated marker events. Markers are typically recorded by TCSPC hardware based on external (or internal) trigger or clock signals, such as (but not limited to) raster scan pixel/line/frame clock signals. Marker events share the same time axis with photon events. If multiple marker channels are used, they are known as marker 0, marker 1, etc.
Add group
/photon_data/markers_specs/
.Add
/photon_data/markers_specs/marker_num_bits
, an integer, which must be positive. Required if storing marker timestamps.Add dataset
/photon_data/marker_timestamps
, an array of (typically) int64. This array contains the timestamps at which marker events occurred. Units are the same as the photon timestamps (/photon_data/timestamps_specs/timestamps_unit
). Values must increase monotonically. The length of this array is not related to the length oftimestamps
,nanotimes
, etc.Add dataset
/photon_data/marker_bits
, an array of integers (typically uint8). The array must have the same length asmarker_timestamps
. Each element is the bit pattern of the marker events recorded at the corresponding timestamp. The least significant bit corresponds to marker 0. Multiple bits may be set if more than one marker was recorded at the same timestamp. Elements must not be 0, and only the N least significant bits may be non-zero, where N is the value ofmarker_num_bits
. Required if storing marker timestamps andmarker_num_bits
is greater than 1.For multi-spot data, each of the
/photon_data[n]/
groups can contain marker timestamps as specified above.Points that may need discussion:
How to handle multiple markers at the same timestamp. We either record a single bitmask (as proposed above) and forbid duplicate timestamps, or repeat the same timestamp for each marker, in which case the marker array can use marker number instead of a bitmask.
/photon_data/timestamps
needs to be monotonically increasing or monotonically non-decreasing. If multiple photons (obviously from different detectors) at the same timestamp are currently allowed (can anyone clarify?), there may be an argument for using a similar format for markers, just for the sake of uniformity.How to handle markers recorded by multiple (synchronized) TCSPC devices (for single spot), such as a Becker & Hickl multi-module system. It's probably unusual to need to record markers from more than one module even with such a setup, but should the need arise, they can be combined into a single packed bitmask.
Are there use cases for markers in multi-spot measurements? If so, will each spot have its own markers or will there be common markers for all spots? If the latter is true, it may be desirable to specify a way to store common markers outside of the
/photon_data[n]/
groups. Personally, I think this can be left unsupported (requiring duplication of the marker timestamps for each spot) until a concrete use case arises.I'm prepared to create a pull request if there is interest and agreement on the format.
The text was updated successfully, but these errors were encountered: