Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Profile Measurement Class #199

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Add Profile Measurement Class #199

wants to merge 17 commits into from

Conversation

eloiseyang
Copy link
Collaborator

This is the last portion of PR 143 that hasn't been added yet. I'm setting up this draft pull request as I work on pulling the measurement class up to speed with the changes to the fitting and image classes.

phys-cgarnier and others added 15 commits June 12, 2024 12:02
Added GaussianModel class using MethodBase
Added ProjectionFit class
Added tests for all the above including datasets
Separated from 8c85a56 on phys-cgarnier/lcls-tools/dev
Added ROI Classes for background subtraction and ROI cropping
Added tests and datasets for both classes
Separated from physics-cgranier/lcls-tools/dev 8c85a56
Added measurement.py
Updated relevant files with new measurement structures
Added test for screen_beam_profile_measurement.py
Separated from physics-cgranier/lcls-tools/dev 8c85a56
Added GaussianModel class using MethodBase
Added ProjectionFit class
Added tests for all the above including datasets
Separated from 8c85a56 on phys-cgarnier/lcls-tools/dev
…lving params, might have to remove priors still but its working
@eloiseyang
Copy link
Collaborator Author

I think the tests should run now. I'll do my own review of this soon since this isn't my code. It would be nice to coordinate with Shamin too, since he's working on another measurement at the same time.

@eloiseyang
Copy link
Collaborator Author

Don't be alarmed by the number of lines. The jupyter notebook file contains an image, and Github encodes the image as text, resulting in a huge number of lines that aren't actually relevant.

@roussel-ryan roussel-ryan mentioned this pull request Nov 27, 2024
@nneveu nneveu assigned robnagler and unassigned robnagler Nov 27, 2024
@eloiseyang eloiseyang marked this pull request as ready for review November 27, 2024 19:16
Copy link
Member

@nneveu nneveu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this @eloiseyang! Let's continue the discussion in comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a notebook? I'd prefer not to have python notebooks on the repo if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly didn't look at it. I can remove it.

from pydantic import BaseModel, DirectoryPath


class Measurement(BaseModel, ABC):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try the google style RadiaSoft recommended? https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this is already in use in other places in this repo. pykern.pkexample demonstrates in detail this style (and why in many cases).

def measure(self, n_shots: int = 1) -> dict:
"""
Measurement function that takes in n_shots as argumentment, where n_shots is the number of
image profiles (is this what I want to call it?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argumentment -> argument. I think `images', is the right term here vs. profiles. We're taking n pictures of the beam.

is results for the image fitting process stored as a dictionary.
----
Currently under work, what do we want to do with the images?
Needs dump_controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the save images / projections to file here?

import numpy as np


class ScreenBeamProfileMeasurement(Measurement):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure beam is necessary in this name...or screen either? A Profile could be measured with a wire as well, but was the thought a wire profile measurement would be separate code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file was written with a screen profile in mind, but it could be generalized.

images = []
while len(images) < n_shots:
measurement = self.single_measure()
if len(measurement):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a measurement returns a bad image while len(image) < n_shots, there's no catch here? i.e. if the 3rd out of 10 image measurement returns nothing, it keeps going but would only return 9 good images? Or maybe I'm reading that wrong?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the code is written right now, it will keep trying to grab an image until it has n images, regardless of the result. I assume this would only happen if something went wrong in CA. How are we handling CA errors in devices right now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see what you mean. My holiday brain...
Were not doing any sophisticated error handling. A few places with PV checks and attribute errors.
So it would cause problems if there was an epics error returned/we need to look into that in future PRs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always better to surface errors unless you know how to handle them. For example, there may be a correct read of an image (no exception), but the beam may not be on. It's unlikely someone wants a fit of a blank image so you'd write code to throw an exception. Since this is not being done, it's just fine to let any PV exceptions cascade to the higher level.

final_results += [temp]

# what should I do with results now?
results["fits"] = final_results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd want to be able to return and save to file associated w/ the images as you have in the next few lines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using HDF5 to save, right? Is there an example of how we're saving within LCLS-Tools right now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a `_write_image_to_hdf5' function in the screen class. Although this will need some more thought and will get cleaned up in the demo RadiaSoft is planning.

self.test_instantiate_pydantic_objects()

def create_test_image(self, size: tuple, center: list, radius: int):
# make img that is a circle in the center of the image with known
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we use a test file. That way functionality changes would not impact us. We would know we are always testing against the same image.

self.assertIsInstance(perform_measure, list)
self.assertIsInstance(perform_measure[0], dict)
if isinstance(self.gauss_model, GaussianModel):
for key, val in perform_measure[0].items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just remove the value checks? Regression checking the fitting algorithm could be a more complicated task, and this code doesn't really accomplish that I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's go with that and write an issue.

@nneveu
Copy link
Member

nneveu commented Nov 27, 2024

Also, can you please pull in changes from main? Thanks

Copy link
Collaborator Author

@eloiseyang eloiseyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this PR again, I don't want to merge it. The Screen Beam Profile Measurement class has a lot of overlap with the "Profile Monitor GUI Model" class I will eventually include in LCLS-Tools. I think we can put more thought into a class that grabs, processes, and fits data all in one place. That kind of class should probably be broken up into multiple data classes as well.

I'd like to preserve this code in an example script. It's a good demonstration of how the projection fit class works.

"""
images = []
while len(images) < n_shots:
measurement = self.single_measure()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather let this function take measurements as an input. The user of this function will probably want to store and manipulate the images on their own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

images += [measurement]

results = {"images": images}
if self.fit_profile:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to remove this if statement. This should just be a wrapper for the fitting function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. If fit_profile is False, the function will fail with a NameError, because final_results will be undefined.

projection_x = np.array(np.sum(ele["processed_image"], axis=0))
projection_y = np.array(np.sum(ele["processed_image"], axis=1))

for key, param in self.fitting_tool.fit_projection(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this syntax.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an impedance mismatch in the way the function is defined and the implementation, and the extra code for stuff that isn't yet implemented or doesn't actually happen (single_measure returning an empty dict).

Below I changed the function definition to yield a dict per image. An iterator in this case has a number of advantages, e.g. memory efficiency:

    def take_and_fit_images(self, n_shots: int = 1) -> dict:
        """Get `n_shots` from the screen and fit the processed images

        Args:
            n_shots (int): number of valid images to take [1]
        Yields:
            dict: `raw_image`, `processed_image`, and `fits`
        """

        def _fit(processed, axis):
            return self.fitting_tool_fit.fit_projection(
                np.array(np.sum(processed, axis=axis))
            )

        for _ in range(n_shots):
            rv = self.single_measure()
            rv["fits"] = dict(
                {k: _fit(rv["processed_image"], v) for k, v in (("x", 0), ("y", 1))}
            )
            yield rv

If the shots need to be all taken all at once (without fitting in between each shot), then the higher level code should pass in the shots (per @eloiseyang's suggestion) controlling the speed, and this function could remain an iterator and would be even simpler:


return final_results

def single_measure(self) -> dict:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function should be in this file.

Copy link
Collaborator

@robnagler robnagler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are my comments. Hopefully they are useful.

from pydantic import BaseModel, DirectoryPath


class Measurement(BaseModel, ABC):
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this is already in use in other places in this repo. pykern.pkexample demonstrates in detail this style (and why in many cases).

@@ -0,0 +1,99 @@
from lcls_tools.common.devices.screen import Screen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the standard, but it's very nice to have a standardized header in a file. SLACTwin has a standard header with the SLAC FOSS license.

We also recommend using qualified imports with exceptions of course :).

One of the problems with deep hierarchies like this is that qualified imports become a problem. We avoid names like this: lcls_tools.common.measurements.measurement.Measurement. Typically a common package indicates the repo is getting too large. The measurements.measurement.Measurement could be simplified into lcls_tools.measurements.Measurement, or maybe even measurements.Base. It used to be in Python that the string representation of this class was <class 'Base'>, which was not very informative in error messages. In recent versions of Python, it's now <class 'lcls_tools.measurements.Base'> so this supports this more logical and less redundant style of naming.

from lcls_tools.common.image.processing import ImageProcessor
from lcls_tools.common.data.fit.projection import ProjectionFit
from lcls_tools.common.measurements.measurement import Measurement
import numpy as np
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliases cause cognitive overload (and namespace collisions sometimes). I realize that np is extremely common in scientific code (to save 3 letters). It's just that this style leads to other aliases which are less obvious: utils, sio, and mod.

results["fits"] = final_results

# no attribute dump controller
if self.save_data:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_data is defined generally in Measurement, and only used here, which does nothing (for now, I understand). When there's a plan as to how to save data, add this to a parameter this specific function. Until then, remove it. If you want a reminder, add a (Google style) #TODO.

"""
images = []
while len(images) < n_shots:
measurement = self.single_measure()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.


"""
images = []
while len(images) < n_shots:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If single_measure actually returned an empty dict (it doesn't), this has the potential for being an infinite loop if for whatever reason the image is always zero length. This might happen (guessing) if the device is off. It's hard for users to detect infinite loops (halting problem). As it is, the test should be removed.

images += [measurement]

results = {"images": images}
if self.fit_profile:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. If fit_profile is False, the function will fail with a NameError, because final_results will be undefined.

projection_x = np.array(np.sum(ele["processed_image"], axis=0))
projection_y = np.array(np.sum(ele["processed_image"], axis=1))

for key, param in self.fitting_tool.fit_projection(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an impedance mismatch in the way the function is defined and the implementation, and the extra code for stuff that isn't yet implemented or doesn't actually happen (single_measure returning an empty dict).

Below I changed the function definition to yield a dict per image. An iterator in this case has a number of advantages, e.g. memory efficiency:

    def take_and_fit_images(self, n_shots: int = 1) -> dict:
        """Get `n_shots` from the screen and fit the processed images

        Args:
            n_shots (int): number of valid images to take [1]
        Yields:
            dict: `raw_image`, `processed_image`, and `fits`
        """

        def _fit(processed, axis):
            return self.fitting_tool_fit.fit_projection(
                np.array(np.sum(processed, axis=axis))
            )

        for _ in range(n_shots):
            rv = self.single_measure()
            rv["fits"] = dict(
                {k: _fit(rv["processed_image"], v) for k, v in (("x", 0), ("y", 1))}
            )
            yield rv

If the shots need to be all taken all at once (without fitting in between each shot), then the higher level code should pass in the shots (per @eloiseyang's suggestion) controlling the speed, and this function could remain an iterator and would be even simpler:

self.radius = 50
self.size = (800, 800)
self.center = [400, 400]
self.extent = [300, 300]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these are unused(?). With this many unnecessary intermediate variables (why on self?), this function is hard to read. Put the mocked objects in a separate function so the intent (test) of this function is clear.

self.assertIsInstance(perform_measure[0], dict)
if isinstance(self.gauss_model, GaussianModel):
for key, val in perform_measure[0].items():
if key == "amplitude_x":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, have these be exact values. Read and write the files to a file (e.g. json or yaml) and them diff that file with an expected output. You can use [ndiff] if the numbers are fuzzy. However, testing the fitting function itself is outside the scope of this module so inserting a blank image, for example, would yield a precise value after the fit (0). You would only need to test, say, mean_x and mean_y that they were both 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants