Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Richardson Lucy Parallelization #237

Closed
wants to merge 27 commits into from
Closed

Conversation

avalluvan
Copy link

Code Modifications
Backend (updates to existing files)

  1. Image_deconvolution.py:
    New deconvolution algorithm class added to dictionary: “RLparallel”
    Parameter filepath is propagated through to deconvolution algorithm
  2. RichardsonLucy.py and RichardsonLucySimple.py:
    Tiny changes in init definition to accept parameter filepath string
    User Interface
    Define number of nodes/processors to use in parameter in config.yml deconvolution:parameter:numproc

RichardsonLucyParallel.py
Maintains standard user interface: initialize() and run_deconvolution()
No changes to initialize() call tree
Run_deconvolution() invokes the following in RichardsonLucyParallel.py:

  1. initialization() → writes event, bg model and response files to disk to prepare for parallel execution. Event and bg model are stored as dense p-vectors in .csv. Response file is stored as p⊗q matrix as .h5.
  2. iteration() → intended functionality has been disabled as MPI script cannot work within the framework. Need a shell call → subprocess.run call to RLparallelscript.py which includes all iteration steps.
  3. finalization() → removes files written to disk.

Employing the Parallel Code
Extremely simple:

  1. Change imagedeconvolution_parfile_gal_511keV.yml deconvolution:algorithm to "RLparallel."
  2. Specify the number of processors/nodes to use through imagedeconvolution_parfile_gal_511keV.yml deconvolution:parameter:numproc to an integer appropriate to your system.
    Right now, the tutorial notebook has not been explicitly modified as we do not have plans to deploy this with DC3. Nevertheless, please let me know if that is recommended.

Limitations

  1. No access to interactive parameter overrides yet
  2. Current method is yet to be tested on a supercomputer where environment variables are less flexible making subprocess calls less transferable
  3. This is an alpha version of parallel RL → No smoothing, bg normalization, etc.
  4. Results are not in the same format
    Model and delta_model at every iteration are saved in separate .csv files
    No direct methods to return result object without saving results to disk. Could potentially load results during finalization() step.
    Result format is primarily a product of input format (simple dense vectors vs histogram objects based implementation in serial code).

Next Steps

  1. Test implementation on UCSD supercomputer
  2. Incorporate RL optimizations in parallel version
  3. Add interactive parameter overrides to config.yml
  4. Move iteration loop in run_deconvolution() to respective algos (Thoughts?)

New files: RichardsonLucyParallel.py and RLparallelscript.py
Potentially modified files: dataIF_COSI_DC2.py
deconvolution_algorithm_base.py
image_deconvolution_data_interface_base.py
image_deconvolution.py
model_base.py
RichardsonLucySimple.py and RichardsonLucy.py were modified to include the propagation of the config file from the user facing image_deconvolution object to the respective deconvolution algorithms
@avalluvan avalluvan added enhancement New feature or request good first issue Good for newcomers labels Aug 29, 2024
@avalluvan avalluvan added this to the v4.0 - DC4 milestone Aug 29, 2024
@avalluvan avalluvan self-assigned this Aug 29, 2024
Copy link

codecov bot commented Aug 29, 2024

Codecov Report

Attention: Patch coverage is 12.01717% with 205 lines in your changes missing coverage. Please review.

Project coverage is 69.17%. Comparing base (d02846a) to head (3036a7a).

Files with missing lines Patch % Lines
cosipy/image_deconvolution/RLparallelscript.py 0.00% 156 Missing ⚠️
...sipy/image_deconvolution/RichardsonLucyParallel.py 28.98% 49 Missing ⚠️
Files with missing lines Coverage Δ
cosipy/image_deconvolution/RichardsonLucy.py 96.03% <100.00%> (ø)
cosipy/image_deconvolution/RichardsonLucySimple.py 94.33% <100.00%> (ø)
cosipy/image_deconvolution/image_deconvolution.py 96.15% <100.00%> (+0.19%) ⬆️
...sipy/image_deconvolution/RichardsonLucyParallel.py 28.98% <28.98%> (ø)
cosipy/image_deconvolution/RLparallelscript.py 0.00% <0.00%> (ø)

@hiyoneda
Copy link
Contributor

Hi, Anaya. Thank you for submitting this PR. I have reviewed the changes and noticed a few issues that I want to discuss with you.

  1. Currently, most of the essential parts of the RL algorithm are performed in RLparallelscript.py. I am concerned about this because it may reduce the flexibility to maintain the code. In actual image analysis, it is very likely to use several different modified RL algorithms, such as standard RL, accelerated RL, MAP RL, maximum entropy, etc, and compare the results. So, the main purpose of DeconvolutionAlgorithmBase is to allow us to use these different algorithms in the same manner to make things easier. But with RLparallelscript.py, we have to duplicate these algorithm classes for the parallel calculation.

  2. The data handling of the response matrix, like preparing intermediate files, is performed in RichardsonLucyParallel.py. I suggest separating these parts from the algorithm classes because I need to modify the algorithm classes every time the response format changes. As for now, we assume that the response matrix is a single matrix, but we may use a different format (neural network, matrix with different parameterization, a combination of several matrices, etc.). So, I prefer to prepare a child class for the data interface class and hide such a data handling part there. Ideally, the parallel calculation can also be implemented in the data interface. I think that such implementation would also be a good first step in thinking about parallel calculation in another case, like model fitting.

Can you please consider these issues first and tell me the potential challenges for them? If you agree with them, we can discuss how to address them together.

@avalluvan
Copy link
Author

Hi Hiroki,

  1. As we have discussed before, it might not be possible to invoke an MPI-based script without performing a shell execute as it requires an mpiexec python <filename> call as opposed to a python <filename> call. MPI-based code could potentially be run on just the master node, keeping all the child nodes waiting, unless a specific, parallel-code needs to be executed. I think this would require some changes from the ground up such that MPI initializations are a part of the root modules being called (such as init.py in the cosipy/cosipy/* directory.). However, I am not sure how that will interfere (and if there are any workarounds) with Jupyter Notebooks. I can take a look at these aspects and revert back.
  2. I agree on this. Now may be a good time to revisit these sections and generalize them, and I think I will be able to do this myself.
    Thanks for your inputs and we can also continue these discussions in future meetings.

@israelmcmc
Copy link
Collaborator

@avalluvan It took me a while (sorry!), but I finally got to look at your code in detail. Good job! I didn't have experience with MPI and I got to learn a lot from your work :)

A couple of things:

  1. I truncated your branch to keep only the MPI RL stuff --i.e. excluding the new response handling-- and open a new PR with it (Auxiliary PR: Richardson Lucy Parallelization  #271). It's good to keep one feature per PR. I also removed the parameter_file input (since the parameter Configurator class already keeps track of the file path) and removed some other small changes that were not needed by the parallel code (just to make reviewing easier). Please take a look.
  2. I created this script to illustrate @hiyoneda 's point (which I agree with). It's the parallel versions of this other one that I created to test the current version of the imaging module (PR Major updates on the image deconvolution modules #188). You can execute it with e.g. mpiexec -n 4 python toy_ParallelImageDeconvolutionDataInterfaceBase.py and it will plot the deconvolved model (from random data generated internally). Instead of re-implementing the RichardsonLucy class, you create a ImageDeconvolutionDataInterfaceBase child class that performs the computation in parallel. Instead of spawning multiple mpiexec internally, you just let the user do, as usual when working with clusters --it's OK if this doesn't work within a Jupyter notebook. This allows you to use all current and future improvements to the deconvolution algorithm(s) without duplicating code. Please take a look and let us know what you think.

@israelmcmc israelmcmc self-assigned this Dec 5, 2024
@avalluvan
Copy link
Author

avalluvan commented Dec 9, 2024

Hi Israel. This pull request is unlikely to include any response reparametrization code. However, my response reparametrization pull request unfortunately contains this pull request's content along with response handling modifications.

Thanks for pointing out the parameter.absolute_path feature.

I will work with the script that you have provided and get back on this.

@israelmcmc
Copy link
Collaborator

Thanks @avalluvan.

To get a branch that contains only the detector response handling and not the MPI RL stuff look into git rebase, in particular around this section:

Screenshot 2024-12-09 at 12 08 40 PM

In your case "Topic A" is the parallelization stuff, and "Topic B" is the respect handling stuff.

@avalluvan
Copy link
Author

Thanks for the inputs. I have significantly restructured the code and will open a new pull request for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants