Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow birefringence reconstruction for z-stacks #102

Closed
edyoshikun opened this issue Jan 6, 2023 · 6 comments · Fixed by mehta-lab/recOrder#322
Closed

Slow birefringence reconstruction for z-stacks #102

edyoshikun opened this issue Jan 6, 2023 · 6 comments · Fixed by mehta-lab/recOrder#322
Assignees

Comments

@edyoshikun
Copy link

The time to reconstruct z-stacks is not linear and quickly becomes an issue reconstructing >50 slice z-stacks. These stacks have shapes (5,61,2048,2048) for 5-state polarization, Z,Y, X.

The issue arises mainly in the reconstruct_qlipp_stokes() function where we get:
1 slice
Stokes Elapsed Time: 1.6s
Birefringence Elapsed Time: 0.4
5 slice
Stokes Elapsed Time: 10.2s
Birefringence Elapsed Time: 2s
10 slice
Stokes Elapsed Time: 24.6s
Birefringence Elapsed Time: 4.6s
61 slice
Stokes Elapsed Time: 210s
Birefringence Elapsed Time: 327s

#%%
from waveorder.io.reader import WaveorderReader
from waveorder.io.writer import WaveorderWriter
from recOrder.io.utils import load_bg
from recOrder.compute.reconstructions import (
    initialize_reconstructor,
    reconstruct_qlipp_stokes,
    reconstruct_qlipp_birefringence,
    reconstruct_phase3D,
)
from datetime import datetime
import numpy as np
import os
import sys
import multiprocessing as mp  

#debugging
import time

# %%
#Data paths
data_root = '/hpc/projects/comp_micro/rawdata/falcon/Ed/2022_12_09_zebrafish_GFP_ISG15_dtomat_MPEG'
# data_folder = 'infected_uninfected_fish_5'
data_folder = 'recOrderPluginSnap_0/multiposition_intoto_1'
dataset_folder = os.path.join(data_root,data_folder)
print('data:' + dataset_folder)
# Background folder name
bg_folder = os.path.join(data_root,'BG')
print('bg_folder:' + bg_folder)
# %%
#Setup Readers
reader = WaveorderReader(dataset_folder)
print(reader.shape)

t = 0
pos = 0
pol_data = reader.get_array(0)[t, :5,25:26]
# channels = ['state0','state1','state2','state3','state4','gfp','dtomato']
# pol_data = data[:5,25:30]
C, Z, Y, X  = pol_data.shape

# fluor_data = data[6:]
bg_data = load_bg(bg_folder, height= Y, width= X)

# %%
# Get first position
print("Start Reconstructor")
reconstructor_args = {
    "image_dim": (Y, X),
    "mag": 8.25,  # magnification   is 200/165
    "pixel_size_um": 3.45,  # pixel size in um
    "n_slices": Z,  # number of slices in z-stack
    "z_step_um": 5,  # z-step size in um
    "wavelength_nm": 532,
    "swing": 0.1,
    "calibration_scheme": "5-State",  # "4-State" or "5-State"
    "NA_obj": 0.55,  # numerical aperture of objective
    "NA_illu": 0.2,  # numerical aperture of condenser
    "n_obj_media": 1.0,  # refractive index of objective immersion media
    "pad_z": 5,  # slices to pad for phase reconstruction boundary artifacts
    "bg_correction": "local_fit",  # BG correction method: "None", "local_fit", "global"
    "mode": "3D",  # phase reconstruction mode, "2D" or "3D"
    "use_gpu": False,
    "gpu_id": 0,
}

reconstructor = initialize_reconstructor(
    pipeline="birefringence", **reconstructor_args
)

proj_folder =  '/hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg'
timestamp = datetime.now().strftime("%Y_%m_%d_%H%M%S")
proj_folder = os.path.join(proj_folder, timestamp)
if not os.path.exists(proj_folder):
    os.mkdir(proj_folder)
output_folder = os.path.join(proj_folder, 'multiposition_intoto_1')
print(output_folder)

writer = WaveorderWriter(output_folder)

timestamp_short  = datetime.now().strftime("%Y%m%d")
writer.create_zarr_root(timestamp_short + "_birefringence_pos_" + str(pos))
    
# Reconstruct data Stokes w/ background correction
print("Begin stokes calculation")
start_time = time.time()
bg_stokes = reconstruct_qlipp_stokes(bg_data,reconstructor)
stokes = reconstruct_qlipp_stokes(pol_data, reconstructor,bg_stokes)
print(f'Stokes elapsed time:{time.time() - start_time}')
print(f"Shape of background corrected data Stokes: {np.shape(stokes)}")
# Reconstruct Birefringence from Stokes
# Shape of the output birefringence will be (C, Z, Y, X) where
# Channel order = Retardance [nm], Orientation [rad], Brightfield (S0), Degree of Polarization
print("Begin birefringence calculation")
bire_start_time = time.time()
birefringence = reconstruct_qlipp_birefringence(stokes, reconstructor)
birefringence[0] = (
    birefringence[0] / (2 * np.pi) * reconstructor_args["wavelength_nm"]
)
print(f'Bire elapsed time:{time.time() - bire_start_time}')
print(f'Total elapsed time:{time.time() - start_time}')
# %%

I am documenting as an issue for further deep debugging. Thank you @talonchandler for testing and confirming similar results.

@talonchandler
Copy link
Collaborator

Thanks for writing this up @edyoshikun. I agree that we should expect ~linear reconstruction times with increasing FOV size since this is a voxel-by-voxel reconstruction.

We'll need to do some debugging to figure out if this bottleneck is caused by:

  • saturating our compute resources? (I don't think we're filling RAM, but we may be filling a cache or other resource)
  • an accidental bottleneck somewhere in the birefringence code...an accidental non-linear step?

A "trivial" (but more work intensive) temporary workaround might be to split the large stacks into pieces across CPUs/jobs.

@edyoshikun
Copy link
Author

edyoshikun commented Jan 6, 2023

Ok, so thanks to @ziw-liu suggestion to run the individual z-planes separately. I used the multiprocess library to allocate 61 processes and got them down from 547s to 22s processing time. This is a short-term fix.

I used the cProfiler and snakeviz visualizer and this points out to Pol_recon and the Stokes_recon function.

Start Reconstructor
Initializing Reconstructor...
Finished Initializing Reconstructor (0.00 min)
/hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg/2023_01_06_141529/multiposition_intoto_1
No existing directory found. Creating new directory at /hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg/2023_01_06_141529/multiposition_intoto_1
Creating new zarr store at /hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg/2023_01_06_141529/multiposition_intoto_1/20230106_birefringence_pos_0.zarr
Begin stokes calculation
Stokes elapsed time:9.410654783248901
Shape of background corrected data Stokes: (5, 5, 2048, 2048)
Begin birefringence calculation
Bire elapsed time:2.1207518577575684
Total elapsed time:11.531463384628296
         70 function calls in 9.411 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    9.411    9.411 reconstructions.py:259(reconstruct_qlipp_stokes)
        1    8.549    8.549    8.549    8.549 waveorder_reconstructor.py:1049(Stokes_recon)
        1    0.834    0.834    0.834    0.834 waveorder_reconstructor.py:1113(Stokes_transform)
        5    0.000    0.000    0.026    0.005 {built-in method numpy.core._multiarray_umath.implement_array_function}
        1    0.000    0.000    0.025    0.025 <__array_function__ internals>:177(copy)
        1    0.000    0.000    0.025    0.025 function_base.py:870(copy)
        1    0.025    0.025    0.025    0.025 {built-in method numpy.array}`
```

@mattersoflight
Copy link
Member

Thanks for logging this @edyoshikun. For deeper debugging and proper fix, you can profile both reconstruct_qlipp_stokes and reconstruct_qlipp_birefringence with snakeviz, which uses cProfile module of python.

@mattersoflight
Copy link
Member

mattersoflight commented Feb 17, 2023

Hi @edyoshikun , @talonchandler All the operations we apply to compute birefringence are straightforward with linear complexity (inverse instrument matrix, background correction, square roots).

My observations/ questions:

  • It makes sense that Stokes Elapsed Time increases linearly.
  • It doesn't make sense that Birefringence Elapsed Time increases exponentially.
  • Does reconstructor = initialize_reconstructor(pipeline="birefringence", **reconstructor_args) compute the phase OTF and operate on it anyway? If it does, that is a bug that can explain the exponential increase in compute time.
  • If you set "mode": "3D" parameter to 2D, it should not affect the birefringence. The parameter currently affects only phase reconstruction as far as I remember.

@edyoshikun
Copy link
Author

  • Yes, it doesn't make sense that by increasing the number of slices the code is exponentially slower.
  • The birefringence calculations are just linear as you mention and pixel to pixel, and I've checked the flags in recorder and waveorder related to the OTF calculations.
  • I had the dictionary of arguments such as mode:3D, pad_z,z_step,n_slice which are not needed for birefringence but left them in place as in theory they don't change the reconstruction pipeline and are used for QLIPP. I can take them out and take a closer look if that makes things different.

@talonchandler
Copy link
Collaborator

@edyoshikun you can try out recOrder #322 or you can replace
birefringence = reconstruct_qlipp_birefringence(stokes, reconstructor)
with
birefringence = reconstructor.Polarization_Recon(stokes) to skip all of the transposes and copies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants