Improvements to NonStationaryConvolve2D #466

mrava87 · 2023-01-17T06:13:25Z

This issue collects a few improvements for the NonStationaryConvolve2D as raised in #465 (and a couple of others):

Refactor bilinear interpolation
The CUDA bilinear interp is the same as the CPU one. It then makes sense to wrap the same code under a same function. This can be done by coding a pure Python function and then defining a device function from it. For example:
```
# CPU
def fun(a, b):
   return a + b

# GPU
dev_fun = cuda.jit(device=True)(fun)
```
Move CPU to auxiliary file
Since the GPU functions are in another file and taking into account the above refactor, it makes sense to move it to another file.
Improve thread launch parameters and iteration model
Currently, there is a limit to the size of the model, imposed by the number of blocks and threads in the call. There are two approaches to fixing this. The first is by making the total number of threads depend on the model. For example, if the model is 1000 x 1000, launch 1 block of 1024 x 1024. There is a problem which appears when very large models are used: the number of threads that are required exceed the number that the GPU can provide.
Grid stride looping can help with this as allows arbitrarily sized models to be processed by a smaller number of threads. For example, set 1 block of 1024 x 1024 and you can process any model of any size. The issue with naive grid-stride looping is that it may not be optimal in the number of threads. For example, a 32 x 32 model always launches 1024 x 1024 threads.
A solution to that is to combine both techniques: use an input-dependent number of blocks/threads (that are powers of 2), but set a maximum number of total threads launched. This can even be set per GPU, but is probably ok to hardcode for a reasonable number obtained from the MAX_GRID_DIM_X attribute (see here).
Create operator that takes filters as input
Currently, our operator takes a generic 2d signal as input and filters as parameter. However, an alternative operator that takes the 2d signal as parameter and the filters as inputs is desirable as it could be used to estimate the best non-stationary filters that match a given data and input signal. This is kind of equivalent to pylops.avo.prestack.PrestackWaveletModelling for the 1D case. I suggest making a completely new operator and calling it NonStationaryWavelets2D or NonStationaryFilters2D.

The text was updated successfully, but these errors were encountered:

mrava87 · 2023-01-17T06:13:43Z

@cako can you look at the last point and let me know what you think :)

cako · 2023-01-17T06:33:20Z

Yup, that is a good idea. In fact that is what I first thought the operator did hahaha. So it would be nice to have both,

mrava87 · 2023-01-21T13:12:27Z

@cako I am struggling a bit to implement a bilinear interpolator that is called by all versions (numpy, numba, cuda-numba)...

More specifically, letting cuda-numba aside for a moment, I would like to have a function that performs interpolation (_interpolate_h) and one that does matvec/rmatvec (_matvec_rmatvec). Both of them will be numpy and then I can make an equivalent jitted one for numba. However, a jitted function can only call jitted functions, so I cannot have something like this

def _interpolate_h(hs, ix, iz, ohx, dhx, nhx, ohz, dhz, nhz):
    """find closest filters and linearly interpolate between them and interpolate psf"""
    ...

if jit_message is None:
    _interpolate_h_numba = jit(**numba_opts)(_interpolate_h)

def _matvec_rmatvec(
    x: NDArray,
    y: NDArray,
    hs: NDArray,
    hshape: Tuple[int, int],
    xdims: Tuple[int, int],
    ohx: float,
    ohz: float,
    dhx: float,
    dhz: float,
    nhx: int,
    nhz: int,
    rmatvec: bool = False,
    number : bool = False,
) -> NDArray:
    for ix in prange(xdims[0]):
        for iz in range(xdims[1]):

            # find filter
            if numba:
                h = _interpolate_h_numba(hs, ix, iz, ohx, dhx, nhx, ohz, dhz, nhz)
            else:
                h = _interpolate_h(hs, ix, iz, ohx, dhx, nhx, ohz, dhz, nhz)
            ...

if jit_message is None:
    _matvec_rmatvec_numba = jit(**numba_opts)(_matvec_rmatvec)

Any smart idea?

cako · 2023-01-24T06:23:38Z

How about this:

from math import floor
import numba
import numpy as np

class InterpolateAt:
    # A JIT function cannot be a standard class method
    # (takes `self`, Numba can't handle) or a classmethod
    # (takes `cls`, Numba can't handle). At most, they can
    # be staticmethods. So, here are three options:
    #     1. Create a non-JIT and a JIT staticmethod, bind
    #        at runtime the correct one depending on the
    #        engine.
    #     2. Create two classes e.g., InterpolateAtNumpy
    #        and InterpolateAtNumba, and return the 
    #        appropriate one based on engine upon the
    #        call to self.__new__.
    # See: https://santoshk.dev/posts/2022/__init__-vs-__new__-and-when-to-use-them/
    #    3. Bind only the non-JIT functions as staticmethods
    #       when possible, create all other JIT-ed functions
    #       when calling self.__init__.
    #
    # This class follows approach 3.

    def __init__(self, i, engine="numpy"):
        # Local, possibly JIT-ed interpolation.
        _interpolate_local = (
            numba.jit(nopython=True)(self._interpolate)
            if engine == "numba"
            else self._interpolate
        )

        # This is not necessary but replaces the original
        # staticmethod. This replacement is not a staticmethod
        # but that probably doesn't matter.
        self._interpolate = _interpolate_local
        # Alternatively, you can apply staticmethod, but
        # you lose the Numba attributes.
        # self._interpolate = staticmethod(_interpolate_local)

        # Creates a function with bound `i`. This
        # allows the compitation using Numba.
        def _matvec_rmatvec_local(x):
            return _interpolate_local(x, i)
        
        # Attaches the result of a possibly JIT-ed
        # `_matvec_rmatvec_local` to a the class, so it can
        # be accessible by external or internal methods.
        self._matvec_rmatvec = (
            numba.jit(nopython=True)(_matvec_rmatvec_local)
            if engine == "numba"
            else _matvec_rmatvec_local
        )

    # This staticmethod decorator can be removed if not
    # required. Alternatively, _interpolate can be defined
    # outside the class if it needs to be used by other classes.
    @staticmethod
    def _interpolate(ary, i):
        ifloor = floor(i)
        w = i - ifloor
        if ifloor < 0 or ifloor + 1 > len(ary) - 1:
            return np.nan
        return (1 - w) * ary[ifloor] + w * ary[ifloor + 1]

    # Cannot be JIT-ed, but is a thin wrapper over a JIT-ed
    # function.
    def _matvec(self, x):
        return self._matvec_rmatvec(x)

print(InterpolateAt._interpolate)
# <function InterpolateAt._interpolate at 0x7ff8fc2b9360>
print(InterpolateAt._matvec_rmatvec)
# Doesn't exist yet, raises AttributeError

x = np.arange(10)

IOpPy = InterpolateAt(1.5, engine="numpy")
assert (
    IOpPy._matvec_rmatvec(x) == IOpPy._matvec(x) == IOpPy._interpolate(x, 1.5)
)
print(IOpPy._interpolate(x, 1.5))
# 1.5
print(type(IOpPy._matvec_rmatvec))
# <class 'function'>
print(type(IOpPy._interpolate))
# <class 'function'>

IOpNB = InterpolateAt(1.5, engine="numba")
assert (
    IOpNB._matvec_rmatvec(x) == IOpNB._matvec(x) == IOpNB._interpolate(x, 1.5)
)
print(IOpNB._interpolate(x, 1.5))
# 1.5
print(type(IOpNB._matvec_rmatvec))
# <class 'numba.core.registry.CPUDispatcher'>
print(type(IOpNB._interpolate))
# <class 'numba.core.registry.CPUDispatcher'>
print(IOpNB._matvec_rmatvec.signatures)
# [(array(int64, 1d, C),)]

mrava87 · 2023-01-24T06:38:41Z

Mmmh I am not sure I understand how this fits in the bigger picture. You are suggesting to make a new class just for the interpolation part? And then feed it to the main NonStationaryConvolve2D class?

I just worry that to avoid repeating 20 lines of code between cpu and gpu implementation, we are going to have 200 more that are very complicated to understand and more likely to break in future if any change is introduced to numba...

Also, remembering the end goal (have shared interpolate function for numpy, numba, and numba-cuda implementations), are you sure we can easily integrated within the numba-cuda code? I never tried to work with classes in numba-cuda so it may be ok...

mrava87 · 2023-01-24T06:39:40Z

Could you try taking the current nonstatconvolve2d file and show how this would be implemented?

cako · 2023-01-24T06:55:16Z

Yup, I can have a look :)

First implemenation of non-stationary filter estimation operators (point 4 in PyLops#466).

* feat: Added NonStationaryFilters1D and NonStationaryFilters2D First implemenation of non-stationary filter estimation operators (point 4 in #466). * minor: added tests for nonstatfilters operators * minor: added checks that filter locations are within dims * minor: reintroduced parallel in rmatvec

mrava87 added a commit to mrava87/pylops that referenced this issue Jan 27, 2023

feat: Added NonStationaryFilters1D and NonStationaryFilters2D

04a282c

First implemenation of non-stationary filter estimation operators (point 4 in PyLops#466).

mrava87 mentioned this issue Jan 29, 2023

NonStationaryFilters1D and NonStationaryFilters2D operators #478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to NonStationaryConvolve2D #466

Improvements to NonStationaryConvolve2D #466

mrava87 commented Jan 17, 2023

mrava87 commented Jan 17, 2023

cako commented Jan 17, 2023

mrava87 commented Jan 21, 2023

cako commented Jan 24, 2023

mrava87 commented Jan 24, 2023 •

edited

Loading

mrava87 commented Jan 24, 2023

cako commented Jan 24, 2023

Improvements to NonStationaryConvolve2D #466

Improvements to NonStationaryConvolve2D #466

Comments

mrava87 commented Jan 17, 2023

mrava87 commented Jan 17, 2023

cako commented Jan 17, 2023

mrava87 commented Jan 21, 2023

cako commented Jan 24, 2023

mrava87 commented Jan 24, 2023 • edited Loading

mrava87 commented Jan 24, 2023

cako commented Jan 24, 2023

mrava87 commented Jan 24, 2023 •

edited

Loading