Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Multinterp branch, JY reviewer PR, edit main.md, add comments_figs_nbs.md #1

Open
wants to merge 24 commits into
base: multinterp
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions papers/alan_lujan/comments_figs_nbs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# multinterp comments, figures and notebooks

Please feel free to incorporate into the paper or ignore at your discretion. :-) -- Jennifer Yoon --

Screenshot PNG files with hand drawn markups are stored in ` .\figures\`

Curvenotes build preview with animation (figures) look very nice, cool!

## Table of Contents, Jupyter Notebooks
(in order of appearance in paper)

### 1. Multivariate_Interpolation.ipynb

cells 3 and 4:
"Suppose we are trying to approximate the following function at a set of points: "
```
def squared_coords(x, y):
return x**2 + y**2
```
comment: It will be helpful to provide a short reason for choosing the squared coordinates function.
Example: A squared x and y coordinates function is used to draw a figure whose grid geometry
looks like a curved sheet in 3D projection.

Example cont: This closed-form solution function, for which all points along the curve are known, is used as the
baseline model, from which we draw sample points, to simulate an unknown function with a similar shape. Then we can
fill in the between points using an interpolation method.

First 2 figures need titles and descriptions.
Blue colors are too uniform. Is there a way to have gradation?, so curve will be exaggerated?

It's not obvious what is the different in the 2 figures. They look identical. May benefit from
having a small, boxed area where the pixelation is greatly enhanced in the interpolated output, figure 2.

missing title fig 1, ex: figure 1: 3D projection of squared coordinates function.

screenshot3.PNG <img src="\figures\screenshot3.PNG" width="600" >

missing title fig 2, ex: figure 2: interpolated 3D projection, using sampled points from figure 1.
body text figure description, ex: Image is pixelated in fig 2 because outputs are interpolations.

screenshot4.PNG <img src="\figures\screenshot4.PNG" width="600" >

### 2. Multivariate_Interpolation_with_Derivatives.ipynb

Difficult to see relationship between first group of 2 figures and second group of 2 figures
(partial derivatives).

axis need labels on all 4 figures.

screenshot5.PNG <img src="\figures\screenshot5.PNG" width="600" >

### 3. Multivalued_Interpolation.ipynb

none. Could use grid lines or pop-out box with enhanced pixelation to make the
differences between plots easier to eye-ball.


### 4. Curvilinear_Interpoliation.ipynb

none. Could use grid lines or pop-out box with enhanced pixelation to make the
differences between plots easier to eye-ball.


### 5. Unstructured_Interpolation.ipynb

For figures after the 1st one, (group 1: nearest, linear, cubic, radial basis) and (group 2: original,
gaussian process regression), eye-balling the differences in the images maybe easier with grid lines
drawn in white or black ink.

Or a small boxed area can be blown up and pixelation exaggerated in the interpolated image for viewing
contrast.

Perhaps say something like, "Boxed area's pixelation has been enhanced on the simulated (interpolated)
figure to maximize visual difference. Boxed are image does not reflect actual smoothness of model output.

screenshot1.PNG <img src="\figures\screenshot1.PNG" width="700" >

screenshot2.PNG <img src="\figures\screenshot2.PNG" width="700" >

Should "Gaussian Process" be "Gaussian Process Regression" for title, last figure?
Binary file added papers/alan_lujan/figures/screenshot1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added papers/alan_lujan/figures/screenshot2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added papers/alan_lujan/figures/screenshot3.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added papers/alan_lujan/figures/screenshot4.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added papers/alan_lujan/figures/screenshot5.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 29 additions & 6 deletions papers/alan_lujan/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,15 @@
title: multinterp
subtitle: A Unified Interface for Multivariate Interpolation in the Scientific Python Ecosystem
abstract: |
Multivariate interpolation is a fundamental tool in scientific computing, yet the Python ecosystem offers a fragmented landscape of specialized tools. This fragmentation hinders code reusability, experimentation, and efficient deployment across diverse hardware. To address this challenge, I've developed the `multinterp` package. It provides a unified interface for regular/rectilinear interpolation, supports serial (NumPy/SciPy), parallel (Numba), and GPU (CuPy, PyTorch, JAX) backends, and includes tools for multivalued interpolation and interpolation of derivatives.
Multivariate interpolation is a fundamental tool in scientific computing, yet the Python ecosystem offers a fragmented landscape of specialized tools. This fragmentation hinders code reusability, experimentation, and efficient deployment across diverse hardware. The `multinterp` package was developed to address this challenge. It provides a unified interface for regular/rectilinear interpolation, supports serial (NumPy/SciPy), parallel (Numba), and GPU (CuPy, PyTorch, JAX) backends, and includes tools for multivalued interpolation and interpolation of derivatives.
exports:
- format: pdf
---

`[[JY reviewer notes: my comments will be inside double brackets. Please feel free to incorporate or ignore at your discretion. All of my comments are intended to improve readability for a general reader.]]`

`[[Question on abstract: did you mean "regular/irregular interpolation" in your abstract? Removed "I've" in abstract.]]`

## Introduction

The scientific Python ecosystem has a number of diverse tools for multivariate interpolation. However, these tools are scattered across multiple packages, each constructed for a specific purpose that prevents them from being easily used in other contexts.
Expand All @@ -18,9 +22,10 @@ This project aims to develop a comprehensive framework for multivariate interpol

## Grid Interpolation

Functions are powerful mappings between sets of inputs and outputs, indicating how one set of values is related to another. Functions, however, are also infinitely dimensional, in that the inputs can range over an infinite number of values each mapping 1-to-1 (typically) to an infinite number of outputs. This makes it difficult to represent non-analytic functions in a computational environment, as we can only store a finite number of values in memory. For this reason, interpolation is a powerful tool in scientific computing, as it allows us to represent functions with a finite number of values and to approximate the function's behavior between these values.
Functions are powerful mappings between sets of inputs and outputs, indicating how one set of values is related to another. Functions, however, are also infinitely dimensional, in that the inputs can range over an infinite number of values each mapping 1-to-1 (typically) to an infinite number of outputs. This makes it difficult to represent non-analytic functions `[[ (i.e., without closed-form solutions) ]]` in a computational environment, as we can only store a finite number of values in memory. For this reason, interpolation is a powerful tool in scientific computing, as it allows us to represent functions with a finite number of values and to approximate the function's behavior between these values.

The set of input values on which we know the function's output values is called a **grid**. A grid (or input grid) of values can be represented in many ways, depending on its underlying structure. The broadest categories of grids are regular or structured grids, and irregular or unstructured grids. Regular grids are those where the input values are arranged in a regular pattern, such as a triangle or a quadrangle. Irregular grids are those where the input values are not arranged in a particularly structured way and can seem to be scattered randomly across the input space.
The set of input values on which we know the function's output values is called a **grid**. A grid (or input grid) of values can be represented in many ways, depending on its underlying structure. The broadest categories of grids are regular or structured grids, and irregular or unstructured grids.
Regular grids are those where the input values are arranged in a regular pattern, such as a triangle or a quadrangle. Irregular grids are those where the input values are not arranged in a particularly structured way and can seem to be scattered randomly across the input space.

As we might imagine, interpolation on regular grids is much easier than interpolation on irregular grids as we are able to exploit the structure of the grid to make predictions about the function's behavior between known values. Irregular grid interpolation is much more difficult, and often requires *regularizing* and/or *regression* techniques to make predictions about the function's behavior between known values. `multinterp` aims to provide a comprehensive set of tools for both regular and irregular grid interpolation, and we will discuss some of these tools in the following sections.

Expand All @@ -43,7 +48,8 @@ As we might imagine, interpolation on regular grids is much easier than interpol

## Rectilinear Interpolation

A *rectilinear* grid is a regular grid where the input values are arranged in a *rectangular* (in 2D) or *hyper-rectangular* (in higher dimensions) pattern. Moreover, they can be represented by the tensor product of monotonically increasing vectors along each dimension. For example, a 2D rectilinear grid can be represented by two 1D arrays of increasing values, such as $x = [x_0, x_1, x_2, \cdots, x_n]$ and $y = [y_0, y_1, y_2, \cdots, y_m]$, where $x_i > x_j$ and $y_i > y_j$ $\forall i > j$, and the input grid is then represented by $x \times y$ of dimensions $n \times m$. This allows for a very simple and efficient interpolation algorithm, as we can easily find and use the nearest known values to make predictions about the function's behavior in the unknown space.
A *rectilinear* grid is a regular grid where the input values are arranged in a *rectangular* (in 2D) or *hyper-rectangular* (in higher dimensions) pattern. Moreover, they can be represented by the tensor product of monotonically increasing vectors along each dimension. For example, a 2D rectilinear grid can be represented by two 1D arrays of increasing values, such as $x = [x_0, x_1, x_2, \cdots, x_n]$ and $y = [y_0, y_1, y_2, \cdots, y_m]$, where $x_i > x_j$ and $y_i > y_j$ $\forall i > j$ `[[Given your later example, it might be better to make j > i. Also suggest using words "for all i > j" for non-math readers instead of symbol.]]`, and the input grid is then represented by $x \times y$ of dimensions $n \times m$.
This allows for a very simple and efficient interpolation algorithm, as we can easily find and use the nearest known values to make predictions about the function's behavior in the unknown space.

```{figure} figures/BilinearInterpolation
:label: bilinear
Expand All @@ -52,10 +58,11 @@ A *rectilinear* grid is a regular grid where the input values are arranged in a

A non-uniformly spaced rectilinear grid can be transformed into a uniformly spaced coordinate grid (and vice versa).
```
`[[screen compatibility error, last line above does not wrap when viewed directly on Github repo, on some devices. If possible keep line width to 80 characters.]]`

### Multilinear Interpolation

`multinterp` provides a simple and efficient implementation of *multilinear interpolation* for various backends (`numpy` (@Harris2020), `scipy` (@Virtanen2020), `numba` (@Lam2015), `cupy` (@Okuta2017), `pytorch` (@Paszke2019), and `jax` (@Bradbury2018)) via its `multinterp` function. From the remaining of this section, `multinterp` refers to the `multinterp` function in `multinterp` package, unless otherwise specified.
`multinterp` provides a simple and efficient implementation of *multilinear interpolation* for various backends (`numpy` (@Harris2020), `scipy` (@Virtanen2020), `numba` (@Lam2015), `cupy` (@Okuta2017), `pytorch` (@Paszke2019), and `jax` (@Bradbury2018)) via its `multinterp` function. `[[ For the remainder of this section... ]]` From the remaining of this section, `multinterp` refers to the `multinterp` function in `multinterp` package, unless otherwise specified.

The main workhorse of `multinterp` is `scipy.ndimage`'s `map_coordinates` function. This function takes an array of **input** values and an array of **coordinates**, and returns the interpolated values at those coordinates. More specifically, the `input` array is the array of known values on the coordinate (index) grid, such that `input[i,j,k]` is the known value at the coordinate `(i,j,k)`. The `coordinates` array is an array of fractional coordinates at which we wish to know the values of the function, such as `coordinates[0] = (1.5, 2.3, 3.1)`. This indicates that we wish to know the value of the function between input index $i \in [1,2]$, $j \in [2,3]$, and $k \in [3,4]$. While `map_coordinates` is a powerful tool for coordinate grid interpolation, a typical function in question may not be defined on a coordinate grid. For this reason, we first need to find a mapping between the functional input grid and the coordinate grid, and then use `map_coordinates` to interpolate the function on the coordinate grid.

Expand Down Expand Up @@ -91,7 +98,7 @@ A *curvilinear* grid is a regular grid whose input coordinates are *curved* or *

A curvilinear grid can be transformed into a rectilinear grid by a simple remapping of its vertices.
```

`[[screen compatibility error, last line above does not wrap when viewed directly on Github repo, on some devices. If possible keep line width to 80 characters.]]`
```{include} notebooks/Curvilinear_Interpolation.ipynb
```

Expand All @@ -104,13 +111,15 @@ A curvilinear grid can be transformed into a rectilinear grid by a simple remapp

Unstructured grids are irregular and often require a triangulation step which might be computationally expensive and time-consuming.
```
`[[screen compatibility error, last line above does not wrap when viewed directly on Github repo, on some devices. If possible keep line width to 80 characters.]]`

```{include} notebooks/Unstructured_Interpolation.ipynb
```

## Conclusion

Multivariate interpolation is a cornerstone of scientific computing, yet the Python ecosystem (@Oliphant2007) presents a fragmented landscape of tools. While individually powerful, these packages often lack a unified interface. This fragmentation makes it difficult for researchers to experiment with different interpolation methods, optimize performance across diverse hardware, and handle varying data structures (regular, rectilinear, curvilinear, unstructured).
`[[Is there a way to give a nod to PyTorch and TensorFlow for having call-backs, hooks, and function wrappers to allow a user to swap out an optimization function or module mid-stream? They do not cover all the use cases of the `multinterp` package, but some effort went into developing a layered API to cover varying use cases. NumPy also has structured data type, that can be used for custom data type and hierarchial data structures. It can be seen as an attempt to provide a flexible (customizable) user interface, even though its aim and scope is different from 'multinterp.' (NumPy structured datatype's goal seems mostly for C code interface and optimized C module or C numerical recipe interface or explicit memory control or memory layout control.) The general reader may appreciate having some context. This package may be viewed as a further development of previous efforts at a flexible user interface for users of varying data types and data geometries. (See structured arrays in https://numpy.org/doc/stable/user/basics.rec.html) ]]`

The `multinterp` project seeks to change this. Its goal is to provide a unified, comprehensive, and flexible framework for multivariate interpolation in Python. This framework will streamline workflows by offering:

Expand All @@ -119,3 +128,17 @@ The `multinterp` project seeks to change this. Its goal is to provide a unified,
- Broad Functionality: Tools for regular/rectilinear interpolation, multivalued interpolation, and derivative calculations, addressing a wide range of scientific problems.

The multinterp package (<https://github.com/alanlujan91/multinterp>) is currently in its beta stage. It offers a strong foundation but welcomes community contributions to reach its full potential. We invite collaboration to improve documentation, expand the test suite, and ensure the codebase aligns with the highest standards of Python package development.

`[[On Curvenoted build preview: right-column bottom shows list of files used in the article. These should be listed in the order of their appearance in the paper. See suggestion below.]]`

`[[Supporting Documents
(shown in the order of appearance)
1. Multivariate_Interpolation.ipynb
2. Multivariate_Interpolation_with_Derivatives.ipynb
3. Multivalued_Interpolation.ipynb
4. Curvilinear_Interpoliation.ipynb
5. Unstructured_Interpolation.ipynb
6. manim_notebook.ipynb (figure animation on Curvenotes build)
7. figures.ipynb (add?) (build figure grid scaffolds)
]]`