Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper: multinterp: A Unified Interface for Multivariate Interpolation in the Scientific Python Ecosystem #937

Merged
merged 12 commits into from
Sep 25, 2024

Conversation

alanlujan91
Copy link
Contributor

@alanlujan91 alanlujan91 commented Jun 7, 2024

If you are creating this PR in order to submit a draft of your paper, please name your PR with Paper: <title>. An editor will then add a paper label and GitHub Actions will be run to check and build your paper.

See the project readme for more information.

Editor: Amey Ambade @ameyxd

Reviewers:

@ameyxd ameyxd self-assigned this Jun 8, 2024
@ameyxd ameyxd added the paper This indicates that the PR in question is a paper label Jun 8, 2024
Copy link

github-actions bot commented Jun 10, 2024

Curvenote Preview

Directory Preview Checks Updated (UTC)
papers/alan_lujan 🔍 Inspect 34 checks passed (7 optional) Sep 3, 2024, 1:14 AM

@ameyxd ameyxd removed their assignment Jun 11, 2024
@ameyxd
Copy link
Contributor

ameyxd commented Jun 11, 2024

@alanlujan91 can you recheck the DOIs that fail checks? As long as they are valid DOIs we should be okay.

you may be able to add citation keys you want to ignore if their DOIs don't exist in myst.yml under error_rules:

  - rule: doi-exists
    severity: ignore
    keys:
      - abc
      - def01

@ameyxd
Copy link
Contributor

ameyxd commented Jun 13, 2024

@alanlujan91 are you skipping valid DOIs? If so, please refrain and try the checks again. 😅

@alanlujan91
Copy link
Contributor Author

@ameyxd I believe the ones I'm skipping don't have DOIs. I can try to look again

@ameyxd
Copy link
Contributor

ameyxd commented Jun 13, 2024

Okay! please check "Paszke2019" in particular. lmk when complete and passing checks.

@alanlujan91
Copy link
Contributor Author

@ameyxd Oh I remember what happened.

I found https://dl.acm.org/doi/10.5555/3454287.3455008
so I put that on my bib, then pushed, but mystmd didn't like the doi, probably because of this
image
so then I set to ignore, but mystmd still didn't like it, so I removed it altogether from the bib
@rowanc1 any advice?

@ameyxd
Copy link
Contributor

ameyxd commented Jun 20, 2024

Review reminders sent to @gcdeshpande and @aparoha

@cbcunc
Copy link
Member

cbcunc commented Jun 21, 2024

so then I set to ignore, but mystmd still didn't like it, so I removed it altogether from the bib @rowanc1 any advice?

@rowanc1 Any comment?

Copy link

@aparoha aparoha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising your PR. The overall concepts look promising. I have provided in-line comments as well. Overall comments:-

  1. The text could benefit from clearer organization and structure. For instance, some sections seem to repeat concepts without introducing new information or advancing the narrative.
  2. The introduction of technical terms and concepts could be more gradual and systematic, ensuring the audience can follow the content smoothly.


As we might imagine, interpolation on regular grids is much easier than interpolation on irregular grids as we are able to exploit the structure of the grid to make predictions about the function's behavior between known values. Irregular grid interpolation is much more difficult, and often requires *regularizing* and/or *regression* techniques to make predictions about the function's behavior between known values. `multinterp` aims to provide a comprehensive set of tools for both regular and irregular grid interpolation, and we will discuss some of these tools in the following sections.

```{list-table} Grids and structures implemented in "multinterp".
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Do you have plans to support other interpolations e.g. spline, inverse distance weight, natural neighbor etc.?


Functions are powerful mappings between sets of inputs and outputs, indicating how one set of values is related to another. Functions, however, are also infinitely dimensional, in that the inputs can range over an infinite number of values each mapping 1-to-1 (typically) to an infinite number of outputs. This makes it difficult to represent non-analytic functions in a computational environment, as we can only store a finite number of values in memory. For this reason, interpolation is a powerful tool in scientific computing, as it allows us to represent functions with a finite number of values and to approximate the function's behavior between these values.

The set of input values on which we know the function's output values is called a **grid**. A grid (or input grid) of values can be represented in many ways, depending on its underlying structure. The broadest categories of grids are regular or structured grids, and irregular or unstructured grids. Regular grids are those where the input values are arranged in a regular pattern, such as a triangle or a quadrangle. Irregular grids are those where the input values are not arranged in a particularly structured way and can seem to be scattered randomly across the input space.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The introduction of technical terms and concepts could be more gradual and systematic, ensuring the audience can follow the content smoothly. The 2nd paragraph directly jumps to grid interpolation without setting the context before.

- Hardware Adaptability: Seamless support for CPU (NumPy, SciPy), parallel (Numba), and GPU (CuPy, PyTorch, JAX) backends, empowering users to optimize performance based on their computational resources.
- Broad Functionality: Tools for regular/rectilinear interpolation, multivalued interpolation, and derivative calculations, addressing a wide range of scientific problems.

The multinterp package (<https://github.com/alanlujan91/multinterp>) is currently in its beta stage. It offers a strong foundation but welcomes community contributions to reach its full potential. We invite collaboration to improve documentation, expand the test suite, and ensure the codebase aligns with the highest standards of Python package development.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the text mentions documentation and community contributions, specific links or guidelines for how users can contribute or access detailed documentation are not provided. Adding clear references or links to supplementary materials would enhance the text's utility.

@rowanc1
Copy link
Contributor

rowanc1 commented Jun 25, 2024

Hi @alanlujan91 - we talked about this today in our editorial team meeting and would love to showcase your article with computational capabilities and reproducibility (as per this note in the readme!).

I think following the review comment by @aparoha "some sections seem to repeat concepts without introducing new information or advancing the narrative.", it looks as though you are including the full notebooks, rather than individual cells. There is a way in MyST to include individual cells or outputs that keep links to the original location of the supporting materials. The documentation is here. That could be a way to change how you are showcasing the information and better highlight and link to your supporting materials.

If you are open to it I would love to work on you to change the use of MyST slightly to better highlight the computational aspects. Let me know if you have time this week or next week?

@alanlujan91
Copy link
Contributor Author

@rowanc1 that would be great!

I am at a conference this week, so I haven't had time to address the previous comments, but will be available next week to discuss how we can improve this work

@rowanc1
Copy link
Contributor

rowanc1 commented Jun 25, 2024

Fantastic - I will reach out to you separately and get a meeting scheduled soon. Very very excited to show off your content in the SciPy Proceedings this year, and elevate the use of Jupyter Notebooks and reproducibility. 🚀

@JennEYoon
Copy link

Re: alanlujan91#1

Dear Alan @alanlujan91,
Thank you for the opportunity to review Multinterp paper for SciPy Conference Proceedings 2024. All of my comments are to improve readability for the general reader. Please feel free to incorporate or ignore any of my suggestions.

File edit: main.md [[ suggestions enclosed between double brackets ]]
Add file: comments_figs_nbs.md ( suggestions on figures )
screenshots (hand markup drawings) in `.\figures\

Sincerely,
Jennifer Yoon, @JennEYoon
best email: [email protected]
July 24, 2024 PR1 submit

@ameyxd
Copy link
Contributor

ameyxd commented Jul 31, 2024

@JennEYoon thanks for your reviews, please leave your commit as comments in this thread. You may refer to the work by other reviewers. Let us know if you have issues!

Copy link

@JennEYoon JennEYoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review comments moved from a PR on author's multinterp branch
( https://github.com/alanlujan91/scipy_proceedings/pulls )
to scipy_proceedings/pull/937 comments section.
Jennifer
Sunday August 4, 2024 update

## Conclusion

Multivariate interpolation is a cornerstone of scientific computing, yet the Python ecosystem (@Oliphant2007) presents a fragmented landscape of tools. While individually powerful, these packages often lack a unified interface. This fragmentation makes it difficult for researchers to experiment with different interpolation methods, optimize performance across diverse hardware, and handle varying data structures (regular, rectilinear, curvilinear, unstructured).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to give a nod to PyTorch and TensorFlow for having call-backs, hooks, and function wrappers to allow a user to swap out an optimization function or module mid-stream? They do not cover all the use cases of the multinterp package, but some effort went into developing a layered API to cover varying use cases.

NumPy also has structured data type, that can be used for custom data type and hierarchial data structures. It can be seen as an attempt to provide a flexible (customizable) user interface, even though its aim and scope is different from 'multinterp.' (NumPy structured datatype's goal seems mostly for C code interface and optimized C module or C numerical recipe interface and explicit memory control or memory layout control.) The general reader may appreciate having some context. This package may be viewed as a further development of previous efforts at a flexible user interface for users of varying data types and data geometries.
(See structured arrays in https://numpy.org/doc/stable/user/basics.rec.html)

papers/alan_lujan/main.md Show resolved Hide resolved
papers/alan_lujan/main.md Show resolved Hide resolved
:label: unstructured
:alt: Unstructured grids are irregular and often require a triangulation step which might be computationally expensive and time-consuming.
:align: center

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memo, the alt-text sections in code blocks do not wrap properly when viewed on an iPad/Tablet device. Curvenotes build version works properly. So this issue only comes up when viewed direcly from a GitHub repo. Not sure if this is an issue for proceedings.
(Also alt-text wrap issue in Rectilinear Interpolation and Curvilinear Interpolation sections.)


```{include} notebooks/Multivariate_Interpolation.ipynb
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multilinear_Interpolation.ipynb comment

def squared_coords(x, y):
    return x**2 + y**2

It will be helpful to provide a short reason for choosing the squared coordinates function. Example: A squared x and y coordinates function is used to draw a figure whose grid geometry looks like a curved sheet (bowl) in 3D projection.

This closed-form solution function, for which all points along the curved surface are known, is used as the baseline model. From this known model, we can draw sample points to approximate similarly shaped unknown functions.


```{include} notebooks/Multivariate_Interpolation.ipynb
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fig 2: interpolated figures
missing title, axes labels (x, y z)
suggest ways to show this figure is interpolated.
Possibly blow up a small area with enhanced pixelation.
screenshot4


```{include} notebooks/Multivariate_Interpolation_with_Derivatives.ipynb
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multivariate_Interpolation_with_Derivatives.ipynb comment
fig1, fig2, fig3, fig4
Axes need labels on all 4 figures (x, y, z) or (dz/dx, dz/dy, f(z)).

Difficult to see the relationship between first group of 2 figures and second group of 2 figures (partial derivatives). Perhaps some lines can be drawn to show an example of the original function and its partial derivatives. I leave this decision up to the author.
screenshot5


```{include} notebooks/Unstructured_Interpolation.ipynb
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unstructured_Interpolation.ipynb comment
For figures after the 1st one, (group 1: nearest, linear, cubic, radial basis) and (group 2: original, gaussian process regression), eye-balling the differences in the figures maybe easier with grid lines drawn in white or black ink. Also just a suggestion.

screenshot1


```{include} notebooks/Unstructured_Interpolation.ipynb
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

screenshot2

- Hardware Adaptability: Seamless support for CPU (NumPy, SciPy), parallel (Numba), and GPU (CuPy, PyTorch, JAX) backends, empowering users to optimize performance based on their computational resources.
- Broad Functionality: Tools for regular/rectilinear interpolation, multivalued interpolation, and derivative calculations, addressing a wide range of scientific problems.

The multinterp package (<https://github.com/alanlujan91/multinterp>) is currently in its beta stage. It offers a strong foundation but welcomes community contributions to reach its full potential. We invite collaboration to improve documentation, expand the test suite, and ensure the codebase aligns with the highest standards of Python package development.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curvenoted build version comment
This paper when viewed on-screen, right-bottom screen area shows a list of related files. If possible, files should be listed in its order of appearance.

Supporting Documents
(shown in the order of appearance)

Multivariate_Interpolation.ipynb
Multivariate_Interpolation_with_Derivatives.ipynb
Multivalued_Interpolation.ipynb
Curvilinear_Interpoliation.ipynb
Unstructured_Interpolation.ipynb
manim_notebook.ipynb (figure animation on Curvenotes build)
figures.ipynb (add?)

@ameyxd
Copy link
Contributor

ameyxd commented Aug 6, 2024

@JennEYoon, @aparoha - can you confirm the author's updates are satisfactory for your feedback and give me a thumbs up! 🙂

@JennEYoon
Copy link

Looks good. All of my comments were up to the author's discretion. :-)

papers/alan_lujan/main.md Outdated Show resolved Hide resolved
Copy link

@JennEYoon JennEYoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an edit forgot to copy over.

@cbcunc cbcunc merged commit 98446a9 into scipy-conference:2024 Sep 25, 2024
4 checks passed
@stefanv
Copy link
Member

stefanv commented Nov 5, 2024

Hi folks, it doesn't look like the performance comparison made it to the published paper?

https://github.com/scipy-conference/scipy_proceedings/pull/937/files#diff-c6654a0adee03d2551feb82286acfd2863716a72e411ad6435c1bc1753445071R503

@fwkoch
Copy link
Collaborator

fwkoch commented Nov 5, 2024

Hey @stefanv - thanks for noticing this. As far as I can tell, labels referenced in the Performance Comparison section do not exist in any of the notebooks provided (fig:performance_comparison_2d, fig:performance_comparison_3d, fig:backend_comparison_2d, and fig:backend_comparison_3d).

If the notebooks with these figures are added, we can update the proceedings site and PDF! cc @alanlujan91

@alanlujan91
Copy link
Contributor Author

thanks for catching that! I will make a PR against 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
paper This indicates that the PR in question is a paper ready-for-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants