Skip to content
This repository has been archived by the owner on Feb 9, 2020. It is now read-only.

Performance issues #36

Open
timholy opened this issue Jul 19, 2014 · 12 comments
Open

Performance issues #36

timholy opened this issue Jul 19, 2014 · 12 comments

Comments

@timholy
Copy link
Owner

timholy commented Jul 19, 2014

The performance of Grid is not as good as it could be. I don't have time to throw in a demo right now, but one can pretty easily write a hand-written function that performs linear interpolation. I've tended to find that it is many times faster than Grid.

Grid got started ages ago (even before there were such things as packages), and now I know a lot more about how to do fast multidimensional operations in Julia. However, one major point of concern: a really fast Grid would currently require one to use Cartesian. I have no desire whatsoever to scare co-developers off (especially @tlycken), so I won't even consider this if there are concerns. (I'm also busy with other things right now.)

@tomasaschan
Copy link
Collaborator

Well, I wouldn't mind being part of a re-write (given the usual disclaimers about having enough time on my hands etc...). So you won't scare me off just by threatening with that! ;)

I don't think Cartesian has to be a problem either - even if there is obviously going to be some learning curve before I understand how it works, I must admit I don't find the current code very easy to understand either; although it's only a few months since I spent a week more or less pouring over this to implement InterpCubic, I still find myself stepping through the code line by line today to try to understand what it does. And besides, I've been curious about Cartesian for a while and wouldn't mind learning!

Since we both have other concerns too, maybe a good strategy here is to outline some changes we'd like to see, sketch up a structure for the new version, and then start working. Then we can work on it in a separate branch that we can plan to be quite long-lived, basically re-writing the library, and when we have feature-parity we merge (probably more-or-less overwrite) and tag a new version. Given the large number of people that use Grid, it's probably better to make any breaking changes all at once.

@timholy
Copy link
Owner Author

timholy commented Jul 19, 2014

Just saw this after posting on #37. OK, I'll put a demo up shortly.

@tomasaschan
Copy link
Collaborator

By the way, to iterate what could be your own mantra: have you profiled Grid.jl in the cases where it's slower? 😉

@timholy
Copy link
Owner Author

timholy commented Jul 19, 2014

Yes, I just don't have the data to show you right now. That led to some performance improvements a month or so back. But I think fixing the remaining issues may require some design changes.

@kbarbary
Copy link

I've been following Grid.jl with some interest, as I need fast interpolation in my some of my work (though I don't know that much about interpolation myself). I usually work in Python, where I use scipy.interpolate which is mostly a Fortran wrapper for a NETLIB library. It seems to have pretty good performance. Yesterday I started a Julia wrapper Dierckx.jl for the same Fortran library. It might be useful to you for performance comparisons, or as a temporary stand-in for users with interpolation bottlenecks. Of course, I wouldn't mind seeing Grid.jl do it all eventually :)

Right now Dierckx.jl only does one thing (just a proof of concept): 2-d grid interpolation with irregular spacing. And it's limited to Float64 arrays. But I can add wrappers for specific functionality upon request.

@timholy
Copy link
Owner Author

timholy commented Oct 16, 2014

@kbarbary, thanks for telling us about this. Sounds cool.

There's more info about what kinds of performance gains are possible in #38---roughly, 4-fold. Have you benchmarked Dierckx.jl against Grid.jl? It would be interesting to know if you get about 4-fold (in which case just sticking with Julia will have some advantages), or if Dierckx can do even better. Of course, it sounds like you haven't wrapped the comparable routines yet. If you do that, I'd be interested in seeing the numbers.

@tomasaschan
Copy link
Collaborator

@kbarbary Cool! In addition to what Tim has already said, I think it would also be interesting to read up more on the algorithms used by the NETLIB library, especially if we can show that they perform well. Interpolations on irregular grids in >1D is currently not supported at all in Grid.jl, but it would of course be very useful. Do you have any references to papers or similar with the algorithms? Or would it be feasible (and unproblematic license-wise) to port the FORTRAN library straight-off?

@kbarbary
Copy link

@tlycken Unfortunately, the only thing I've found is a reference to a book by Dierckx on the NETLIB page, which isn't very helpful. There's some documentation in the code comments, but I don't think it describes the methods adequately.

Interestingly, I found an R package wrapping Dierckx called DierckxSpline where the comments say

This package is broken. Currently, it seems NOT to have any users and is not worth fixing. These
FITPACK routines are over 20 years old and may be obsolete.

My first thought was actually to read the Fortran and port to Julia, but after looking into it, it seemed lower effort to just wrap it. :) Porting would be a good way to understand the algorithm though, so I may still do it.

Regarding licensing, the licensing of the NETLIB package is unclear to me. The copy of the library that I put in Dierckx.jl is actually from scipy.interpolate (it's a double-precision version) which is BSD licensed, so I assume that the Fortran library is at least BSD-compatible. Therefore, I think a port would not be a problem.

I'll try to mock up a performance comparison between Dierckx.jl and Grid.jl. I don't think Dierckx even has a regular 2-d Grid implementation, but even irregular vs regular might be informative.

@kbarbary
Copy link

I added evaluation on the grid spanned by two input arrays to Dierckx.jl and did a quick benchmark. Here's the benchmark script:

using Dierckx
using Grid

n = 1000

x = linspace(1, n, n)
z = ones(n, n)

xi = linspace(40., 50., n)

# Grid.jl
g = CoordInterpGrid((1:n, 1:n), z, BCnearest, InterpQuadratic)
g[xi, xi]
@time g[xi, xi]

# Dierckx.jl
s = GridSpline(x, x, z; kx=2, ky=2)
evalgrid(s, xi, xi)
@time evalgrid(s, xi, xi)

Results:

$ julia benchmark.jl
elapsed time: 0.3615498 seconds (183661928 bytes allocated, 17.83% gc time)
elapsed time: 0.011027167 seconds (12048288 bytes allocated)

Using my @timeit macro in place of @time to do more loops gives similar results:

$ julia benchmark.jl
1 loops, best of 3: 353.75 ms per loop
10 loops, best of 3: 12.64 ms per loop

This is Grid.jl v0.3.5. I haven't had a chance to try out the performance-enhanced branch yet.

timholy added a commit that referenced this issue Oct 18, 2014
@timholy
Copy link
Owner Author

timholy commented Oct 18, 2014

Thanks! That's a pretty huge gap. I trust that kx=2, ky=2 implements quadratic 3-point interpolation?

This revealed a couple of bottlenecks, including one type problem. Some fixes are in b73de68 and f2293a1, which you can get via Pkg.update(). For me these give a roughly 4x improvement, which is nice, but a far cry from where we clearly need to be. From what I can tell, the rest simply reflects the architecture of Grid, which was written long ago and needs a revamp. I'd certainly encourage you to stick with your Dierckx work if you're in a hurry to obtain good performance. I'm pretty excited about having that package available, for benchmarking a native Julia implementation as well as in practical usage.

The performance-enhanced work hasn't yet implemented CoordInterpGrid (needed for non-integer coordinates), so you couldn't easily test it.

@kbarbary
Copy link

Cool, glad to know this helped a bit already!

Regarding spline order, from the Fortran docstring (just to be sure I'm interpreting kx and ky correctly):

kx,ky : integer values. on entry kx and ky must specify the degrees
          of the spline. 1<=kx,ky<=5. it is recommended to use bicubic
         (kx=ky=3) splines.

I found that kx=3, ky=3 is very roughly 50% slower.

I'll keep going with Dierckx.jl as time allows. Maybe if I happen to learn enough about what its actually doing I can contribute to Grid.jl as well.

@kbarbary
Copy link

FYI, I added Dierckx.jl to METADATA. I'll probably announce it more broadly once I can test that it builds on OSX.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants