-
Notifications
You must be signed in to change notification settings - Fork 26
Performance issues #36
Comments
Well, I wouldn't mind being part of a re-write (given the usual disclaimers about having enough time on my hands etc...). So you won't scare me off just by threatening with that! ;) I don't think Cartesian has to be a problem either - even if there is obviously going to be some learning curve before I understand how it works, I must admit I don't find the current code very easy to understand either; although it's only a few months since I spent a week more or less pouring over this to implement Since we both have other concerns too, maybe a good strategy here is to outline some changes we'd like to see, sketch up a structure for the new version, and then start working. Then we can work on it in a separate branch that we can plan to be quite long-lived, basically re-writing the library, and when we have feature-parity we merge (probably more-or-less overwrite) and tag a new version. Given the large number of people that use Grid, it's probably better to make any breaking changes all at once. |
Just saw this after posting on #37. OK, I'll put a demo up shortly. |
By the way, to iterate what could be your own mantra: have you profiled Grid.jl in the cases where it's slower? 😉 |
Yes, I just don't have the data to show you right now. That led to some performance improvements a month or so back. But I think fixing the remaining issues may require some design changes. |
I've been following Grid.jl with some interest, as I need fast interpolation in my some of my work (though I don't know that much about interpolation myself). I usually work in Python, where I use Right now Dierckx.jl only does one thing (just a proof of concept): 2-d grid interpolation with irregular spacing. And it's limited to Float64 arrays. But I can add wrappers for specific functionality upon request. |
@kbarbary, thanks for telling us about this. Sounds cool. There's more info about what kinds of performance gains are possible in #38---roughly, 4-fold. Have you benchmarked Dierckx.jl against Grid.jl? It would be interesting to know if you get about 4-fold (in which case just sticking with Julia will have some advantages), or if Dierckx can do even better. Of course, it sounds like you haven't wrapped the comparable routines yet. If you do that, I'd be interested in seeing the numbers. |
@kbarbary Cool! In addition to what Tim has already said, I think it would also be interesting to read up more on the algorithms used by the NETLIB library, especially if we can show that they perform well. Interpolations on irregular grids in >1D is currently not supported at all in Grid.jl, but it would of course be very useful. Do you have any references to papers or similar with the algorithms? Or would it be feasible (and unproblematic license-wise) to port the FORTRAN library straight-off? |
@tlycken Unfortunately, the only thing I've found is a reference to a book by Dierckx on the NETLIB page, which isn't very helpful. There's some documentation in the code comments, but I don't think it describes the methods adequately. Interestingly, I found an R package wrapping Dierckx called
My first thought was actually to read the Fortran and port to Julia, but after looking into it, it seemed lower effort to just wrap it. :) Porting would be a good way to understand the algorithm though, so I may still do it. Regarding licensing, the licensing of the NETLIB package is unclear to me. The copy of the library that I put in Dierckx.jl is actually from scipy.interpolate (it's a double-precision version) which is BSD licensed, so I assume that the Fortran library is at least BSD-compatible. Therefore, I think a port would not be a problem. I'll try to mock up a performance comparison between Dierckx.jl and Grid.jl. I don't think Dierckx even has a regular 2-d Grid implementation, but even irregular vs regular might be informative. |
I added evaluation on the grid spanned by two input arrays to Dierckx.jl and did a quick benchmark. Here's the benchmark script: using Dierckx
using Grid
n = 1000
x = linspace(1, n, n)
z = ones(n, n)
xi = linspace(40., 50., n)
# Grid.jl
g = CoordInterpGrid((1:n, 1:n), z, BCnearest, InterpQuadratic)
g[xi, xi]
@time g[xi, xi]
# Dierckx.jl
s = GridSpline(x, x, z; kx=2, ky=2)
evalgrid(s, xi, xi)
@time evalgrid(s, xi, xi) Results:
Using my
This is Grid.jl v0.3.5. I haven't had a chance to try out the performance-enhanced branch yet. |
Thanks! That's a pretty huge gap. I trust that This revealed a couple of bottlenecks, including one type problem. Some fixes are in b73de68 and f2293a1, which you can get via The performance-enhanced work hasn't yet implemented |
Cool, glad to know this helped a bit already! Regarding spline order, from the Fortran docstring (just to be sure I'm interpreting kx and ky correctly):
I found that I'll keep going with Dierckx.jl as time allows. Maybe if I happen to learn enough about what its actually doing I can contribute to Grid.jl as well. |
FYI, I added Dierckx.jl to METADATA. I'll probably announce it more broadly once I can test that it builds on OSX. |
The performance of Grid is not as good as it could be. I don't have time to throw in a demo right now, but one can pretty easily write a hand-written function that performs linear interpolation. I've tended to find that it is many times faster than Grid.
Grid got started ages ago (even before there were such things as packages), and now I know a lot more about how to do fast multidimensional operations in Julia. However, one major point of concern: a really fast Grid would currently require one to use Cartesian. I have no desire whatsoever to scare co-developers off (especially @tlycken), so I won't even consider this if there are concerns. (I'm also busy with other things right now.)
The text was updated successfully, but these errors were encountered: