-
-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plan for modularization #59
Comments
A few things to unravel here. One thing is, it's not my position to be able to call ODE.jl officially deprecated. That would have to be something the community decides on. There is work in ODE.jl in the form of SciML/ODE.jl#49 to essentially re-do ODE.jl as iterators. who knows when that will actually "finish" but it is better than the current ODE.jl (which isn't saying much because the current ODE.jl isn't always type-stable...). So while it's not for me to to unilaterally decide what is best for the community, you can see that all of the benchmarks I have are against this new iterator update, and they show that the DifferentialEquations.jl ODE solvers are both faster and more accurate than any of those of ODE.jl. I can objectively say that there DifferentialEquations.jl has more algorithms, which are both faster and more accurate, and have more features (higher order dense output, plot recipes, multithreading, etc), and leave it to the community to make the decision of what's "official", whether there should be one, and what the criteria for choosing that one is. In the meantime, I'll just keep implementing/improving more algorithms. That said, I agree that this package should be split, the real question is how. I like the experience that DifferentialEquations.jl offers: you do Even this leads to issues without conditional dependencies, since one of the ways I am adding for solving an SDE is to solve the related Kolmogorov PDE, and the PDE solvers really need the ODE solvers, and then allowing you to build SDE problems by adding noise to an ODE problems means that they all require each other and so we're at the same point where using one of them would use all of them (this is ONLY if they aren't conditional dependencies. Otherwise you could have them require each other just when needed...). This is the current dilemma which is why I haven't made a move on splitting it up yet (the actual splitting would be pretty easy: take no more than a few hours). To really cut down on the size of the package, I should probably split the documentation, examples, and benchmarks out. I have it set so that way Github doesn't calculate those in the line total, but if I didn't have them ignored, you'd see that the majority of DifferentialEquations.jl is actually Jupyter notebooks (when counting lines / size since those things are so bloated). Would it be okay to have the user install another package for the examples? I guess I could just add that to the lines where I tell people how to open the example notebooks. |
Another thing that heavily contributes to the size of the package is the ODE tableaus. I have a crap ton of them. I think that most people should stick to using the optimized ODE algorithms and the tableaus themselves are more of a research tool (it's like 100 tableaus or so, and they plot stability diagrams and you can benchmark them all against each other). I use some of them in the tests, though they are only used as options along with the |
How large are all the files? If it's a few MB I'd say it's not worth separating into a new package, which will be less likely to be installed
It's unfortunate that conditional modules are lacking at the moment. However, it still seems to me that the backbone are all the ODE solvers, is it not possible to split just the ODE solver out and have another package with all the other solvers? If I understand correctly your concern is the following: ODE solvers are needed for the PDE solvers, the SDE and ODE solvers are needed by the SPDE solvers. Would it not be possible to split the SDE/ODE solvers from PDE related solvers? Right now it's quite difficult to wrap my head around the code, with how it's currently structured. |
Benchmarks + Docs + Examples are 92 MB right now. Probably a lot of the .git history too (182 MB). Those two together are >95% of the size.
SDEs and PDEs are essentially the same via the Forward Kolmogorov equations. I am not using that quite yet because I need to make Grids.jl for efficient iterators in order to have the best possible FDM methods (going to make them right the first time, not the MATLAB-style with full matrices). However, that's where that's heading in the very near future. But then again, only one way to solve SDEs is through parabolic equations, and only one way to solve parabolic equations is through SDEs, but the domains for which these methods are best are large enough that I want to fully exploit this relationship (this was the reason for making all of this together in the first place). Right now the different solvers are all in different folders (src/ode, src/sde, etc.). The only things that are together are the types (the solution types are all together, the problem types are all together, etc) in the src/general directory. Maybe splitting the types by equations would make it easier, and the general folder should just have the functions on the abstract types? [The thing that's probably hard to understand is the special casing: each optimized algorithm has its own interpolation scheme, and there's different default settings for widely used algorithms, etc. In fact, I think the ODE algorithms are the most in depth in this respect (ode_solve is just an outer wrapper, but it's able to handle so many different cases that it ends up being a lot of code). What I should really do is extend the contributor documentation more: detailing what each file/folder is doing.] |
Wow yeah....my .julia folder is ballooning out of hand in size, I suspect this isn't just a problem with your package. For the SDEs issue, I'm thinking of lightweight SDE solvers such as Euler-maruyama and variants. What if the ODE package contained, stiff/regular/geometric/sde solvers (light weight)/? |
The .julia folder is ballooning since each package is a Git repository, and Git repos retain history. Tom just was talking with the METADATA maintainers about pruning Plots.jl for this reason. I pruned the repo once before actually (I had to remove sensitive information, i.e. unpublished methods for SDE adaptivity and higher order SDE methods, both waiting on publication). This is an unsafe operation since it means older releases will not work. This also means it doesn't get smaller when you remove things from the repository... it's setup for small packages, but this fact, plus the lack of conditional dependencies, are causing clear problems in larger packages. I agree that the FEM set should be its own package though. FEM usually will only be PDEs. The interactions between ODEs/SDEs -> PDEs is only for things like square domains which are better suited for FDM methods. The FDM methods should be pretty lightweight: just building an ODE to solve. |
Sounds good, this issue was inspired by the fact that 95% of the time myself and those I work with only use at best 3 different ode solvers, and 90% of the time it's just ode45 |
That's the problem: best is dependent on the problem. Dormand-Prince 4/5 solvers (ode45) are only "best" in cases where you need mild tolerances (1e-3 to 1e-5 accuracy?) and a quick interpolant (though this isn't really true, it's based on old wisdom. See how the newer As you probably know, the story gets even more complicated for PDEs. SDEs are oddly different: I am finding that some of the higher order Runge-Kutta methods I am using with the special adaptivity just tend to work on tough equations better than other methods (since there really isn't a "stiff solver" do to the non-finite moments of the inverse normal distribution). But that's for a later time (after publication). Even then, it's faster to solve a PDE to get the distribution for an SDE than it is to do a Monte Carlo experiment, but only for low-dimensional SDEs. Etc., so many different cases. That said, I think the discussion on separating pieces of a package like this should come down to two factors: size and how easy it breaks (i.e. does that section cause precompilation errors?). The code base is surprisingly small in size since the code itself is entirely Julia (it's about 30,000 lines of Julia, but since it's all text it's small). And since it's all Julia code (no C/Fortran dependencies), it doesn't cause build issues. That's why I haven't seen this as very pertinent. The crucial things to have as conditional dependencies are things like Sundials and ODEInterface since, because they have to build binary dependencies, they can cause breakage (rather easily). That's why those must be optional. But the FEM methods? It's a 76 KB folder that doesn't cause compilation issues. |
Thanks for this. It might be interesting to think about an interface that is agnostic to the user's expertise. For example:
I'm not sure if "size" should account for whether separation makes sense. One could argue that a package should contain only those features that are essential for it's main function and that the smaller this is the more preferable. Whether this is relevant to DifferentialEquations.jl is another story, since we probably won't see any of this happening until Julia's conditional module's features are more mature. |
Conditional imports looks like it's coming sooner rather than later: JuliaLang/julia#6195 (comment) I'll hold out for that. |
What names should I use? I am thinking that instead I should just start doing the split, and for now just set it up with them requiring each other. So it's more of a semantic breakup until conditional modules come. |
Some additional thoughts: ODESolve Or: |
@tkelman, could you chime in? Is there a naming scheme you would suggest? Any you'd reject? I'm also sending the Julia Praxis squad here for opinions. |
Consider the appropriate ways to categorize your application with respect to the subfields it addresses. One way that I have found useful is to analyze the collective from the view that it is a whole, and use the elements that come of analysis as grist for resynthesis .. a remaking of the sense in which it is a whole; then use that as the basis for subselection. |
SolveODE, SolvePDE, SolveSDE is prefered over ODESolver etc ({what you are doing}{to/for/on what}) just as RedirectIO was chosen over IORedirect. |
Why? What's the precedence? |
I don't see any reason for SolveODE over ODESolvers after all the package is a suite of ODE solvers ... Ideally, we could have the following name hierarchy ODEs It's simple, readable, and no one could mistake their purpose. |
@ViralBShah, do you have any comments? |
I did not know this to be a packaging of solvers. I assumed there were solvers in it; I did not assume there were only solvers. The package is not named DifferentialEquationSolvers.jl. |
acronyms should be avoided |
Here's the full list of what's all going on right now, trying to break it apart into packages. These are the things which need good naming schemes.
And in the near future, following the roadmap #47:
Then, as discussed here, having Domain-Specific API packages. I have no idea for the naming, but something like
(or [This is just laying out what exactly is all in here. Clearly it's too much for one package, but I hope we can work out good names...] |
Do you have a suggestion then? Because I think |
@tkelman That's an arbitrary rule, that isn't uniformly applied to packages (e.g. ODE, SDE, JuMP, PyPlot, PyCall, PkgDev, and many others), as a result I don't quite understand the "thumb's down". In this case, I've never read a paper where they explicitly spell out PDE, ODE, SDE. These are such well established acronyms in the field. |
abbreviations are usually okay, but ODE and SDE are bad names that probably would be spelled out more if registered today. people not in the field will want to know what the package is for and possibly use it as well. longer names are clearer. |
What about |
A thought on Financial_, Biological_: The helpfulness that spelling things out brings is particularly evident with packages that target people with different expertise and interests. OrdinaryDiffEq is much more accessible than ODE and reads closer to OrdinaryDifferentialEquations than ODE, but FinancialDiffEq (while better than FDE or FinancialDE) is not as accessible as FinanicalDifferentialEquations. At the margin, what is the cost of naming a package OrdinaryDifferentialEquations rather than OrdinaryDiffEq? As a package user, the value of seeing at a glance what is going on without having to focus on specific associations is greater than the cost of typing the additional letters at the top a file. The habit in programming to give things shortend names rose from physical limitations that have been bridged since. The tendency to use shorter variable names as a way of saving typing and keeping source text from getting overly ragged is irrelevant when considering module names. |
So you think that those should be appended with |
No -- I prefer models, that is much better! |
Hey, major updates. DifferentialEquations.jl can be broken up without needing conditional dependencies. I've worked it out, and already have some prototypes locally. Given the way testing happens (it requires the registered versions), this will be done in stages. Here's what it looks like:
So thank you guys for all of the feedback. @tkelman we need to agree on names, and the rest should follow quite easily (except for what requires Pkg3). |
AdjectiveDiffEq seems sufficiently descriptive to me |
@tkelman the repositories are ready but there's a chicken and egg problem where DiffEqBase tests won't pass until OrdinaryDiffEq, StochasticDiffEq, etc. are registered, but of course those all depend on DiffEqBase, and there are other dependencies around. Can I register the whole batch together and submit patches if needed? |
The PR is in: JuliaLang/METADATA.jl#6833 This covers the main changes. Then registering / tagging the models packages will come, then DifferentialEquations.jl will get a major release. |
Completed by cb08e55 The common interface is being discussed SciML/Roadmap#5 so that way other packages can plug into the ecosystem more easily. But yes, now parameter estimation, sensitivity analysis, etc. are separate packages but utilize all ODE solvers, etc. This push will homogenize the diffeq solver interface while distributing the ability to contribute. Thanks for the inputs! |
There's a lot of great functionality here...but there is almost too much functionality in one package 😄
One thing I think would make sense as a first step is to separate out PDE solvers from ODE solvers. This could be done alongside deprecating ODE.jl. IMO there should be just "one" ODE package that is "the" package for ODEs in Julia. If there are two ODE packages the tradeoffs between the two should be made clear in the README. I.e. why should I use this ODE package vs the other ODE package.
A lot people will likely want to use the ODE solvers without using the PDE solvers. From experience most of the time I have to write a specialized PDE solver for my use case, which will take advantage of an ODE solver.
Things to think about:
What exactly to separate out?
Should the SDE solvers be in their own package too (currently I think this would also be a good idea, but there might be an argument made against that ..., I suppose)
The ODE package should contain: (please fill in/update)
The text was updated successfully, but these errors were encountered: