Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Code matching and Expr generation #183

Merged
merged 24 commits into from
Jan 30, 2021
Merged

WIP: Code matching and Expr generation #183

merged 24 commits into from
Jan 30, 2021

Conversation

shashi
Copy link
Member

@shashi shashi commented Jan 22, 2021

Starting the effort to release all the tension around build_function here.

Maybe the matchable thing is too fancy, but I like the macro as an experiment at least.
It makes

@matchable struct Hypot
   x
   y
end

matchable in the form of:

@rule Hypot(~x, ~y) => hypot(~x, ~y)

Some arguments are vectors, e.g. pairs in Let, we need some matchers for vectors, and tuples maybe.

We still need some types to represent function outputs of various forms, so that body fields in Func or Let can represent:

  • In-place array output
  • new array output (with specified default value)
  • tuple output
  • sparse array output with known sparsity
  • StaticArray output
  • any combination of these (tuple of vectors, sparse matrix of StaticArrays...)

Copy link
Member

@YingboMa YingboMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

lhs(a::Assignment) = a.lhs
rhs(a::Assignment) = a.rhs

const (←) = Assignment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to force the user to use Unicode. Can we also alias it to :=?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a binary operator...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can always use Assignment(a, b) :-p

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha

@shashi shashi mentioned this pull request Jan 28, 2021
@cscherrer
Copy link
Contributor

Hi @shashi , thanks for the pointer! I'm still missing some context.

  • What is "the tension around build_function"?
  • What do you mean by "body fields in Func or Let"? I don't see any code using those names.

Will this let the @matchable struct have a type that can be determined by its arguments? Mostly I need first-class typed lambdas as in #181

@shashi
Copy link
Member Author

shashi commented Jan 28, 2021

around build_function

https://github.com/SciML/ModelingToolkit.jl/blob/master/src/build_function.jl

^ this code is not pretty

What do you mean by "body fields in Func or Let"? I don't see any code using those names.

@matchable struct Func
    args
    kwargs
    body #   <-- this
end

have a type that can be determined by its arguments?

not unless you make them parameters.

@cscherrer
Copy link
Contributor

Thanks @shashi . MTK didn't seem set up to have the flexibility I needed, so I haven't really done much with it. Hopefully it can make sense at some point if it becomes more flexible or some functionality is factored out into other packages.

@ChrisRackauckas
Copy link
Member

The problem is that it's too flexible. Some of it needs to be refactored into other dispatches because the sparse array handling overlaps with all of the parallelism choices in a way creates a bit of a mess. And that's not to mention the fact that there's little code reuse between the portions for generating code for C, Stan, and MATLAB.

We instead need a way to specify "these are the blocks to keep together" (with cost functional information and locality information that connects to nonlinear tearing) which is separated from the specification of how to parallelize the blocks. The macroeconomics community donated a good set of tests over all of the sparsity stuff:

https://github.com/SciML/ModelingToolkit.jl/blob/master/test/build_function_arrayofarray.jl

The parallelism is a little undertested so it needs to be careful, but with a few parallelism tests it should be pretty easy to know when a superset has been hit with a nicer functional form.

@ChrisRackauckas
Copy link
Member

But the real issue with build_function -> Symbolics.jl is going to be about how it connects with tearing. Tearing does need to work, and that is probably going to add a few complications.

@cscherrer
Copy link
Contributor

The problem is that it's too flexible.

Can MTK let me specify

  • types for the symbolic values
  • how the simplification is done, and
  • how the code is generated?

It didn't seem to expose interfaces for these.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Jan 28, 2021

Well of course it can always get more features. But it's already a big load of features which is the cause of the refactor.

types for the symbolic values

That's moreso because types are not fully supported in the symbolic system at this point, which is something that needs to get fixed. But it does do this: vectors of vectors and such are exactly what I just pasted tests of.

how the simplification is done

what do you mean by that? simply.(ops) on the input?

how the code is generated?

ParallelForms, targets like CTarget, etc. It can generate code for clusters and it can put equations inside of Stan. That sounds like quite a lot of "how the code is generated" zzzz.

@cscherrer
Copy link
Contributor

@ChrisRackauckas Here's the kind of thing I need to be able to do:
https://informativeprior.com/blog/2021/01-25-symbolic-simplification/

This uses Soss, MeasureTheory.jl, and SymbolicCodegen.jl

@ChrisRackauckas
Copy link
Member

Yes, the goal is to make it do all of that, plus work with sparsity patterns, automate the parallelism, and make it work with strongly connected components information for tearing. CSE falls into this because we immediately did sparsity -> parallelism, but you cannot easily then apply CSE without breaking parallel code because then you have to know the dependencies, i.e. you can only CSE things that live on the same node, and that needs to change the parallelism. So that's why it needs to take a step back and instead:

  1. First represent the core function it's trying to calculate (the Let) with known strongly connected components information and tearing information.
  2. Perform CSE to reduce the computation but increase the size of the graph, along with pasting in observed variables. Any other custom code changes should happen here.
  3. Split the blocks into groups for parallel computation
  4. Map that output onto the output vector. If the output is supposed to be sparse, then represent the sparse pattern by its linear .data vector and map it directly onto there (which is the fast way to do it anyways).

The big problem is that in the current design 3 and 4 are flips / intertwined, making 2 almost impossible, because we never did 1. But the goal is, knowing the issues now, solve it correctly by lowering in stages like that. (@shashi did I miss anything? I think that's all of it)

@cscherrer
Copy link
Contributor

Ok that's all great, my point was just about flexibility. I needed different functionality to be cleanly separated. Just having one giant build_function is extremely limiting.

@ChrisRackauckas
Copy link
Member

Just having one giant build_function is extremely limiting.

We noticed 🤣. It outgrew its use. It's Lennie's rabbit and it got too much love until it needs a rewrite, which @shashi is leading since now we know a good scope for it. When it was designed, mixing with Dagger.jl and tearing wasn't even unheard of, we didn't even know it could be a thing.

@shashi
Copy link
Member Author

shashi commented Jan 29, 2021

It is usually not super useful to have types as parameters, except for dispatch. You can always call symtype at run time -- this is probably better whenever it suffices.

@shashi shashi marked this pull request as draft January 29, 2021 16:07
@shashi shashi marked this pull request as ready for review January 30, 2021 12:21
@shashi shashi merged commit 8602860 into master Jan 30, 2021
@shashi shashi deleted the s/code branch January 30, 2021 13:02
@shashi
Copy link
Member Author

shashi commented Feb 4, 2021

MTK didn't seem set up to have the flexibility I needed

This is true, and that's why we are doing this. and the rewrite.

I am thinking more about how we can make it even better and flexible before I merge the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants