DeepOnet Multiple output #9

KirillZubov · 2024-06-26T16:46:51Z

u = ones(Float32, 10, 10, 5)
v = ones(Float32, 1, 10, 5)
deeponet = DeepONet(; branch = (10, 10, 10), trunk = (1, 10, 10))
ps, st = Lux.setup(Random.default_rng(), deeponet)

y, st_ = deeponet((u, v), ps, st)
@test size(y) == (10, 5)

Now the output only for one feature solution with dropping one dim (1, 10, 5) to (10, 5). It is not dim for multiple output.
It need additional dim (output_dim, 10, 5)
My suggestion and easy way is add linear layer as option, how it has done here src/neural_operators.jl SciML/NeuralPDE.jl@30a5134#diff-3623c72624d6f36cc808e510588bfe6ed4872162fdce6e12b69408856c038c5d

deeponet = DeepONet(; branch = (10, 10, 10), trunk = (1, 10, 10), liner =(10, output_dim))

The text was updated successfully, but these errors were encountered:

avik-pal · 2024-06-26T19:32:59Z

With linear do you expect linear(y) where y is the output of the dot product, aka dropdims(sum(...)) or you want it applied on sum(...) without collapsing the first dim?

KirillZubov · 2024-06-27T07:30:41Z

We need sum only for embedding dims - inner feature representation, which is the dot product branch and trunk for DeepONet.
If we use the linear at last layer, we don't need dropping dim with embedding because linear transformation do it.

@avik-pal

embbeding_size = 10

branch = Lux.Chain(
    Lux.Dense(3, 20, Lux.tanh_fast),
    Lux.Dense(20, 20, Lux.tanh_fast),
    Lux.Dense(20, embbeding_size))

a = rand(3, 50,40, 1)
θ, st = Lux.setup(Random.default_rng(), branch)
b_out, st = branch(a, θ, st)

trunk = Lux.Chain(
    Lux.Dense(2, 20, Lux.tanh_fast),
    Lux.Dense(20, 20, Lux.tanh_fast),
    Lux.Dense(20, embbeding_size, Lux.tanh_fast))

a = rand(2, 1, 1, 30)
θ, st = Lux.setup(Random.default_rng(), trunk)
t_out, st = branch(a, θ, st)

out_ = b_out .* t_out;

# julia > size(out_)
# (embbeding_size, 50, 40, 30)

# here we sum embbeding inner representation 
out = sum(out_ , dims = 1)
# julia > size(out)
# (1, 50, 40, 30)


output_size = 3
linear = Lux.Dense(10, output_size)
θ, st = Lux.setup(Random.default_rng(), linear)
l_out, st = linear(out_, θ, st)

#julia > size(l_out)
#(output_size, 50, 40, 30)

ps Option for the layer at the end can be used for many other purposes (for example Fourier feature embeddings an many other) not only for the multi output that I described here as issue. So in generally I think it will be good thing for interface.

ayushinav · 2024-06-28T03:08:46Z

Hi @KirillZubov,
I'm not sure if the above implementation of DeepONet is completely correct. The branch net takes the input $u$ at different locations, let's say $\vec{x}$, so $u(\vec{x})$ will be defined at multiple location points. Let's assume for now that we have only single output, i.e., $u(\vec{x})$ is a scalar.

In case of multiple outputs, $U = u_1, u_2, u_3,..., u_O$, we can learn the approximation for each of the functions independently from each other, much like what happens in NeuralPDE.jl where cases of multiple functions are learned independently. We pass a vector of chains instead of one.

The input of the branch net is then a vector. For a batch size of nb and m sampled points, the input would be m x nb. For $x \in \R^d$, the input to trunk net would be d x N x nb if we have N points to evaluate the function, the points at which we put sensors.

The output of branch net would then be embedding_size x nb and that of trunk net would be embedding_size x N x nb. We'd then want to do a dot product along the first axis or can approach the same using a matmul operation. The DeepONet would then output the function evaluations at the sensor points, that is, the output would be of the size embedding_size x nb.

This is how I understand the DeepONets work. If this is correct, the out or out_ in the above code are not correct.

KirillZubov · 2024-06-28T07:35:38Z

I've quite fast written this script above just only for show how to use linear layer. My point here is not about how to calculate the vector inside. It is about support multiple output for DeepOnet. If you think that isn't need here - ok.

Yes we can use batch of NNs for each output independently but also can use one NN for all output. Here example https://github.com/SciML/NeuralPDE.jl/blob/a57b966df8e172f7e0eebf9ec0c2d94b6eaf596e/test/NNODE_tests.jl#L137-L153

DeepOnet is not about what size vectors. It about how to learn operator from sensor and locations information with branch-trunk architecture, other is flexible. It would be better to strive to make an implementation that could work with the widest possible range of extensions and not be strictly limited.
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=deeponet&btnG=

KirillZubov · 2024-06-28T11:12:50Z

I can implement this feature myself by the way.

ayushinav · 2024-06-28T15:59:16Z

I see. The linear layer wasn't there in the vanilla DeepONet so I wasn't much aware about it. Vector sizes were just a quick way to check but I got your point.

embbeding_size = 10
batch_size = 8

branch = Lux.Chain(
    Lux.Dense(3, 20, Lux.tanh_fast),
    Lux.Dense(20, 20, Lux.tanh_fast),
    Lux.Dense(20, embbeding_size))

trunk = Lux.Chain(
    Lux.Dense(6, 20, Lux.tanh_fast),
    Lux.Dense(20, 20, Lux.tanh_fast),
    Lux.Dense(20, embbeding_size, Lux.tanh_fast))

u_dims = 4 # 4-dimensional function points
u = rand(3, u_dims, batch_size) # rand(3, 50,40, 1)
θ, st = Lux.setup(Random.default_rng(), branch)
b_out, st = branch(u, θ, st)

size(b_out)
# (10, 4, 8)
# embedding_size x input_dims x batch_size

N = 16 # sensor points
x = rand(6, N, batch_size) # 6-dimensional points
θ, st = Lux.setup(Random.default_rng(), trunk)
t_out, st = branch(x, θ, st)

size(t_out)
# (10, 16, 8)
# embedding_size x sensor_dims x batch_size

out_ = batched_mul(permutedims(b_out, [2,1,3]), t_out)
size(out_)
# (4, 16, 8)
# input_dims x sensor_dims x batch_size

We can probably use the linear layer after this, but if it's not given by the user, we can just do dropdims along the dimensions 1 or 2 if the sizes are 1.

KirillZubov · 2024-07-01T11:19:52Z

Why do you want use permutedims(b_out, [2,1,3]) it is limited to 3dim output only. It is already fine implement by @avik-pal with https://github.com/LuxDL/LuxNeuralOperators.jl/blob/44c39bb6d39443fb4f1a80bcbd8a28e643b63397/src/deeponet.jl#L130
But all another is fine, additional last layer(better is call it like additional cause it can be any layer not only linear ) as optional. I already implemented it in my tiny DeepONet implementation.
SciML/NeuralPDE.jl@30a5134#diff-3623c72624d6f36cc808e510588bfe6ed4872162fdce6e12b69408856c038c5d

ayushinav · 2024-07-01T19:41:59Z

Hi,
Maybe I'm missing something out but I don't completely agree with the current implementation (though my above implementation wasn't correct either).
For an eg. case, the following errors

u = rand(Float32, 64, 2, 5) # m x u_dims x nb
y = rand(Float32, 6, 40, 5) # ndims x N x nb


branch = Lux.Chain(
    Lux.Dense(64, 32, Lux.tanh_fast),
    Lux.Dense(32, 32, Lux.tanh_fast),
    Lux.Dense(32, embbeding_size))

trunk = Lux.Chain(
    Lux.Dense(6, 20, Lux.tanh_fast),
    Lux.Dense(20, 20, Lux.tanh_fast),
    Lux.Dense(20, embbeding_size, Lux.tanh_fast))

model = DeepONet(branch, trunk)
ps, st = Lux.setup(Random.default_rng(), model)

first(model((u, y), ps, st))

This works when the broadcasting dimensions are same, which as we see here, won't always be true.

I think I finally got the high dimensional case. We have a function u we want to approximate that outputs a tensor of shape u_size given an input tensor y of shape y_size.

Let's have m sensor points and N points where we want to evaluate it. For a batch size of nb, we have

### branch
u                      =>  b
[m x u_size... x nb]   =>  [p x u_size x nb]

### trunk
y                      =>  t
[N x y_size... x nb]   =>  [N x p x nb]

We leave it to the user to determine how to transform y to a vector of length p through some network. The final output does not depend on the size of y.

t_ = permutedims(t, [2, 1, 3]) # p x N x nb
t_dot = reshape(t_, size(t_,1), ones(eltype(u_size), u_size), size(t_)[2:end])
# p x (1,1,1,...) x N x nb

b_dot = reshape(b, size(b,1), u_size, 1, size(b)[end])
# p x u_size x N x nb

dropdims(sum(t_dot .* p_dot; dims = 1), 1)
# u_size x N x nb

There is also a restriction on the dimensions of the outputs from the trunk, unless it implies that the user should take care of the dimensions of the trunk. While it is reasonable, because we can have any representation of the latent state where we do the dot product, but then the reshape at
https://github.com/LuxDL/LuxNeuralOperators.jl/blob/main/src/deeponet.jl#L129
should not be required, or one of the argchecks at
https://github.com/LuxDL/LuxNeuralOperators.jl/blob/main/src/deeponet.jl#L124
should not have been there. Also, maybe me but this way did not seem much intuitive to me as well.

KirillZubov · 2024-07-02T12:11:11Z

ok, got it

avik-pal · 2024-07-03T00:44:47Z

u = rand(Float32, 64, 2, 5) # m x u_dims x nb
y = rand(Float32, 6, 40, 5) # ndims x N x nb

To solve this do it on a case by case basis:

If broadcasting is possible then use broadcast and reduce, that should be more performant than doing a permutedims (it forces a copy which is expensive and doubles the memory footprint)
If not then permutedims and batched_mul.

KirillZubov mentioned this issue Jun 26, 2024

Physics informed neural operator ode SciML/NeuralPDE.jl#806

Open

12 tasks

KirillZubov mentioned this issue Jul 2, 2024

Vector output for PINO ODE SciML/NeuralPDE.jl#871

Open

avik-pal assigned ayushinav Jul 3, 2024

ayushinav mentioned this issue Jul 4, 2024

Deeponet multi-output fix #11

Closed

ayushinav mentioned this issue Jul 17, 2024

Deeponet multi-output fix #15

Merged

avik-pal closed this as completed in #15 Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepOnet Multiple output #9

DeepOnet Multiple output #9

KirillZubov commented Jun 26, 2024

avik-pal commented Jun 26, 2024

KirillZubov commented Jun 27, 2024

ayushinav commented Jun 28, 2024

KirillZubov commented Jun 28, 2024 •

edited

Loading

KirillZubov commented Jun 28, 2024

ayushinav commented Jun 28, 2024 •

edited

Loading

KirillZubov commented Jul 1, 2024

ayushinav commented Jul 1, 2024

KirillZubov commented Jul 2, 2024

avik-pal commented Jul 3, 2024

DeepOnet Multiple output #9

DeepOnet Multiple output #9

Comments

KirillZubov commented Jun 26, 2024

avik-pal commented Jun 26, 2024

KirillZubov commented Jun 27, 2024

ayushinav commented Jun 28, 2024

KirillZubov commented Jun 28, 2024 • edited Loading

KirillZubov commented Jun 28, 2024

ayushinav commented Jun 28, 2024 • edited Loading

KirillZubov commented Jul 1, 2024

ayushinav commented Jul 1, 2024

KirillZubov commented Jul 2, 2024

avik-pal commented Jul 3, 2024

KirillZubov commented Jun 28, 2024 •

edited

Loading

ayushinav commented Jun 28, 2024 •

edited

Loading