You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is more of a proposal for now. The idea is as follows:
For e.g. GatherLayer we automatically infer the shared axes between the positions and the params automatically using the dim tags. For me, this logic worked very well in the past.
In a same way, we could also infer the var1/var2 axes as be the axes which are not present in both inputs for DotLayer.
This would also work well for when axis change depending on automatic rec layer optimization, and make a common argument that could be specified instead of var1/var2 like discussed in #569 redundant.
I'm not 100% sure if this might be unexpected behavior though. E.g., to compute an dyadic product of a [F]-shaped tensor (i.e. compute v * v^T), the user could previously use DotLayer and set var1=var2=F as well as red1=red2=(), and then it would do exactly that, giving a [F1,F2] tensor.
Now with that change, you would first need to create a copy of the input, rename the axis to something different, and then use DotLayer.
But the old solution anyway had some issues: It would not be clear what the output dim tags (here F1 and F2) would be. They should would be the same (I'm really not sure what our implementation currently does), but then you could not reference the axes uniquely. That would be pretty bad.
With the new solution, we are very explicit what the output dim tags are.
Similarly, we can also get away with just a single reduce argument (#636).
This is more of a proposal for now. The idea is as follows:
For e.g.
GatherLayer
we automatically infer the shared axes between the positions and the params automatically using the dim tags. For me, this logic worked very well in the past.In a same way, we could also infer the
var1
/var2
axes as be the axes which are not present in both inputs forDotLayer
.This would also work well for when axis change depending on automatic rec layer optimization, and make a
common
argument that could be specified instead ofvar1
/var2
like discussed in #569 redundant.I'm not 100% sure if this might be unexpected behavior though. E.g., to compute an dyadic product of a
[F]
-shaped tensor (i.e. computev * v^T
), the user could previously useDotLayer
and setvar1=var2=F
as well asred1=red2=()
, and then it would do exactly that, giving a[F1,F2]
tensor.Now with that change, you would first need to create a copy of the input, rename the axis to something different, and then use
DotLayer
.But the old solution anyway had some issues: It would not be clear what the output dim tags (here
F1
andF2
) would be. They should would be the same (I'm really not sure what our implementation currently does), but then you could not reference the axes uniquely. That would be pretty bad.With the new solution, we are very explicit what the output dim tags are.
Similarly, we can also get away with just a single
reduce
argument (#636).Originally posted by @Zettelkasten in #627 (comment)
The text was updated successfully, but these errors were encountered: