-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No inlining let-bound global vars with clock types #2846
Conversation
0a2d090
to
b13b96f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should backport this to 1.8?
Furthermore, it wouldn't surprise me if the reproducer in clash-testsuite
that is added here no longer reproduces the issue once issue #2570 is truly fixed. This is just something I realised and wanted to point out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you aware that bindConstantVar
already checks for (vars with) local ids?
clash-compiler/clash-lib/src/Clash/Normalize/Transformations/Inline.hs
Lines 118 to 120 in 10f26ff
test _ (i,stripTicks -> e) = case isLocalVar e of | |
-- Don't inline `let x = x in x`, it throws us in an infinite loop | |
True -> return (i `notElemFreeVars` e) |
I think that means that your (isLocalId v)
will be False
, except possibly when we recurse inside isWorkFreeIsh
.
-- Only local variables with a clock type are work-free. When it is a global | ||
-- variable, it is probably backed by a clock generator, which is definitely | ||
-- not work-free. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local variable distinction feel arbitrary to me.
Given:
clk = clockGen
clkGlobal = clk
f =
let
clkLocal = clk
clkA = clkLocal
clkB = clkGlobal
in [...]
As I understand it, you're saying here it is ok for bindConstantVar to replace clkA
inside of [...]
with clkLocal
, but not clkB
with clkGlobal
.
Is that what you're saying?
And if so, why?
They seem equivalent to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I'm saying. Inlining clkA
everywhere will not make the circuit f
any larger (perform work). Inlining clkB
everywhere will make the circuit f
larger, because it will have duplicated calls to what will ultimately be clockGen
.
-- | ||
-- Inlining let-bindings referencing a global variable with a clock type | ||
-- can sometimes lead to the post-normalization flattening stage to generate | ||
-- code that violates the invariants of the netlist generation stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember which invariants?
And are these invariants actually documented anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot have let-expressions appear in the argument position of an application.
I am aware. But prior to this commit/PR, the compiler would inline let-bindings binding global variables with a clock type, which is bad because it duplicates work/increases the size of the circuit.. |
Do not inline let-bound recursive calls
b13b96f
to
6f9b730
Compare
I think @leonschoorl comments correctly identify a missed CSE oppertunity in that the current implementation will, for: clk = clockGen
clkGlobal = clk
f =
let
clkLocal = clk
clkA = clkLocal
clkB = clkGlobal
in [...] not transform That being said, once this PR is merged, we should probably open an "enhancement" issue that Clash is missing a desirable CSE opportunity. |
I'd like to call your attention to the point that there can ever only be one clock primitive in one domain.¹ Two clock primitives might generate the same frequency, but they are probably not perfectly phase-aligned, making them separate clock domains. So with constructions like ¹ The other way around is fine: you could have one clock primitive which outputs multiple clocks. |
Mergify was not opening a backport PR so I tried to delete and re-add the label. Didn't work; but I remembered you can look at Actions for Mergify stuff. Mergify reports:
and boy something has gone wrong with this PR. Just admire this nice shape:
I don't know whatever happened here, but yeah, I don't blame Mergify not being able to backport this... D-: I think PR #2846 was at one point based on PR #2844, and then rebased incorrectly somehow. |
I have no idea what happened here. My regular way of working is:
|
I also don't know what went wrong, and we actually lost your commit message of this PR in the process. I'm currently handcrafting a backport PR. |
The intended commit message of this PR is identical to the PR cover letter. |
The global vars are usually backed by a clock generator that are not work-free. In addition, when these global vars are recursively defined, they can mess up the post-normalization flattening stage which then violates certain invariants of the netlist generation stage. This then causes the netlist generation stage to generate bad Verilog names. Fixes #2845 (The PR on master was malformed somehow, with the correct contents but missing the commit message above and an incorrect Git structure. This backport to 1.8 is reconstructed by hand.)
The global vars are usually backed by a clock generator that are not work-free. In addition, when these global vars are recursively defined, they can mess up the post-normalization flattening stage which then violates certain invariants of the netlist generation stage. This then causes the netlist generation stage to generate bad Verilog names. Fixes #2845 (The PR on master was malformed somehow, with the correct contents but missing the commit message above and an incorrect Git structure. This backport to 1.8 is reconstructed by hand.) Co-authored-by: Christiaan Baaij <[email protected]>
The global vars are usually backed by a clock generator that are not work-free.
In addition, when these global vars are recursively defined, they can mess up the post-normalization flattening stage which then violates certain invariants of the netlist generation stage. This then causes the netlist generation stage to generate bad Verilog names.
Fixes #2845
Still TODO: