Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clock primitive is duplicated #2570

Open
rowanG077 opened this issue Aug 27, 2023 · 7 comments
Open

Clock primitive is duplicated #2570

rowanG077 opened this issue Aug 27, 2023 · 7 comments
Labels

Comments

@rowanG077
Copy link
Member

rowanG077 commented Aug 27, 2023

Tested on current master branch with following GHC versions:

  • 9.0.2: Fails as described below
  • 9.4.5: Works as expected
  • 9.6.2: Works as expected

Consider the following TopEntity:

{-# LANGUAGE QuasiQuotes #-}

module TopEntity where

import qualified Clash.Prelude as P
import Clash.Explicit.Prelude
import Clash.Annotations.Primitive
import Data.String.Interpolate (i)

oscg :: Clock System
oscg = clockGen
{-# ANN oscg (InlineYamlPrimitive [Verilog] [i|
  BlackBox:
    name: TopEntity.oscg
    kind: Declaration
    template: |-
      // OSCG begin
      OSCG \#( .DIV(6) ) ~GENSYM[oscillator][0] ( .OSC(~RESULT) );
      // OSCG end
  |]) #-}
{-# NOINLINE oscg #-}

topEntity = s1
  where
    clk = oscg

    f = P.exposeClockResetEnable (P.register False) clk noReset enableGen
    s0 = f $ dflipflop clk (pure True)
    s1 = f s0

And running clash TopEntity.hs --verilog you can see that OSCG primitive is duplicated 3 times:

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.7.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // No inputs

      // Outputs
      output wire  result // reset
    );
  // TopEntity.hs:27:5
  wire  rst;
  reg  result_1 = 1'b0;
  wire  c$app_arg;
  // TopEntity.hs:27:5
  wire  rst_0;
  reg  result_2 = 1'b0;
  wire  c$app_arg_0;
  reg  c$app_arg_1 = ({1 {1'bx}});
  wire  c$app_arg_2;

  assign rst = 1'b0;

  // register begin
  always @(posedge c$app_arg or  posedge  rst) begin : result_1_register
    if ( rst) begin
      result_1 <= 1'b0;
    end else begin
      result_1 <= result_2;
    end
  end
  // register end

  // OSCG begin
  OSCG #( .DIV(6) ) oscillator ( .OSC(c$app_arg) );
  // OSCG end

  assign rst_0 = 1'b0;

  // register begin
  always @(posedge c$app_arg_0 or  posedge  rst_0) begin : result_2_register
    if ( rst_0) begin
      result_2 <= 1'b0;
    end else begin
      result_2 <= c$app_arg_1;
    end
  end
  // register end

  // OSCG begin
  OSCG #( .DIV(6) ) oscillator_0 ( .OSC(c$app_arg_0) );
  // OSCG end

  // delay begin
  always @(posedge c$app_arg_2) begin : c$app_arg_1_delay
    c$app_arg_1 <= 1'b1;
  end
  // delay end

  // OSCG begin
  OSCG #( .DIV(6) ) oscillator_1 ( .OSC(c$app_arg_2) );
  // OSCG end

  assign result = result_1;


endmodule

Per @christiaanb suggestion I added a GHC.Magic.lazy as a view pattern to the Clock argument for:

With no effect.

An interesting observation is that if you rewrite f in the failing code from

f = P.exposeClockResetEnable (P.register False) clk noReset enableGen

to

f = register clk noReset enableGen False

there is no longer any duplication going on.

Removing dflipflop removes one duplication but this is just because there is one less usage of the clock.

@rowanG077 rowanG077 added the bug label Aug 27, 2023
@rowanG077 rowanG077 changed the title Clock blackbox primitive is duplicated Clock primitive is duplicated Aug 27, 2023
@christiaanb
Copy link
Member

Does adding a NOINLINE to clk work around the problem?

topEntity = s1
  where
    clk = oscg
    {-# NOINLINE clk #-}

    f = P.exposeClockResetEnable (P.register False) clk noReset enableGen
    s0 = f $ dflipflop clk (pure True)
    s1 = f s0

@rowanG077
Copy link
Member Author

Well it no longer duplicates the primitive... The primitive is not instantiated at all and clash has made up a clock out of thin air.

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.7.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // No inputs

      // Outputs
      output wire  result
    );
  // TopEntity.hs:28:5
  wire  rst;
  reg  result_2 = 1'b0;
  reg  c$app_arg = ({1 {1'bx}});
  // TopEntity.hs:28:5
  wire  rst_0;
  reg  result_1 = 1'b0;

  assign rst = 1'b0;

  // register begin
  always @(posedge TopEntity.topEntity_clk or  posedge  rst) begin : result_2_register
    if ( rst) begin
      result_2 <= 1'b0;
    end else begin
      result_2 <= c$app_arg;
    end
  end
  // register end

  // delay begin
  always @(posedge TopEntity.topEntity_clk) begin : c$app_arg_delay
    c$app_arg <= 1'b1;
  end
  // delay end

  assign rst_0 = 1'b0;

  // register begin
  always @(posedge TopEntity.topEntity_clk or  posedge  rst_0) begin : result_1_register
    if ( rst_0) begin
      result_1 <= 1'b0;
    end else begin
      result_1 <= result_2;
    end
  end
  // register end

  assign result = result_1;


endmodule

@christiaanb
Copy link
Member

Ah, alright...

So a work-around on GHC 9.0.2 that does work is to eta-expand f:

topEntity = s1
  where
    clk = oscg

    f x = P.exposeClockResetEnable (P.register False) clk noReset enableGen x
    s0 = f $ dflipflop clk (pure True)
    s1 = f s0

produces:

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.7.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // No inputs

      // Outputs
      output wire  result
    );
  reg  topEntity_f_result = 1'b0;
  reg  topEntity_f_c$app_arg = 1'b0;
  reg  topEntity_f_c$app_arg_0 = ({1 {1'bx}});
  wire  c$app_arg;
  wire  c$app_arg_0;

  // register begin
  always @(posedge c$app_arg_0 or  posedge  c$app_arg) begin : topEntity_f_result_register
    if ( c$app_arg) begin
      topEntity_f_result <= 1'b0;
    end else begin
      topEntity_f_result <= topEntity_f_c$app_arg;
    end
  end
  // register end

  // register begin
  always @(posedge c$app_arg_0 or  posedge  c$app_arg) begin : topEntity_f_c$app_arg_register
    if ( c$app_arg) begin
      topEntity_f_c$app_arg <= 1'b0;
    end else begin
      topEntity_f_c$app_arg <= topEntity_f_c$app_arg_0;
    end
  end
  // register end

  // delay begin
  always @(posedge c$app_arg_0) begin : topEntity_f_c$app_arg_0_delay
    topEntity_f_c$app_arg_0 <= 1'b1;
  end
  // delay end

  assign c$app_arg = 1'b0;

  // OSCG begin
  OSCG #( .DIV(6) ) oscillator ( .OSC(c$app_arg_0) );
  // OSCG end

  assign result = topEntity_f_result;


endmodule

@leonschoorl
Copy link
Member

leonschoorl commented Aug 31, 2023

Well it no longer duplicates the primitive... The primitive is not instantiated at all and clash has made up a clock out of thin air.

That is in itself is worrying.
And seems to happen even on GHC 9.6.2

@DigitalBrains1
Copy link
Member

DigitalBrains1 commented Nov 1, 2024

I think I have two more reproducers that lead to heaps of duplication, with a clock primitive being involved. I'm using GHC 9.8.2. They generate funky names as well: names like Duplicate.topEntity_cntr as well as \Duplicate.topEntity_cntr. The latter might technically be valid Verilog, but it seems Vivado isn't too happy about it anyway.

The first variant is without NOINLINE cntr:

module Duplicate where

import Clash.Explicit.Prelude
import Clash.Explicit.Testbench

topEntity ::
  Signal System (Unsigned 8)
topEntity = cntr + x
 where
  cntr = register clk noReset enableGen 0 0
  x = register clk noReset enableGen 100 0
  done = (== 100) <$> cntr
  clk = tbClockGen $ not <$> done

This generates the following Verilog:

Duplicate.hs Verilog
/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.9.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // No inputs

      // Outputs
      output wire [7:0] result
    );
  reg [7:0] c$app_arg = 8'd100;
  // Duplicate.hs:10:3-6
  reg [7:0] \Duplicate.topEntity_cntr  = 8'd0;
  // Duplicate.hs:13:3-5
  wire  \Duplicate.topEntity_clk ;
  // Duplicate.hs:10:3-6
  reg [7:0] Duplicate.topEntity_cntr = 8'd0;
  // Duplicate.hs:13:3-5
  wire  Duplicate.topEntity_clk;
  wire  c$rst;
  wire  c$rst_0;
  wire  c$rst_1;

  assign c$rst = (1'b0);

  // register begin
  always @(posedge Duplicate.topEntity_clk or  posedge  c$rst) begin : DuplicatetopEntity_cntr_register
    if ( c$rst) begin
      Duplicate.topEntity_cntr <= 8'd0;
    end else begin
      Duplicate.topEntity_cntr <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk;
  // 1 = 0.1ps
  localparam half_period = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk =  0 ;
    #1000000 forever begin

      if (~ (~ (Duplicate.topEntity_cntr == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk = ~ clk;
      #half_period;
      clk = ~ clk;
      #half_period;
    end
  end

  assign Duplicate.topEntity_clk = clk;
  // pragma translate_on
  // tbClockGen end

  assign c$rst_0 = (1'b0);

  // register begin
  always @(posedge Duplicate.topEntity_clk or  posedge  c$rst_0) begin : c$app_arg_register
    if ( c$rst_0) begin
      c$app_arg <= 8'd100;
    end else begin
      c$app_arg <= 8'd0;
    end
  end
  // register end

  assign c$rst_1 = (1'b0);

  // register begin
  always @(posedge \Duplicate.topEntity_clk  or  posedge  c$rst_1) begin : DuplicatetopEntity_cntr_register_0
    if ( c$rst_1) begin
      \Duplicate.topEntity_cntr  <= 8'd0;
    end else begin
      \Duplicate.topEntity_cntr  <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk_0;
  // 1 = 0.1ps
  localparam half_period_0 = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk_0 =  0 ;
    #1000000 forever begin

      if (~ (~ (\Duplicate.topEntity_cntr  == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk_0 = ~ clk_0;
      #half_period_0;
      clk_0 = ~ clk_0;
      #half_period_0;
    end
  end

  assign \Duplicate.topEntity_clk  = clk_0;
  // pragma translate_on
  // tbClockGen end

  assign result = \Duplicate.topEntity_cntr  + c$app_arg;


endmodule

Al least the following things are duplicated:

  reg [7:0] \Duplicate.topEntity_cntr  = 8'd0;
  wire  \Duplicate.topEntity_clk ;
  reg [7:0] Duplicate.topEntity_cntr = 8'd0;
  wire  Duplicate.topEntity_clk;

  assign Duplicate.topEntity_clk = clk;

  assign \Duplicate.topEntity_clk  = clk_0;

along with 2 instead of 1 clock primitive, and 2 register blocks for cntr instead of 1.

[edit]
I noticed I could get rid of the second clock and still have the duplication. The one with the second clock might still be interesting because of other weirdness, such as the done signal in HDL being not <$> done in Haskell?

  // tbClockGen begin
  [...]
      if (~ Duplicate.topEntity_done) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

  assign Duplicate.topEntity_done = ~ (Duplicate.topEntity_cntr == 8'd100);
This is output generated by the following code:
module Duplicate where

import Clash.Explicit.Prelude
import Clash.Explicit.Testbench

topEntity ::
  Signal System (Unsigned 8)
topEntity = cntr + posD
 where
  cntr = register clk1 noReset enableGen 0 0
  posD = register clk2 noReset enableGen 0 0
  done = (== 100) <$> cntr
  (clk1, clk2) = biTbClockGen $ not <$> done
[/edit]

@DigitalBrains1
Copy link
Member

DigitalBrains1 commented Nov 1, 2024

The second reproducer. If we NOINLINE cntr, we locally get rid of the escaped Verilog identifier, but we can create 4 copies of stuff by using the clock in 3 registers.

Duplicate2.hs:

module Duplicate2 where

import Clash.Explicit.Prelude
import Clash.Explicit.Testbench

topEntity ::
  Signal System (Unsigned 8)
topEntity = cntr + x0 + x1 + x2
 where
  cntr = register clk noReset enableGen 0 0
  {-# NOINLINE cntr #-}
  x0 = register clk noReset enableGen 1 0
  x1 = register clk noReset enableGen 2 0
  x2 = register clk noReset enableGen 3 0
  done = (== 100) <$> cntr
  clk = tbClockGen $ not <$> done
Duplicate2.hs Verilog

topEntity:

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.9.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // No inputs

      // Outputs
      output wire [7:0] result
    );
  reg [7:0] c$app_arg = 8'd3;
  reg [7:0] c$app_arg_0 = 8'd2;
  reg [7:0] c$app_arg_1 = 8'd1;
  wire  c$app_arg_2;
  wire [7:0] c$app_arg_3;
  // Duplicate2.hs:10:3-6
  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  // Duplicate2.hs:16:3-5
  wire  Duplicate2.topEntity_clk;
  wire  c$rst;
  // Duplicate2.hs:10:3-6
  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  // Duplicate2.hs:16:3-5
  wire  Duplicate2.topEntity_clk;
  wire  c$rst_0;
  // Duplicate2.hs:10:3-6
  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  // Duplicate2.hs:16:3-5
  wire  Duplicate2.topEntity_clk;
  wire  c$rst_1;

  assign result = ((c$app_arg_3 + c$app_arg_1) + c$app_arg_0) + c$app_arg;

  assign c$rst = (1'b0);

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$rst) begin : Duplicate2topEntity_cntr_register
    if ( c$rst) begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end else begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk;
  // 1 = 0.1ps
  localparam half_period = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk =  0 ;
    #1000000 forever begin

      if (~ (~ (Duplicate2.topEntity_cntr == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk = ~ clk;
      #half_period;
      clk = ~ clk;
      #half_period;
    end
  end

  assign Duplicate2.topEntity_clk = clk;
  // pragma translate_on
  // tbClockGen end

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$app_arg_2) begin : c$app_arg_register
    if ( c$app_arg_2) begin
      c$app_arg <= 8'd3;
    end else begin
      c$app_arg <= 8'd0;
    end
  end
  // register end

  assign c$rst_0 = (1'b0);

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$rst_0) begin : Duplicate2topEntity_cntr_register_0
    if ( c$rst_0) begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end else begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk_0;
  // 1 = 0.1ps
  localparam half_period_0 = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk_0 =  0 ;
    #1000000 forever begin

      if (~ (~ (Duplicate2.topEntity_cntr == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk_0 = ~ clk_0;
      #half_period_0;
      clk_0 = ~ clk_0;
      #half_period_0;
    end
  end

  assign Duplicate2.topEntity_clk = clk_0;
  // pragma translate_on
  // tbClockGen end

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$app_arg_2) begin : c$app_arg_0_register
    if ( c$app_arg_2) begin
      c$app_arg_0 <= 8'd2;
    end else begin
      c$app_arg_0 <= 8'd0;
    end
  end
  // register end

  assign c$rst_1 = (1'b0);

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$rst_1) begin : Duplicate2topEntity_cntr_register_1
    if ( c$rst_1) begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end else begin
      Duplicate2.topEntity_cntr <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk_1;
  // 1 = 0.1ps
  localparam half_period_1 = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk_1 =  0 ;
    #1000000 forever begin

      if (~ (~ (Duplicate2.topEntity_cntr == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk_1 = ~ clk_1;
      #half_period_1;
      clk_1 = ~ clk_1;
      #half_period_1;
    end
  end

  assign Duplicate2.topEntity_clk = clk_1;
  // pragma translate_on
  // tbClockGen end

  // register begin
  always @(posedge Duplicate2.topEntity_clk or  posedge  c$app_arg_2) begin : c$app_arg_1_register
    if ( c$app_arg_2) begin
      c$app_arg_1 <= 8'd1;
    end else begin
      c$app_arg_1 <= 8'd0;
    end
  end
  // register end

  assign c$app_arg_2 = 1'b0;

  topEntity_cntr topEntity_cntr_c$app_arg_3
    (.\Duplicate2.topEntity_cntr  (c$app_arg_3));


endmodule

topEntity_cntr:

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.9.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity_cntr
    ( // No inputs

      // Outputs
      output wire [7:0] \Duplicate2.topEntity_cntr
    );
  // Duplicate2.hs:10:3-6
  reg [7:0] \Duplicate2.topEntity_cntr_0  = 8'd0;
  // Duplicate2.hs:16:3-5
  wire  \Duplicate2.topEntity_clk ;
  wire  c$rst;

  assign c$rst = (1'b0);

  // register begin
  always @(posedge \Duplicate2.topEntity_clk  or  posedge  c$rst) begin : Duplicate2topEntity_cntr_0_register
    if ( c$rst) begin
      \Duplicate2.topEntity_cntr_0  <= 8'd0;
    end else begin
      \Duplicate2.topEntity_cntr_0  <= 8'd0;
    end
  end
  // register end

  // tbClockGen begin
  // pragma translate_off
  reg  clk;
  // 1 = 0.1ps
  localparam half_period = (100000 / 2);
  always begin
    // Delay of 1 mitigates race conditions (https://github.com/steveicarus/iverilog/issues/160)
    #1 clk =  0 ;
    #1000000 forever begin

      if (~ (~ (\Duplicate2.topEntity_cntr_0  == 8'd100))) begin
        `ifdef VERILATOR
          $c("std::exit(0);");
        `endif
        $finish(0);
      end

      clk = ~ clk;
      #half_period;
      clk = ~ clk;
      #half_period;
    end
  end

  assign \Duplicate2.topEntity_clk  = clk;
  // pragma translate_on
  // tbClockGen end

  assign \Duplicate2.topEntity_cntr  = \Duplicate2.topEntity_cntr_0 ;


endmodule

Stuff looks like:

  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  wire  Duplicate2.topEntity_clk;
  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  wire  Duplicate2.topEntity_clk;
  reg [7:0] Duplicate2.topEntity_cntr = 8'd0;
  wire  Duplicate2.topEntity_clk;

  assign Duplicate2.topEntity_clk = clk;

  assign Duplicate2.topEntity_clk = clk_0;

  assign Duplicate2.topEntity_clk = clk_1;

and there are 3 register blocks for cntr and 3 clocks. But the additional source file for the counter has escaped Verilog identifiers with a dot again, and it includes its own clock, bringing the total number of register blocks and clocks to 4:

module topEntity_cntr
    ( // No inputs

      // Outputs
      output wire [7:0] \Duplicate2.topEntity_cntr
    );
  // Duplicate2.hs:10:3-6
  reg [7:0] \Duplicate2.topEntity_cntr_0  = 8'd0;
  // Duplicate2.hs:16:3-5
  wire  \Duplicate2.topEntity_clk ;

[...]

  // tbClockGen begin

[...]

  assign \Duplicate2.topEntity_clk  = clk;
  // pragma translate_on
  // tbClockGen end

[...]

[edit]
Like the first reproducer, I realised I didn't need two clocks (two intended clocks, not counting unintentional copies). Also, I had initially overlooked the second HDL file >_<.
[/edit]

@DigitalBrains1
Copy link
Member

Ah. I did everything I reported about in the previous two messages with GHC 9.8.2. I didn't try other GHC's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants