Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-triage] chip_tl_errors #14200

Closed
msfschaffner opened this issue Aug 10, 2022 · 16 comments · Fixed by #15737
Closed

[test-triage] chip_tl_errors #14200

msfschaffner opened this issue Aug 10, 2022 · 16 comments · Fixed by #15737

Comments

@msfschaffner
Copy link
Contributor

Hierarchy of regression failure

Chip Level

Failure Description

UVM_ERROR @ * us: (cip_base_scoreboard.sv:433) [scoreboard] Check failed item.d_error == exp_d_error (* [] vs * []) On interface chip_reg_block, TL item: req: (cip_tl_seq_item@34189) { a_addr: * a_data: * a_mask: * a_size: * a_param: * a_source: * a_opcode: * a_user: * d_param: * d_source: * d_data: * d_size: * d_opcode: * d_error: * d_sink: * d_user: * a_source_is_overridden: * a_valid_delay: * d_valid_delay: * a_valid_len: * d_valid_len: * req_abort_after_a_valid_len: * rsp_abort_after_d_valid_len: * req_completed: * rsp_completed: * tl_intg_err_type: TlIntgErrNone max_ecc_errors: * } has 1 failures:

Steps to Reproduce

  • GH revision: 9c0b24ddb
  • util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_tl_errors --build-seed 2483310614 --fixed-seed 4154093656

Tests with similar or related failures

This is likely due to the missing alert connection in prim_flash and prim_otp.
Hopefully this will be resolved once the design updates are in (WIP by @msfschaffner).

@msfschaffner
Copy link
Contributor Author

Note that there are multiple seeds that fail with this error mode.

@johngt
Copy link

johngt commented Aug 19, 2022

  • tl_d_oob_addr_access
  • tl_d_illegal_access

@cindychip
Copy link
Contributor

Re-triage since this test has new issue.

@vogelpi
Copy link
Contributor

vogelpi commented Sep 27, 2022

I am looking into this. It seems that only accesses to debug ROM cause the UVM_ERRORs.

@vogelpi
Copy link
Contributor

vogelpi commented Sep 27, 2022

Failure Description:

UVM_ERROR @ 3901.302122 us: (cip_base_scoreboard.sv:431) [uvm_test_top.env.scoreboard] Check failed item.d_error == exp_d_error (0 [0x0] vs 1 [0x1]) On interface chip_reg_block, TL item: req: (cip_tl_seq_item@131626) { a_addr: 'h106d5 a_data: 'hef482b96 a_mask: 'h2 a_size: 'h0 a_param: 'h0 a_source: 'h2e a_opcode: 'h1 a_user: 'h264cd d_param: 'h0 d_source: 'h2e d_data: 'h0 d_size: 'h0 d_opcode: 'h0 d_error: 'h0 d_sink: 'h0 d_user: 'h152a a_source_is_overridden: 'h0 a_valid_delay: 'h0 d_valid_delay: 'h0 a_valid_len: 'h0 d_valid_len: 'h0 req_abort_after_a_valid_len: 'h0 rsp_abort_after_d_valid_len: 'h0 req_completed: 'h0 rsp_completed: 'h0 tl_intg_err_type: TlIntgErrNone max_ecc_errors: 'h3 }
, unmapped_err: 0, mem_access_err: 1, bus_intg_err: 0, byte_wr_err: 0, csr_size_err: 0, tl_item_err: 0, write_w_instr_type_err: 0, cfg.tl_mem_access_gated: 0 ecc_err: 0
UVM_INFO @ 3901.302122 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]

Steps to reproduce:

  • Git revision: f781bc2
  • util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_tl_errors --build-seed 739773536 --fixed-seed 2260535853 --waves

It seems like the scoreboard expects to see TL-UL errors but RV_DM doesn't produce errors:

Screenshot from 2022-09-27 18-34-22

I need to have a closer look at why DV expects an error here. It could be due to the access not being a full-word write. This seems to be the common denominator between all chip_tl_errors failures.

@msfschaffner msfschaffner self-assigned this Sep 27, 2022
@msfschaffner
Copy link
Contributor Author

msfschaffner commented Sep 27, 2022

This may be related to #14653 and #14921.
Let me help take a look (I think I know why this may be happening).

msfschaffner added a commit to msfschaffner/opentitan that referenced this issue Sep 27, 2022
msfschaffner added a commit to msfschaffner/opentitan that referenced this issue Sep 27, 2022
@msfschaffner
Copy link
Contributor Author

I cherry-picked #15169 onto f781bc2 and reran with

util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_tl_errors --build-seed 739773536 --fixed-seed 2260535853 --waves

and the test passed.

@msfschaffner
Copy link
Contributor Author

msfschaffner commented Sep 29, 2022

Note that with #15192 merged, I do not get any more failures when rerunning with 200 seeds.

@engdoreis
Copy link
Contributor

It's possible to note a significant improvement in this test, but it's still not 100%.

Tests latest 2022.09.28 2022.09.27 2022.09.26/1 2022.09.26/2 2022.09.24 2022.09.23 2022.09.22/1 2022.09.22/2
chip_tl_errors 85.00 30.00 5.00 35.00 40.00 30.00 25.00 30.00 30.00

@engdoreis engdoreis reopened this Sep 30, 2022
@engdoreis
Copy link
Contributor

I looked again and is passing for 3 days:

Tests latest 2022.10.01 2022.09.30 2022.09.29 2022.09.28 2022.09.27 2022.09.26/1 2022.09.26/2 2022.09.26
chip_tl_errors 100.00 100.00 100.00 85.00 30.00 5.00 35.00 40.00 40.00

So I'm closing this issue again.

@engdoreis
Copy link
Contributor

This error has returned

Tests latest 2022.10.22 2022.10.21 2022.10.17/2 2022.10.17/1 2022.10.15 2022.10.14 2022.10.12/2 2022.10.12/1 2022.10.11 Suite
chip_tl_errors 10.00 16.67 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 tl_d_oob_addr_access, tl_d_illegal_access

However the log signature has changed slightly

UVM_ERROR @ 2696.359716 us: (cip_base_scoreboard.sv:436) [uvm_test_top.env.scoreboard] Check failed item.d_error == exp_d_error (0 [0x0] vs 1 [0x1]) On interface chip_reg_block, TL item: req: (cip_tl_seq_item@32616) { a_addr: 'h1053c  a_data: 'h4a2615eb  a_mask: 'h1  a_size: 'h0  a_param: 'h0  a_source: 'h1f  a_opcode: 'h0  a_user: 'h249ba  d_param: 'h0  d_source: 'h1f  d_data: 'h0  d_size: 'h0  d_opcode: 'h0  d_error: 'h0  d_sink: 'h0  d_user: 'h152a  a_source_is_overridden: 'h0  a_valid_delay: 'h0  d_valid_delay: 'h0  a_valid_len: 'h0  d_valid_len: 'h0  req_abort_after_a_valid_len: 'h0  rsp_abort_after_d_valid_len: 'h0  req_completed: 'h0  rsp_completed: 'h0  tl_intg_err_type: TlIntgErrNone  max_ecc_errors: 'h3  }
  , unmapped_err: 0, mem_access_err: 0, bus_intg_err: 0, byte_wr_err: 0, csr_size_err: 1, tl_item_err: 0, write_w_instr_type_err: 0, cfg.tl_mem_access_gated: 0 ecc_err: 0
  UVM_INFO @ 2696.359716 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---

Steps to Reproduce

  • Commit hash where failure was observed bd9f6d019
  • dvsim invocation command to reproduce the failure, inclusive of build and run seeds:
    ./util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_tl_errors --build-seed 2411619278 --waves -v h

Tests with similar or related failures

  • chip_tl_errors

@engdoreis engdoreis reopened this Oct 24, 2022
@engdoreis engdoreis assigned vrozic and unassigned vogelpi and msfschaffner Oct 24, 2022
@vrozic
Copy link
Contributor

vrozic commented Oct 25, 2022

I can reproduce this issue locally. It fails with almost every seed, always due to the csr_size_err.
It seems that the commit that caused the error is 1201531

@msfschaffner Do you perhaps have an idea how to fix it?

@msfschaffner
Copy link
Contributor Author

I can take a look at this yes.

@msfschaffner
Copy link
Contributor Author

msfschaffner commented Oct 28, 2022

Alright, I think I finally have a fix for this in #15737. A DV workaround would be possible as well, but since it did not seem straightforward I ended up fixing the design. I filed an issue for future DV improvements here: #15803.

@msfschaffner
Copy link
Contributor Author

The fix has been merged - let's observe the nightlies and close the issue if the test passes again.

@msfschaffner
Copy link
Contributor Author

This test has been passing with 100% success for a while - closing!

|     | Tests          |   2022-11-02 07:03:26 |   2022-11-01 07:10:18 |   2022-10-31 07:08:35 |   2022-10-30 07:09:58 |   2022-10-29 07:10:41 |   2022-10-28 07:09:55 |   2022-10-27 07:09:48 |   2022-10-26 07:06:46 |   2022-10-25 07:06:33 | Suite                                     |
|----:|:---------------|----------------------:|----------------------:|----------------------:|----------------------:|----------------------:|----------------------:|----------------------:|----------------------:|----------------------:|:------------------------------------------|
| 225 | chip_tl_errors |                   100 |                   100 |                   100 |                 96.67 |                   100 |                  3.33 |                  6.67 |                     0 |                  3.33 | tl_d_oob_addr_access, tl_d_illegal_access |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants