-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: forge coverage #1576
feat: forge coverage #1576
Conversation
cli/src/cmd/forge/coverage.rs
Outdated
!artifact | ||
.get_abi() | ||
.map(|abi| abi.functions().any(|f| f.name.starts_with("test"))) | ||
.unwrap_or_default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should add an extension trait for this
// TODO: Maybe support coverage for fuzz tests | ||
coverage: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REF discussion here #99 (comment), + context from @mds1
I think "default no fuzz, flag to include it" is my preferred approach. Cause contrary to my comment there I often find myself writing concrete tests to test edge cases I'm worried fuzzer might not find / for gas profiling purposes
e4cdc0a
to
8112635
Compare
Random thought, likely shouldnt be in current PR but: It would be cool to get info on storage slot writes/checks. i.e. |
Optimizer is always off, which can cause issues when using solidity |
I think if the optimizer is On we get discrepancies between bytecode and sourcemap? |
ye we do (kind of, bytecode/sourcemap is accurate for the most part but the bytecode/sourcemap/ast conversions can get mangled iirc). i think best solution is to try to turn it off and if we get a stack too deep error and the user specifies to use the optimizer, we should just compile with the optimizer on and report a warning that we had to turn on the optimizer so coverage may not be accurate. otherwise coverage is impossible to run (and a slightly incorrect coverage report is better than no coverage report imo) |
I do wonder why does solc not emit an "optimized" sourcemap when the bytecode is also optimized.. |
to be clear, it tries its best to do so. I think it just mangles the bytecode to a point that so much of the bytecode doesnt actually map to anything in the source. So it technically does. The primary issue i believe is that a lot of the optimizations that are done are done via "peephole" optimizations, optimizations done on the bytecode itself, which means it doesnt track to the AST so the AST is only partially optimized relative to the bytecode. |
0b091f3
to
b3d2aaf
Compare
👀 it's happening |
Hey guys, an update. First of all, to everyone following the issue, sorry for the radio silence and lack of activity. I've both had an attempt at something more complex (in order to address an issue I think might come up), which I've put on hold for now, and also had some family related issues that impacted my mental health. On the latest commit before this comment I see that a lot of things in lil-web3 is covered: I still need to:
That seems like a shorter to do list, but the current thinking is that I get this into a useable experimental state and create issues separately for the remaining items, such as the HTML reporter, questions around some of the statement nodes, and figuring out what we support, what we should support, and what we can support. I've noticed some things that I need to address before this can be merged:
|
Update again, I'm currently working on hooking up branch hit data and concurrently checking out issue #2 I wrote about (the |
I think we need a rebase per the recent refactor by @mattsse. Should we make
I'm supportive of this, esp. the HTML reporter and edge cases around various nodes. e.g. in the Receive case, if you think it's a lot of work, let's track separately and can document it as a known limitation |
24dcba4
to
1086f4f
Compare
Getting there - some items are considered uncovered even though they are. This is related to how I pick anchors (see 3bb07ce and 5742f53), since I am only considering the runtime source map Additionally, branch coverage isn't really true in this current iteration, since:
|
Everytime I see a commit w progress on this PR I jump out of my seat lol Do you have a 1-paragraph explanation of "anchors" and why they help here? |
@gakonst Each AST node has an attached source range: a start byte and (optionally) a length. Previously I would take each opcode we executed, look it up in the source map and see if the offset described in the source map matched the start byte of one of our coverage items. This doesn't always work: sometimes the opcode that should mark an item as covered might be somewhere else in the range, and sometimes an opcode actually marks multiple items as covered because of the optimizer. Anchors are essentially opcodes that we pick that would mark items as covered. Another benefit of the anchor approach is that we can change what anchor we pick for more complicated items such as branches. For example we might want to pick a specific type of opcode within the source range (like a JMPI or something). Hope what I wrote made sense, it's a little hard to explain lol |
This makes sense (also copying @jpopesculian and @rkrasiuk who seem to understand the AST stuff better than I frmo the work on forge fmt in case they have thoughts / comments). Separately:
This makes sense.
What makes them more complicated? Doesn't each require/assert(bool) translate to an |
Ideally for branches we want to have three states: not executed, one path executed (either true or false) or both branches executed (i.e. covered). Currently for if statements with no else block there are two states: not executed or true branch executed which is not ideal. I think we probably want to signal to the developer if they only covered 1 possible path in their test. For require and assert though, all of this is marked by specific opcodes, and I need to figure out which. The general layout seems to be:
So my assumption is that the false branch would be right after JUMPI. For the true branch, I'm not sure - maybe I need to record what value is on the stack every time we encounter a Edit: I checked it out on Godbolt btw, here is my contract // SPDX-License-Identifier: UNLICENSED
pragma solidity >=0.4.0;
contract Test {
function test(uint256 a) public pure returns (uint256) {
require(a > 1, "abc");
}
} And here is the bytecode for PUSH 1
DUP3
GT
PUSH [tag] test_uint256_6
JUMPI
PUSH 40 # if we end up here, we will end up reverting
# load revert string
MLOAD
PUSH 8C379A000000000000000000000000000000000000000000000000000000000
DUP2
MSTORE
PUSH 4
ADD
# set return addr
PUSH [tag] test_uint256_5
SWAP1
# some abi encoding whatever
PUSH [tag] abi_encode_tuple_t_stringliteral_4e03657aea45a94fc7d47ba826c8d667c0d1e6e33a64a036ec44f58fa12d6c45__to_t_string_memory_ptr__fromStack_reversed_0
JUMP [in]
tag test_uint256_5
# after abi encoding we end up here
JUMPDEST
PUSH 40
MLOAD
DUP1
SWAP2
SUB
SWAP1
REVERT
tag test_uint256_6
JUMPDEST # if we end up here we didn't revert |
Previously coverage was based on the start of AST nodes in the source code. With the anchoring system, opcodes in relevant source maps are searched, and the first one within the source range of the AST node is selected as an anchor. If the anchor is executed once, then the coverage item is considered to be covered. This works very well, but false positives and edge cases still need to be found.
Consider the full range instead of just the offset. Unfortunately breaks detection of some items; suspected to be related to not using constructor source maps.
534ec77
to
49757dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smol nit
lgtm!
how can we test this properly?
|
||
fn visit_contract(&mut self, node: Node) -> eyre::Result<()> { | ||
let is_contract = | ||
node.attribute("contractKind").map_or(false, |kind: String| kind == "contract"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see,
sounds reasonable, we could however think of adding an enum (Known)Attribute that's AsRef<str>
so we can do node.attribute(Attribute::Abstract)
but no real benefit to that I'd assume
Not sure how to best do automated testing for it yet, my best guess is to do some sort of snapshot testing against external repos - so, we run There's still some way to go before this is dependable (see above) but those will be in other PRs |
SGTM |
Adds test coverage support to Forge.
Usage
Simply run
forge coverage
to get a coverage summary.An LCOV reporter is also included:
forge coverage --report lcov
. Currently the LCOV file is written to./lcov.info
but this is going to move.Background
I've had to rewrite this feature many times.. I wrote what I thought would be a good structure, only to find out later that something would not work and/or fit into the current structure, so I scrapped most of it and tried again.
Initially, the thought was to analyse
JUMPI
andJUMPDEST
s to find branches (i.e.require
,assert
,if
etc.), however it would not get us all the way in terms of the information we want - we want functions, branches, executable lines and statements.The primary mode of operation for the coverage tool will be to collect the following pieces of information:
This information will be collected from the AST of the Solidity files. Then the tests are run with a special coverage collector that marks opcodes as hit as they are executed, which we later use (along with source maps) to map back to the coverage items.
Currently, I am working off of a combination of
solidity-coverage
and this Brownie blog post to figure out what to include in the reports and relevant data structures.Report types
The above information will be output either to stdout as a simple report, or as an LCOV file that you can upload to services like CodeCov. A stretch goal would be to add a HTML report as well, but since tools already exist for converting LCOV to HTML, it isn't a deal breaker if it is not in the first version.
Compared to other coverage tools
require
andassert
as branches.Current status
Good
Missing
elseif
's with the same branch ID?). I also do not currently detectassert
andrequire
type branches.Needs change
out/coverage
(e.g.out/coverage/lcov.info
,out/coverage/html/index.html
), but I'm not sure yet.Needs clarification
Conditional
nodes count as branches (e.g. ternary statements).name
used to benodeType
and function names etc. were innode.attributes.name
)BRDA
item) have a branch ID and a block ID. I'm not sure what the block ID is supposed to mean.Task list
Branches
assert
as a branchrequire
as a branchConditional
nodes are branchesLines
Statements
Reporters
solidity-coverage
I've tried running the LCOV report through
genhtml
but it complained that the file was malformed. I'm not sure if this is to be expected, or if it is just because we don't map the relevant items to source lines and that is why it is complainingMisc
Other
Companion PR: gakonst/ethers-rs#1201
Closes #99