-
-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[venom]: new DFTPass
algorithm
#4255
Merged
Merged
Changes from all commits
Commits
Show all changes
149 commits
Select commit
Hold shift + click to select a range
919346b
τεμπ
harkal f4c7ac5
revert `get_phi_depth()` changes
harkal cc50efe
revert change that slipped in from another branch
harkal 28eed40
`StackModel` refactor
harkal e871868
feat[venom]: add small heuristic for cleaning input stack
charles-cooper e97bc88
print correct liveness and fence
harkal 12487fc
improve ordering
harkal 185ae23
r fences
harkal d8e6c74
remove old code
harkal 4a87e93
adapt fensing
harkal 24ad003
more
harkal 5b7f60c
improve ordering and grouping of instructions in DFTPass
harkal 2f0b4af
wip
harkal 1f94e46
wip
harkal f12192c
squash
harkal d89adb6
wip - ament
harkal a581a09
pass all tests
harkal 948f7f1
preserve last child if terminator
harkal 8c99eee
cleanups
harkal 58182ab
wip
harkal 6efbce8
wip
harkal 57353b2
no merge node
harkal 8bac9f5
fix
harkal a332610
fix
harkal 56fed17
gda
harkal 93c7dcf
groups
harkal 1958f48
wip
harkal ce584ac
wip
harkal 81143c6
wip
harkal 44e6880
better dep traversal
harkal ebeb9ee
wip
harkal fce4086
before circle breaking
harkal e7ce364
cycles detector
harkal 58d7387
wip
harkal 644e124
wip
harkal 9f8511f
fix
harkal 26c129c
wip
harkal ec29d76
grouping refactor
harkal f30b11c
cleanup
harkal 44ebf21
spelling
harkal a8ffb79
`DUP1 SWAP1` optimization
harkal 192ee8f
small cleanup and store expantion (not working)
harkal 958974b
cleanup and add comments
harkal 9e64c6e
lint
harkal 3cd1e8b
fixes
harkal a008fe8
refactor
harkal 23f2b65
lint
harkal e3c3ba4
remove store expantion pass
harkal ba88b4e
refactor
harkal 63246e7
Merge branch 'master' into feat/stack2mem
harkal 1f0ae6a
Merge branch 'master' into feat/dft_upgrade
harkal 3e08f83
properly hash
harkal 3d3a4cf
review dead code
harkal 8ccb393
refactor
harkal 1bb6da6
`get_uses_in_bb()` utility method
harkal b61f9b9
refactor
harkal f259618
refactor
harkal e04285f
remove force
harkal 588a818
refactor/cleanup
harkal d4f3cf0
combine children to allow for ordering heuristics later
harkal 72b10b5
disable liveness print out for debuging purposes
harkal 918cd5f
Merge branch 'master' into feat/dft_upgrade
harkal 3d26886
work
harkal 6f77fa8
lint
harkal 7367822
wip
harkal 0a9240b
wip
harkal 7f89a8b
w
harkal ceffef6
remove duplicate `assert`, `assert_unreachable` from `VOLATILE_INSTRU…
harkal 3f45f5e
new dep
harkal 8cb64da
no groups
harkal 2987d36
refactor[venom]: add effects to instructions
charles-cooper b1425ca
wip
harkal cfbb0eb
wip
harkal 2a9a3ce
push iszero-assert together
harkal 65b1351
cleanup
harkal dae696a
Revert "remove store expantion pass"
harkal db87ff8
enable store expansion
harkal d025298
improve naming
harkal e9b5303
fix get_write_effects()
charles-cooper 36ff613
effects
harkal 283d93a
effect deps
harkal f94e103
work
harkal 8be0bef
Update vyper/venom/effects.py
charles-cooper fcb688f
update effects to be an enum.Flag
charles-cooper 94aeeb0
effects magic
harkal b9046cc
Merge remote-tracking branch 'origin-vyper/master' into feat/dft_upgrade
harkal bdc1b4d
wip
harkal 62616cd
disable effects
harkal 805b079
Merge remote-tracking branch 'origin-charles/refactor/effects-analysi…
harkal eea930b
find roots and make terminator last to process
harkal 826e821
remove sort
harkal 2e628b6
fix
harkal cbf47ef
cleanup
harkal 9b791f3
Merge remote-tracking branch 'origin-vyper/master' into feat/stack2mem
harkal 843a127
fix deps
harkal a99ccf6
remove store expanstion
harkal d8d747b
Revert "remove store expanstion"
harkal ba0b912
lint
harkal da7c00a
python 3.11 support
harkal d8a63c9
cleanup
harkal face008
cleanup
harkal 3836b87
wip
harkal 015278a
Merge branch 'master' into feat/dft_upgrade
charles-cooper 9191c6a
Merge branch 'master' into feat/stack2mem
harkal 1d6b762
Merge branch 'master' into feat/dft_upgrade
harkal ff9e43c
Merge branch 'feat/dft_upgrade' of github.com:harkal/vyper into feat/…
harkal baadeed
debug
charles-cooper d5cc045
wip - improve heuristic
charles-cooper 0a1a001
wip barriers
charles-cooper 6a72efe
refactor offspring count
charles-cooper ddde336
an improvement
charles-cooper 6a8d595
Merge pull request #8 from charles-cooper/dft-shenanigans
harkal defdf8d
cleanup
harkal 0b899e4
remove deadcode and debuging
harkal 9e2c428
Merge remote-tracking branch 'origin-vyper/master' into feat/dft_upgrade
harkal 747e54b
Merge branch 'master' into feat/venom-pops
harkal 919dc22
Merge branch 'master' into feat/venom-pops
harkal 2b49132
Merge branch 'feat/venom-pops' of github.com:charles-cooper/vyper int…
harkal 26592bc
Merge branch 'devel' into feat/dft_upgrade
harkal 7406e85
Merge branch 'master' into feat/dft_upgrade
harkal b4c6998
Merge branch 'master' into feat/dft_upgrade
harkal 3192f9c
Merge branch 'master' into feat/dft_upgrade
charles-cooper e46d946
Merge branch 'feat/dft_upgrade' of github.com:harkal/vyper into feat/…
harkal ef84d70
Merge branch 'master' into feat/dft_upgrade
harkal 1c98c23
fix Effects.__iter__() for python3.10
charles-cooper b450deb
Merge branch 'master' into feat/dft_upgrade
harkal 7523cb1
bring back sorting and offsprings
harkal d2ed247
merge barriers
harkal b4b22cf
Merge branch 'master' into feat/stack2mem
harkal 52efbaf
refactor to the new pass importing
harkal 9212c86
add back SCCP
harkal 414ca33
fixes and cleanup
harkal f48ded6
Merge branch 'feat/stack2mem' into feat/dft_upgrade
harkal caaae55
heuristic update
harkal a29efac
Revert "Merge branch 'feat/stack2mem' into feat/dft_upgrade"
harkal 02b5082
cleanup
harkal aec32a0
refactor
harkal 267732c
fix test
harkal acd5d77
add `sha3_64` to effects
harkal 40e2e94
lint
harkal 5b4b9c1
Merge branch 'master' into feat/dft_upgrade
harkal 9b7b01e
refactor
harkal 0925547
Merge branch 'master' into feat/dft_upgrade
harkal 6891e7f
lint
harkal f7c6e94
improve offspring cost - ignore store chains
charles-cooper 0725890
Merge branch 'master' into feat/dft_upgrade
charles-cooper a80c7bd
Merge branch 'master' into feat/dft_upgrade
charles-cooper 39c27bb
add a comment
charles-cooper 1f53984
Merge branch 'master' into feat/dft_upgrade
charles-cooper File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,81 +1,138 @@ | ||
from collections import defaultdict | ||
|
||
import vyper.venom.effects as effects | ||
from vyper.utils import OrderedSet | ||
from vyper.venom.analysis import DFGAnalysis | ||
from vyper.venom.basicblock import IRBasicBlock, IRInstruction, IRVariable | ||
from vyper.venom.analysis import DFGAnalysis, IRAnalysesCache, LivenessAnalysis | ||
from vyper.venom.basicblock import IRBasicBlock, IRInstruction | ||
from vyper.venom.function import IRFunction | ||
from vyper.venom.passes.base_pass import IRPass | ||
|
||
|
||
class DFTPass(IRPass): | ||
function: IRFunction | ||
inst_order: dict[IRInstruction, int] | ||
inst_order_num: int | ||
inst_offspring: dict[IRInstruction, OrderedSet[IRInstruction]] | ||
visited_instructions: OrderedSet[IRInstruction] | ||
ida: dict[IRInstruction, OrderedSet[IRInstruction]] | ||
|
||
def __init__(self, analyses_cache: IRAnalysesCache, function: IRFunction): | ||
super().__init__(analyses_cache, function) | ||
self.inst_offspring = {} | ||
|
||
def run_pass(self) -> None: | ||
self.inst_offspring = {} | ||
self.visited_instructions: OrderedSet[IRInstruction] = OrderedSet() | ||
|
||
self.dfg = self.analyses_cache.request_analysis(DFGAnalysis) | ||
basic_blocks = list(self.function.get_basic_blocks()) | ||
|
||
self.function.clear_basic_blocks() | ||
for bb in basic_blocks: | ||
self._process_basic_block(bb) | ||
|
||
self.analyses_cache.invalidate_analysis(LivenessAnalysis) | ||
|
||
def _process_basic_block(self, bb: IRBasicBlock) -> None: | ||
self.function.append_basic_block(bb) | ||
|
||
self._calculate_dependency_graphs(bb) | ||
self.instructions = list(bb.pseudo_instructions) | ||
non_phi_instructions = list(bb.non_phi_instructions) | ||
|
||
self.visited_instructions = OrderedSet() | ||
for inst in non_phi_instructions: | ||
self._calculate_instruction_offspring(inst) | ||
|
||
# Compute entry points in the graph of instruction dependencies | ||
entry_instructions: OrderedSet[IRInstruction] = OrderedSet(non_phi_instructions) | ||
for inst in non_phi_instructions: | ||
to_remove = self.ida.get(inst, OrderedSet()) | ||
if len(to_remove) > 0: | ||
entry_instructions.dropmany(to_remove) | ||
|
||
entry_instructions_list = list(entry_instructions) | ||
|
||
def _process_instruction_r(self, bb: IRBasicBlock, inst: IRInstruction, offset: int = 0): | ||
for op in inst.get_outputs(): | ||
assert isinstance(op, IRVariable), f"expected variable, got {op}" | ||
uses = self.dfg.get_uses(op) | ||
# Move the terminator instruction to the end of the list | ||
self._move_terminator_to_end(entry_instructions_list) | ||
|
||
for uses_this in uses: | ||
if uses_this.parent != inst.parent or uses_this.fence_id != inst.fence_id: | ||
# don't reorder across basic block or fence boundaries | ||
continue | ||
self.visited_instructions = OrderedSet() | ||
for inst in entry_instructions_list: | ||
self._process_instruction_r(self.instructions, inst) | ||
|
||
# if the instruction is a terminator, we need to place | ||
# it at the end of the basic block | ||
# along with all the instructions that "lead" to it | ||
self._process_instruction_r(bb, uses_this, offset) | ||
bb.instructions = self.instructions | ||
assert bb.is_terminated, f"Basic block should be terminated {bb}" | ||
|
||
def _move_terminator_to_end(self, instructions: list[IRInstruction]) -> None: | ||
terminator = next((inst for inst in instructions if inst.is_bb_terminator), None) | ||
if terminator is None: | ||
raise ValueError(f"Basic block should have a terminator instruction {self.function}") | ||
instructions.remove(terminator) | ||
instructions.append(terminator) | ||
|
||
def _process_instruction_r(self, instructions: list[IRInstruction], inst: IRInstruction): | ||
if inst in self.visited_instructions: | ||
return | ||
self.visited_instructions.add(inst) | ||
self.inst_order_num += 1 | ||
|
||
if inst.is_bb_terminator: | ||
offset = len(bb.instructions) | ||
|
||
if inst.opcode == "phi": | ||
# phi instructions stay at the beginning of the basic block | ||
# and no input processing is needed | ||
# bb.instructions.append(inst) | ||
self.inst_order[inst] = 0 | ||
if inst.is_pseudo: | ||
return | ||
|
||
for op in inst.get_input_variables(): | ||
target = self.dfg.get_producing_instruction(op) | ||
assert target is not None, f"no producing instruction for {op}" | ||
if target.parent != inst.parent or target.fence_id != inst.fence_id: | ||
# don't reorder across basic block or fence boundaries | ||
continue | ||
self._process_instruction_r(bb, target, offset) | ||
children = list(self.ida[inst]) | ||
|
||
self.inst_order[inst] = self.inst_order_num + offset | ||
def key(x): | ||
cost = inst.operands.index(x.output) if x.output in inst.operands else 0 | ||
return cost - len(self.inst_offspring[x]) * 0.5 | ||
|
||
def _process_basic_block(self, bb: IRBasicBlock) -> None: | ||
self.function.append_basic_block(bb) | ||
# heuristic: sort by size of child dependency graph | ||
children.sort(key=key) | ||
|
||
for inst in bb.instructions: | ||
inst.fence_id = self.fence_id | ||
if inst.is_volatile: | ||
self.fence_id += 1 | ||
for dep_inst in children: | ||
self._process_instruction_r(instructions, dep_inst) | ||
|
||
# We go throught the instructions and calculate the order in which they should be executed | ||
# based on the data flow graph. This order is stored in the inst_order dictionary. | ||
# We then sort the instructions based on this order. | ||
self.inst_order = {} | ||
self.inst_order_num = 0 | ||
for inst in bb.instructions: | ||
self._process_instruction_r(bb, inst) | ||
instructions.append(inst) | ||
|
||
bb.instructions.sort(key=lambda x: self.inst_order[x]) | ||
def _calculate_dependency_graphs(self, bb: IRBasicBlock) -> None: | ||
# ida: instruction dependency analysis | ||
self.ida = defaultdict(OrderedSet) | ||
|
||
def run_pass(self) -> None: | ||
self.dfg = self.analyses_cache.request_analysis(DFGAnalysis) | ||
non_phis = list(bb.non_phi_instructions) | ||
|
||
self.fence_id = 0 | ||
self.visited_instructions: OrderedSet[IRInstruction] = OrderedSet() | ||
# | ||
# Compute dependency graph | ||
# | ||
last_write_effects: dict[effects.Effects, IRInstruction] = {} | ||
last_read_effects: dict[effects.Effects, IRInstruction] = {} | ||
|
||
basic_blocks = list(self.function.get_basic_blocks()) | ||
for inst in non_phis: | ||
for op in inst.operands: | ||
dep = self.dfg.get_producing_instruction(op) | ||
if dep is not None and dep.parent == bb: | ||
self.ida[inst].add(dep) | ||
|
||
self.function.clear_basic_blocks() | ||
for bb in basic_blocks: | ||
self._process_basic_block(bb) | ||
write_effects = inst.get_write_effects() | ||
read_effects = inst.get_read_effects() | ||
|
||
for write_effect in write_effects: | ||
if write_effect in last_read_effects: | ||
self.ida[inst].add(last_read_effects[write_effect]) | ||
last_write_effects[write_effect] = inst | ||
|
||
for read_effect in read_effects: | ||
if read_effect in last_write_effects and last_write_effects[read_effect] != inst: | ||
self.ida[inst].add(last_write_effects[read_effect]) | ||
last_read_effects[read_effect] = inst | ||
|
||
def _calculate_instruction_offspring(self, inst: IRInstruction): | ||
if inst in self.inst_offspring: | ||
return self.inst_offspring[inst] | ||
|
||
self.inst_offspring[inst] = self.ida[inst].copy() | ||
|
||
deps = self.ida[inst] | ||
for dep_inst in deps: | ||
assert inst.parent == dep_inst.parent | ||
if dep_inst.opcode == "store": | ||
continue | ||
res = self._calculate_instruction_offspring(dep_inst) | ||
self.inst_offspring[inst] |= res | ||
|
||
return self.inst_offspring[inst] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think there is a performance error here. in the previous version, we would iterate into children with
get_uses
to bring instructions closer to their uses. but here we have no possibility to reorder the entry instructions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rearrange the entry instruction processing order. We are just not doing it just yet.