TODO

- NEW SSA/arg handling:
  - run the first phase of SSA_GVN as part of IR construction:
    - Get rid of "variables" as instructions in block 0
    -  read/write from vars X nodes table while filling nodes

  - conditionally get rid of "args" as instruction in node 0:
    - depending on if we actually have calls or not, we might want to get args out of the way to other regs

  - always get rid of "phi" nodes from join nodes blocks
  - don't put "putphi" instructions in block as part of phi constructions
  - only schedule these as part of "resolve_moveblock" later

- calls:
  - vregs+temps now have for all clobbered regs, not just "overlaps a call" bit but we need to use it
   - for syscalls: much fewer clobbers
   - handle constrained instructions like "shr reg, ECX" properly (note: input-constraint is not a clobber, but it works as a zero-order approx)

- FIXME: general paralell move handling. paralell moves currently needs to be handled:
  - block of putphi instructions
  - block of callarg instructions
  - block of arg instructions (we shouldn't need to do swaps here, but why not support to)

- regalloc:
 - implement dummy "spill everything!" allocator just to test spill code in isolation

 -implement liveness interval spliting:
 - reference: "Linear Scan Register Allocation on SSA Form"; Wimmer, Franz; 2010
 - [DONE] allocation when no spilling is needed
 - decide how do do spills:
   - in advance as part of instruction selection (keeping pressure)
   - in the midle
 - consider "call near" test case as of 638022887f2f060ecf6eb0aacf211f93a05ed89a:
        \\func twokube
        \\  %x = arg
        \\  %y = arg
        \\  %xx = call kuben %x
        \\  %yy = call kuben %y
        \\  %summa = add %xx %yy
        \\  ret %summa
        \\end
   - this uses two saved regs (for $xx and $y overlapping), but in theory only one is needed
     (between the calls, $y could be moved from SAVED to .rdi and then %xx from .rax to SAVED)
   - make a test case where we actually hit the register limit
   - for this case: allow interval to be split into two distinct registers, whithout any memory spilling

 - follow up project: attempt to generalize reverse-mode linear scan to general (at least reducible) CFG:s.
  (only worth it if the struct/algo becomes simpler than

WIP: integrate codebase completely with EIRI.
Try zig package manager and see make Eiri depend on FLIR!

pub fn vmovdq_vg(self: *Self, quad: bool, dst: u4, src: IPReg) !void {
    try self.new_inst(@returnAddress());
    try self.vex3(wide, dst.ext(), src1.ext(), false, .h0F38, src2.id(), false, pp);
}


std.debug.print("\nnee: {}", .{std.os.system.getpid()});
std.time.sleep(1000 * 1e9);