-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(connect-four): add board evaluation #228
Conversation
src/games/connect_four.rs
Outdated
const POSITIONS: [usize; BOARD_HEIGHT] = [ | ||
0, // | ||
BOARD_WIDTH, | ||
BOARD_WIDTH * 2, | ||
BOARD_WIDTH * 3, | ||
BOARD_WIDTH * 4, | ||
BOARD_WIDTH * 5, | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since constants of this structure are created several times throughout the code, we should probably find a way to reduce the boilerplate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that's like my least favourite part to see, but they're also critical for the performance of this app.
What I can do, however, is define all the constants for all coordinates (used in the check_x
methods) and expand them, then I can do for this case:
const POSITIONS: [usize; BOARD_HEIGHT] = [0, B1, B2, B3, B4, B5];
It'll still have the boilerplate, but it'll be less visually noisy, and will also reduce the amount of lines of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a generic constructor for each entry would be
base + BOARD_WIDTH * idx + if additive { idx } else { 0 }
but i am not sure how you would implement this without inducing some kind of loss in performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I can't, B1
, B2
, and B3
from check
are isize
(because negative ranges), and this uses strictly usize
.
It'd also declare a lot of unused code in the top-level with this change, which is not desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will work on a macro (or if i can think of something better, i'll do that instead) for this
i am, however, noticing that each of these functions has a lot of boilerplate, so if i'm going to go to those lengths i might consider making one that generates the entire thing
src/games/connect_four.rs
Outdated
const POSITIONS: [usize; BOARD_HEIGHT] = [ | ||
0, // | ||
BOARD_WIDTH, | ||
BOARD_WIDTH * 2, | ||
BOARD_WIDTH * 3, | ||
BOARD_WIDTH * 4, | ||
BOARD_WIDTH * 5, | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will work on a macro (or if i can think of something better, i'll do that instead) for this
i am, however, noticing that each of these functions has a lot of boilerplate, so if i'm going to go to those lengths i might consider making one that generates the entire thing
From 1 SSE2 and a jump, to 2 SSE instructions and no jump
Co-authored-by: Tyler J Russell <[email protected]>
The code prior to this commit was already quite fast, at 37 instructions. For this commit, I did two things: 1. I read all the indexes once per line instead of once per window, this allowed the code to read the values as a sequence of contiguous `u8` values instead of a sequence that jumps around in memory. This by itself allows the `evaluate_window` function to use a slice rather than having to create a new array per window. This change reduced the number of instructions in `evaluate_window` from 37 to 23. 2. I changed `ConnectFour` to add a `score_player_mask` field containing the `Simd<u8, 4>` value, rather than having to make a new one on every `evaluate_window` call. This change reduced the number of instructions in `evaluate_window` from 23 to 19. The final x86 assembly code for `evaluate_window` is as follows: ```asm ; fn evaluate_window(&self, player: Player, window: &[u8; 4]) -> i32 { movdqa xmm0, xmmword, ptr, [rdx] pxor xmm1, xmm1 movdqa xmm2, xmm0 pcmpeqb xmm2, xmm1 xorps xmm3, xmm3 movss xmm3, xmm2 pmovmskb eax, xmm3 ; match mask.to_bitmask() { movzx eax, al pcmpeqb xmm0, xmmword, ptr, [rcx] lea rcx, [rip, +, switch.table._ZN8skyra_ai5games12connect_four11ConnectFour15evaluate_window17hbe62da8767e17f32E] movss xmm1, xmm0 pmovmskb edx, xmm1 movzx edx, dl lea r8, [rip, +, switch.table._ZN8skyra_ai5games12connect_four11ConnectFour15evaluate_window17hbe62da8767e17f32E.300] movzx edx, byte, ptr, [rdx, +, r8] ; let mask: u8 = (player_pieces << 3) | empty_pieces; or dl, byte, ptr, [rax, +, rcx] movsx rax, dl ; match mask { lea rcx, [rip, +, switch.table._ZN8skyra_ai5games12connect_four11ConnectFour15evaluate_window17hbe62da8767e17f32E.299] mov eax, dword, ptr, [rcx, +, 4*rax, -, 4] ; } ret ```
That's the name of the token we use for publish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have nothing left to add that wouldn't simply be nitpicking, like make_
and generate_
can both be gen_
by convention. looks good :)
Implemented the heuristics to evaluate the board using as reference:
However, I'm not quite happy with the performance, so I will try to see if I can improve it.A lot of work has been done to optimise the code greatly.