Page 2 of 14

Re: Simplifying code

Posted: Fri Apr 24, 2020 9:50 pm
by mvanthoor
Just create the board at program entry, and then pass a pointer to that board to each function such as search() or perft().

In that function you can create a local move list over which you can iterate. As those functions are recursive, each function call will automatically have its own move list on the stack. Because you're using only one board, you'll obviously have to use make/unmake. It's the fastest way. copy/make is too slow, because you'll have to copy the entire position, bitboards, lists and all. This is my board:

Code: Select all

pub struct Board {
    pub bb_side: [[Bitboard; NR_OF_PIECES as usize]; EACH_SIDE as usize],
    pub bb_pieces: [Bitboard; EACH_SIDE as usize],
    pub game_state: GameState,
    pub history: History,
    pub piece_list: [Piece; NR_OF_SQUARES as usize],
    pub material_count: [u16; EACH_SIDE as usize],
    zobrist_randoms: Arc<ZobristRandoms>,
    move_generator: Arc<MoveGenerator>,
}
This would take:

bb_side: 8 bytes x 6 pieces x 2 sides = 96 bytes
bb_pieces: 8 bytes x 2 sides = 16 bytes
game_state: This is also a struct with its own fields. It's 24 bytes.
history: this is the game history. It is an array that holds 2048 game states: 49.152 bytes.
piece_list: 8 bytes x 64 squares (my pieces are usize/u64, as Rust requires this for array indexing): 512 bytes
zobrist_randoms: Atomic Counting Reference Pointer to the Zobrist Randoms struct. 8 bytes.
move_generator: Atomic Counting Reference Pointer to the Move Generator struct. 8 bytes.

The last two exist so I don't need to pass the zobrist randoms and move generator to any functions. I can't just call them globally; Rust has no global variables. (Well; it does; if you prefer working AGAINST the language, you can hack it...)

The total is 49.816 bytes. I wouldn't want to copy that on each move, billions of times in a game.

In a different language such as C, you could store the history list, zobrist randoms and move generator globally, but the rest is essential to be copied because it changes on every move. Even so, you'd need to copy 648 bytes. A billion copies of 648 bytes comes down to 600 GB of copying.

I'd rather do make/unmake to be honest...

I first had my material count in the game state struct, thinking to save two instructions: not having to update it in reverse when doing unmake(). It was only 2x 8 bytes to copy... so, should work, right? But then I did some research. My game state struct was 26 bytes, and Rust pads it up to 32, which is the first multiple of 8. So, I was copying 6 bytes extra on each make/unmake. I took the material count out, which dropped the struct to 24 bytes; this is exactly a multiple of 8 so it isn't padded. Taking out 2 bytes therefore saved me 8 bytes of copying during make and unmake, and the engine sped up by a few MILLION moves per second. It gained me 10% speed. So yes, taking one or two bytes off of struct type that is copied back and forth billions of times can actually speed up your engine by 10% percent.

I managed to pull this optimization (and some others, cutting if-statements and assignments where possible) off in a few places in make and unmake, and it dropped my perft 7 speed from 120 seconds to 79 seconds. Speedup is 34%.

In a chess engine, speed is your currency. This currency buys you deeper search (if you just use the speed) or better evaluation accuracy (you can trade the speed for more/better evaluation parameters without losing search depth). The more speed you have (efficient move generation, make/unmake, search with good move ordering and pruning), the more other things you can "buy".

Making your evaluation better = more strength... but... it also loses speed, and thus search depth, which costs you strength. The gain through evaluation better be higher than the loss of strength due to searching less deep. So, if you add an evaluation parameter that costs you 3% speed and then can you manage to speed up the engine by 3%, your evaluation was basically free.

Re: Simplifying code

Posted: Sat Apr 25, 2020 9:32 am
by Michael Sherwin
I'm "out of the know" about this no stack idea. I understand copy and forget and making an array of modified boards (which is a stack, btw). But what I do not "see" is how effective move ordering is done. Move ordering is the absolute most important factor in the efficiency of an alpha/beta search. Ten times slower code with good move ordering "runs circles" around ten times faster code with poor move ordering. Please tell, how does one achieve good move ordering without a list of moves to order?

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:28 am
by Henk
I create a move priority queue on each node when not leaf node.

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:32 am
by Ras
Michael Sherwin wrote: Sat Apr 25, 2020 9:32 amPlease tell, how does one achieve good move ordering without a list of moves to order?
If you have a hash table hit with hash move, but insufficient depth draft, you don't need to generate moves because it will probably (90% - 95%) be a beta cut-off. I just verify the complete move legality of the hash move before I accept it. I call it "even later move generation" in my engine, though it doesn't give much of a performance boost.

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:43 am
by Michael Sherwin
Henk wrote: Sat Apr 25, 2020 10:28 am I create a move priority queue on each node if it doesn't stand pat.
On every node of the search tree? Do you have really huge standpat margins lower in the tree? How can queen sacrifices ever be seen? A priority queue can mean many things. I assume that you generate pawns_capture_queens first and queens_capture_pawns last? I guess that would be quick using bitboards! Then generate the rest of the moves in no particular order? Am I on the right track? Thanks.

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:49 am
by Henk
Michael Sherwin wrote: Sat Apr 25, 2020 10:43 am
Henk wrote: Sat Apr 25, 2020 10:28 am I create a move priority queue on each node if it doesn't stand pat.
On every node of the search tree? Do you have really huge standpat margins lower in the tree? How can queen sacrifices ever be seen? A priority queue can mean many things. I assume that you generate pawns_capture_queens first and queens_capture_pawns last? I guess that would be quick using bitboards! Then generate the rest of the moves in no particular order? Am I on the right track? Thanks.
Yes. And it doesn't see queen sacrifices. Maybe the internal iterative deepening move might sacrifice a queen at a node near the root.

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:50 am
by Michael Sherwin
Ras wrote: Sat Apr 25, 2020 10:32 am
Michael Sherwin wrote: Sat Apr 25, 2020 9:32 amPlease tell, how does one achieve good move ordering without a list of moves to order?
If you have a hash table hit with hash move, but insufficient depth draft, you don't need to generate moves because it will probably (90% - 95%) be a beta cut-off. I just verify the complete move legality of the hash move before I accept it. I call it "even later move generation" in my engine, though it doesn't give much of a performance boost.
I am quite sure that hash move first only ordering does not make for a very good move ordering scheme. But, it is a good start.

Re: Simplifying code

Posted: Sat Apr 25, 2020 10:59 am
by Ras
Michael Sherwin wrote: Sat Apr 25, 2020 10:50 amI am quite sure that hash move first only ordering does not make for a very good move ordering scheme.
Correct because most of the time, at least in the middlegame, there won't be a hash hit. But I guess you could also implement MVV-LVA in some clever way so that the order of move generation would coincide with the order scheme of MVV-LVA.

Re: Simplifying code

Posted: Sat Apr 25, 2020 11:27 am
by Henk
Ras wrote: Sat Apr 25, 2020 10:59 am
Michael Sherwin wrote: Sat Apr 25, 2020 10:50 amI am quite sure that hash move first only ordering does not make for a very good move ordering scheme.
Correct because most of the time, at least in the middlegame, there won't be a hash hit. But I guess you could also implement MVV-LVA in some clever way so that the order of move generation would coincide with the order scheme of MVV-LVA.
The nodes near the root need not care much about efficiency of move ordering. For there are only a few. Bitboard operations not cheap. But computing move priorities not cheap too.

Re: Simplifying code

Posted: Sat Apr 25, 2020 11:36 am
by Ras
Henk wrote: Sat Apr 25, 2020 11:27 amThe nodes near the root need not care much about efficiency of move ordering.
I don't think so. True, there are only few nodes near root, but they have the largest search trees beneath them.