Simplifying code

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Simplifying code

Post by mvanthoor »

JohnWoe wrote: Tue Sep 29, 2020 8:24 pm Somehow in chess programming it's ok to have a search function that's 1000+ lines long.
Quiescence check -> Tablebase probe -> Razoring -> Futility -> Null Move ...

I have 9 small search functions. You could fit them into one single function.
I've noticed this tendency to have very long functions in chess engines (indeed, especially search and evaluation). I think this is caused by the fact that, after an engine first starts to play, people add functionality and don't take the time to refactor.

It is not particular to chess engines, however. At work, one of the programs I maintain, also has some functions I don't dare to touch because they're over 850 lines long and use some global variables for that matter. The fun of legacy code written almost 15 years ago by a different developer, at a different company, when the IT landscape was very different as well :P
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: Simplifying code

Post by Henk »

JohnWoe wrote: Tue Sep 29, 2020 8:24 pm
Henk wrote: Thu Sep 17, 2020 1:41 pm Functions no longer than twenty lines of code.
No more than three arguments.

Food for rewrite:

Exactly!
Functions that take 4 or less arguments fit into register and that is processed really fast.
Functions should do just one thing. And _ONE_ thing only.

Somehow in chess programming it's ok to have a search function that's 1000+ lines long.
Quiescence check -> Tablebase probe -> Razoring -> Futility -> Null Move ...

I have 9 small search functions. You could fit them into one single function.
I tried to reduce number of arguments but now I get this for example

Code: Select all

 search = idSearch.Search.Set(Level, depth)
                         .Set(Param, searchParam)
                         .Set(State, state);
 (result, state) = search.MainSearch(bestValue);
So if I forget one Set I'm in trouble.
Maybe that's why he advocates Test Driven development and spending half of your time at refactoring.

Other example:

Code: Select all

   var initMvPriorityCalculator =
                    mvPriorityCalculator
                    .SetAllowStandingPat(true)
                    .SetDepth(p.Depth)
                    .SetKillerMoves(state.KillerMoves)
                    .SetMoveRatings(state.MoveRatings);
Also splitting up long methods in shorter ones is not that easy. For you end up with methods with strange names and hard to tell where method is for if you don't know the context.
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: Simplifying code

Post by Henk »

He also said better have clean code that doesn't work than bad code that works.
For if you have clean code you can fix the bugs and make it work.

Hi, hi, hi. So write clean code. Fix bugs later.

Sorry this should be: make it work, refactor, make it work again.
JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Re: Simplifying code

Post by JohnWoe »

Henk wrote: Sat Oct 03, 2020 10:32 am He also said better have clean code that doesn't work than bad code that works.
For if you have clean code you can fix the bugs and make it work.

Hi, hi, hi. So write clean code. Fix bugs later.

Sorry this should be: make it work, refactor, make it work again.
When the code isn't modular refactoring takes time and energy.
I have always liked more simple functions doing just 1 thing and doing it damn well. Than some 1000 lines monster doing gazillion things.

I see this "C-hacker syndrome" going on. Where they write their buggy containers/string functions from scratch. Because they want to be "in control". I rather use std::vector/std::string and get the job done.

Linus Torvalds accepts new code to Kernel for 2 weeks then 8-10 weeks refactoring/bug fixing. The release cycle is about 10 weeks. The famous 80/20 ratio here.
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: Simplifying code

Post by Henk »

Now I get this. At least it is less than 20 lines of code.

Code: Select all

  	
        public (Result, SearchState) PrincipalVariationSearch()
        {
            var state = Get<SearchState>(Name.State);
            var param = Get<SearchParam>(Name.Param);
            var key = KeyFactory.BuildKey(param.Position.Board);
            var (qResult, qState) = QuiescenceSearch(key);
            if (qResult != null)
            {
                return (qResult, qState);
            }
            var entry = state.TTable[key];
            var (res, nextState) = DoTranspositionAndNullMovePruning(entry, key);           
            if (res != null)
            {
                return (res, nextState);
            }
            return Set(Name.State, nextState)
                   .SearchMoves(entry, key);
        }
  
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Simplifying code

Post by mvanthoor »

JohnWoe wrote: Sat Oct 03, 2020 11:53 pm I see this "C-hacker syndrome" going on. Where they write their buggy containers/string functions from scratch. Because they want to be "in control". I rather use std::vector/std::string and get the job done.

Linus Torvalds accepts new code to Kernel for 2 weeks then 8-10 weeks refactoring/bug fixing. The release cycle is about 10 weeks. The famous 80/20 ratio here.
Personally, in the past, I've also exhibited this behavior of wanting to write -everything- myself, and trying to keep code as short as possible.

For example, you can rewrite this:

Code: Select all

let x = a + b;
let y = d * r;
let z = x / y;
to this:

Code: Select all

let z = (a + b) / (d * r);
However, this begs the questions: what does (a + b) and (d * r) represent?" Obviously, imagine the single-letter variables to be more descriptive, or representing functions or something. It's only to demonstrate that you can put several lines into one line if you'd want to. Nowadays, the first example above, will be transformed to the second form by the compiler.

For example, my Square Attacked function is this:

Code: Select all

    pub fn square_attacked(&self, board: &Board, attacker: Side, square: Square) -> bool {
        let attackers = board.bb_pieces[attacker];
        let occupancy = board.occupancy();
        let bb_king = self.get_non_slider_attacks(Pieces::KING, square);
        let bb_rook = self.get_slider_attacks(Pieces::ROOK, square, occupancy);
        let bb_bishop = self.get_slider_attacks(Pieces::BISHOP, square, occupancy);
        let bb_knight = self.get_non_slider_attacks(Pieces::KNIGHT, square);
        let bb_pawns = self.get_pawn_attacks(attacker ^ 1, square);
        let bb_queen = bb_rook | bb_bishop;

        (bb_king & attackers[Pieces::KING] > 0)
            || (bb_rook & attackers[Pieces::ROOK] > 0)
            || (bb_queen & attackers[Pieces::QUEEN] > 0)
            || (bb_bishop & attackers[Pieces::BISHOP] > 0)
            || (bb_knight & attackers[Pieces::KNIGHT] > 0)
            || (bb_pawns & attackers[Pieces::PAWN] > 0)
    }
It looks as if it first generates all of the attacks for all possible pieces, and then starts to check if one of them is true in the OR statement. This seems to waste a lot of calculation, because you could be calculating square attacks you don't need. This is probably the reason why I often see this function written with many if-statements, one for each piece, trying to cut the function short. In Rust however (in C as well, probably), the above function is transformed by the compiler:

Code: Select all

pub fn square_attacked(&self, board: &Board, attacker: Side, square: Square) -> bool {
	(self.get_non_slider_attacks(Pieces::KING, square) & board.bb_pieces[attacker][Pieces::KING] > 0)
	|| (self.get_slider_attacks(Pieces::ROOK, square, board.occupancy()) & board.bb_pieces[attacker][Pieces::ROOK] > 0)
	|| ((self.get_slider_attacks(Pieces::ROOK, square, board.occupancy()) | bb_bishop = self.get_slider_attacks(Pieces::BISHOP, square, board.occupancy())) & board.bb_pieces[attacker][Pieces::QUEEN] > 0)
	|| (bb_bishop = self.get_slider_attacks(Pieces::BISHOP, square, board.occupancy()) & board.bb_pieces[attacker][Pieces::BISHOP] > 0)
	|| (self.get_non_slider_attacks(Pieces::KNIGHT, square) & board.bb_pieces[attacker][Pieces::KNIGHT] > 0)
	|| (self.get_pawn_attacks(attacker ^ 1, square) & board.bb_pieces[attacker][Pieces::PAWN] > 0)
}
I might have made a copy/paste mistake somewhere, and the compiler will probably be able to reduce the function even further. Point is that all the 'extra' variables at the top will be elided by the compiler (to elide => omit something by merging it into something else), resulting in one huge OR-statement, which at runtime will be cut as short as possible.

Because of optimizations such as these, I prefer to write more descriptive code, using more lines, and more in-between variables. For example, my function that generates pawn moves is a massive row of let statements, building one on top of the previous, and it basically has only one command at the end to add a move to the move list:

Code: Select all

    pub fn pawns(&self, board: &Board, list: &mut MoveList) {
        const UP: i8 = 8;
        const DOWN: i8 = -8;

        let us = board.us();
        let bb_opponent_pieces = board.bb_side[board.opponent()];
        let bb_empty = !board.occupancy();
        let bb_fourth = BB_RANKS[Board::fourth_rank(us)];
        let direction = if us == Sides::WHITE { UP } else { DOWN };
        let rotation_count = (NrOf::SQUARES as i8 + direction) as u32;
        let mut bb_pawns = board.get_pieces(Pieces::PAWN, us);

        // As long as there are pawns, generate moves for each of them.
        while bb_pawns > 0 {
            let from = bits::next(&mut bb_pawns);
            let to = (from as i8 + direction) as usize;
            let bb_push = BB_SQUARES[to];
            let bb_one_step = bb_push & bb_empty;
            let bb_two_step = bb_one_step.rotate_left(rotation_count) & bb_empty & bb_fourth;
            let bb_targets = self.get_pawn_attacks(us, from);
            let bb_captures = bb_targets & bb_opponent_pieces;
            let bb_ep_capture = match board.game_state.en_passant {
                Some(ep) => bb_targets & BB_SQUARES[ep as usize],
                None => 0,
            };

            // Gather all moves for the pawn into one bitboard.
            let bb_moves = bb_one_step | bb_two_step | bb_captures | bb_ep_capture;
            self.add_move(board, Pieces::PAWN, from, bb_moves, list);
        }
    }
I don't even want to try and guess how the compiler is going to elide all of those variables, but be sure that it does. This way of coding, in combination with waterfall if-statements is a good way of getting descriptive code and avoiding Rightward Drift, but id DOES use a lot more lines. (And thus requires more typing.)
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: Simplifying code

Post by Henk »

O wait I have to remove redundant curly braces too.
Curly braces are probably warning signals that refactoring is done badly.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Simplifying code

Post by mvanthoor »

Henk wrote: Tue Oct 06, 2020 11:20 am O wait I have to remove redundant curly braces too.
Curly braces are probably warning signals that refactoring is done badly.
Why? I often see this:

Code: Select all

if (...)
	statement1
else
	statement2
And then you get weird errors about a dangling else if you put another statement below statement1, or a statement you put below statement2 suddenly doesn't belong to your if-statement. (Same stuff goes for for-loops.)

I prefer to use braces everywhere, even if there's only one statement. (If you don't, the default Rust formatter will fix it for you :P ) I also use parenthesis all the time:

Code: Select all

(a && b) || (c && d)
I was VERY bad at questions like "determine the result of this line" (with regard to operator priority) in university. I never took the time to study this in any language. If in doubt, use parenthesis and you KNOW it'll be correct.

Don't try to make the code as short as possible; try to make it as readable as possible for humans, with short functions and clear variable names, using extra variables for in-between results. The compiler will probably make it shorter than even you yourself could have made it with your best efforts.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: Simplifying code

Post by Henk »

No this is for only one statement in if and else clause which is best.

O wait you mean accidentally put an extra stratement below these clauses.
Maybe this is also about test driven development. Test cases should capture these errors.

Probably also functions are so short no extra statements required. Or probably you will detect these errors easily.
Braces should be ugly.

Here next result of rewrite. I think KeyFactory is unneccessary too. All too much.

Code: Select all

	 public (Result, SearchState) PrincipalVariationSearch()
        {
            var param = Get<SearchParam>(Name.Param);
            var key = KeyFactory.BuildKey(param.Position.Board);
            var (nextResult, nextState) = TryTerminateTheSearch(key);
            if (nextResult != null) return (nextResult, nextState);      
            else return Set(Name.State, nextState)
                        .SearchMoves(key);
        }
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Simplifying code

Post by Joost Buijs »

mvanthoor wrote: Tue Oct 06, 2020 12:35 pm
Henk wrote: Tue Oct 06, 2020 11:20 am O wait I have to remove redundant curly braces too.
Curly braces are probably warning signals that refactoring is done badly.
Why? I often see this:

Code: Select all

if (...)
	statement1
else
	statement2
And then you get weird errors about a dangling else if you put another statement below statement1, or a statement you put below statement2 suddenly doesn't belong to your if-statement. (Same stuff goes for for-loops.)

I prefer to use braces everywhere, even if there's only one statement. (If you don't, the default Rust formatter will fix it for you :P ) I also use parenthesis all the time:

Code: Select all

(a && b) || (c && d)
I was VERY bad at questions like "determine the result of this line" (with regard to operator priority) in university. I never took the time to study this in any language. If in doubt, use parenthesis and you KNOW it'll be correct.

Don't try to make the code as short as possible; try to make it as readable as possible for humans, with short functions and clear variable names, using extra variables for in-between results. The compiler will probably make it shorter than even you yourself could have made it with your best efforts.
You're right, I also have the habit of using curly braces, parenthesis and variables for in-between results as much as possible. It avoids problems with operator precedence and makes the code a lot easier to understand, the in-between variables will be removed by the optimizer anyway.

In the past when compilers were not that smart a different programming style was needed to get good performance, basically you had to optimize the whole code flow by hand, fortunately this no longer necessary.