Devlog of Leorik

lithander · Post by **lithander** » Sat Jan 29, 2022 1:17 pm

mvanthoor wrote: ↑Fri Jan 28, 2022 6:27 pm Does your a/b-function already implement staged move generation? If so, I'd remove it.

Yes it does. I believe I mentioned it above, too. I create captures and quiets separately anyway and if I play all the quiets first and captures after that of course the tree size explodes. I was merely surprised to find that the additional benefit of move-sorting the captures vs playing them in the order they were generated wasn't a clear win. (Except in qsearch)

mvanthoor wrote: ↑Fri Jan 28, 2022 6:27 pm If I were you, I'd implement the simplest a/b-function and a simple linear time control and then have the engine run a test against MMC 0.2. Leorik should perform horribly because it has no qsearch (so big problems with horizon effect) and it has no MVV-LVA (so it tries moves in random order). Then implement qsearch to remove the horizon effect. (This will lower the depths you can reach, because the entire qsearch is done at depth 0; but it will massively boost seldepth, obviously.) Test again; you should now see a clear improvement although Leorik will still be very weak.

I don't actually believe that qsearch without sorting the captures would make the engine stronger at all. Based on what I measured so far and described in my above post (provided there are no hidden bugts) my conclusion would be that without move sorting the horizon-effect is less of a problem for such a depth-constrained engine then getting caught up in calculating pointless exchanges in qsearch because the capture where a pawn captures a queen isn't prioritized. E.g. a search to depth 7 with the horizon effect causing some blunders is better than a search that only reaches depth 5 because that depth is just way too shallow.

mvanthoor wrote: ↑Fri Jan 28, 2022 6:27 pm If you implement only qsearch, MVV-LVA, linear time control and incremental non-tapered PST, you should have feature-parity with Rustic Alpha 1 and MMC 0.3, if I remember correctly. Therefore your rating should be somewhere around 1500-1700 depending on the speed of the implementation.

MMC 0.3 also had a feature where it used the triangular PV table to not only store the pv but also play the move from the pv table first when possible. That's a feature I didn't plan to recreate in Leorik because eventually it's getting replaced by a best move from the TT. So without tracing the evolution of MMC accurately I can not use it as a sparring partner because it would be comparing apples with oranges, not only about the speed difference.

I could however try to make a version that has feature parity with Rustic Alpha if I knew exactly what features you had. Did you take any knowledge from previous iterations to speed up the performance of iteration X? Or was the search time it took for a depth of X the same whether you searched in iterative search or outside?

I wanted to avoid engine-vs-engine testing for as long as possible because it's so time consuming but maybe it's the sensible thing to do just to make extra sure that each component I add to Leorik is actually working as it's supposed to.

mvanthoor · Post by **mvanthoor** » Sat Jan 29, 2022 4:18 pm

lithander wrote: ↑Sat Jan 29, 2022 1:17 pm I don't actually believe that qsearch without sorting the captures would make the engine stronger at all. Based on what I measured so far and described in my above post (provided there are no hidden bugts) my conclusion would be that without move sorting the horizon-effect is less of a problem for such a depth-constrained engine then getting caught up in calculating pointless exchanges in qsearch because the capture where a pawn captures a queen isn't prioritized. E.g. a search to depth 7 with the horizon effect causing some blunders is better than a search that only reaches depth 5 because that depth is just way too shallow.

You got a point there. Maybe it's better to implement move sorting first.

MMC 0.3 also had a feature where it used the triangular PV table to not only store the pv but also play the move from the pv table first when possible. That's a feature I didn't plan to recreate in Leorik because eventually it's getting replaced by a best move from the TT. So without tracing the evolution of MMC accurately I can not use it as a sparring partner because it would be comparing apples with oranges, not only about the speed difference.

Indeed; I forgot about that feature. If you're not going to take the same steps in the same order, you can't compare to MMC 1-on-1; but you can still use it as a sparring partner to estimate Leorik's strength.

I could however try to make a version that has feature parity with Rustic Alpha 1 if I knew exactly what features you had. Did you take any knowledge from previous iterations to speed up the performance of iteration X? Or was the search time it took for a depth of X the same whether you searched in iterative search or outside?

Rustic Alpha 1 has the following features:
- Fancy magic bitboard move generator and bitboard board representation
- alpha-beta (hard fail)
- QSearch
- MVV-LVA
- Check extension
- Evaluation with material count and one set of self-written PSQT's (non-tapered, but it has an endgame King table to make sure it can actually deliver checkmate against a bare king.)

I wanted to avoid engine-vs-engine testing for as long as possible because it's so time consuming but maybe it's the sensible thing to do just to make extra sure that each component I add to Leorik is actually working as it's supposed to.

That would be my main reason to test the engine; test every feature you add, to see if it adds the expected amount of playing strength. If it doesn't, you probably have a bug somewhere. If you add lots of features and then later have to go back, debugging is going to be harder.

With some luck I have a two week vacation coming up. I hope I'll be able to find some time to finally finish my own next version. I'll be very glad when that's out of the way, because the next verison 5 will be relatively small. (Probably only null move and static null move.) And I want to turn part of Rustic into "librustic", so I have my own chess library to use in future programs.

lithander · Post by **lithander** » Sat Jan 29, 2022 8:08 pm

mvanthoor wrote: ↑Sat Jan 29, 2022 4:18 pm Rustic Alpha 1 has the following features:
- Fancy magic bitboard move generator and bitboard board representation
- alpha-beta (hard fail)
- QSearch
- MVV-LVA
- Check extension
- Evaluation with material count and one set of self-written PSQT's (non-tapered, but it has an endgame King table to make sure it can actually deliver checkmate against a bare king.)

Because I had MVV-LVA and QSearch (which searches all moves when in check which comes down to check extensions I guess) already implemented it didn't take too long to create an engine by just taking MinimalChess's code for time control, UCI protocol etc.

Then I plaid a few hundred games at 3s + 0.2s increment.

Code: Select all

Score of Leorik 0.1 vs Rustic 1: 577 - 369 - 725  [0.562] 1671
...      Leorik 0.1 playing White: 314 - 161 - 361  [0.592] 836
...      Leorik 0.1 playing Black: 263 - 208 - 364  [0.533] 835
...      White vs Black: 522 - 424 - 725  [0.529] 1671
Elo difference: 43.5 +/- 12.5, LOS: 100.0 %, DrawRatio: 43.4 %

My evaluation is different but other than that I believe I have the exact same features as Rustic 1. My evaluation is based on the tables of MMC but without the mobility table. The values are tuned under the assumption that the mobility tables is also used. Not sure if it makes the evaluation better or weaker than yours.

The high draw rate is mostly due to 3-fold repetitions. Leorik doesn't detect those yet. Not sure about Rustic.

There has been one time forfeit for Leorik. But no illegal moves or crashes. Nice!

Just out of curiosity I have used 'go' on the start position and it seems like Leorik is way faster, somehow.

Code: Select all

Leorik 0.1
go
info string Search scheduled to take 715827862ms!
info depth 1 score cp 29 nodes 20 nps 606 time 33 pv d2d4
info depth 2 score cp 0 nodes 151 nps 3355 time 45 pv d2d4
info depth 3 score cp 27 nodes 1340 nps 26800 time 50 pv g1f3
info depth 4 score cp 0 nodes 6042 nps 118470 time 51 pv g1f3
info depth 5 score cp 27 nodes 38144 nps 706370 time 54 pv g1f3
info depth 6 score cp 0 nodes 351012 nps 3510120 time 100 pv b1c3
info depth 7 score cp 19 nodes 2316858 nps 7240181 time 320 pv b1c3
info depth 8 score cp 8 nodes 17755752 nps 11980939 time 1482 pv e2e4
info depth 9 score cp 30 nodes 133963134 nps 14255947 time 9397 pv e2e4
info depth 10 score cp 6 nodes 1061640178 nps 13162732 time 80655 pv e2e4

Code: Select all

Engine: Rustic Alpha 1
Author: Marcel Vanthoor
EMail: mail@marcelvanthoor.nl
Website: https://rustic-chess.org/
Protocol: uci
Threads: 1

go
info score cp 70 depth 1 time 0 nodes 7 nps 0 pv e2e4
info score cp 0 depth 2 seldepth 4 time 0 nodes 114 nps 0 pv d2d4 d7d5
info score cp 60 depth 3 seldepth 4 time 0 nodes 1005 nps 0 pv d2d4 d7d5 e2e3
info score cp 0 depth 4 seldepth 6 time 4 nodes 15893 nps 3973250 pv d2d4 d7d5 c1g5 c8g4
info score cp 20 depth 5 seldepth 6 time 46 nodes 169855 nps 3692500 pv d2d4 d7d5 c1f4 c8g4 f4e5
info score cp 5 depth 6 seldepth 7 time 343 nodes 1458763 nps 4252953 pv d2d4 d7d5 e2e3 c8d7 c2c4 e7e6 c4d5 e6d5
info score cp 20 depth 7 seldepth 8 time 2858 nodes 11665664 nps 4081758 pv d2d4 d7d5 c1d2 c8f5 e2e3 e7e6 f1e2
info score cp 0 depth 8 seldepth 9 time 24449 nodes 109722286 nps 4487803 pv d2d4 d7d5 c1d2 c8g4 f2f3 g4f5 e2e3 e7e6
info score cp 15 depth 9 seldepth 10 time 180631 nodes 771514086 nps 4271216 pv g1f3 d7d5 d2d4 c8g4 f3e5 g4f5 c1d2 e7e6 e2e3

Rustic takes 3 minutes to reach depth 9 and Leorik takes 9 seconds, only. I'm not even sure if we count nodes the same way (4M nps seems slow, 14M nps seems about right) but speed alone doesn't explain that huge difference!

But anyway... I'm happy, as it seems so far everything is working as it should!

mvanthoor · Post by **mvanthoor** » Sat Jan 29, 2022 8:49 pm

lithander wrote: ↑Sat Jan 29, 2022 8:08 pm But anyway... I'm happy, as it seems so far everything is working as it should!

Congratulations

I have no explanation for the speed difference. My engine runs at 4.5 million NPS on my i7-6700K.

It runs at about the same speed on your CPU, but it is much newer. Are you using the BMI2-version on a Zen2 CPU perhaps? If you have a Zen2, you'd better run the popcnt version.

I cannot explain why your engine runs at 14M nodes per second. That is ridiculously fast. I also find that to be strange because, according to dangi's test with regard to bitboard implementations, Magic Bitboards is faster than the implementation you have used.

But, do you still have staged move generation in the 0.1 version of your engine? Rustic Alpha 1 doesn't have that.

This speed difference could explain the 43 Elo difference. That could also be explained by the fact that you are using (partially) tuned tables, where my engine is using a table written by hand.

I'd be glad to know where that massive speed difference between the engines comes from. I can believe that your first version is a bit stronger than Rustic 1 due to better PST's, but I can hardly believe that it is over 3 times faster with regards to nodes per second. Either you are counting nodes differently, or I have a massive bottleneck somewhere.

Rustic 1 counts nodes when it is actually going to do work: so it counts nodes after generating moves. If it cuts nodes off, a node is not counted. (This is true for both a/b and qsearch.) Rustic 3 counts nodes as they are visited: it counts immediately after entering the a/b function and qsearch function. To make sure it does not count a node at depth 0 a second time, it compensates for this by substracting a node just before entering qsearch. Rustic 3 seems to be faster than 1 because it counts visited nodes instead of nodes generating moves, but in practice they're the same speed. (If no hash table is used.)

I'm wondering how your engine is able to reach a higher depth that fast.

The development version of Rustic (and basically every version since 3) reaches depth 10 in 9.4 seconds IF I give it a 1MB TT, so it uses hash table / PV move sorting:

Code: Select all

info score cp 29 depth 1 seldepth 1 time 0 nodes 24 nps 0 pv d2d4
info score cp 0 depth 2 seldepth 2 time 0 nodes 90 nps 0 pv d2d4 d7d5
info score cp 27 depth 3 seldepth 5 time 0 nodes 672 nps 0 hashfull 1 pv d2d4 d7d5 g1f3
info score cp 0 depth 4 seldepth 8 time 2 nodes 2292 nps 1146000 hashfull 6 pv d2d4 d7d5 g1f3 g8f6
info score cp 27 depth 5 seldepth 12 time 9 nodes 13957 nps 1550778 hashfull 22 pv d2d4 d7d5 g1f3 b8c6 b1c3
info score cp 0 depth 6 seldepth 18 time 18 nodes 66522 nps 3695667 hashfull 134 pv d2d4 d7d5 g1f3 g8f6 f3e5 f6e4
info score cp 19 depth 7 seldepth 18 time 46 nodes 302428 nps 6574522 hashfull 417 pv d2d4 d7d5 g1f3 g8f6 b1c3 b8c6 f3g5
info score cp 8 depth 8 seldepth 23 time 260 nodes 1809704 nps 6960400 hashfull 994 pv e2e4 e7e5 b1c3 b8c6 g1f3 g8f6 d2d4 e5d4 f3d4 c6d4 d1d4
info score cp 27 depth 9 seldepth 23 time 1502 nodes 10879505 nps 7243346 hashfull 1000 pv e2e4 b8c6 g1f3 g8f6 f1d3 d7d5 e4d5 f6d5 e1g1
info score cp 6 depth 10 seldepth 25 time 9411 nodes 66066608 nps 7020147 hashfull 1000 pv e2e4 e7e5 b1c3 g8f6 g1f3 f8d6 f1c4 b8c6 e1g1 e8g8

If Leorik has exact feature parity with Rustic 1 (which does NOT have PV-move ordering or a hash move ordering; or a TT for that matter), then I can't explain why it is so much faster than Rustic. If you have an explanation, I'd be glad to hear it. Rustic consistently needs less features for the same strength than other engines I test it against, but that doesn't mean there is not a bottleneck somewhere that is holding it back.

lithander · Post by **lithander** » Sat Jan 29, 2022 10:28 pm

mvanthoor wrote: ↑Sat Jan 29, 2022 8:49 pm It runs at about the same speed on your CPU, but it is much newer. Are you using the BMI2-version on a Zen2 CPU perhaps? If you have a Zen2, you'd better run the popcnt version.

I tried old, generic, bmi2 and popcnt version of rustic alpha 1 (had them all on my harddisk, still) and bmi2 and popcnt were a lot faster than old and generic but popcnt seems even faster than BMI2 on my Ryzen 3600 for some reason. Don't ask me why but I made sure I used the fastest of the available versions.

mvanthoor wrote: ↑Sat Jan 29, 2022 8:49 pm I cannot explain why your engine runs at 14M nodes per second. That is ridiculously fast. I also find that to be strange because, according to dangi's test with regard to bitboard implementations, Magic Bitboards is faster than the implementation you have used.

My (staged) move generator runs at 60M nps in perft and when computing the evaluation incrementally it's still 40M nps. So 14M nodes during search didn't seem so fast to me because the search isn't doing much else. Just generating moves, playing them and looking at the eval.

mvanthoor wrote: ↑Sat Jan 29, 2022 8:49 pm But, do you still have staged move generation in the 0.1 version of your engine? Rustic Alpha 1 doesn't have that.

I do generate captures and quiet moves separately! It's hard to "disable" it. Otherwise I wouldn't know in what range to apply the move sorting. It's really written with that kind of 2 phase generation from the get go because I think it makes just a lot of sense to not waste time on generating quiet moves when most of the time you don't need them. I'm willing to accept that as the root cause for the speed difference, though.

mvanthoor wrote: ↑Sat Jan 29, 2022 8:49 pm Rustic 1 counts nodes when it is actually going to do work: so it counts nodes after generating moves. If it cuts nodes off, a node is not counted. (This is true for both a/b and qsearch.) Rustic 3 counts nodes as they are visited: it counts immediately after entering the a/b function and qsearch function. To make sure it does not count a node at depth 0 a second time, it compensates for this by substracting a node just before entering qsearch. Rustic 3 seems to be faster than 1 because it counts visited nodes instead of nodes generating moves, but in practice they're the same speed.

I count visited nodes, too. Sometimes the visit is a short one: just looking whether side to move is in check (not cheap, actually) and then the standpat score of the position might already beat beta so no moves are generated. In all other cases I will also generate moves for each node that I count. So some degree of effort is spend for each "visited" node. Here's the source...

Code: Select all

 private int Evaluate(int depth, int remaining, int alpha, int beta, MoveGen moveGen)
        {
            if (remaining == 0)
                return EvaluateQuiet(depth, alpha, beta, moveGen);

            NodesVisited++;
            if (Aborted)
                return 0;

            BoardState current = Positions[depth];
            BoardState next = Positions[depth + 1];
            int score;
            bool movesPlayed = false;


            for (int i = moveGen.CollectCaptures(current); i < moveGen.Next; i++)
            {
                PickBestMove(i, moveGen.Next);

                if (next.PlayAndUpdate(current, ref Moves[i]))
                {
                    movesPlayed = true;
                    score = -Evaluate(depth + 1, remaining - 1, -beta, -alpha, moveGen);

                    if (score >= beta)
                        return beta;

                    if (score > alpha)
                        alpha = score;
                }
            }
            for (int i = moveGen.CollectQuiets(current); i < moveGen.Next; i++)
            {
                if (next.PlayAndUpdate(current, ref Moves[i]))
                {
                    movesPlayed = true;
                    score = -Evaluate(depth + 1, remaining - 1, -beta, -alpha, moveGen);

                    if (score >= beta)
                        return beta;

                    if (score > alpha)
                        alpha = score;
                }
            }

            //checkmate or draw?
            if (!movesPlayed)
                return current.IsChecked(current.SideToMove) ? Evaluation.Checkmate(depth) : 0;

            return alpha;
        }

        private int EvaluateQuiet(int depth, int alpha, int beta, MoveGen moveGen)
        {
            NodesVisited++;
            if (Aborted)
                return 0;

            BoardState current = Positions[depth];
            BoardState next = Positions[depth + 1];

            bool inCheck = current.IsChecked(current.SideToMove);

            //if inCheck we can't use standPat, need to escape check!
            if (!inCheck)
            {
                int standPatScore = (int)current.SideToMove * current.Eval.Score;

                if (standPatScore >= beta)
                    return beta;

                if (standPatScore > alpha)
                    alpha = standPatScore;
            }

            bool movesPlayed = false;
            for (int i = moveGen.CollectCaptures(current); i < moveGen.Next; i++)
            {
                PickBestMove(i, moveGen.Next);
                if (next.PlayAndUpdate(current, ref Moves[i]))
                {
                    movesPlayed = true;
                    int score = -EvaluateQuiet(depth + 1, -beta, -alpha, moveGen);

                    if (score >= beta)
                        return beta;

                    if (score > alpha)
                        alpha = score;
                }
            }

            if (inCheck)
            {
                for (int i = moveGen.CollectQuiets(current); i < moveGen.Next; i++)
                {
                    if (next.PlayAndUpdate(current, ref Moves[i]))
                    {
                        movesPlayed = true;
                        int score = -EvaluateQuiet(depth + 1, -beta, -alpha, moveGen);

                        if (score >= beta)
                            return beta;

                        if (score > alpha)
                            alpha = score;
                    }
                }

                if (!movesPlayed)
                    return Evaluation.Checkmate(depth);
            }


            //stalemate?
            //if (expandedNodes == 0 && !LegalMoves.HasMoves(position))
            //    return 0;

            return alpha;
        }

Please correct me if this is not the standard way of counting nodes!

The development version of Rustic (and basically every version since 3) reaches depth 10 in 9.4 seconds IF I give it a 1MB TT, so it uses hash table / PV move sorting.
If Leorik has exact feature parity with Rustic 1 (which does NOT have PV-move ordering or a hash move ordering; or a TT for that matter), then I can't explain why it is so much faster than Rustic. If you have an explanation, I'd be glad to hear it. Rustic consistently needs less features for the same strength than other engines I test it against, but that doesn't mean there is not a bottleneck somewhere that is holding it back.

I definitely don't use PV-move ordering or a TT. Actually there's no benefit at all from having an incremental search at the moment except that when I abort due to running out of time I can return the result of the previous search.

I'm conflicted about sharing my code at this point. I have no problem with disclosing the source code or executables for reference purposes to answer questions like "is it really that fast? and why?" but I'm not quite ready to declare everything "open source" in the sense that everyone can just copy it and do their own derivative work. I might decide to open source everything later (e.g. under the MIT license like MinimalChess or GPL or whatever...) but for now, if I make the repo public I'd like to clearly communicate "look, but don't touch!" to the viewer.

Anybody knows a license that does just that?

j.t. · Post by **j.t.** » Sun Jan 30, 2022 12:19 am

If I understand correctly, usually if you don't specify a license, people can basically only look at the code if you put it online. However, by using GitHub you might have to give up some rights. This stackexchange thread suggests that if you put your code publicly on GitHub, people are allowed to fork your code (but probably not allowed to edit it? Idk).

I mean, at the end you probably have to rely on people being honest and fair, because I guess that you won't start a legal battle because of this.

mvanthoor · Post by **mvanthoor** » Sun Jan 30, 2022 12:21 am

lithander wrote: ↑Sat Jan 29, 2022 10:28 pm I tried old, generic, bmi2 and popcnt version of rustic alpha 1 (had them all on my harddisk, still) and bmi2 and popcnt were a lot faster than old and generic but popcnt seems even faster than BMI2 on my Ryzen 3600 for some reason. Don't ask me why but I made sure I used the fastest of the available versions.

The reason is that Ryzen 3xxx and older emulate BMI2 instructions instead of implementing them. Therefore the popcnt version is faster. (It's one of the reasons why I never upgraded to a Ryzen 3xxx. Now it's too late to upgrade to Ryzen 5xxx because 7xxx is coming this year. So I'm probably upgrading somewhere next year.

My (staged) move generator runs at 60M nps in perft and when computing the evaluation incrementally it's still 40M nps. So 14M nodes during search didn't seem so fast to me because the search isn't doing much else. Just generating moves, playing them and looking at the eval.

Completely flabbergasted. How did you get this to run at 60M nps and 14M respectively? You must be NOT doing something in MMC that I am doing in Rustic. Rustic keeps zobrist hashing incrementally, next to two sets of PST's, material, and game phase. It does this in perft as well, because I don't have any provisions to make perft go faster (so I don't do anything special for perft).

I do generate captures and quiet moves separately! It's hard to "disable" it. Otherwise I wouldn't know in what range to apply the move sorting. It's really written with that kind of 2 phase generation from the get go because I think it makes just a lot of sense to not waste time on generating quiet moves when most of the time you don't need them. I'm willing to accept that as the root cause for the speed difference, though.

That may be a difference. You are already doing some staged move generation. Rustic _can_ generate captures and quiet moves seperately, but up until now, it is doing this only for QSearch. In version 6 it will have staged move generation: TT move first, then captures, then quiets. That should speed it up a bunch.

Please correct me if this is not the standard way of counting nodes!

It seems correct at first glance. You count nodes after you go into QSearch; so you either count the node in QSearch itself, or in alpha/beta. But, I see that you indeed do a two-step move generation. That gains a lot of speed. I'll be adding that in version 6. (With the addition of the step for trying the TT-move first if it exists.)

I'm conflicted about sharing my code at this point. I have no problem with disclosing the source code or executables for reference purposes to answer questions like "is it really that fast? and why?" but I'm not quite ready to declare everything "open source" in the sense that everyone can just copy it and do their own derivative work. I might decide to open source everything later (e.g. under the MIT license like MinimalChess or GPL or whatever...) but for now, if I make the repo public I'd like to clearly communicate "look, but don't touch!" to the viewer.

No problem; just share bits and pieces where you feel like it, and open-source all of it when you think it's ready.

Anybody knows a license that does just that?

I don't know a license like that, but you can just make up your own. Why not? You could just create a LICENSE file that says exactly that in a single clause. However, you can't prevent people from copying the source if you make it public. If you feel uncomfortable with that because you deem it to not be good enough yet, then just keep it private and open-source it when you feel it's good enough.

Mike Sherwin · Post by **Mike Sherwin** » Sun Jan 30, 2022 12:48 am

Using incremental update in make/unmake and stripping Bric's search all the way down to just a/b with the only move ordering from the move generator. Example, m->score = (value[board[ts]] << 4) - value[m->type];.

Code: Select all

s32 Search(Thread* t, Move* m, s32 alpha, s32 beta, s32 depth) {
  s32 mi, j, key, score, count, hfs, hts, moveFound;
  u64 n;
  Move* mov;

  // hash table lookup
  moveFound = false;
  key = hashKey & 0xfffff;
/*  if (posHash[key].key == hashKey && posHash[key].sig == hashSig) {
    if (posHash[key].score >= beta && depth <= posHash[key].depth) {
      return posHash[key].score;
    }
    hfs = posHash[key].fs;
    hts = posHash[key].ts;
    moveFound = true;
  }*/

  score = mat[wtm] - mat[1 - wtm] + pos[wtm] - pos[1 - wtm];

  if (depth < 1) return Qsearch(t, m, alpha, beta);

  pvl[ply] = ply;

  n = GenMoves(t, m);

  if (!n) return ILLEGAL;

  // null move
/*  if (score >= beta && depth > 1) {
    hashKey ^= 1ull;
    hashSig ^= 1ull;
    wtm = 1 - wtm;
    score = -Search(t, m + n, -beta - 1, -beta + 1, depth - 3);
    wtm = 1 - wtm;
    hashKey ^= 1ull;
    hashSig ^= 1ull;
    if (score >= beta) return beta;
  }*/

  // shallow search to order moves
/*  if (depth > 3) {
   for (mi = 0; mi < n; mi++) {
      mov = m + mi;
      MakeMove(t, mov);
      UpdateHash(t, mov);
      mov->score = -Search(t, m + n, -beta - 1, -alpha + 1, 1);
      UpdateHash(t, mov);
      TakeBack(t, mov);
    }
  }*/

/*  if (moveFound) {
    for (mi = 0; mi < n; mi++) {
      mov = m + mi;
      if (mov->fs == hfs && mov->ts == hts) {
        mov->score = 1000000;
        break;
      }
    }
  }*/

  count = 0;
  for (mi = 0; mi < n; mi++) {
    Sort(t, m, mi, n);
    mov = m + mi;
    MakeMove(t, mov);
//    UpdateHash(t, mov);
//    if (IncreaseRepCount(t) == 2) {
//      mov->score = 0;
//    }
//    else {
/*      if (depth > 3) {
        if (count && mov->score < alpha) {
          mov->score = -Search(t, m + n, -alpha - 1, -alpha, depth - 3);
          if (mov->score > alpha)
            mov->score = -Search(t, m + n, -beta, -alpha, depth - 1);
        }
        else {
          mov->score = -Search(t, m + n, -beta, -alpha, depth - 1);
        }
      }
      else {
        mov->score = -Search(t, m + n, -beta, -alpha, depth - 1);
      }*/
      mov->score = -Search(t, m + n, -beta, -alpha, depth - 1);
//    }
//    DecreaseRepCount(t);
//    UpdateHash(t, mov);
    TakeBack(t, mov);
    fpv = false;
    if (mov->score == -ILLEGAL) continue;
    count++;
    if (mov->score > alpha) {
      if (mov->score >= beta) {
/*        posHash[key].key = hashKey;
        posHash[key].sig = hashSig;
        posHash[key].score = mov->score;
        posHash[key].depth = depth;
        posHash[key].fs = mov->fs;
        posHash[key].ts = mov->ts;*/
        return beta;
      }
      alpha = mov->score;
      pvs[ply][ply] = *mov;
      for (j = ply + 1; j < pvl[ply + 1]; j++) pvs[ply][j] = pvs[ply + 1][j];
      pvl[ply] = pvl[ply + 1];
    }
    if (clock() > end) return alpha;
  }
  if (!count) {
    if (InCheck[wtm](t, king[wtm])) {
      return -(MATE - ply);
    }
    else {
      return STALEMATE;
    }
  }
  return alpha;
}

Begin Depth 8
8 16 351 37676018 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (3.51 seconds)

With shallow search to score moves enabled.
8 16 228 27740062 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (2.28 seconds)

With TT enabled. (only stores beta cutoffs as of now and only one million slots)
8 16 195 20172105 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (1.95 seconds)

With null move enabled.
8 16 61 5796957 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.61 seconds)

With move reduction.
8 16 8 659640 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.08 seconds)

lithander · Post by **lithander** » Sun Jan 30, 2022 1:17 am

Mike Sherwin wrote: ↑Sun Jan 30, 2022 12:48 am Using incremental update in make/unmake and stripping Bric's search all the way down to just a/b with the only move ordering from the move generator.

Begin Depth 8
8 16 351 37676018 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (3.51 seconds)

With shallow search to score moves enabled.
8 16 228 27740062 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (2.28 seconds)

With TT enabled. (only stores beta cutoffs as of now and only one million slots)
8 16 195 20172105 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (1.95 seconds)

With null move enabled.
8 16 61 5796957 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.61 seconds)

With move reduction.
8 16 8 659640 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.08 seconds)

I can't disable features I don't have yet but I can also strip Leorik down more.

With Alpha/Beta, MVV-LVA move sorting and Qsearch:
info depth 8 score cp 8 nodes 17755752 nps 13605940 time 1305 pv e2e4
13.6M NPS, 1.3s

With Alpha/Beta, MVV-LVA but no Qsearch:
info depth 8 score cp -80 nodes 18488281 nps 26003208 time 711 pv b1c3
26M NPS, 0.7s (but the evaluation will suffer from the horizon effect)

With Alpha/Beta played in the order of generation (captures first, then quiets)
info depth 8 score cp -80 nodes 19201618 nps 27275025 time 704 pv b1c3
27M NPS, 0.7s (move ordering doesn't do much but also doesn't cost much.. it's the starting position after all^^)

Without Alpha/Beta it is approaching perft speeds:
info depth 6 score cp -68 nodes 134709990 nps 42589310 time 3163 pv b1c3
42M NPS... and takes forever to reach depth 8.

Mike Sherwin · Post by **Mike Sherwin** » Sun Jan 30, 2022 1:23 am

lithander wrote: ↑Sun Jan 30, 2022 1:17 am
Mike Sherwin wrote: ↑Sun Jan 30, 2022 12:48 am Using incremental update in make/unmake and stripping Bric's search all the way down to just a/b with the only move ordering from the move generator.

Begin Depth 8
8 16 351 37676018 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (3.51 seconds)

With shallow search to score moves enabled.
8 16 228 27740062 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (2.28 seconds)

With TT enabled. (only stores beta cutoffs as of now and only one million slots)
8 16 195 20172105 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (1.95 seconds)

With null move enabled.
8 16 61 5796957 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.61 seconds)

With move reduction.
8 16 8 659640 e2e4 e7e5 g1f3 g8f6 f3e5 f6e4 d1h5 e4d6 (0.08 seconds)
I can't disable features I don't have yet but I can also strip Leorik down more.

With Alpha/Beta, MVV-LVA move sorting and Qsearch:
info depth 8 score cp 8 nodes 17755752 nps 13605940 time 1305 pv e2e4
13.6M NPS, 1.3s

With Alpha/Beta, MVV-LVA but no Qsearch:
info depth 8 score cp -80 nodes 18488281 nps 26003208 time 711 pv b1c3
26M NPS, 0.7s (but the evaluation will suffer from the horizon effect)

With Alpha/Beta played in the order of generation (captures first, then quiets)
info depth 8 score cp -80 nodes 19201618 nps 27275025 time 704 pv b1c3
27M NPS, 0.7s (move ordering doesn't do much but also doesn't cost much.. it's the starting position after all^^)

Without Alpha/Beta it is approaching perft speeds:
info depth 6 score cp -68 nodes 134709990 nps 42589310 time 3163 pv b1c3
42M NPS... and takes forever to reach depth 8.

I would say then that you have lowered the bar that the rest of us have to shimmy under!

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik