New engine announcement: Blunder 1.0 (And some questions concerning improvement)

mvanthoor · Post by **mvanthoor** » Fri Jul 02, 2021 8:27 pm

algerbrex wrote: ↑Fri Jul 02, 2021 6:24 pm I agree with your estimate that I should at least be around ~1800 and be able to beat ~1600 rated engines, but of course my engine is playing more around ~1300.

If you hoave alpha/beta, MVV-LVA, Rustic's PST's, and a transposition table with TT-move ordering, then you should be at least 1700. (I'm taking the possibility of an engine that is about 3x as slow as Rustic into account.)

So an Elo gap of about ~500 makes me think there's a bug somewhere in my code.

First remove everything except MVV-LVA and Rustic's PST's, and try to equalize with MinimalChess 0.3. That should be possible. If not, you either have bugs, or your engine is really slow.

Which is a little frustrating, but just another setback. But a gap of 500-600 Elo seems like quite a lot, so I may just start from the beginning again with negamax, alpha-beta, and piece square tables, and go from there, testing my engines Elo at each stage. Because even with all of the features I have now, my engine is very, very weak.

Indeed. It is worth it to test every feature and get as much Elo from it as you possibly can, because going back and doing that again in an engine that has a bazillion features is very annoying and very time consuming. Even more time-consuming than doing it now. You have to do it per feature, because some features stack on top of one another. For example: add X = +40 Elo. Add Y = +40 Elo. But, if you add BOTH X and Y, the combined result could be +100 instead of +80. This means that if feature X is not optimal, feature Y will also not be optimal.

If you have a bug in your TT, that can easily cost you 100 Elo, if you didn't know. Add a TT, gain +50 Elo. Nice... as long as you don't know that a TT can gain you +150 Elo. That's 100 Elo you're missing forever.

mvanthoor · Post by **mvanthoor** » Fri Jul 02, 2021 8:31 pm

lithander wrote: ↑Fri Jul 02, 2021 5:55 pm This is the pure PST based eval you're probably interested in: https://github.com/lithander/MinimalChe ... luation.cs

That is the set of PST's to build a tapered evaluation.

Should you be interested in Rustic's untapered PST's (which were used by MinimalChess for some time during development), they can be found here:

https://github.com/mvanthoor/rustic/blo ... on/psqt.rs

You will also need a material count (in Alpha 2, these were not yet worked into the PST's):

https://github.com/mvanthoor/rustic/blo ... on/defs.rs

algerbrex · Post by **algerbrex** » Fri Jul 02, 2021 8:40 pm

mvanthoor wrote: ↑Fri Jul 02, 2021 8:27 pm
algerbrex wrote: ↑Fri Jul 02, 2021 6:24 pm I agree with your estimate that I should at least be around ~1800 and be able to beat ~1600 rated engines, but of course my engine is playing more around ~1300.
If you hoave alpha/beta, MVV-LVA, Rustic's PST's, and a transposition table with TT-move ordering, then you should be at least 1700. (I'm taking the possibility of an engine that is about 3x as slow as Rustic into account.)

So an Elo gap of about ~500 makes me think there's a bug somewhere in my code.
First remove everything except MVV-LVA and Rustic's PST's, and try to equalize with MinimalChess 0.3. That should be possible. If not, you either have bugs, or your engine is really slow.

Which is a little frustrating, but just another setback. But a gap of 500-600 Elo seems like quite a lot, so I may just start from the beginning again with negamax, alpha-beta, and piece square tables, and go from there, testing my engines Elo at each stage. Because even with all of the features I have now, my engine is very, very weak.
Indeed. It is worth it to test every feature and get as much Elo from it as you possibly can, because going back and doing that again in an engine that has a bazillion features is very annoying and very time consuming. Even more time-consuming than doing it now. You have to do it per feature, because some features stack on top of one another. For example: add X = +40 Elo. Add Y = +40 Elo. But, if you add BOTH X and Y, the combined result could be +100 instead of +80. This means that if feature X is not optimal, feature Y will also not be optimal.

If you have a bug in your TT, that can easily cost you 100 Elo, if you didn't know. Add a TT, gain +50 Elo. Nice... as long as you don't know that a TT can gain you +150 Elo. That's 100 Elo you're missing forever.

Thanks, I'm currently in the process of doing exactly what you're saying, and I already discovered my first bug. My MvvLva was screwed up, and so good, winning captures weren't being ordered first.

As far as speed goes, I don't think that's the issue. I'm not sure how fast Rustic is exactly, but as a baseline, Blunder calculates perft(4) for kiwipete in 0.268 seconds. Here's the output:

Code: Select all

total nodes: 4085603
ms: 268ms

Now, that is with bulk-node counting. Without it, Blunder takes just under a second, which of course is still quite slow, but, it seems to me, not at all so slow to explain ~500 missing points of Elo.

So I imagine you're correct in that I have very nasty, Elo-depleting bugs scattered throughout my code somewhere. So I'm going to let equalizing with MinimalChess 0.3 be my goal right now, just with MvvLva and PST like you mentioned.

amanjpro · Post by **amanjpro** » Fri Jul 02, 2021 11:32 pm

Congrats on your new engine

If you think, Blunder can play a whole game without disconnecting, or segvaults, you may want to register it to participate in ZaTour: https://zatour.amanj.me

Registration form:

https://docs.google.com/forms/u/1/d/1po ... PObTU/edit

Granted, it will probably be the weakest in the Promotion League

algerbrex · Post by **algerbrex** » Sat Jul 03, 2021 8:06 am

amanjpro wrote: ↑Fri Jul 02, 2021 11:32 pm Congrats on your new engine

If you think, Blunder can play a whole game without disconnecting, or segvaults, you may want to register it to participate in ZaTour: https://zatour.amanj.me

Registration form:

https://docs.google.com/forms/u/1/d/1po ... PObTU/edit

Granted, it will probably be the weakest in the Promotion League

Thanks for the very kind offer, but as I've realize today Blunder has some very serious bugs causing huge Elo drops, I'll hold off from any competition for now and focus on getting down the basics. But hopefully within the next week or to I'll be registering something I'm happy with!

hgm · Post by **hgm** » Sat Jul 03, 2021 11:18 am

algerbrex wrote: ↑Fri Jul 02, 2021 6:24 pmSo an Elo gap of about ~500 makes me think there's a bug somewhere in my code. Which is a little frustrating, but just another setback. But a gap of 500-600 Elo seems like quite a lot, so I may just start from the beginning again with negamax, alpha-beta, and piece square tables, and go from there, testing my engines Elo at each stage. Because even with all of the features I have now, my engine is very, very weak.

With such a large under-performance it should be easy to spot positions where the engine blunders. Then you could simply figure out how it came to choose the move it did by having it print out info from the search tree (in particular the moves it searches and their score in some nodes of interest). That will tell you where it went wrong, and why.

Since TT is a non-essential feature that can easily be switched off, it would be best to first run without TT, and make sure everything else is bug-free. That will also simplify debugging. (Which otherwise will have to deal with the situation where you get an obviously wrong score from a TT hit.)

lithander · Post by **lithander** » Sat Jul 03, 2021 1:10 pm

hgm wrote: ↑Sat Jul 03, 2021 11:18 am Since TT is a non-essential feature that can easily be switched off, it would be best to first run without TT, and make sure everything else is bug-free. That will also simplify debugging. (Which otherwise will have to deal with the situation where you get an obviously wrong score from a TT hit.)

Temporarily disabling the TT should also not prevent you from equalizing with MinimalChess 0.3 because it didn't use a TT before the current version 0.5.

algerbrex · Post by **algerbrex** » Sat Jul 03, 2021 5:18 pm

hgm wrote: ↑Sat Jul 03, 2021 11:18 am
algerbrex wrote: ↑Fri Jul 02, 2021 6:24 pmSo an Elo gap of about ~500 makes me think there's a bug somewhere in my code. Which is a little frustrating, but just another setback. But a gap of 500-600 Elo seems like quite a lot, so I may just start from the beginning again with negamax, alpha-beta, and piece square tables, and go from there, testing my engines Elo at each stage. Because even with all of the features I have now, my engine is very, very weak.
With such a large under-performance it should be easy to spot positions where the engine blunders. Then you could simply figure out how it came to choose the move it did by having it print out info from the search tree (in particular the moves it searches and their score in some nodes of interest). That will tell you where it went wrong, and why.

Since TT is a non-essential feature that can easily be switched off, it would be best to first run without TT, and make sure everything else is bug-free. That will also simplify debugging. (Which otherwise will have to deal with the situation where you get an obviously wrong score from a TT hit.)

Right, I'm going to run it back through some games to find some positions where it obviously blunders and see if I can hunt down where it's going wrong and why.

I'm sure the TT had some bugs with how much Elo my engine was missing, but even switching that off and only using alpha-beta, MvvLva, and good piece-square tables (the ones from Rustic that Minimal Chess 0.3 was using), my engine still has a hard time beating Minimal Chess 0.3, so I've just started from scratch with everything.

Since you're obviously very knowledgeable about this subject, could you give me any estimates of the Elo of my engine I should expect at the following points since as I said, I plan to start completely from the beginning and test each and every feature I'm adding to make sure the Elo is increasing:

Pure negamax and alpha-beta
MvvLva
Iterative deepening
Piece-square tables
Quiesence search

algerbrex · Post by **algerbrex** » Sat Jul 03, 2021 5:23 pm

lithander wrote: ↑Sat Jul 03, 2021 1:10 pm
hgm wrote: ↑Sat Jul 03, 2021 11:18 am Since TT is a non-essential feature that can easily be switched off, it would be best to first run without TT, and make sure everything else is bug-free. That will also simplify debugging. (Which otherwise will have to deal with the situation where you get an obviously wrong score from a TT hit.)
Temporarily disabling the TT should also not prevent you from equalizing with MinimalChess 0.3 because it didn't use a TT before the current version 0.5.

Right. I think the TT usage definitely had bugs, but it's not the speeding bullet, since Blunder when stripped to the same point as Minimal Chess 0.3, still doesn't do too well. Which as left me totally stumped, as I'm quite confident the core of the engine (move generator, board representation, etc.) all works correctly, as I made sure I to take the time to test it quite extensively.

The only idea I have right now is that I see Minimal Chess 0.3 had a couple of features I had taken away, most notably iterative deepening and quiescence search. How much do you think those affected the Elo of Minimal Chess at that stage?

emadsen · Post by **emadsen** » Sat Jul 03, 2021 5:33 pm

algerbrex wrote: ↑Sat Jul 03, 2021 5:18 pmCould you give me any estimates of the Elo of my engine I should expect at the following points since as I said, I plan to start completely from the beginning and test each and every feature I'm adding to make sure the Elo is increasing:

Pure negamax and alpha-beta

MvvLva

Iterative deepening

Piece-square tables

Quiesence search

These blog posts may be of interest to you. The table at the bottom of each post records the ELO gain I measured for each feature. The features in each table link to a blog post where I explain how I implemented the feature. Usually a mix of expository writing and code snippets.

Keep in mind, though, chess engine features combine in a non-linear manner. Sometimes the sum is worth more than the parts. Other times the sum is worth less than the parts. Also, if you implement features in a different order than I did, you may measure different gains.

New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)

Re: New engine announcement: Blunder 1.0 (And some questions concerning improvement)