Crude, cruder, crudest

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Crude, cruder, crudest

Post by Lyudmil Tsvetkov »

I used to be a chess fan in the past.

Now I am a STF (Stockfish Testing Framework) fan. Really, it is not my fault, everyone talks only about STF, people test different patches, and the mania is on the rampage. I would not probably have done this, if it had not been for Robert Tournevisse, who started a thread on the programming section, and Ralph Stoesser, who kindly posted a link to Stockfish psqt. Any blame for this message of mine should be directed towards them.

Anyway, as I already am a STF maniac (I might visit the site some 10-20 times per day, well maybe there is more green ) :) , I took the effort to write my own values for the Stockfish piece tables. Actually, I filled in all piece and pawn tables apart from the king table, as, for once, Stockfish plays great in the endgame, and twice, tuning king safety in the middlegame is a very difficult task, depedning on many other factors.

The filled tables could be downloaded as a doc file here: http://www.freeuploadsite.com/do.php?id=30498. I kept the Stockfish format, and just replaced the old values with my own ones. Some remarks:

- the pawn table takes into consideration space advantage, something that was missing from the older table
- the different piece tables also take into consideration space advantage that was also largely missing
- the values for knight and bishop are excatly the same. I thought what to do about that, but as, on the one hand, the bishop is a relatively stronger piece deserving bigger values, and, on the other hand, the knight psqt is usually more important, I decided to not overtune everything to a hair and let both pieces have same values. Basically, that would be right, as there are no major distinctions.


Probably some knowledgeable person could sometime run a test with those values, either locally or globally, however, I am almost certain that all possible tests will fail, as the values will clash with different specifically tuned Stockfish parameters. What comes to my mind is that the pawn table could clash with a space advantage feature, or with some pawns getting rank bonus, like chain pawns. The piece tables for the minor pieces could clash with outposts and other similar stuff. I did the tables so that they include most relevant parameters in one feature, i.e. the table includes bonus points in terms of file and rank placements, centralisation, space advantage, etc.

As I told you, I accept no blame (but some libel and some outrageous words, as well as calling me names might pass) :( I would gladly have done something else instead, but, as STF reigns supreme, I was simply caught in the impetuous stream.

I chose the name of the thread because when I looked at different tables, I was left with the impression they are either crude, or very crude. But now, the present table most probably provides the crudest tuning attempt.

Best, Lyudmil
jpqy
Posts: 554
Joined: Thu Apr 24, 2008 9:31 am
Location: Belgium

Re: Crude, cruder, crudest

Post by jpqy »

I gone try them out!

It was maybe easier just to put this psqtab.h file here..
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Crude, cruder, crudest

Post by Lyudmil Tsvetkov »

Now, if anyone ever tries those values, I would be very happy to hear of the result of a local test, even with small number of games.

The modified tables would have the advantage, but also the disadvantage, of taking into consideration too many things at the same time. For example, general piece placements on files, ranks and centralisation are already in; space advantage is already in, terms like rook on 7th rank, rook on 8th rank, etc., are already in. This means that one can do many things with a single table.

On the other hand, this creates problems with possible redundancies. For example, any patch with those values would not be quite successful if any existing parameters relating to space advantage, outposts, rook on 7th, etc., are not tuned to that table. You might have to remove rook on 7th bonus, adjust outposts, decrease or remove space, etc., in order for the values to work. But of course, the easier decision is to just scrap the table.

The table might also not be tuned to maybe some 30-40 other eval terms in Stockfish or the Stockfish search parameters. That is why it will probably fail. However, I would insist that the values are more appropriate than the earlier ones, but only if you start from scratch.

Whatever the results (I know they are dismal), I would be happy to hear about some local result, if any, to draw my conclusions.
jpqy
Posts: 554
Joined: Thu Apr 24, 2008 9:31 am
Location: Belgium

Re: Crude, cruder, crudest

Post by jpqy »

Core i7 2670QM 4cores , Blitz 5m 0

1 Houdini 4 Pro x64B C0 4c ½011½11½11 7.5/10
2 Stockfish 050114 64 SSE4.2 ½100½00½00 2.5/10

Sorry..not good..

Default wins from Houdini more and more..
Wish you good luck with finding better values!

JP.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Crude, cruder, crudest

Post by Lyudmil Tsvetkov »

jpqy wrote:Core i7 2670QM 4cores , Blitz 5m 0

1 Houdini 4 Pro x64B C0 4c ½011½11½11 7.5/10
2 Stockfish 050114 64 SSE4.2 ½100½00½00 2.5/10

Sorry..not good..

Default wins from Houdini more and more..
Wish you good luck with finding better values!

JP.
Thanks Jean-Paul!

The values are optimal, would say much better than the original ones.
It is just that they are not tuned at all to the whole...

Interesting how an engine started from scratch would perform with those values.
syzygy
Posts: 5697
Joined: Tue Feb 28, 2012 11:56 pm

Re: Crude, cruder, crudest

Post by syzygy »

Lyudmil Tsvetkov wrote:The values are optimal, would say much better than the original ones.
Of course they are!
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Crude, cruder, crudest

Post by Lyudmil Tsvetkov »

syzygy wrote:
Lyudmil Tsvetkov wrote:The values are optimal, would say much better than the original ones.
Of course they are!
I mean, those are simple things, when the original values would score pawn on the 5th rank higher than pawn on the 6th, or queen on the 7th higher than queen on the 6th, or knight on the 7th even negatively, you quickly observe there is really something wrong with the values that does not reflect general chess knowledge.

Miguel is right, and you understand that quite well, that in a pool of terms even a term with a negative sign to what its real meaning would suggest could score well. It is the sum total that counts, and not the separate values.

However, on many occasions, if you do not follow general chess rules, I think this is more like a guesswork. I would prefer to have things straight with clear eval parameters based on general chess knowledge. This simplifies things a lot.

I would bet for example, that Stockfish with those psqt values and space removed; outposts removed; bonus for rook and queen on 7th and 8th ranks removed would perform much better than what the first sample of Jean-Paul suggests with no other modifications. It would still be weaker than original Stockfish, because mobility and other parameters would still be out of tune with the modified psqt, but would perform better than the first sample.

I think it is good to have outposts, space, rook on 7th, etc., in one and the same table, do not you think so?
User avatar
mohzus
Posts: 106
Joined: Tue Sep 24, 2013 2:54 am

Re: Crude, cruder, crudest

Post by mohzus »

I've made a very short (in terms of number of games) test. It does not look promising for these values but the statistical noise is surely high.
Result is 7 wins 22 losses and 18 draws for the modified SF against the original one.
I used exactly the same way they test in the fishtest, except for the time control that I adjusted to 29 +0.05 (which corresponds to roughly 17 seconds per game +0.05 second per move in the fishtest).
The command I used is

Code: Select all

./cutechess-cli -repeat -rounds 100 -tournament gauntlet -pgnout results.pgn -resign movecount=3 score=400 -draw movenumber=34 movecount=8 score=20 -concurrency 1 -openings file=8moves_v3.pgn format=pgn order=random plies=16 -engine name=stockfishmaster cmd=./stockfishmaster option.Hash=128 option.OwnBook=false -engine name=sfmodified cmd=./sfmodified option.Hash=128 option.OwnBook=false -each proto=uci option.Threads=1 tc=29+0.05
The games can be found at

Code: Select all

http://speedy.sh/3YECu/results.pgn
.

Lyudmil, I do not want to discourage you with these results.
Here is some information that I think you've read but I put it here because I think it is relevant. Someone basically set all the psqt to 0, one by one and reported the difference in elo based on very short but numerous games, the results are the following:
SF results wrote:10000 games per test

+/-5 elo
Very fast TC

PST ELO GAMES & SCORE
Pawn: -1 [2029 - 2047 - 5924 [0.499] 10000]
Knight: -20 [1875 - 2445 - 5680 [0.471] 10000]
Bishop: -4 [1945 - 2051 - 6004 [0.495] 10000]
Rook: -2 [1934 - 1981 - 6085 [0.498] 10000]
Queen: -2 [1899 - 1950 - 6151 [0.497] 10000]
King: -32
With this in mind, we see that removing the King psqt removes 32 elo out of SF. My guess is that somehow we should not try to modify this table, or if we do, do it very slightly because it seems well tuned and any change in it would likely be harmful. Same for the knight psqt, I'd be afraid to try "random" numbers there. However when it comes to pawn, queen, bishop and rook psqt, maybe there are some elo points to grab... but then again it's not a certainty.

P.S.:Lyudmil, I think you... err I mean WE, suffer from SF syndrome :D
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Crude, cruder, crudest

Post by Lyudmil Tsvetkov »

mohzus wrote:I've made a very short (in terms of number of games) test. It does not look promising for these values but the statistical noise is surely high.
Result is 7 wins 22 losses and 18 draws for the modified SF against the original one.
I used exactly the same way they test in the fishtest, except for the time control that I adjusted to 29 +0.05 (which corresponds to roughly 17 seconds per game +0.05 second per move in the fishtest).
The command I used is

Code: Select all

./cutechess-cli -repeat -rounds 100 -tournament gauntlet -pgnout results.pgn -resign movecount=3 score=400 -draw movenumber=34 movecount=8 score=20 -concurrency 1 -openings file=8moves_v3.pgn format=pgn order=random plies=16 -engine name=stockfishmaster cmd=./stockfishmaster option.Hash=128 option.OwnBook=false -engine name=sfmodified cmd=./sfmodified option.Hash=128 option.OwnBook=false -each proto=uci option.Threads=1 tc=29+0.05
The games can be found at

Code: Select all

http://speedy.sh/3YECu/results.pgn
.

Lyudmil, I do not want to discourage you with these results.
Here is some information that I think you've read but I put it here because I think it is relevant. Someone basically set all the psqt to 0, one by one and reported the difference in elo based on very short but numerous games, the results are the following:
SF results wrote:10000 games per test

+/-5 elo
Very fast TC

PST ELO GAMES & SCORE
Pawn: -1 [2029 - 2047 - 5924 [0.499] 10000]
Knight: -20 [1875 - 2445 - 5680 [0.471] 10000]
Bishop: -4 [1945 - 2051 - 6004 [0.495] 10000]
Rook: -2 [1934 - 1981 - 6085 [0.498] 10000]
Queen: -2 [1899 - 1950 - 6151 [0.497] 10000]
King: -32
With this in mind, we see that removing the King psqt removes 32 elo out of SF. My guess is that somehow we should not try to modify this table, or if we do, do it very slightly because it seems well tuned and any change in it would likely be harmful. Same for the knight psqt, I'd be afraid to try "random" numbers there. However when it comes to pawn, queen, bishop and rook psqt, maybe there are some elo points to grab... but then again it's not a certainty.

P.S.:Lyudmil, I think you... err I mean WE, suffer from SF syndrome :D
Robert, thanks, you made me laugh for the first time this day referring to syndrome. :)

I like people that provide me with useful information. I do not read all, many things between the lines (as I must write psqt ) :( . How I read the data about elo impact of different piece and pawn tables? Very simple:

- king table written very well (and that was my general impression, confirmed by the fact that Stockfish attacks well and plays good endgames)
- knight table written well
- bishop table written poorly
- rook table written very badly (and that was my general impression)
- queen table written similarly very badly (also my impression from the values, but also from some dubious Stockfish queen moves)
- pawn table written beyond any level of critique: you can not get just one elo from pawn table, and my impression was really that the pawn table contains some very dubious values, if not outright contradictory

Conclusion: we have 2 good tables and 4 bad tables, a fertile ground for maniacs like us.

Thanks again for the test, you are simply great. I can not help because I can not read code properly, but, if you decide to run another test with the modified values, I would suggest that you simultaneously turn off space, outposts, bonus for rook and queen on 7th and 8th ranks. Then you will still have to adjust many other parameters, necessarily mobility, to be able to reach original Stockfish level.

I do not know if introducing all those changes is feasible or desirable at all. One thing I know for sure, however, is that the values contained in the modified tables represent significantly better fundamental chess knowledge principles.
arjuntemurnikar
Posts: 204
Joined: Tue Oct 15, 2013 10:22 pm
Location: Singapore

Re: Crude, cruder, crudest

Post by arjuntemurnikar »

Your patch tested against current stockfish master:

Code: Select all

bench:7079249

TC=2"+0.02
234 - 557 - 409  [0.365] 1200
ELO difference: -96

TC=5"+0.05
33 - 72 - 95  [0.403] 200
ELO difference: -65
Quite Frankly, nope.