AlphaZero Chess is not that strong ...

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: AlphaZero Chess is not that strong ...

Post by cdani »

hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.
Is possible that testing of Stockfish is mostly done with little hash? If so, the engine is tuned to it.
syzygy
Posts: 5728
Joined: Tue Feb 28, 2012 11:56 pm

Re: AlphaZero Chess is not that strong ...

Post by syzygy »

hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.

Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.

I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
Stockfish counts calls to do_move(), so hash cutoffs are counted.

SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: AlphaZero Chess is not that strong ...

Post by Laskos »

syzygy wrote:
hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.

Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.

I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
Stockfish counts calls to do_move(), so hash cutoffs are counted.

SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
Hmm.. that would mean that time-to-depth is not the exact measure to get the total effect of hash size for SF, only time-to-strength, right?

First, I think Shredder GUI restarts the engine for each position tested, and it counts the time to initialize the hash tables in the time used. So, my result was probably misleading. I took another approach, playing actual games to depth=21 to see the difference between the hash = 1 MB and close to optimal hash = 128 MB. First, time-to-depth in the optimal case is about 20% lower. Second, with LOS = 99.0%, it seems same depth=21 result is not equal for different hash sizes (meaning that time-to-depth is indeed not the exact measure in SF case), although I would have liked a bit more confidence.

Code: Select all

Games Completed = 1000 of 1000 (Avg game length = 185.349 sec)
Settings = Gauntlet/0MB/100000ms per move/M 400cp for 3 moves, D 80 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 47002 sec elapsed, 0 sec remaining
 1.  SF 1 MB                  	483.5/1000	84-117-799  	(L: m=0 t=0 i=0 a=117)	(D: r=543 i=38 f=0 s=1 a=217)	(tpm=1885.0 d=21.00 nps=1804812)
 2.  SF 128 MB                	516.5/1000	117-84-799  	(L: m=0 t=0 i=0 a=84)	 (D: r=543 i=38 f=0 s=1 a=217)	(tpm=1523.1 d=21.00 nps=1820235)
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: AlphaZero Chess is not that strong ...

Post by hgm »

So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: AlphaZero Chess is not that strong ...

Post by Laskos »

hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.
Leo
Posts: 1104
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: AlphaZero Chess is not that strong ...

Post by Leo »

I looked up the worlds strongest chess playing entity and found this:

Komodo -- the brainchild of Don Dailey (who died in November of 2013), GM Larry Kaufman, and Mark Lefler -- is now universally recognized as the strongest chess-playing entity on the planet.

Not knocking Komodo, its just what popped up first in the search.
Advanced Micro Devices fan.
User avatar
Eelco de Groot
Posts: 4671
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: AlphaZero Chess is not that strong ...

Post by Eelco de Groot »

syzygy wrote:
hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.

Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.

I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
Stockfish counts calls to do_move(), so hash cutoffs are counted.

SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
It is not so relevant for the question of strength, but phenomenon of sometimes better tactical results with a small hashtable is well known. It applies to engines without singular extensions as well but you are probably correct Ronald that SE in PV nodes strengthens the effect. A possible reason/explanation for me is that with the added overwriting, the PV does not get the time to grow strong quickly, score is lower especially when you'd SE otherwise but can't find the move. Now other moves have a better chance. Or, with enough hash, the first twenty iterations are done very quickly and they all fit in the hashtable. But anytime an entry is not found the engine has to start searching again. This is a form of IID, and any new internal search might find transpositions from other places that can push the search over the horizon with what Robert Hyatt called "grafting". With more IID the tactical results in theory would improve but there is less time for pushing the PV over the horizon so the Elo drops...
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: AlphaZero Chess is not that strong ...

Post by hgm »

Laskos wrote:
hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.
So tpm means 'time per move', and Stockfish was using 1.5 or 1.8 sec/move? This contradicts the results you posted before, where it took even longer time to reach d=21 with 128MB than wth 1MB.

At 1.8Mps that would be 2.7M nodes/search, require 27MB to store the entire tree even if every node is different. So is 128MB not far too large? One would not expect significant difference between 128MB and 8MB hash under these conditions.

Testing with games at fixed depth could be a bit tricky, as the time per move could vary wildly depending on the game phase. So it is difficult to conclude how much the TT was overloaded in the decisive phase of the game from the average time per move of the entire game.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: AlphaZero Chess is not that strong ...

Post by Laskos »

hgm wrote:
Laskos wrote:
hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.
So tpm means 'time per move', and Stockfish was using 1.5 or 1.8 sec/move? This contradicts the results you posted before, where it took even longer time to reach d=21 with 128MB than wth 1MB.

At 1.8Mps that would be 2.7M nodes/search, require 27MB to store the entire tree even if every node is different. So is 128MB not far too large? One would not expect significant difference between 128MB and 8MB hash under these conditions.

Testing with games at fixed depth could be a bit tricky, as the time per move could vary wildly depending on the game phase. So it is difficult to conclude how much the TT was overloaded in the decisive phase of the game from the average time per move of the entire game.
I already wrote that my previous result was probably wrong, as Shredder GUI restarts engine after each position, and the time to initialize the hash is counted in time used.

During the first say 20-25 moves, the time per move is roughly double the average, so loading is say 54 MB, and 40% hashfull is found to be close to optimal in an earlier thread (Mark Lefler also confirmed that). So, 128 MB hash tables is close to optimal for the first 20-25 moves, and these moves are the most important ones, determining probably close to 70-80% of outcomes. Fixed depth is somehow similar to playing games with a large base and a small increment. I wanted to see fixed depth result first to see the average time-to-depth, second to see if fixed depth strengths differ with the size of the hash.
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: AlphaZero Chess is not that strong ...

Post by hgm »

54MB is the size of the tree, and would only be 40% hashful if all nodes of the tree were different. Which would imply the hash table is a write-only data sink, never used for anything other than burning a few memory cycles. More typically a tree size of 54MB would mean 10% hashful.