Is possible that testing of Stockfish is mostly done with little hash? If so, the engine is tuned to it.hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.
AlphaZero Chess is not that strong ...
Moderator: Ras
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: AlphaZero Chess is not that strong ...
Daniel José -
http://www.andscacs.com

-
- Posts: 5728
- Joined: Tue Feb 28, 2012 11:56 pm
Re: AlphaZero Chess is not that strong ...
Stockfish counts calls to do_move(), so hash cutoffs are counted.hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.
Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.
I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: AlphaZero Chess is not that strong ...
Hmm.. that would mean that time-to-depth is not the exact measure to get the total effect of hash size for SF, only time-to-strength, right?syzygy wrote:Stockfish counts calls to do_move(), so hash cutoffs are counted.hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.
Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.
I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
First, I think Shredder GUI restarts the engine for each position tested, and it counts the time to initialize the hash tables in the time used. So, my result was probably misleading. I took another approach, playing actual games to depth=21 to see the difference between the hash = 1 MB and close to optimal hash = 128 MB. First, time-to-depth in the optimal case is about 20% lower. Second, with LOS = 99.0%, it seems same depth=21 result is not equal for different hash sizes (meaning that time-to-depth is indeed not the exact measure in SF case), although I would have liked a bit more confidence.
Code: Select all
Games Completed = 1000 of 1000 (Avg game length = 185.349 sec)
Settings = Gauntlet/0MB/100000ms per move/M 400cp for 3 moves, D 80 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 47002 sec elapsed, 0 sec remaining
1. SF 1 MB 483.5/1000 84-117-799 (L: m=0 t=0 i=0 a=117) (D: r=543 i=38 f=0 s=1 a=217) (tpm=1885.0 d=21.00 nps=1804812)
2. SF 128 MB 516.5/1000 117-84-799 (L: m=0 t=0 i=0 a=84) (D: r=543 i=38 f=0 s=1 a=217) (tpm=1523.1 d=21.00 nps=1820235)
-
- Posts: 28390
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: AlphaZero Chess is not that strong ...
So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: AlphaZero Chess is not that strong ...
No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
-
- Posts: 1104
- Joined: Fri Sep 16, 2016 6:55 pm
- Location: USA/Minnesota
- Full name: Leo Anger
Re: AlphaZero Chess is not that strong ...
I looked up the worlds strongest chess playing entity and found this:
Komodo -- the brainchild of Don Dailey (who died in November of 2013), GM Larry Kaufman, and Mark Lefler -- is now universally recognized as the strongest chess-playing entity on the planet.
Not knocking Komodo, its just what popped up first in the search.
Komodo -- the brainchild of Don Dailey (who died in November of 2013), GM Larry Kaufman, and Mark Lefler -- is now universally recognized as the strongest chess-playing entity on the planet.
Not knocking Komodo, its just what popped up first in the search.
Advanced Micro Devices fan.
-
- Posts: 4671
- Joined: Sun Mar 12, 2006 2:40 am
- Full name: Eelco de Groot
Re: AlphaZero Chess is not that strong ...
It is not so relevant for the question of strength, but phenomenon of sometimes better tactical results with a small hashtable is well known. It applies to engines without singular extensions as well but you are probably correct Ronald that SE in PV nodes strengthens the effect. A possible reason/explanation for me is that with the added overwriting, the PV does not get the time to grow strong quickly, score is lower especially when you'd SE otherwise but can't find the move. Now other moves have a better chance. Or, with enough hash, the first twenty iterations are done very quickly and they all fit in the hashtable. But anytime an entry is not found the engine has to start searching again. This is a form of IID, and any new internal search might find transpositions from other places that can push the search over the horizon with what Robert Hyatt called "grafting". With more IID the tactical results in theory would improve but there is less time for pushing the PV over the horizon so the Elo drops...syzygy wrote:Stockfish counts calls to do_move(), so hash cutoffs are counted.hgm wrote:This is surely weird, and suggests that Stockfish gets weaker with more hash.
Note that in my graph the load factor is not defined w.r.t. hashfull, but w.r.t. node count. Which should be much higher, as many nodes are for the same position.
I am also not sure whether Stockfish counts hash cutoffs as nodes or not.
SF only tries singular extensions for TT moves, so if a position is not in the TT, it will not check if a TT move is a singular move and will not extend the search if it would have been. Perhaps this explains some of what is being observed.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
-
- Posts: 28390
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: AlphaZero Chess is not that strong ...
So tpm means 'time per move', and Stockfish was using 1.5 or 1.8 sec/move? This contradicts the results you posted before, where it took even longer time to reach d=21 with 128MB than wth 1MB.Laskos wrote:No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
At 1.8Mps that would be 2.7M nodes/search, require 27MB to store the entire tree even if every node is different. So is 128MB not far too large? One would not expect significant difference between 128MB and 8MB hash under these conditions.
Testing with games at fixed depth could be a bit tricky, as the time per move could vary wildly depending on the game phase. So it is difficult to conclude how much the TT was overloaded in the decisive phase of the game from the average time per move of the entire game.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: AlphaZero Chess is not that strong ...
I already wrote that my previous result was probably wrong, as Shredder GUI restarts engine after each position, and the time to initialize the hash is counted in time used.hgm wrote:So tpm means 'time per move', and Stockfish was using 1.5 or 1.8 sec/move? This contradicts the results you posted before, where it took even longer time to reach d=21 with 128MB than wth 1MB.Laskos wrote:No. It was same depth=21 result. So, in total (strength at fixed time), it seems to lose about 20% in time-to-depth + 11 Elo points at fixed depth.hgm wrote:So Stockfish loses ~11 Elo in self-play by making the hash 128 times smaller than 'optimal'?
At 1.8Mps that would be 2.7M nodes/search, require 27MB to store the entire tree even if every node is different. So is 128MB not far too large? One would not expect significant difference between 128MB and 8MB hash under these conditions.
Testing with games at fixed depth could be a bit tricky, as the time per move could vary wildly depending on the game phase. So it is difficult to conclude how much the TT was overloaded in the decisive phase of the game from the average time per move of the entire game.
During the first say 20-25 moves, the time per move is roughly double the average, so loading is say 54 MB, and 40% hashfull is found to be close to optimal in an earlier thread (Mark Lefler also confirmed that). So, 128 MB hash tables is close to optimal for the first 20-25 moves, and these moves are the most important ones, determining probably close to 70-80% of outcomes. Fixed depth is somehow similar to playing games with a large base and a small increment. I wanted to see fixed depth result first to see the average time-to-depth, second to see if fixed depth strengths differ with the size of the hash.
-
- Posts: 28390
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: AlphaZero Chess is not that strong ...
54MB is the size of the tree, and would only be 40% hashful if all nodes of the tree were different. Which would imply the hash table is a write-only data sink, never used for anything other than burning a few memory cycles. More typically a tree size of 54MB would mean 10% hashful.