Branching Factor of top engines

lucasart · Post by **lucasart** » Mon Jun 17, 2013 5:33 am

Daniel Shawul wrote:Lucas you tend to offer opinions rather too quickly when you don't have enough background yourself. Also you seem to have utter disrespect for past reseach

I did not disrepect them. On the contrary, I said they knew what they were doing and had reasons for their decision:

Lucas Braesch wrote: Surely they know what they are doing, and they have their reasons for doing these decisions.

I even gave a tentative explanation as to why they might chose this definition:

Lucas Braesch wrote: they probably chose this formula because they are studying the asymptotic behaviour of algorithms anyway, and the formula happens to be simple, and the equation (b^d-1)/(b-1)=nodes cannot generally inverted analytically to find b=f(nodes).

As everyone (except you) can see from my previous posts:
* I have tried to explain my objections and demonstrate my points.
* You are basically saying: "EBF=nodes^(1/d), because ancient gods decided so, and anyone who disagrees is ignorant and disrespectful". If you think you will convince people like that, you must be kidding yourself...

All I said was that extending the definition, outside the case where the number of child per node is constant, is arbitrary. There is no such thing as a 'right' or 'wrong' arbitrary decision: it depends on the application. In the case where you study the asymptotic properties of an algorithm, then yes, I agree with you, you might as well take a simple definition to make formulas simpler.

Daniel Shawul · Post by **Daniel Shawul** » Mon Jun 17, 2013 5:41 am

Nice progress since you started with lets use 'qsearch depth' and why not exclude qsearch nodes. That was so stupid i had to correct you. At least now you are using the right terms , and even started using them against me

. Bye for now.

Michel · Post by **Michel** » Mon Jun 17, 2013 8:07 am

but if you assume you started out to search 5 plies but went on extending that one line to 10 then your EBF is

@Daniel

Could you please give an unambiguous definition of what "d" is in your formula?

Saying it is the depth you set out to search does not work since you can cheat by immediately applying a superficial reduction like d-->d/2.

IWB · Post by **IWB** » Mon Jun 17, 2013 8:22 am

Laskos wrote:
Hi Ingo, I have repeated the test for Houdini 3, this time with 2GB hash and an average of 3-5 minutes of thinking time per position. The hash fills in a half to one quarter of the allotted time, and the result is very close to the old, shallower depth BF.

Houdini 3
Ply:16 Positions: 30 Avg Nodes: 8255150 Branching = 1.62
Ply:17 Positions: 30 Avg Nodes:17638161 Branching = 2.14
Ply:18 Positions: 30 Avg Nodes:30932455 Branching = 1.75
Ply:19 Positions: 30 Avg Nodes:48854066 Branching = 1.58
Ply:20 Positions: 30 Avg Nodes:93472563 Branching = 1.91
Ply:21 Positions: 30 Avg Nodes:198689470 Branching = 2.13
Ply:22 Positions: 30 Avg Nodes:391621195 Branching = 1.97
Ply:23 Positions: 30 Avg Nodes:751912694 Branching = 1.92
Average BF: 1.87
1/Log(BF)=1.60

The old result was

Houdini 3
Ply:15 Positions: 30 Avg Nodes: 4474464 Branching = 1.54
Ply:16 Positions: 30 Avg Nodes: 9993292 Branching = 2.23
Ply:17 Positions: 30 Avg Nodes:16583899 Branching = 1.66
Ply:18 Positions: 30 Avg Nodes:29529725 Branching = 1.78
Ply:19 Positions: 30 Avg Nodes:59452946 Branching = 2.01
Ply:20 Positions: 30 Avg Nodes:100502642 Branching = 1.69
Ply:21 Positions: 30 Avg Nodes:226412605 Branching = 2.25
Average BF: 1.86
1/Log(BF)=1.61

So, I don't see much influence on reasonable hash size, the BF goes as a power (1/depth) of the speed ratio (which varies slowly with the hash size, if it is reasonable).

Obviously there is not much influence - I have to "ponder" about that

Thx for testing
Ingo

Mike S. · Post by **Mike S.** » Mon Jun 17, 2013 9:13 am

The differences of these B.F. numbers may not seem big in itself. But let's try an illustration of their effect, under the following assumptions:

- we compare depth X and depth X+10, IOW going 10 plies deeper
- the average B.F. remains during that, each (which doesn't necessarily have to be true)
- number of required nodes and time consumption are proportional (? like above)

B.F. 1.59^10 = 103

B.F. 2.05^10 = 1,311

That means, if the assumptions above hold, that Critter 1.6a requires ~13 times the time to go 10 plies deeper, than Stockfish 3.

The first conclusion is of course, that Critter depths are more valuable than Stockfish depths. Because at the major ratings lists, either Critter is still ahead (including medium or longer TCs), or only SF.3 surpassed it by a small margin. Nevertheless, I am now lesser surprised that Critter failed early, in the nTCEC Season 1 which was played with very long time control.

The more we hope for a Critter update.

(Of course, the total strength is a mixture of many elements, like eval quality and many others.)

Henk · Post by **Henk** » Mon Jun 17, 2013 9:29 am

If I multiply the search depth by 100 in my Chess program. The last 99 search levels will have a branching factor of 1.

Branching factors of all levels should be displayed not only the last ones.

Uri Blass · Post by **Uri Blass** » Mon Jun 17, 2013 9:43 am

Laskos wrote:
Don wrote:
Uri Blass wrote:
Don wrote:
Laskos wrote:
Don wrote:I don't think it means anything though. But it is interesting. We know that SF had a very low BF and there are some simple mods we could do to Komodo to make the BF lower while having very little impact on the ELO.

We have long suspected this could be related to scalability but we are far from proving that. However SF is one of the more scalable programs and it could be related to the low BF. I would take a huge amount of CPU power to do the kind of experiments we would like to do to really understand scalability.
Still, the BF of top engines went from 6 in 1980es to 3 in late 1990es and to 1.6-1.8 nowadays. It seems to be a rule of thumb that it goes square root every 15 years or so (so the depth is doubled every 15 years on same hardware). And the aggressive pruning doesn't miss much on the way, it's very smartly done. The progress in search is real, it's not only hardware related as some imply about engines' progress.
Yes, a couple of years ago Bob Hyatt and I had a big discussion on this, he was saying it was mostly hardware and I was saying that software improved as much as hardware if not more.

The pruning definitely takes its toll but it's amazing how little it misses. I think most of the progress I have made with Komodo in the last 2 versions had to do with making it prune and reduce safer without hurting the BF - and in fact I think it now prunes more aggressively at the same time.
I agree that software improved not less than hardware but
I do not think that the pruning miss so little(inspite of the fact that it is clearly productive to playing strength).

Let me be clear on this. The pruning definitely hurts the play of the program at fixed depth and it's a non-trivial amount. I evidently didn't make that part clear.

I just let play overnight Shredder 6PB against Houdini 3 at fixed depth 11. Houdini 3 on average searched 200,000 or so nodes, Shredder 6PB 10,000,000. The result was 66:34 for Houdini, at this fixed depth. These 11 plies of Houdini were as valuable as those of the ancient Shredder 6, although the pruning is much more aggressive in Houdini. The pruning seems only mildly to hurt the play, it is amazingly clever. Komodo 5 at fixed depth is as strong as Houdini 3, so the same goes to Komodo.

I think that houdini has clearly better evaluation than shredder6
and it is possible that houdini also extends more than shredder6.

In order to do correct test you need to compare the same program with pruning with the same program without pruning.

Edit:I see that you already did it for komodo with LMR against komodo without LMR.

Note also that LMR is not the only pruning that chess programs use and there is also null move pruning that reduce the strength at fixed depth.

lucasart · Post by **lucasart** » Mon Jun 17, 2013 11:01 am

Daniel Shawul wrote:Get a grip, there is no debate here because the one i gave you is the text book definition that was used during the decades to prove optimality of alpha-beta by many researchers. If one researcher had to use a different one, there would be no communication or agreement. There are so many wrong definitions spread around here in chess forums, such as comparing two nodes counts on two IID iterations ,that are completely wrong. I tried to clear this up some time ago by giving many references, but chess-wiki still had those (not that i am complaining). For alpha-beta analysis, N is actually defined as the number of leaf nodes, which is close but not quite the same as the total nodes. But it is asymptotically the same as the total nodes so it is ok to use it there. But recently I had problems using that conventional definition when trying to analyse LMR! The reason is that at the leaves exactly one node is searched. So I had to revert back to the original text book definition even though all the alpha-beta literature uses number of leaf nodes.

Suggesting to use the qsearch depth is ridiculous because your branching factor will be so law to be non-existent with that definition. The 'd' in sqrt(d) is a typo as you already know but you should know many literature use 'd' for width (degree) too which is why i was inclined to make the mistake.

I have read through your reference:
* first of all the author places himself in the simplistic case where: (i) every interior node has exactly b children (ii) every root -> leaf path makes exactly d movs
* and he then writes N = 1 + b + b^2 + ... + b^d = (b^d - 1) / (b-1), and he pursues by saying that b ~= N^(1/d) is an asymptotic approximation

=> This is EXACTLY what I explained to you before you called me ignorant and disrespectful.

Besides when you say that all experts have agreed b=N^(1/d) definition, that's obviously not the case. At least the good old Professor Hyatt disagrees with you:

Robert Hyatt wrote: The definition of EBF we have been using for _many_ years is simply one of the following: (both are equivalent)

EBF = time(i) / time(i-1) or
EBF = nodes(i) / nodes(i-1)

Also, the reference you mention does not mention say anything about how to account for search extensions, reductions, and quiescent search in the EBF (*). In the general case (where the paths do not all have the same length and/or the nodes do not all have the same number of children) they simply define the EBF as the solution of the equation:

Code: Select all

&#40;b^d - 1&#41; / &#40;b - 1&#41; = N

regardless of how the tree looks like, so long as it has a depth equal to d and N nodes. It does not answer or refute any of the perfectly valid points I made about QS nodes, LMR and search extensions:

* define the depth of the tree? In the strictest mathematical sense, it's the longest path root->leaf that gives the depth. That means the longest variation, which is going to be much bigger than the nominal depth of the search:
- some quiescent searches run very deep as we all know
- plus search extensions (like checks or forced replies)

* no mention whatsoever of LMR in thie paper and how it fools the definition of the branching factor. I maintain that comparing the EBF of an engine with and without LMR is comparing apples and pears. It is not a valid common measure, because the large majority of paths in the LMR case are not even d/2 plies long. It's exactly the same reasoning at the QS point I made, except with reductions instead of extensions (reductions make non reductions a form of extension).

(*)That's perhaps because in the ancient time when this book was written, nobody used QS and LMR, and computers were so slow, you would barely do 3 plies with no qsearch (perhaps a basic SEE instead of the QS at the leaves). So as much as I respect the author, he did not have the hindsight that we have today.

Daniel Shawul · Post by **Daniel Shawul** » Mon Jun 17, 2013 2:45 pm

Michel,
My first post has a mixup because i used d for width. You may think this is stupid but it is almost what every one uses for width a.k.a "degree" so d, and height is represented by height h. But to avoid this confusion lets use the regular b->width , height->d. The following definiton is found in many references

Definition: Effective branching factor is the width a uniform tree would have to contain all the nodes.
With that sum(b^d)=1+b+b^2...b^d = Total nodes

I solve this iteratively in nebiyu

Code: Select all

	/*calculate EBF by bisection method*/
	double mynodes = &#40;double&#41;nodes; 
	double EBFleaf = pow&#40;mynodes,1./search_depth&#41;,
		EBFmax = EBFleaf,EBFmin = 1,EBF,N;
	EBFmax = pow&#40;mynodes,1./search_depth&#41;;
	EBFmin = 1;
	for&#40;i = 0;i < 100;i++) &#123;
		EBF = &#40;EBFmin + EBFmax&#41; / 2;
		N = &#40;pow&#40;EBF,search_depth + 1&#41; - 1&#41; / &#40;EBF - 1&#41;;
		if&#40;mynodes > N&#41; EBFmin = EBF;
		else EBFmax = EBF;
	&#125;
	print_fmt&#40;" EBF=%4.2f",EBF&#41;;
	print_fmt&#40;"\n");

Now at larger depth b^d approaches (Total nodes) so alpha-beta literature uses the defintion

Code: Select all

   b^d = Leafnodes

which is ok to use and helps to simplify analysis. Some use (d-1) there but it is for uniformly branching tree, best to look at the references i posted above. Now for LMR i can't use simplification like that because the largest happens somewhere in the middle which i don't know its values. So i had to use total nodes in the above formula like the text book defintion.

Daniel Shawul · Post by **Daniel Shawul** » Mon Jun 17, 2013 2:56 pm

I have read through your reference:
* first of all the author places himself in the simplistic case where: (i) every interior node has exactly b children (ii) every root -> leaf path makes exactly d movs
* and he then writes N = 1 + b + b^2 + ... + b^d = (b^d - 1) / (b-1), and he pursues by saying that b ~= N^(1/d) is an asymptotic approximation

=> This is EXACTLY what I explained to you before you called me ignorant and disrespectful.

Explained to me?? Look at the post you replied to

For alpha-beta analysis, N is actually defined as the number of leaf nodes, which is close but not quite the same as the total nodes. But it is asymptotically the same as the total nodes so it is ok to use it there. But recently I had problems using that conventional definition when trying to analyse LMR! The reason is that at the leaves exactly one node is searched. So I had to revert back to the original text book definition even though all the alpha-beta literature uses number of leaf nodes.

What do you think that is? Like i said before at least you learned not to use the qsearch depth so be happy about it and move on before you become a complete jerk.

Branching Factor of top engines

Re: Branching Factor of top engines

Re: Branching Factor of top engines

Re: Branching Factor of top engines

Re: Branching factor of top engines.

Re: Branching Factor of top engines

Re: Branching Factor of top engines

Re: Branching factor of top engines.

Re: Branching Factor of top engines

Re: Branching Factor of top engines

Re: Branching Factor of top engines