Unanswered question from Engine Origins forum

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Unanswered question from Engine Origins forum

Post by Don »

mcostalba wrote:
Don wrote: So what are these new ideas?
I am not able to answer to HGM question because I think it is misleading. Ideas, ideas...but what are these ideas ?

Given for grant the chess engine knowledge freely available today from sites like cpw or from open source code the key point are not the ideas.
I think you could build a world class chess engine from CPW which is quite impressive at this point. Of course you still need the talent - but everything you need is there.

I am really convinced that to write a top engine today it takes 10% of effort to build-up an effective laundry list of publicly avilable techniques one wants to implement and 90% of effort to implement them. Of this implementation effort a good 90% is due to testing coverage and 10% is coding skills.

So to build up a top engine today you don't need "ideas" you need a powerful testing framework, a lot of time, a good programming background, and a good knowledge of what is already publicly avilable.

Sorry but there are no shortcuts here. This is what I believe.
And that is exactly how I feel too.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Unanswered question from Engine Origins forum

Post by michiguel »

Don wrote:
mcostalba wrote:
Don wrote: So what are these new ideas?
I am not able to answer to HGM question because I think it is misleading. Ideas, ideas...but what are these ideas ?

Given for grant the chess engine knowledge freely available today from sites like cpw or from open source code the key point are not the ideas.
I think you could build a world class chess engine from CPW which is quite impressive at this point. Of course you still need the talent - but everything you need is there.

I am really convinced that to write a top engine today it takes 10% of effort to build-up an effective laundry list of publicly avilable techniques one wants to implement and 90% of effort to implement them. Of this implementation effort a good 90% is due to testing coverage and 10% is coding skills.

So to build up a top engine today you don't need "ideas" you need a powerful testing framework, a lot of time, a good programming background, and a good knowledge of what is already publicly avilable.

Sorry but there are no shortcuts here. This is what I believe.
And that is exactly how I feel too.
[MODERATION]
In moving this thread from the main forum, fourteen messages were removed. Starting with our pleas that became irrelevant now, petitions to move the thread, counter replies, and everything that followed them up. One post was move to EO.

This thread is interesting and technical, it is the intention of the OPs, and it shall remain that way.

The show must go on.
Miguel
chrisw
Posts: 4598
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Unanswered question from Engine Origins forum

Post by chrisw »

I'm reading a lot of posts claiming 3000 Elo and +50 Elo for this and +200 Elo for that. Are you all sure these figures make sense?

I had a good look at Fruit sources and Rybka RE sources. I read comments, many from Larry, about how evaluation progress is made nowadays. I also read comments about "massive automated testing" and counter arguments suggesting this doesn't really work without huge hardware resources, mainly from Vincent, and even with massive hardware there are problems. I read of how search progress has been made.

All in all I intuit:

a) evaluations are better than the very basic bean counters of old, but there are still serious problems relating, say the value of an attack against the value of a pawn. Or how the bonus for a double pawn relates to the creation of a half-open or open file next to it (some games the pawn structure damage will lead to a loss, and other games the open file will lead to a win) relate to each other and whether just adding the two together isn't just an attempt to average the statistical winning/losing chances that the connection gives. ie there is NO correct value, it all depends. Evaluations are still pretty primitive imo.

b) search is being done with massive pruning. Thus the search tree, instead of being the full game tree, which we can imagine as a bush, as wide as it is deep, is now like a cypress tree, very tall but very narrow.

Masses of stuff is being thrown away/pruned, in effect pruned by the effects of the evaluation function, which, itself, is inaccurate and non-optimal, as if such an optimal evaluator could ever exist. This poor quality pruning is made worse by the effects of quiesence search (QS), which, as far as I have looked, is still using the sorry old bean-counter relationship 13359 to make decisions.


So, what do we have now? A bunch of programs, of supposedly very high Elo, all searching down a similar cypress tree.

A little thought experiment .... suppose we made an impossible, fantasy engine, that was full-width but also had the depth equivalence of our current top engines. ie it has the depth, but it has the width too. You can perhaps envisage the cypress tree search as a small wedge-shaped part of the full width engine tree search. How often will the cypress tree contain and overlap the main line of the full width tree? Impossible to answer, but unless it does so often enough, our cypress tree is going to lose, regularly.

Ok, so I postulate this: current engines are playing AGAINST EACH OTHER with their searches broadly contained within this cypress tree, which, by thought experiment, we know is a non-optimal search space. But by tuning the width/depth parameters, making even finer, deeper searches, and then TESTING AGAINST EACH OTHER ONLY, you are not really testing and Elo-evaluating chess engines, you are incestuously testing cypress tree searchers against each other, and, of course, building your own little world of Elo relativities.

Computer chess or just bean-counter incest in a cul-de-sac? You have no humans any more to relate to, you ran away with yourselves.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Unanswered question from Engine Origins forum

Post by Desperado »

A:
===

well, imo the first question should be what "innovation" means today or
maybe, what it means in general. Introducing new techniques like nullmove,
lmr,razoring,futility and so on is only one step of innovation. We do
only accept these techniques as innovation when one of us is getting them
to work. Then, suddenly , everyone knows that it must work somehow and
finds a way to adapt it in its own framework.

But there is not only innovation by introducing new techniques, also
refinements can be ( must be ) considered as innovative work imo.
As all you chess programmers out there know: the devil is in the details.

Of course there will never be a discussion about the alphabeta vs minimax
algorithm. From implementation view it is a tiny step, the idea is enormous.
But 300 elo is the sum of dozens little improvements which can also be
the result of innovative work.

Where is the line between improvement and innovation exactly ?
Isnt the one thing not the other at _some_ points?

eg: having a staged move generation is one thing, but to find some groups
that do not only differ between captures and quiet moves _can_ be
innovative work. Putting the puzzle together, from introducing gain
move statistic into an existing framework (staged mg scheme) is innovation too,
especially if such a statistic wasnt meant to use it for this purpose.

So, combination of already existing ideas is innovation too of course.

B:
===

Nonetheless, beside dozens of details there is one eyecatcher imo.
(i cannot say it is from ippo or rybka, never saw the source, but
i know it from robbo).

"The search classification"

Yes, we all know about root,cut,all,pv,quies and horizon nodes,
but to design own search functions for all these node types is a great
idea. I speak of idea because it opens my mind to create a complex
search that handles special types of nodes in its special way.
(there is no limit in building any kind of groups, like capture nodes,
known wins, maybe in some years nodes that are handeld with a montecarlo
technique or whatever) and mix them all together to get a better
search result.)

I only know that it is very hard to figure out what is typical for
a node and what is a general node feature, even in the most common node
classification like mentioned above. I am pretty sure that i can
have a compact search in my engine that is as good as a sub-divisioned
search, but there is not the same room for new ideas.

So innovation is what you make out of an idea and you finally need
a successful implementation.

There are small,intermediate and big innovations. The big ones are
already explored by the pioneers of chess programming. So our task
is to sum up the small ones with hard work. Fortunatelly, the small
ones are making the big picture, LOL ( :lol: )

Michael
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Unanswered question from Engine Origins forum

Post by dchoman »

chrisw wrote:
A little thought experiment .... suppose we made an impossible, fantasy engine, that was full-width but also had the depth equivalence of our current top engines. ie it has the depth, but it has the width too. You can perhaps envisage the cypress tree search as a small wedge-shaped part of the full width engine tree search. How often will the cypress tree contain and overlap the main line of the full width tree? Impossible to answer, but unless it does so often enough, our cypress tree is going to lose, regularly.
I thought this was an interesting idea, so I ran a simple test.... I can't create a massive depth full searcher, but it was simple enough to disable all forms of move reductions and pruning (including no null move) in my engine and then run it against the regular version at a fixed depth of 7 plies... 40 games was enough to show a clear superiority for the full width search...

Code: Select all

Rank Name                                 Elo    +    - games score oppo. draws 
   1 EXchess v6.50_no_reduce_prune_null     0   63   52    40   89%  -293   18% 
   2 EXchess v6.50b_MacOSX_10.6.8        -293   52   63    40   11%     0   18% 
Of course, if we limit search time instead to 12 sec + 0.2 per move, version with the pruning and reductions is the superior one...

Code: Select all

Rank Name                                 Elo    +    - games score oppo. draws 
   1 EXchess v6.50b_MacOSX_10.6.8         320   39   34   123   87%     0    9% 
   2 EXchess v6.50_no_reduce_prune_null     0   34   39   123   13%   320    9% 
I ran more games here just because I was watching TV and didn't notice it went through 40 games a bit faster than I expected... so perhaps I should have used a bit longer time control, but I don't think that would change the result.

What was interesting to me is *why* they are different... So I made a third version as well which included null move but no other pruning or reductions. Then I ran all three version to depth=12 in analysis mode from the starting position...

Code: Select all

homanss-imac:run homand$ ./EXchess_v6.50_no_reduce_prune_null 

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.13     1   731098   1. Nc3 d5 2. d4 Nf6 3. Nf3 Nc6 4. e3 e6
  8.   0.27     2  1119920   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  8.   0.27     2  1346043   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  9.   0.65     6  3869303   1. e4 e6 2. d4 Nc6 3. Nc3 Nf6 4. e5 Nd5 5. Nge2  
  9.   0.65    11  6505032   1. e4 e6 2. d4 Nc6 3. Nc3 Nf6 4. e5 Nd5 5. Nge2
 10.   0.37    27 14983303   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 10.   0.37    40 20685808   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 11.   0.57    69 37141267   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6
 11.   0.57   148 84926666   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6
 12.   0.33   262 145951653   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 e6 5. Nxd5 Qxd5 6. d4 Nc6
 12.   0.33   564 267507589   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 e6 5. Nxd5 Qxd5 6. d4 Nc6

nodes = 267507589 hash moves = 3432118 qnodes = 180478794 evals = 144247439
hash hits = 53921530 pawn hash hits = 129445793 score hash hits = 2255180
node_rate = 474304 null cuts = 0 exten = 4356336 qchecks = 3586
int_iter = 181028 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 97
White-To-Move[1]: quit

homanss-imac:run homand$ ./EXchess_v6.50_no_reduce_prune

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.13     0   139907   1. Nc3 d5 2. d4 Nf6 3. Nf3 Nc6 4. e3 e6
  8.   0.27     0   201423   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  8.   0.27     0   250318   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  9.   0.65     0   419773   1. e4 e6 2. Nc3 Nc6 3. d4 Nf6 4. e5 Nd5 5. Nge2  
  9.   0.65     0   444158   1. e4 e6 2. Nc3 Nc6 3. d4 Nf6 4. e5 Nd5 5. Nge2
 10.   0.37     2  1321843   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 10.   0.37     2  1793445   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 11.   0.57     3  2357118   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6 6. Bg5 Nxc3 7. bxc3
 11.   0.57     4  2445478   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6 6. Bg5 Nxc3 7. bxc3
 12.   0.38     9  5540472   1. e4 e5 2. d4 exd4 3. Nf3 Nc6 4. Nxd4 Nf6 5. Nc3 Bb4 6. Bg5 O-O
 12.   0.38    12  7564182   1. e4 e5 2. d4 exd4 3. Nf3 Nc6 4. Nxd4 Nf6 5. Nc3 Bb4 6. Bg5 O-O

nodes = 7564182 hash moves = 108088 qnodes = 5279552 evals = 4847575
hash hits = 1128829 pawn hash hits = 3452207 score hash hits = 1074942
node_rate = 601286 null cuts = 1696557 exten = 103113 qchecks = 1076
int_iter = 21346 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 91
White-To-Move[1]: quit

homanss-imac:run homand$ ./EXchess_v6.50b_MacOSX_10.6.8 

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.51     0    14808   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 Nc6       
  8.   0.51     0    16882   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 Nc6
  9.   0.55     0    24928   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 e5 5. dxe5 Nxc3 6. bxc3 Qxd1+ 7. Kxd1
  9.   0.55     0    27860   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 e5 5. dxe5 Nxc3 6. bxc3 Qxd1+ 7. Kxd1
 10.   0.61     0    44444   1. e4 e6 2. d4 Nf6 3. e5 Nd5 4. Nf3 Nc6 5. Nc3 Nxc3 6. bxc3
 10.   0.61     0    48265   1. e4 e6 2. d4 Nf6 3. e5 Nd5 4. Nf3 Nc6 5. Nc3 Nxc3 6. bxc3
 11.   0.60     0    74680   1. e4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5
 11.   0.60     0    82272   1. e4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5
 12.   0.33     0   179613   1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5 d5 7. Nxc6 bxc6
 12.   0.33     0   197279   1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5 d5 7. Nxc6 bxc6

nodes = 197279 hash moves = 8721 qnodes = 117615 evals = 128822
hash hits = 23419 pawn hash hits = 71539 score hash hits = 47451
node_rate = 290116 null cuts = 25318 exten = 4360 qchecks = 23
int_iter = 2460 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 86
White-To-Move[1]: quit
It is interesting to me that the fail high percentage drops so dramatically with the introduction of more reductions and pruning (first with null move, then with everything). I guess this means that without pruning and reductions, many nodes that are easy to refute with an obvious reply are left in the tree.... pruning and reductions selectively remove these nodes, so the evidence is that they are doing their job, although imperfectly as the first test clearly shows.

- Dan
chrisw
Posts: 4598
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Unanswered question from Engine Origins forum

Post by chrisw »

dchoman wrote:
chrisw wrote:
A little thought experiment .... suppose we made an impossible, fantasy engine, that was full-width but also had the depth equivalence of our current top engines. ie it has the depth, but it has the width too. You can perhaps envisage the cypress tree search as a small wedge-shaped part of the full width engine tree search. How often will the cypress tree contain and overlap the main line of the full width tree? Impossible to answer, but unless it does so often enough, our cypress tree is going to lose, regularly.
I thought this was an interesting idea, so I ran a simple test.... I can't create a massive depth full searcher, but it was simple enough to disable all forms of move reductions and pruning (including no null move) in my engine and then run it against the regular version at a fixed depth of 7 plies... 40 games was enough to show a clear superiority for the full width search...

Code: Select all

Rank Name                                 Elo    +    - games score oppo. draws 
   1 EXchess v6.50_no_reduce_prune_null     0   63   52    40   89%  -293   18% 
   2 EXchess v6.50b_MacOSX_10.6.8        -293   52   63    40   11%     0   18% 
Of course, if we limit search time instead to 12 sec + 0.2 per move, version with the pruning and reductions is the superior one...

Code: Select all

Rank Name                                 Elo    +    - games score oppo. draws 
   1 EXchess v6.50b_MacOSX_10.6.8         320   39   34   123   87%     0    9% 
   2 EXchess v6.50_no_reduce_prune_null     0   34   39   123   13%   320    9% 
I ran more games here just because I was watching TV and didn't notice it went through 40 games a bit faster than I expected... so perhaps I should have used a bit longer time control, but I don't think that would change the result.

What was interesting to me is *why* they are different... So I made a third version as well which included null move but no other pruning or reductions. Then I ran all three version to depth=12 in analysis mode from the starting position...

Code: Select all

homanss-imac:run homand$ ./EXchess_v6.50_no_reduce_prune_null 

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.13     1   731098   1. Nc3 d5 2. d4 Nf6 3. Nf3 Nc6 4. e3 e6
  8.   0.27     2  1119920   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  8.   0.27     2  1346043   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  9.   0.65     6  3869303   1. e4 e6 2. d4 Nc6 3. Nc3 Nf6 4. e5 Nd5 5. Nge2  
  9.   0.65    11  6505032   1. e4 e6 2. d4 Nc6 3. Nc3 Nf6 4. e5 Nd5 5. Nge2
 10.   0.37    27 14983303   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 10.   0.37    40 20685808   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 11.   0.57    69 37141267   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6
 11.   0.57   148 84926666   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6
 12.   0.33   262 145951653   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 e6 5. Nxd5 Qxd5 6. d4 Nc6
 12.   0.33   564 267507589   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 e6 5. Nxd5 Qxd5 6. d4 Nc6

nodes = 267507589 hash moves = 3432118 qnodes = 180478794 evals = 144247439
hash hits = 53921530 pawn hash hits = 129445793 score hash hits = 2255180
node_rate = 474304 null cuts = 0 exten = 4356336 qchecks = 3586
int_iter = 181028 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 97
White-To-Move[1]: quit

homanss-imac:run homand$ ./EXchess_v6.50_no_reduce_prune

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.13     0   139907   1. Nc3 d5 2. d4 Nf6 3. Nf3 Nc6 4. e3 e6
  8.   0.27     0   201423   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  8.   0.27     0   250318   1. e4 d5 2. exd5 e6 3. d4 Nf6 4. Bb5+ c6 5. dxc6 Nxc6
  9.   0.65     0   419773   1. e4 e6 2. Nc3 Nc6 3. d4 Nf6 4. e5 Nd5 5. Nge2  
  9.   0.65     0   444158   1. e4 e6 2. Nc3 Nc6 3. d4 Nf6 4. e5 Nd5 5. Nge2
 10.   0.37     2  1321843   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 10.   0.37     2  1793445   1. e4 d5 2. exd5 e6 3. d4 exd5 4. Nc3 Nc6 5. Qe2+ Nge7 6. Nf3
 11.   0.57     3  2357118   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6 6. Bg5 Nxc3 7. bxc3
 11.   0.57     4  2445478   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. Nf3 Nc6 5. d4 e6 6. Bg5 Nxc3 7. bxc3
 12.   0.38     9  5540472   1. e4 e5 2. d4 exd4 3. Nf3 Nc6 4. Nxd4 Nf6 5. Nc3 Bb4 6. Bg5 O-O
 12.   0.38    12  7564182   1. e4 e5 2. d4 exd4 3. Nf3 Nc6 4. Nxd4 Nf6 5. Nc3 Bb4 6. Bg5 O-O

nodes = 7564182 hash moves = 108088 qnodes = 5279552 evals = 4847575
hash hits = 1128829 pawn hash hits = 3452207 score hash hits = 1074942
node_rate = 601286 null cuts = 1696557 exten = 103113 qchecks = 1076
int_iter = 21346 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 91
White-To-Move[1]: quit

homanss-imac:run homand$ ./EXchess_v6.50b_MacOSX_10.6.8 

....

White-To-Move[1]: sd 12
White-To-Move[1]: analyze

....

  8.   0.51     0    14808   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 Nc6       
  8.   0.51     0    16882   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 Nc6
  9.   0.55     0    24928   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 e5 5. dxe5 Nxc3 6. bxc3 Qxd1+ 7. Kxd1
  9.   0.55     0    27860   1. e4 d5 2. exd5 Nf6 3. Nc3 Nxd5 4. d4 e5 5. dxe5 Nxc3 6. bxc3 Qxd1+ 7. Kxd1
 10.   0.61     0    44444   1. e4 e6 2. d4 Nf6 3. e5 Nd5 4. Nf3 Nc6 5. Nc3 Nxc3 6. bxc3
 10.   0.61     0    48265   1. e4 e6 2. d4 Nf6 3. e5 Nd5 4. Nf3 Nc6 5. Nc3 Nxc3 6. bxc3
 11.   0.60     0    74680   1. e4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5
 11.   0.60     0    82272   1. e4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5
 12.   0.33     0   179613   1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5 d5 7. Nxc6 bxc6
 12.   0.33     0   197279   1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4 exd4 5. Nxd4 Bb4 6. Bg5 d5 7. Nxc6 bxc6

nodes = 197279 hash moves = 8721 qnodes = 117615 evals = 128822
hash hits = 23419 pawn hash hits = 71539 score hash hits = 47451
node_rate = 290116 null cuts = 25318 exten = 4360 qchecks = 23
int_iter = 2460 egtb_probes = 0 egtb_hits = 0 fail_high(%) = 86
White-To-Move[1]: quit
It is interesting to me that the fail high percentage drops so dramatically with the introduction of more reductions and pruning (first with null move, then with everything). I guess this means that without pruning and reductions, many nodes that are easy to refute with an obvious reply are left in the tree.... pruning and reductions selectively remove these nodes, so the evidence is that they are doing their job, although imperfectly as the first test clearly shows.

- Dan
I was mostly interested in whether the +100 typical Elo claims had any relation to reality, or if it is just the result of incestuous testing of beasties operating in their own fantasy world aka cul de sac (all pruning away he same valid stuff eg)

Anyway, your results indicate there's preferable main lines contained within the pruned part of the tree (ie thrown away by current paradigm programs) for a sufficiently substantial part of the time, enough to give the win rates you showed.

Not a criticism of your program in particular, but it seems computer chess has not left the bean counter paradigm.
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Unanswered question from Engine Origins forum

Post by dchoman »

chrisw wrote:
Anyway, your results indicate there's preferable main lines contained within the pruned part of the tree (ie thrown away by current paradigm programs) for a sufficiently substantial part of the time, enough to give the win rates you showed.
Yes, and for my program at depth=7, these preferable main lines that are missed with my pruning/reductions are worth about ~300 elo. I wonder if the margins is as large for the top engines. I also wonder how it scales with depth.

If gap is as large for top engines and/or scales up with depth, then there is indeed an significant path to improvement here by pruning/reducing more intelligently. I think we knew that (I am always trying to find ways to make better pruning/reducing decisions in my engine, and I am sure others are as well), but the margin was larger than I expected.

I also was surprised at just how well pruning/reducing does considering the mistakes it makes. When computation resources are finite (the restricted search time test), making these mistakes is much more than an even trade-off for pruning/reducing the many more truly bad moves in the tree.

As to your other question: is the strength of the top engines an illusion based on tests against engines of a similar design. I've read they do very well against human players, so that is one independent measure. But I guess we won't know for sure unless someone tries another design and develops it as fully.

- Dan
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: Unanswered question from Engine Origins forum

Post by wgarvin »

dchoman wrote:
chrisw wrote:
Anyway, your results indicate there's preferable main lines contained within the pruned part of the tree (ie thrown away by current paradigm programs) for a sufficiently substantial part of the time, enough to give the win rates you showed.
Yes, and for my program at depth=7, these preferable main lines that are missed with my pruning/reductions are worth about ~300 elo. I wonder if the margins is as large for the top engines. I also wonder how it scales with depth.
Does it mean anything to say this is "worth about 300 elo"? Its a self-play result where the full-width searcher is looking at a lot more nodes than the selective one. And with fixed depth, either one might not be looking at the nodes it wants to look at because they're over the horizon. (I guess I'm saying, it seems to me that the search behaviour might not be the same in fixed-depth and non-fixed-depth searches. Fixed-depth introduces artifacts which the normal extensions, etc. are supposed to mitigate. But I'm not any kind of authority on search, I might just be wrong.)

Anyway, its seems to me that "worth about N elo" doesn't mean anything unless its measured by playing a lot of non-fixed-depth games against a set of other engines, using ordinary time controls.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Unanswered question from Engine Origins forum

Post by bob »

wgarvin wrote:
dchoman wrote:
chrisw wrote:
Anyway, your results indicate there's preferable main lines contained within the pruned part of the tree (ie thrown away by current paradigm programs) for a sufficiently substantial part of the time, enough to give the win rates you showed.
Yes, and for my program at depth=7, these preferable main lines that are missed with my pruning/reductions are worth about ~300 elo. I wonder if the margins is as large for the top engines. I also wonder how it scales with depth.
Does it mean anything to say this is "worth about 300 elo"? Its a self-play result where the full-width searcher is looking at a lot more nodes than the selective one. And with fixed depth, either one might not be looking at the nodes it wants to look at because they're over the horizon. (I guess I'm saying, it seems to me that the search behaviour might not be the same in fixed-depth and non-fixed-depth searches. Fixed-depth introduces artifacts which the normal extensions, etc. are supposed to mitigate. But I'm not any kind of authority on search, I might just be wrong.)

Anyway, its seems to me that "worth about N elo" doesn't mean anything unless its measured by playing a lot of non-fixed-depth games against a set of other engines, using ordinary time controls.
Correct. Otherwise LMR will "cost" you about 300 Elo or so, if both versions search to the same depth, one with LMR and one without. But the one with really will search much deeper in a normal time-constrained game, and you won't see that 300+ elo loss.
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Unanswered question from Engine Origins forum

Post by dchoman »

wgarvin wrote:
dchoman wrote:
chrisw wrote:
Anyway, your results indicate there's preferable main lines contained within the pruned part of the tree (ie thrown away by current paradigm programs) for a sufficiently substantial part of the time, enough to give the win rates you showed.
Yes, and for my program at depth=7, these preferable main lines that are missed with my pruning/reductions are worth about ~300 elo. I wonder if the margins is as large for the top engines. I also wonder how it scales with depth.
Does it mean anything to say this is "worth about 300 elo"? Its a self-play result where the full-width searcher is looking at a lot more nodes than the selective one. And with fixed depth, either one might not be looking at the nodes it wants to look at because they're over the horizon. (I guess I'm saying, it seems to me that the search behaviour might not be the same in fixed-depth and non-fixed-depth searches. Fixed-depth introduces artifacts which the normal extensions, etc. are supposed to mitigate. But I'm not any kind of authority on search, I might just be wrong.)

Anyway, its seems to me that "worth about N elo" doesn't mean anything unless its measured by playing a lot of non-fixed-depth games against a set of other engines, using ordinary time controls.
Actually, I didn't touch the extensions, so the only difference between the versions was that the modified version didn't prune or reduce any of the moves in the tree. Both versions were stopped after they completed the seventh iteration for each move in the game, so the nominal depth was 7, but individual lines could well be longer due to allowed extensions. The depth=7 iteration limit was so that the full-width searcher would not be limited by computational resources in this test and would be able to search all lines as deeply as the real version.

The ~300 elo difference implies to me that it might be possible to get up to this much back by pruning/reducing more intelligently in the real version... i.e. what would happen if we could make perfect pruning/reduction decisions so that only the truly bad moves were affected? The good moves would stay in the tree, so the real version would play better. So this is kind of a theoretical limit, at least for depth = 7 and my engine. Other depths may well be different, and other engines may be more intelligent about how they prune and reduce.

- Dan