Question for Bob Hyatt

diep · Post by **diep** » Tue Jan 19, 2010 11:36 pm

bob wrote:
lkaufman wrote:Programs with the Rybka search are readily identifiable; they don't change their mind very often, and they are "too fast" in getting thru the plies, unless the count is non-standard as in Rybka herself. Surely if any of the existing programs had the Rybka-like search (excluding the recent R3 derivatives) someone would have commented on this by now. I'm recently certain that none of the leading programs that I have (Stockfish, Naum, Deep Shredder 12, Fritz 12, Hiarcs 12, Zappa Mexico, and of course Doch) do anything like the Rybka search.

It seems that very few who have looked at the Strelka or Ippolit code have understood the search, and those who have have not commented much about it. Nowhere have I read "The idea of Rybka's search is XXX" and apparently it's not because those who know the idea are using it in their own engines. All I've read is the general observation that Rybka devotes more resources to the main line and less to finding new moves.
Some have looked pretty carefully, as they discussed the TT table search extension idea among others. But I do not know who has tried that idea or how it worked.

There are two angles to a search. (1) you can try to replace the PV by doing lots of work on non-PV branches, in an effort to find something that fails high. (2) you can try to search the PV more carefully, and force it to fail low so that you will automatically switch to a different move. Which is better is a guess. But there is something about (2) that leaves me feeling it is a bad idea, as you keep an OK pv and miss a better move, which is (IMHO) worse than finding a better move without over-searching the PV space. It just doesn't "feel right" to me. May be a great idea. And things like LMR and forward pruning tend to behave like that since later moves get reduced/pruned more. But still, there are probably limits to this approach.

Well Bob you can prove quite easily that if your evaluation E is slightly inferior to evaluation E', that if program equipped with E is just doing mainline checking, he's gonna lose bigtime to E'.

Rybka *starts* losing 4.5 ply or more there, reductions not counted.

It is why shredder so suddenly was gone out of world top for a while (to be back now), as it followed the same strategy.

Note that another effect that mainline checking has is similar to what fundamentalistic arabs have: you have a clear mind, see just 1 truth and everything else gives a cutoff.

Kind of similarity to the Peter Gillgasch lemma to explain why Darkthought at the time reached quite deep.

Vincent

Albert Silver · Post by **Albert Silver** » Tue Jan 19, 2010 11:38 pm

lkaufman wrote:Programs with the Rybka search are readily identifiable; they don't change their mind very often, and they are "too fast" in getting thru the plies, unless the count is non-standard as in Rybka herself. Surely if any of the existing programs had the Rybka-like search (excluding the recent R3 derivatives) someone would have commented on this by now. I'm recently certain that none of the leading programs that I have (Stockfish, Naum, Deep Shredder 12, Fritz 12, Hiarcs 12, Zappa Mexico, and of course Doch) do anything like the Rybka search.

It seems that very few who have looked at the Strelka or Ippolit code have understood the search, and those who have have not commented much about it. Nowhere have I read "The idea of Rybka's search is XXX" and apparently it's not because those who know the idea are using it in their own engines. All I've read is the general observation that Rybka devotes more resources to the main line and less to finding new moves.

Check out Shredder 7.04 (non-Rybka like as per your description) and Shredder 8 (very Rybka-like with more plies, and rarely changing its mind). Note that I think Rybka 3 does this far less than 232a.

diep · Post by **diep** » Tue Jan 19, 2010 11:42 pm

Albert Silver wrote:
lkaufman wrote:Programs with the Rybka search are readily identifiable; they don't change their mind very often, and they are "too fast" in getting thru the plies, unless the count is non-standard as in Rybka herself. Surely if any of the existing programs had the Rybka-like search (excluding the recent R3 derivatives) someone would have commented on this by now. I'm recently certain that none of the leading programs that I have (Stockfish, Naum, Deep Shredder 12, Fritz 12, Hiarcs 12, Zappa Mexico, and of course Doch) do anything like the Rybka search.

It seems that very few who have looked at the Strelka or Ippolit code have understood the search, and those who have have not commented much about it. Nowhere have I read "The idea of Rybka's search is XXX" and apparently it's not because those who know the idea are using it in their own engines. All I've read is the general observation that Rybka devotes more resources to the main line and less to finding new moves.
Check out SHredder 7.04 (non-Rybka like as per your description) and SHredder 8 (more plies, and rarely changing its mind). Note that I think Rybka 3 does this far less than 232a.

Don't really feel 7 and 8 are comparable. 7.04 released februari 2003 and version 8 was if i remember well somewhere februari 2004?

Shredder has a far bigger evaluation function, so the agressive last 7 ply hard forward pruning, which you can do basically only when having a tiny evaluation function, is far more complicated to do for Stefan.

Instead he's more into multicut world it seems. It took me 6 months or so to conclude that multicut at bigger search depths loses elo.

Not sure what he's doing nowadays in version 11 and 12, the first 2 that seem automatic optimized to me - but i could be wrong. It's obvious Stefan did do it himself.

Vincent

bob · Post by **bob** » Tue Jan 19, 2010 11:43 pm

Albert Silver wrote:
lkaufman wrote:Programs with the Rybka search are readily identifiable; they don't change their mind very often, and they are "too fast" in getting thru the plies, unless the count is non-standard as in Rybka herself. Surely if any of the existing programs had the Rybka-like search (excluding the recent R3 derivatives) someone would have commented on this by now. I'm recently certain that none of the leading programs that I have (Stockfish, Naum, Deep Shredder 12, Fritz 12, Hiarcs 12, Zappa Mexico, and of course Doch) do anything like the Rybka search.

It seems that very few who have looked at the Strelka or Ippolit code have understood the search, and those who have have not commented much about it. Nowhere have I read "The idea of Rybka's search is XXX" and apparently it's not because those who know the idea are using it in their own engines. All I've read is the general observation that Rybka devotes more resources to the main line and less to finding new moves.
Check out Shredder 7.04 (non-Rybka like as per your description) and SHredder 8 (very Rybka-like with more plies, and rarely changing its mind). Note that I think Rybka 3 does this far less than 232a.

Some have even tried an offset A-B window. After getting the PV score, slightly raise the window to make it harder to change to a different move. This from the days when a program would search forever to find a move just 0.01 better than previous best, and the fact that it was further down in the move list suggested it might have been a little worse from some perspective. Whether this is a good idea or not is unknown. I even experiemented with it, but intentionally breaking alpha/beta seems wrong at the time and I never kept such an idea.

Albert Silver · Post by **Albert Silver** » Tue Jan 19, 2010 11:46 pm

diep wrote:
Albert Silver wrote:
lkaufman wrote:Programs with the Rybka search are readily identifiable; they don't change their mind very often, and they are "too fast" in getting thru the plies, unless the count is non-standard as in Rybka herself. Surely if any of the existing programs had the Rybka-like search (excluding the recent R3 derivatives) someone would have commented on this by now. I'm recently certain that none of the leading programs that I have (Stockfish, Naum, Deep Shredder 12, Fritz 12, Hiarcs 12, Zappa Mexico, and of course Doch) do anything like the Rybka search.

It seems that very few who have looked at the Strelka or Ippolit code have understood the search, and those who have have not commented much about it. Nowhere have I read "The idea of Rybka's search is XXX" and apparently it's not because those who know the idea are using it in their own engines. All I've read is the general observation that Rybka devotes more resources to the main line and less to finding new moves.
Check out SHredder 7.04 (non-Rybka like as per your description) and SHredder 8 (more plies, and rarely changing its mind). Note that I think Rybka 3 does this far less than 232a.
Don't really feel 7 and 8 are comparable. 7.04 released februari 2003 and version 8 was if i remember well somewhere februari 2004?

Shredder has a far bigger evaluation function, so the agressive last 7 ply hard forward pruning, which you can do basically only when having a tiny evaluation function, is far more complicated to do for Stefan.

Instead he's more into multicut world it seems. It took me 6 months or so to conclude that multicut at bigger search depths loses elo.

Not sure what he's doing nowadays in version 11 and 12, the first 2 that seem automatic optimized to me - but i could be wrong. It's obvious Stefan did do it himself.

Vincent

I understand, but watch how their PVs look. 7.04 was similar to previous builds, at least in how it looked, and then came S8 which showed a significant increase in plies, though not in strength, and seemed to never ever change its mind.

jwes · Post by **jwes** » Tue Jan 19, 2010 11:58 pm

bob wrote:
Uri Blass wrote:
diep wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:One note. I believe the inflated piece values were a direct response to programs trading knight for 3 pawns and ending up in hopeless positions, and such. I did the "bad trade" idea in Crafty to avoid this, since the bad trade idea directly addresses the issue rather than indirectly thru modifying piece values.
I wonder to what extent it is that programs do not understand how to play with material differences, e.g. with 3 pawns vs. a piece, you need to use the pawns aggressively.
That is one thing that makes this tuning stuff so difficult. I remember many years ago that we simply could not come up with a scheme to handle some of the openings where the program would play g3/g6 and then Bg2/Bg7. The bishop is often critical, and trading it for a knight is generally not a good idea unless the knight is causing lots of problems where it stands. So we simply tuned the opening book to avoid such lines and did just fine (this was a Cray Blitz issue, by the way). Very early Crafty versions used the old CB book, but as I worked on king safety, slowly this problem went away. Yet the book avoided the Bg2 type positions and would instead go into something that became even more problematic.

Bottom line is that as the evaluation is modified, all terms suddenly become suspect. Sort of like optimizing for speed. As one peak gets driven down by optimizations you apply, others rise to take its place, and the process is actually never completed, just continually improved/refined...
It would be an interesting (and tedious) experiment to collect a few thousand relatively even positions with unbalanced material, e.g. N v PPP,
and tune a version of crafty specifically for those positions to see how much better it would play in those positions than regular crafty.
I feel you've been missing what happened to crafty past dozens of months.

With just 'a few positions' you aren't going to be able to approximate the millions of 'monte carlo type' datapoints crafty has already been tuned to by means of millions of games.

Assuming you don't fix the chessknowledge, but just tune parameters, you can already estimate that most likely the first few months in your experiment you will manage to lose an elo of 200 or so, not win anything.

Vincent
1)If you tune parameters in your program only for specific small set and
are stupid enough to use these value everywhere then
you may lose elo but I think that you are unable to lose 200 elo or even only 80 elo by tuning your program to play well in many N v PPP positions.

2)If you tune your program for N vs PPP positions then you do not need to use the values that you get everywhere.

You can have a preprocessor and use a different evaluation when the position in the board is of the type of N vs PPP.

Uri
That idea absolutely is fraught with problems. You do not want to turn things on and off at the drop of a hat. Around that discontinuity, the search will create strange positions that take advantage of crossing that "boundary" when it is convenient, or introducing horizon-effect type moves to avoid crossing the boundary when it is bad.

That's the very issue that led to the various ways of interpolating the scores between MG and EG, currently in vogue.

My original idea was to try to determine if some correction would be useful, not necessarily to add such code. If a significantly differently tuned evaluation could play significantly better in these situations, then something should be done. What is another question.

lkaufman · Post by **lkaufman** » Wed Jan 20, 2010 1:55 am

The TT move extension is known to be in Stockfish 1.6, but that's a refinement in Rybka, not what really makes it unique. I agree with you that the extreme focus on the PV is counter-intuitive, and clearly the programming world agrees with you as no one else appears to be doing it aside from the Rybka-derivatives. But the fact is that Rybka 3 was nearly a class above the competition for a year or so, and it is surprising to me that she has found no followers.

lkaufman · Post by **lkaufman** » Wed Jan 20, 2010 2:11 am

According to what you are saying, a program with more knowledge and a conventional search should beat Rybka 3. Both Deep Shredder 12 and Stockfish 1.6 probably have more knowledge than R3 (I'm not sure, it's just my impression) but still lose nearly 2-1 to R3. In any case, can you propose some simple changes that could be made to a conventional program, let's say one with a small eval, that would make it behave like Rybka in its concentration on the PV? We could perhaps find out by actual testing whether such ideas help or hurt.

lkaufman · Post by **lkaufman** » Wed Jan 20, 2010 2:16 am

I think that those who understand the ideas in the Rybka code would be inclined either to use them or in some cases to talk about them. After all if you don't think an idea is worth using yourself, there's no harm in telling others about it. Of course some people are secretive, but others like to give away such secrets; why else would the clones be open-source?

lkaufman · Post by **lkaufman** » Wed Jan 20, 2010 2:24 am

"The parameter tuning of material seems a neural network to me. the parameter tuning of the other parameters seems more thoroughly tested". Wow, I've never been called a "neural network" before, but I guess it's a compliment. All evaluation parameters (material and positional) in Rybka 3 were assigned by my intuition originally and then tuned by me for optimum results at roughly game in one second levels (not exactly the long time controls you thought)! It is rather miraculous that they work as well as they do in Rybka 3 at long time controls. It is impossible to tune evaluation features at long time controls, as most features are worth just one or two Elo points and it takes something on the order of 50,000 games or so to prove such small improvements.

Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt

Re: Question for Bob Hyatt