Questions for the Stockfish team

bob · Post by **bob** » Wed Jul 21, 2010 6:01 pm

Chan Rasjid wrote:
Chan Rasjid wrote:
bob wrote:
Chan Rasjid wrote:The Beal Effect :-
I am not going to continuously argue a point that has already been explained in detail and verified by others, dating back as far as 20 years.
...
First disclaimer, I have not gone into the detail explanations here about the Beal Effect. My gut intuition just cannot accept a random move generator engine having any elo from only randomness.

My question is : if all search() returns is replaced with random(), is there any Beal Effect?
Code: Select all
// for all search()
x = search();
x = random();
I withdraw my question if search() is allowed to retain draws, mate scores etc that are real and part of search().

Rasjid
Search is unmodified, so yes, it recognizes draws and mates. The random eval is a simple approximation to mobility. You let the search choose among the random scores, and at any point in the tree, the more moves you have, the greater the probability you will get a good random score to back up, and vice-versa...
I think it is the "Beal Effect" - at any point in the tree, the more moves you have, the greater the probability you will get a good random score to back up.

I don't think deeply, but I do seem to accept it will be a crude mobility evaluator - probably even when all draws, mates are also replaced by random numbers.

I think Daniel Shawul was not given this concise explanation (here) earlier.

Rasjid
Unfair disadvatage to Daniel as he was not given this explicit explanation earlier. . But still I cannot accept elo1800 which is near a sane human player. Say dumb at elo 800.

Rasjid

I don't remember the thread, but someone has a huge rating list that included Crafty skill=1. And it had slowly moved up to the 1750+ range on that list. They posted a complaint here. After I looked into it, I discovered that it was playing far stronger than intended. Unfortunately, when I tested the skill command to see how it worked, I only went down to skill 50, because that put crafty 400 points below the weakest program I had to test against, and measuring below that is way inaccurate. It turned out that below 50, things flattened out quickly, when I expected them to plummet to near zero at 1. Didn't happen, and this was many games against a big variety of opponents, to boot, when the original complaint was posted in CCC.

Gerd Isenberg · Post by **Gerd Isenberg** » Wed Jul 21, 2010 6:02 pm

Daniel Shawul wrote:
The Beal Effect :-
I am just hoping for some decent explanation other than its name

I think in negamax, the random function from "heuristic" leaf nodes may better symmetic around zero, e.g. 200*rand() - 100, with respect to game theoretic scores and draw-scores from terminal nodes, and considering trees may not uniform due to extensions, reductions etc, so that heuristic leafs may appear at different distance from root, not to mention ID.

Don F. Beal and Michael C. Smith (1994) Random Evaluations in Chess

This paper reports experiments using random numbers as ``evaluations'' in chess. Although this results in random play with a depth - 1 search, it is found that strength of play rises rapidly when the depth of lookahead is increased. This counter-intuitive result is discussed and its implications for game-playing are given.

Chan Rasjid · Post by **Chan Rasjid** » Wed Jul 21, 2010 6:08 pm

Aha !

Very good - Crafty will soon overtake Stockfish because the results show a huge bug in Crafty somewhere - just locate where and what it is

Rasjid

bob · Post by **bob** » Wed Jul 21, 2010 6:11 pm

Chan Rasjid wrote:Aha !

Very good - Crafty will soon overtake Stockfish because the results show a huge bug in Crafty somewhere - just locate where and what it is

Rasjid

This isn't a bug.

But it is a pain in the a** for the skill command.

bob · Post by **bob** » Wed Jul 21, 2010 6:16 pm

Gerd Isenberg wrote:
Daniel Shawul wrote:
The Beal Effect :-
I am just hoping for some decent explanation other than its name
I think in negamax, the random function from "heuristic" leaf nodes may better symmetic around zero, e.g. 200*rand() - 100, with respect to game theoretic scores and draw-scores from terminal nodes, and considering trees may not uniform due to extensions, reductions etc, so that heuristic leafs may appear at different distance from root, not to mention ID.

Don F. Beal and Michael C. Smith (1994) Random Evaluations in Chess

This paper reports experiments using random numbers as ``evaluations'' in chess. Although this results in random play with a depth - 1 search, it is found that strength of play rises rapidly when the depth of lookahead is increased. This counter-intuitive result is discussed and its implications for game-playing are given.

That's it. The thing that slipped by me is that our depth has steadily been increasing, to a point far beyond what was doable in 1994. His "rises rapidly" was prophetic beyond anything he might have imagined.

The other issue is certainly reductions, because they so drastically reduce the size of the subtrees, this makes it hard to produce good scores from random numbers. For that reason I turned everything off at skill 1 and yet the thing still plays way better than I want...

Daniel Shawul · Post by **Daniel Shawul** » Wed Jul 21, 2010 6:23 pm

You said random evaluation at first, and then you started bringing
order first by 0.01 * real eval which I strenously objected to,
then you said eval of white = -eval of black which further breaks the random nature of the eval, period.
I will not try to convince anyone further. Anyone interested to know my position can read all the issues
I raised here http://talkchess.com/forum/viewtopic.ph ... 66&t=35455 with the perspective of random eval and take their own conclusion.
It really doesn't help if you post voluminous game resutls with different setup than what was discussed.
This is basically a strawman argument from you which neglects the complete random evaluation criteria
you originally proposed.

They say insanity is doing the same thing over and over again and expecting different results.
I say it is expecting a consistent miracle from a random event.

Ralph Stoesser · Post by **Ralph Stoesser** » Wed Jul 21, 2010 6:33 pm

bob wrote: I find it irritating that it is just as complicated to make the program play poorly as it is to make it play well...

Is it really that hard?

0.01 * (-1) * real eval + 0.99 * random()
0.02 * (-1) * real eval + 0.98 * random()
0.03 * (-1) * real eval + 0.97 * random()
... etc.

does not work? Why?

If the ELO difference between

pure random() and 0.01 * real eval + 0.99 * random()

is small, then the ELO difference between

pure random() and 0.01 * (-1) * real eval + 0.99 * random()

should also be small.

Daniel Shawul · Post by **Daniel Shawul** » Wed Jul 21, 2010 6:35 pm

Gerd,
My objection was to a completely random evaluation which I tried to outline as much as I can here http://talkchess.com/forum/viewtopic.ph ... 66&t=35455. Now we have come to apparent consensus how the score of one side should be negated to the other side for minimax to work... This was originally absent from his reply to me but somehow expected me to understand even after giving me pseudocode how to do it..

I gave up the point that it does some weird kind of mobility evaluation the minute Marco posted it. But I pointed out how bad that eval is and how one sided it is completely disregarding the perfect information game assumption. The supposed engine evaluates like 'poker' , like it can't see what the opponent has to offer. It just evaluates its mobility and goes on... See points c & d of my post in the link above.

Did they (Don Beal) say a 1800 elo engine can be constructed this way ? Even he (Bob) himself didn't belive it when people first told him it plays like 1800. He thought it played like 800 (said it in this thread ofcourse). I can't say what crafty does / does not do, that is why I am sticking to what he says about the effect with random eval and I am definately not getting a 1800 elo engine.

Daniel

bob · Post by **bob** » Wed Jul 21, 2010 7:09 pm

Ralph Stoesser wrote:
bob wrote: I find it irritating that it is just as complicated to make the program play poorly as it is to make it play well...
Is it really that hard?

0.01 * (-1) * real eval + 0.99 * random()
0.02 * (-1) * real eval + 0.98 * random()
0.03 * (-1) * real eval + 0.97 * random()
... etc.

does not work? Why?

If the ELO difference between

pure random() and 0.01 * real eval + 0.99 * random()

is small, then the ELO difference between

pure random() and 0.01 * (-1) * real eval + 0.99 * random()

should also be small.

In Crafty, skill 70 drops elo by almost exactly 200. skill 50 drops it by another 200. But it won't go below about 1800 even at skill 1. That is the problem...

The "Beal effect" is causing a strength issue that prevents going further. I have one solution. I am now half-way thru a set of tests that simply shrink the range of the random numbers, which so far looks reasonable, although I have yet to get to the skill 1 test, which is the key.

bob · Post by **bob** » Wed Jul 21, 2010 7:11 pm

Daniel Shawul wrote:Gerd,
My objection was to a completely random evaluation which I tried to outline as much as I can here http://talkchess.com/forum/viewtopic.ph ... 66&t=35455. Now we have come to apparent consensus how the score of one side should be negated to the other side for minimax to work... This was originally absent from his reply to me but somehow expected me to understand even after giving me pseudocode how to do it..

No we have _not_ come to a consensus about it needing to be negated. As I have already clearly explained, just a pure random number has the same characteristic.

I gave up the point that it does some weird kind of mobility evaluation the minute Marco posted it. But I pointed out how bad that eval is and how one sided it is completely disregarding the perfect information game assumption. The supposed engine evaluates like 'poker' , like it can't see what the opponent has to offer. It just evaluates its mobility and goes on... See points c & d of my post in the link above.

Did they (Don Beal) say a 1800 elo engine can be constructed this way ? Even he (Bob) himself didn't belive it when people first told him it plays like 1800. He thought it played like 800 (said it in this thread ofcourse). I can't say what crafty does / does not do, that is why I am sticking to what he says about the effect with random eval and I am definately not getting a 1800 elo engine.

Daniel

You say you are not getting an 1800 elo engine. How, exactly, did you test something to come to this conclusion. I just completed almost 1/2 million games to test what _I_ am getting...

Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team