Questions for the Stockfish team

BubbaTough · Post by **BubbaTough** » Wed Jul 21, 2010 4:59 pm

bob wrote:As I said, the scores my eval will produce, using pure random numbers, is -100 to +100. -100 is good for black, +100 is good for white.

only up to 100? Assuming integers, that is not much variety for millions of nodes to choose from. The whole thing makes sense to me if you were using a much larger number. I find it peculiar you are seeing much effect here without using a larger number, but if you are, you are. I wonder how it effects things to use smaller or larger ranges (I thought I knew but apparently my instinct is off).

-Sam

bob · Post by **bob** » Wed Jul 21, 2010 5:04 pm

Daniel Shawul wrote:Tic-tac-toe really ? You don't need random evaluation for that as you can
get many WDL just by searching.. Has already been mentioned in this thread infact.
That effect is not related to the random eval at all.

Answer me these questions.

a) how minimax is going to work with just eval() which returns
positive number for everything... It is really a simple question.
This really breaks the essence of minimax as is clearly outlined in the
wiki page I gave link to.

So if we start off a position where white is up a pawn, minimax is broken, since all evaluations will be positive? "zero sum" does not require that "equal" is exactly zero, it only requires that the score indicate the difference between what is good for white or good for black.

This is not about the "score". It is about the "randomness of the score."

Simple example, for the last time. Let's do only a 2 ply search, since search is recursive anyway. I make a move. You have 20 possible replies. You therefore produce 20 random numbers and since you are a min side, you choose the smallest and back it up. That is the score for my first move. My next move checks you, leaving you just 2 legal moves. You produce two random scores, but with just two, the probability of getting a small one is not very good. So you choose the smallest of the two and return that. That is better for me and I make that my best move and score. The more moves you have in reply to one of my moves, the smaller the "score" you will return because you get more chances to get a small one. The fewer moves you have, the greater the probability you won't get a small number, which makes you return a larger score. This happens at _every_ node in the tree. At any node M, the more moves black has (assuming white to move at the root) the greater the probability he will get to choose a small score. The fewer moves he has, the greater the probability he will be forced to choose a large score. The same is true for any node where it is white to move, except that white will want to choose larger scores, which is easier if he has more moves to choose from.

You are hung up on the score itself. That is not the issue here. As far as material goes, if you hang your queen and I take it, your mobility goes down and my chances for a larger score goes up. If I hang my queen, my mobility goes down and that hurts my chances to extract one of those large numbers.

It really is that simple, and it really does work. And all your hand-waving, "I don't believe", "I can't see how" and such is not helping this discussion one bit. As I said, this is _not_ conjecture.

b) The mobility eval is done for one side only.. But chess is a game of perfect information
http://en.wikipedia.org/wiki/Perfect_information
And yet we are doing gross mobility evaluation for one side only, which disregards the
mobility of its opponent.

c) The supposed mobility evaluation that it brings is very rough to the say the least.
Again for reasons I mentioned in this thread already.

This is two violations of chess game tree search theory in a raw and one so so positonal evaluation.
You must understand why I have difficulty to accept one can get a 1800 elo engine out of this mess.

Once again, that doesn't mean a thing. Whether you accept this or not is completely irrelevant. It is a simple fact that anyone can measure. But playing one game is not going to do the trick. Feel free to try crafty, 23.2 is publicly available.

bob · Post by **bob** » Wed Jul 21, 2010 5:05 pm

michiguel wrote:
bob wrote:
Joost Buijs wrote:I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
OK, some background. It turns out that if you replace Crafty's evaluation with a pure random number, it plays well above 2,000 Elo. If you disable all the search extensions, reductions, no null-move and such, you still can't get it below 1800. There has been a long discussion about this, something I call "The Beal Effect" since Don Beal first reported on this particular phenomenon many years ago. So a basic search + random eval gives an 1800 player. Full search + full eval adds 1,000 to that. How much from each? Unknown. But I have watched many many stockfish vs crafty games and the deciding issue does not seem to be evaluation. We seem to get hurt by endgame search depth more than anything...
And that is where most (all?) engines had the biggest holes in evaluation... endgame!

Miguel

I have never heard _anyone_ say that Crafty's endgame evaluation is poor. In fact, several GM players have said exactly the opposite. Most ignore candidate passed pawns and such. We don't.

My comment was based on _watching_ games, where we get out-searched and then end up losing something tactically. Not positionally.

Daniel Shawul · Post by **Daniel Shawul** » Wed Jul 21, 2010 5:09 pm

Exactly what test did you run and what were the results for how many games? Just playing one game does not make this "light years different." Dann is playing games using the skill version of Crafty and is seeing just as much unusual strength as I am.

Well he is playing with something else. You said random eval and infact wrote psuedo code for it.

No. I mean this:

int Evaluate() {

return (random());

}

Another random quote from you

I don't know what your "completely random" comment means, but I have tested (and just did it again) with pure random scores. Just take Crafty, and right at the top of evaluate.c return 100 * random_generator() (assuming random_generator() returns a float 0.0 <= N < 1.00). Then you won't be guessing.

etcetra..

Why you suddenly decided to differentiate side to move ? I mean it is really difficult to
argue when your words change every reply.

Chan Rasjid · Post by **Chan Rasjid** » Wed Jul 21, 2010 5:09 pm

The Beal Effect :-

I am not going to continuously argue a point that has already been explained in detail and verified by others, dating back as far as 20 years.
...

First disclaimer, I have not gone into the detail explanations here about the Beal Effect. My gut intuition just cannot accept a random move generator engine having any elo from only randomness.

My question is : if all search() returns is replaced with random(), is there any Beal Effect?

Code: Select all

// for all search()
x = search();
x = random();

I withdraw my question if search() is allowed to retain draws, mate scores etc that are real and part of search().

Rasjid

bob · Post by **bob** » Wed Jul 21, 2010 5:13 pm

Daniel Shawul wrote:Your words not mine. Re-read your first reply to me
Code: Select all
No. I mean this:

int Evaluate() {

return (random());

}
Is there anyting like if (side == white) ? rand() : -rand() ?? NO.
That is what I tried in my engine when you said random eval... and probably
what Tord did too. The fact that you give +ve score always to one side and -ve to the other of course something NOT random..
You just inserted order out no where. Not surprised there at all.

Please continue reading. 23.2 does as above. It takes the normal eval, reduces it by multiplying by .01, then adds 0.99 * random() where random returns a number between 0 to 100. This replaces the normal score which is then negated.

But if you just return "random()" this still works _perfectly_. Just because you won't take the time to understand why, doesn't mean it won't work. It just means you don't understand it. You somehow miss the idea of going down thru the tree and at any node where you have a large number of potential moves to choose from, you have a greater chance of finding a score good for you and when you have a small number of moves to choose from, you have a smaller chance of finding a score good for you. And this works at _every_ node in the tree. min is trying to find the smallest and hates big scores, max is trying to find the largest and hates small scores. It is mobility for _both_ sides. Positions with equal mobility will have a greater chance of returning a score right in the middle of the range, an imbalance in mobility shifts this.

So, one more time, you can use what I do in 23.2, which will negate the random numbers if it is BTM (since score is always computed as if WTM). But if you just return a pure random number it doesn't change a thing. BTM always wants small numbers and the more moves he has, the greater the chance of returning one of those, WTM always wants big numbers and the more moves he has... etc...

Daniel Shawul · Post by **Daniel Shawul** » Wed Jul 21, 2010 5:13 pm

The Beal Effect :-

I am just hoping for some decent explanation other than its name

bob · Post by **bob** » Wed Jul 21, 2010 5:16 pm

BubbaTough wrote:
bob wrote:As I said, the scores my eval will produce, using pure random numbers, is -100 to +100. -100 is good for black, +100 is good for white.
only up to 100? Assuming integers, that is not much variety for millions of nodes to choose from. The whole thing makes sense to me if you were using a much larger number. I find it peculiar you are seeing much effect here without using a larger number, but if you are, you are. I wonder how it effects things to use smaller or larger ranges (I thought I knew but apparently my instinct is off).

-Sam

What would be the point of a larger range? The random numbers are uniformly distributed. The range is pretty much irrelevant, you just want enough choices to differentiate between typical mobility extremes. 0 to 100 moves covers almost all cases well enough. I tried other ranges when developing this and found absolutely no benefit. This tends to make the tree a bit smaller since there are fewer distinct scores and there are more "equals" which allow pruning.

bob · Post by **bob** » Wed Jul 21, 2010 5:18 pm

Chan Rasjid wrote:The Beal Effect :-
I am not going to continuously argue a point that has already been explained in detail and verified by others, dating back as far as 20 years.
...
First disclaimer, I have not gone into the detail explanations here about the Beal Effect. My gut intuition just cannot accept a random move generator engine having any elo from only randomness.

My question is : if all search() returns is replaced with random(), is there any Beal Effect?
Code: Select all
// for all search()
x = search();
x = random();
I withdraw my question if search() is allowed to retain draws, mate scores etc that are real and part of search().

Rasjid

Search is unmodified, so yes, it recognizes draws and mates. The random eval is a simple approximation to mobility. You let the search choose among the random scores, and at any point in the tree, the more moves you have, the greater the probability you will get a good random score to back up, and vice-versa...

bob · Post by **bob** » Wed Jul 21, 2010 5:19 pm

Daniel Shawul wrote:
The Beal Effect :-
I am just hoping for some decent explanation other than its name

It has been given repeatedly, by more than one person. Or you could just look up the paper and read it. You can find the citation on the ICGA web site if you look for Beal.

Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team