Questions for the Stockfish team

Discussion of chess software programming and technical issues.

Moderator: Ras

Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Result 10000 - 0

Post by Daniel Shawul »

Gerd,
you are absolutely right. That is what I have been saying from the start.
I guess some people just don't listen. I think it is now time for me to leave this discussion as it fast going down the flame road..
cheers.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Result 10000 - 0

Post by AlvaroBegue »

Gerd Isenberg wrote:
AlvaroBegue wrote: Another thing I wanted to comment (although it's in response to someone else in the thread) is that thinking of alpha-beta windows when trying to understand the Beal effect is probably not very helpful. The result of the search is guaranteed to be the same as what you would get with minimax.
I guess it was me, with alpha-beta broken versus pure minimax, both with same random eval. This thread has become too huge ;-)
It was mostly in response to Chan Rasjid's attempt to analyze what makes the Beal effect happen.
I am not quite sure, but assuming random() in eval is independent from the position, i.e. not based on hashkey (otherwise you are right), but deterministic (after a same seed) pseudo random number generator with a huge cycle.
Yes, I am thinking of this differently. I imagine the scores at the leaves are probability distributions. Minimax will propagate these distributions back to the root. One tree search will result in a score at the root that is one sampling in the minimaxed distribution. In this I think of the pseudo-random number generator as a method to get samples, and I assume it's as good as true random numbers. If your pseudo-random number generator is good enough (and by default rand() in gcc is good enough), the analysis is valid.
Since the number of leaves in alpha-beta is much lower than in minimax, their sequential eval calls on a uniform depth tree will provide only a small subset of the pure minimax leaf numbers. Same leaf-positions except the most left will have different scores, and hence will likely produce another backup score at the root. Not?
Yes, you will backup a different score, but its picked from the same random distribution, so it will behave comparably.
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Result 10000 - 0

Post by Chan Rasjid »

Daniel Shawul wrote:Maybe a little bit of Minimax will do you good. It is appropriate only for zero sum games where the scores are negated from ply to ply. Also chess is a perfect information game so the evaluation should be done for both at every ply. If you evaluate white's mobility at ply 4, black's at ply 5 etc etc alpha-beta is not going to work.
Since are completely ignoring the other's sides eval, there is no way eval could agree at ply n and n+1. Either you have to do an even or odd ply search so that the tips fail always at white to move or black to move. Can't mix.
I suggest you dig in a bit here
http://en.wikipedia.org/wiki/Minimax
http://en.wikipedia.org/wiki/Perfect_information

Alpha-beta has an inherent odd-even effect. And this evaluation of one side only at a ply thing could add more of this IMHO.
http://chessprogramming.wikispaces.com/Odd-Even+Effect

You don't believe me, that is fine. Look at the way it plays in the games I provided. I really don't want to waste time here.
Of course I believe if you are right and I can understand you are right!

OK, now I understand the Beal effect probably has something more than what I understand or misconstrued. You could be right that there is a one-sided Beal mobility - only for the side on move in usual alpha-beta implementation. So it is possible the Beal mobility effect is garbage!

I have to think a little.

Rasjid
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Chan Rasjid wrote:Daniel,

Your test might be invalid. If you use random numbers between 0 and 1000, there could be no Beal Effect!

The Beal Effect is equivalent to this:
For N random numbers and any random number alpha, the probability that one of the N numbers be greater than alpha increases with N.


In your test, at every node with N moves and bounds (alpha, beta), beta has a probabilty > 0.5 that beta < 0. So your alpha and beta are NOT RANDOM NUMBERS. This might be the reason your test result shows a true random engine without any Beal intelligence.

Rasjid
I am using 0-100, except that if it is BTM I multiply by -1.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Daniel Shawul wrote:Bear in mind that I am not exactly set out to disprove the "Beal effect" as that is
something alien to me. Don't have the paper but the abstract Gerd provided.
So what I am trying to disprove is the original proposal by Bob that {return rand()} would give
this effect. I gave 3 or 4 possible theroretical reasons why it shouldn't work and results seem to prove me right.
The minimax is really screwed with this set up. Sensible setup IMO that will not violate
standard search methods could be
- using (hash_key % 1000) to avoid differently evaluating a position every time it is visited
- Use exact opposite score when side to move changes.
- By just searching, only one side's mobility eval (if any) is done, nothing whatsover
about the other side's mobility. Don't know how to fix this. At even/odd plies only the
sides to move mobility is conducted. So this violates perfect information game nature of chess
and even adds an artifical "even-odd behaviour" like the one inherent to alpha-beta.

Anyway if someone has a clear understanding of what the Don Beal effect is and how it can be achieved
I will test with that setup.


Other Observaions:

This random engine is like Chest. Its only hope of finding a win is through
long combinations of moves which lead to mate that happen to fail in its horizon.
We might as well return 0 instead of rand() for eval. And I think that
returning 0 might be better as it allows deeper search. The rand() just slows down alpha-beta.
You see in all of its games it throws material quite quickly...

Get elo of Chest if anyone conducted it, and then take away some elo from it and we may have
an approximation for this thing. No elo coming from rand() if anything it should come from
its mate searching capabilities.. And this is not the Beal effect btw. It comes from seeing the tips
of the tree by search as in Tic-Tac-Toe and not from rand(). Infact replacing rand() with 0 should
give a better mate searcher.

I am gonna try this with weak engines and see how significant this chest-like behaviour could be...
Your explanation about mobility is simply wrong. The score does _not_ evaluate mobility. At any node you care to choose, if you have more possible moves, you have a greater probability of getting a larger number backed up to that node. You need to mentally disconnect the score from meaning _anything_. It is a random number. The probability of getting a number on either end of the scale is what this is about, not the number itself. The number just allows minimax/alpha-beta to work correctly.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Chan Rasjid wrote:
Daniel Shawul wrote:Bear in mind that I am not exactly set out to disprove the "Beal effect" as that is something alien to me. Don't have the paper but the abstract Gerd provided. So what I am trying to disprove is the original proposal by Bob that {return rand()} would give this effect.
...
Anyway if someone has a clear understanding of what the Don Beal effect is and how it can be achieved I will test with that setup.
...
I am not sure if we need the original Don Beal's paper. Bob Hyatt gave me a concise explanation that is satisfactory to me and expanded slightly in my earlier post. Anyone should first be able to refute that before going for more details.

Bob Hyatt's experiment is invalid if it was meant to test how an engine performs with the Beal Effect. From what he explained, his engine still have the real mates and draws. This will ensure that the engine will have some "elo intelligence". When combined with the Beal Effect, it could be "lethal" against weak human players - the engine goes for check, avoids checkmate and goes for draws if the score is < 0 and it goes for a PV with the greatest mobility advantage.

A pure "Beal engine" would have to replace all scores return by search() and not just at the eval() calls :

Code: Select all

x = search();
x = (hash % 1000) * (1 - 2 * (hash & 1));
My guess is a pure Beal engine would be very weak.

Rasjid
I do not think Don "broke" his search. I believe he did exactly what I am doing, that is simply replace the static evaluation with a random number. But letting the search handle mates and repetitions (do not know if he dealt with 50 move rule back then). If you ignore mates and draws, you might win a single game every 10,000,000 or so, because it would be pure luck since even around mate positions, there are good checks that force mate and bad checks that just waste time (but still restrict mobility).

I have made some progress, according to cluster testing. My "busy-loop" in the evaluation will, as skill is lowered, execute more and more iterations of "do nothing" (it actually does some computation that can't be optimized away to make this work). On this hardware (cluster) Crafty searches 2M nps or so on the slower one that I mostly use, with skill 1, the nps drops off to a couple of K nodes per second, which kills depth and shrinks "the Beal effect" enough that it plays ugly chess. I am trying to somehow tune this "nps reducer" so that the skill N command has a pretty constant level of reduction, right now it is somewhat "jumpy".
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Daniel Shawul wrote:See my result against TSCP 650 elo difference already out of 300 games. Already went down to 3 digits. More will follow as soon as I find many opening positions in PGN format.
It is not my problem if people want to hang on to dear life with whatever you think is available. I have explicitly told that in my very very first post that this is possible.
When you say random eval , do you mean tactics(material eval) + random score for positional terms OR just random score for all... In the later case I can't see how it can reach 1800 elo.. All the engine will do is make random moves throwing away material. It may try to avoid mates when it sees them through search but how can it possibly think intelligently when it is fed garbage at the end.
The beal effect as it was explained to me was due to the mobility evalution it brings,and how it smartly save material bla bla (which should be clear to everyone by now that it is not so).
Sure. You have convinced me this is all phony. My skill 1 command is drawing complaints by people that don't know what they are dong. I, as a 2000+ Elo chess player find beating the thing requires attention and effort, not just casual play, so I suppose my chess skills have slipped so badly that I am now a 200 player or something. And all these other idiots that are actually measuring skill 1 crafty against a pool of 1600-1800 programs are just screwing the tests up. And my cluster is obviously broken since it supports their findings quite well.

I guess that all of us are incompetent and you are the only person here that can test these conclusions that have already been _proven_ correct, but which now stand refuted.

Yeah, that's it... silly me...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Chan Rasjid wrote:
Daniel Shawul wrote: ...
I gave 3 or 4 possible theroretical reasons why it shouldn't work and results seem to prove me right.
The minimax is really screwed with this set up. Sensible setup IMO that will not violate
standard search methods could be
...
- By just searching, only one side's mobility eval (if any) is done, nothing whatsover about the other side's mobility. Don't know how to fix this.
...
I don't follow. It is a perfectly valid alpha-beta implementation.The Beal's mobility will be what the PV follows. Both sides maximize mobilty and it is the only criterion the search follows in getting the "principle variation". The side that deviates from the PV will finally get itself "immobilized".

Rasjid
He does not understand the basic concept here, so trying to discuss it is really way beyond pointless...
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Result 10000 - 0

Post by AlvaroBegue »

A question for Bob Hyatt: If I understand correctly, your scores in this mode always indicate white is ahead. If a draw is 0, wouldn't this mean that white will always avoid draws and black will always seek them? In that sense, this is not equivalent to using random numbers centered around 0.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Result 10000 - 0

Post by bob »

Daniel Shawul wrote:Maybe a little bit of Minimax will do you good. It is appropriate only for zero sum games where the scores are negated from ply to ply. Also chess is a perfect information game so the evaluation should be done for both at every ply. If you evaluate white's mobility at ply 4, black's at ply 5 etc etc alpha-beta is not going to work.
Since are completely ignoring the other's sides eval, there is no way eval could agree at ply n and n+1. Either you have to do an even or odd ply search so that the tips fail always at white to move or black to move. Can't mix.
I suggest you dig in a bit here
http://en.wikipedia.org/wiki/Minimax
http://en.wikipedia.org/wiki/Perfect_information

Alpha-beta has an inherent odd-even effect. And this evaluation of one side only at a ply thing could add more of this IMHO.
http://chessprogramming.wikispaces.com/Odd-Even+Effect

You don't believe me, that is fine. Look at the way it plays in the games I provided. I really don't want to waste time here.
Did you ever think that perhaps _you_ screwed up the test. Others of us are not having that problem. For me, when others report X, and I run a test and conclude ~X, I go back and look at the test carefully to make sure it is doing what it is supposed to do, rather than shouting that everyone _else_ is wrong.