Questions for the Stockfish team

Discussion of chess software programming and technical issues.

Moderator: Ras

Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Result 10000 - 0

Post by Chan Rasjid »

Those who expect a random evaluation engine to be dumb which offers pieces free are in for a surprise. My Elo is probably 1400 and I had a hard time playing against my random engine, mate scores and draws intact :-

Code: Select all

// within QS if not incheck 
        x = eval(side, alpha, beta, ply);
        if (x ^ 0) {
        // scores randomized to  x >= -4095,  x <= 4095; mate scores about +-8000 
            x = (int)(pstack->hash & 0xfff) * (1 - 2 * ((int)pstack->hash & 1));  
        } else {
        //draws remain
            pv[ply][0] = draw_move;
            pv[ply][1] = 0;
            //UPDATE_HASH
            // true draw x = 0 ???
            updatehash(1, MAX_DEPTH, EX, draw_move);
            return 1;
        }
I could introduce bugs with the simplest changes. Hope the above w/o bugs. The engine probably have an elo of 1600 and NOT elo 0.

Rasjid
Mangar
Posts: 65
Joined: Thu Jul 08, 2010 9:16 am

Re: Questions for the Stockfish team

Post by Mangar »

Hi,

sorry, if I have some questions that are explained in Beal´s paper (I haven´t found it yet):

1. Shouldn´t you have a large range of random numbers? If you have 1000 different numbers a tree with 10.000 and a tree with 100.000 nodes would perhaps produce the same result if only using 1000 different random numbers?

2. What about pruning methods. Maybe lots of them are contraproductive with a random eval.
LMR switched off?
Nullmove switched off (as eval is random, nullmove is just a hudge decrease of search depth)?
Any value based pruning methods switched of (Razoring, Futility, ...)?

3. What about the opponents, maybe it will be lot smarter to test against 1800 Elo engines? WBEC should give a hint what engines to select.

Greetings Volker
Mangar Spike Chess
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Result 10000 - 0

Post by Daniel Shawul »

Bear in mind that I am not exactly set out to disprove the "Beal effect" as that is
something alien to me. Don't have the paper but the abstract Gerd provided.
So what I am trying to disprove is the original proposal by Bob that {return rand()} would give
this effect. I gave 3 or 4 possible theroretical reasons why it shouldn't work and results seem to prove me right.
The minimax is really screwed with this set up. Sensible setup IMO that will not violate
standard search methods could be
- using (hash_key % 1000) to avoid differently evaluating a position every time it is visited
- Use exact opposite score when side to move changes.
- By just searching, only one side's mobility eval (if any) is done, nothing whatsover
about the other side's mobility. Don't know how to fix this. At even/odd plies only the
sides to move mobility is conducted. So this violates perfect information game nature of chess
and even adds an artifical "even-odd behaviour" like the one inherent to alpha-beta.

Anyway if someone has a clear understanding of what the Don Beal effect is and how it can be achieved
I will test with that setup.


Other Observaions:

This random engine is like Chest. Its only hope of finding a win is through
long combinations of moves which lead to mate that happen to fail in its horizon.
We might as well return 0 instead of rand() for eval. And I think that
returning 0 might be better as it allows deeper search. The rand() just slows down alpha-beta.
You see in all of its games it throws material quite quickly...

Get elo of Chest if anyone conducted it, and then take away some elo from it and we may have
an approximation for this thing. No elo coming from rand() if anything it should come from
its mate searching capabilities.. And this is not the Beal effect btw. It comes from seeing the tips
of the tree by search as in Tic-Tac-Toe and not from rand(). Infact replacing rand() with 0 should
give a better mate searcher.

I am gonna try this with weak engines and see how significant this chest-like behaviour could be...
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Result 10000 - 0

Post by Chan Rasjid »

Daniel Shawul wrote:Bear in mind that I am not exactly set out to disprove the "Beal effect" as that is something alien to me. Don't have the paper but the abstract Gerd provided. So what I am trying to disprove is the original proposal by Bob that {return rand()} would give this effect.
...
Anyway if someone has a clear understanding of what the Don Beal effect is and how it can be achieved I will test with that setup.
...
I am not sure if we need the original Don Beal's paper. Bob Hyatt gave me a concise explanation that is satisfactory to me and expanded slightly in my earlier post. Anyone should first be able to refute that before going for more details.

Bob Hyatt's experiment is invalid if it was meant to test how an engine performs with the Beal Effect. From what he explained, his engine still have the real mates and draws. This will ensure that the engine will have some "elo intelligence". When combined with the Beal Effect, it could be "lethal" against weak human players - the engine goes for check, avoids checkmate and goes for draws if the score is < 0 and it goes for a PV with the greatest mobility advantage.

A pure "Beal engine" would have to replace all scores return by search() and not just at the eval() calls :

Code: Select all

x = search();
x = (hash % 1000) * (1 - 2 * (hash & 1));
My guess is a pure Beal engine would be very weak.

Rasjid
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Result 10000 - 0

Post by Chan Rasjid »

Daniel Shawul wrote: ...
I gave 3 or 4 possible theroretical reasons why it shouldn't work and results seem to prove me right.
The minimax is really screwed with this set up. Sensible setup IMO that will not violate
standard search methods could be
...
- By just searching, only one side's mobility eval (if any) is done, nothing whatsover about the other side's mobility. Don't know how to fix this.
...
I don't follow. It is a perfectly valid alpha-beta implementation.The Beal's mobility will be what the PV follows. Both sides maximize mobilty and it is the only criterion the search follows in getting the "principle variation". The side that deviates from the PV will finally get itself "immobilized".

Rasjid
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Against TSCP

Post by Daniel Shawul »

Indeed it got a couple of wins by its mate searching capabilities.
6 wins out of 300. That is a 650 elo difference but it shows it can get
wins this way if the opponent can's see short mates..

Code: Select all

Num. Name            games   score 
   0 Scorpio_random    300       7 
   1 XboardEngine      300     293 
Rank Name             Elo    +    - games score oppo. draws 
   1 XboardEngine     325   64   46   300   98%  -325    0% 
   2 Scorpio_random  -325   46   64   300    2%   325    0% 
TSCP does not support setboard so all I have is 150 positions in PGN format.
I can not use EPD files.I will try to find more opening positions and see how this goes.

To make my point clearer look at one of those won games. Scorpio has thrown away
material before it suddely sensed a win.

Code: Select all

[Event "?"]
[Site "?"]
[Date "2010.07.22"]
[Round "21"]
[White "Scorpio_random"]
[Black "XboardEngine"]
[Result "1-0"]
[PlyCount "59"]
[TimeControl "40/30"]

1. d4 {book} d5 {book} 2. c4 {book} c6 {book} 3. Nf3 {book} Nf6 {book}
4. Nc3 {book} e6 {book} 5. Bg5 {book} h6 {book} 6. Bxf6 {book} Qxf6 {book}
7. e3 {book} Nd7 {book} 8. Rc1 {book} Bb4 {-0.22/5 1.0s} 9. Qa4 {-0.46/8 1.3s}
Bxc3+ {+0.13/4 0.97s} 10. Rxc3 {-0.34/9 0.87s} e5 {-0.01/4 0.94s}
11. Qb4 {+0.53/8 0.67s} e4 {+0.14/4 0.90s} 12. cxd5 {+1.21/9 0.72s}
exf3 {+1.11/5 0.88s} 13. g4 {+0.90/9 0.77s} cxd5 {+2.65/4 0.84s}
14. Qa3 {+2.07/9 0.81s} Qg6 {+3.18/4 0.82s} 15. Bd3 {+2.51/10 0.98s}
Qxg4 {+3.74/4 0.80s} 16. Bb5 {+3.26/9 0.76s} Qg2 {+5.06/5 0.77s}
17. Rf1 {+2.99/10 1.5s} Qxh2 {+4.86/4 0.74s} 18. e4 {+3.10/10 1.4s}
dxe4 {+5.65/4 0.71s} 19. Kd1 {+3.22/9 0.78s} h5 {+6.02/4 0.69s}
20. Re1 {+4.22/9 0.98s} Kd8 {+5.33/4 0.67s} 21. Bxd7 {+3.22/9 0.72s}
Bxd7 {+4.85/5 0.65s} 22. Qa5+ {+3.47/10 0.81s} Ke8 {+4.08/4 0.62s}
23. Qd5 {+3.78/9 0.85s} Ba4+ {+5.62/4 0.60s} 24. b3 {+2.78/11 1.1s}
Bc6 {+5.15/4 0.58s} 25. Rxc6 {+3.60/10 0.89s} Qxf2 {+2.89/4 0.56s}
26. Rc7 {+M10/16 0.80s} Qxe1+ {-0.73/4 0.54s} 27. Kxe1 {+M8/26 0.68s}
f2+ {-2.35/4 0.52s} 28. Kf1 {+M6/24 0.16s} e3 {-M4/4 0.11s}

Games http://sites.google.com/site/dshawul/ts ... ects=0&d=1
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Result 10000 - 0

Post by Daniel Shawul »

See my result against TSCP 650 elo difference already out of 300 games. Already went down to 3 digits. More will follow as soon as I find many opening positions in PGN format.
It is not my problem if people want to hang on to dear life with whatever you think is available. I have explicitly told that in my very very first post that this is possible.
When you say random eval , do you mean tactics(material eval) + random score for positional terms OR just random score for all... In the later case I can't see how it can reach 1800 elo.. All the engine will do is make random moves throwing away material. It may try to avoid mates when it sees them through search but how can it possibly think intelligently when it is fed garbage at the end.
The beal effect as it was explained to me was due to the mobility evalution it brings,and how it smartly save material bla bla (which should be clear to everyone by now that it is not so).
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Result 10000 - 0

Post by AlvaroBegue »

Daniel Shawul wrote:[...] So what I am trying to disprove is the original proposal by Bob that {return rand()} would give
this effect. I gave 3 or 4 possible theroretical reasons why it shouldn't work and results seem to prove me right.
The minimax is really screwed with this set up. Sensible setup IMO that will not violate standard search methods could be
- using (hash_key % 1000) to avoid differently evaluating a position every time it is visited
I don't think it matters either way.
- Use exact opposite score when side to move changes.
If anyone here proposed having an evaluation function that is essentially (0 + a random bonus for the side to move), I am sure it's a problem of communication, not really what was intended. For instance, in Ruy-López the evaluation function returns scores from the point of view of white, and it's the caller who flips the sign if it's black's turn. So `return rand();' is perfectly OK. Or we could just make the random symmetric around 0 and forget about these kinds of details in different engines.
- By just searching, only one side's mobility eval (if any) is done, nothing whatsover
about the other side's mobility. Don't know how to fix this. At even/odd plies only the
sides to move mobility is conducted. So this violates perfect information game nature of chess
and even adds an artifical "even-odd behaviour" like the one inherent to alpha-beta.
You keep saying this about only evaluating one side's mobility, but it's not right. Imagine a depth-2 search. The leaves get scores picked from a uniform random distribution. The nodes where depth-1 is left get as score the scores that are the maximum of several uniform distributions. The more uniform distributions they get to pick from, the higher the expected score. The root will get a score that is the maximum of minus the distributions for depth-1 nodes. Again, the more moves this has available, the higher the expected score.

It is true that the resulting probability distributions are not symmetric at all, and that "even-odd behavior" will almost certainly be present.

It is very possible that many search modifications that typically result in improvements when using a normal evaluation function will be detrimental when using a random one. For instance, quiescence search may introduce strange biases that aren't helpful at all, and it will certainly waste a lot of time.

Another thing I wanted to comment (although it's in response to someone else in the thread) is that thinking of alpha-beta windows when trying to understand the Beal effect is probably not very helpful. The result of the search is guaranteed to be the same as what you would get with minimax.
Anyway if someone has a clear understanding of what the Don Beal effect is and how it can be achieved
I will test with that setup.
I think `return 0;' -vs- `return -10000+(rand()%20001);' is a good test. Sort the moves at the root randomly. This way you are effectively comparing "move randomly" with "evaluate randomly", but both engines have the ability to detect mate.
Other Observaions:

This random engine is like Chest. Its only hope of finding a win is through
long combinations of moves which lead to mate that happen to fail in its horizon.
We might as well return 0 instead of rand() for eval. And I think that
returning 0 might be better as it allows deeper search. The rand() just slows down alpha-beta.
You see in all of its games it throws material quite quickly...

Get elo of Chest if anyone conducted it, and then take away some elo from it and we may have
an approximation for this thing. No elo coming from rand() if anything it should come from
its mate searching capabilities.. And this is not the Beal effect btw. It comes from seeing the tips
of the tree by search as in Tic-Tac-Toe and not from rand().
I think you didn't understand what I said about tic-tac-toe. Assigning a random value to draws makes the perfect-knowledge engine prefer moves that constrain the opponent during the game and leave more options for itself. The resulting program makes moves that feel very natural, like taking the middle spot most of the time when going first, even though all moves are draws.
Infact replacing rand() with 0 should
give a better mate searcher.
I think now we have a good test to try. You think `return 0;' will perform better than `return -10000+(rand()%20001);', and I think worse.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Result 10000 - 0

Post by Daniel Shawul »

Maybe a little bit of Minimax will do you good. It is appropriate only for zero sum games where the scores are negated from ply to ply. Also chess is a perfect information game so the evaluation should be done for both at every ply. If you evaluate white's mobility at ply 4, black's at ply 5 etc etc alpha-beta is not going to work.
Since are completely ignoring the other's sides eval, there is no way eval could agree at ply n and n+1. Either you have to do an even or odd ply search so that the tips fail always at white to move or black to move. Can't mix.
I suggest you dig in a bit here
http://en.wikipedia.org/wiki/Minimax
http://en.wikipedia.org/wiki/Perfect_information

Alpha-beta has an inherent odd-even effect. And this evaluation of one side only at a ply thing could add more of this IMHO.
http://chessprogramming.wikispaces.com/Odd-Even+Effect

You don't believe me, that is fine. Look at the way it plays in the games I provided. I really don't want to waste time here.
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Result 10000 - 0

Post by Gerd Isenberg »

AlvaroBegue wrote: Another thing I wanted to comment (although it's in response to someone else in the thread) is that thinking of alpha-beta windows when trying to understand the Beal effect is probably not very helpful. The result of the search is guaranteed to be the same as what you would get with minimax.
I guess it was me, with alpha-beta broken versus pure minimax, both with same random eval. This thread has become too huge ;-)

I am not quite sure, but assuming random() in eval is independent from the position, i.e. not based on hashkey (otherwise you are right), but deterministic (after a same seed) pseudo random number generator with a huge cycle.

Since the number of leaves in alpha-beta is much lower than in minimax, their sequential eval calls on a uniform depth tree will provide only a small subset of the pure minimax leaf numbers. Same leaf-positions except the most left will have different scores, and hence will likely produce another backup score at the root. Not?