Questions for the Stockfish team

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Questions for the Stockfish team

Post by michiguel »

bob wrote:
Joost Buijs wrote:I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
OK, some background. It turns out that if you replace Crafty's evaluation with a pure random number, it plays well above 2,000 Elo. If you disable all the search extensions, reductions, no null-move and such, you still can't get it below 1800. There has been a long discussion about this, something I call "The Beal Effect" since Don Beal first reported on this particular phenomenon many years ago. So a basic search + random eval gives an 1800 player. Full search + full eval adds 1,000 to that. How much from each? Unknown. But I have watched many many stockfish vs crafty games and the deciding issue does not seem to be evaluation. We seem to get hurt by endgame search depth more than anything...
And that is where most (all?) engines had the biggest holes in evaluation... endgame!

Miguel
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

Here is a shock (I am sure a statistical abnormality) crafty with skill=100 loses to crafty with skill set to 0:

Code: Select all

[Event "Crskill"]
[Site "DCORBIT2008"]
[Date "2010.07.20"]
[Round "1"]
[White "Crafty-23.2a-skill-mod"]
[Black "Crafty-232ap00"]
[Result "0-1"]
[BlackElo "2700"]
[ECO "A40"]
[Opening "Englund (Charlick) Gambit"]
[Time "21:26:40"]
[Variation "Soller Deferred"]
[WhiteElo "2800"]
[TimeControl "60+1"]
[Termination "normal"]
[PlyCount "154"]
[WhiteType "program"]
[BlackType "program"]

1. Nf3 Nc6 2. d4 d5 3. c4 e5 4. Nxe5 {(4. Nxe5 Bb4+ 5. Nd2 Nge7 6. a3 Bxd2+
7. Bxd2 dxc4 8. Nxc6 Nxc6 9. e3 Qg5 10. Rc1 Bg4 11. Qc2 Be6) +0.79/14 3}
Bb4+ {(4. ... Bb4+ 5. Bd2 Bxd2+ 6. Qxd2 Nge7 7. Nc3 dxc4 8. Nxc4 Qxd4 9.
Qxd4 Nxd4 10. O-O-O Ne6 11. e4 O-O 12. Be2 Nc6) +0.86/15 3} 5. Bd2 {(5. Bd2
Bxd2+ 6. Nxd2 Nge7 7. Nxc6 Nxc6 8. e3 O-O 9. Be2 Re8 10. O-O Bf5 11. Nf3
Qd6 12. Qb3 Na5 13. Qb5) +0.92/16 2} Bxd2+ {(5. ... Bxd2+ 6. Nxd2 Nge7 7.
Nxc6 Nxc6 8. e3 O-O 9. Be2 Re8 10. O-O Bf5 11. Nf3 Nb4 12. cxd5 Qxd5 13.
Rc1 <HT>) +0.98/15 3} 6. Nxd2 Nxe5 {(6. ... Nxe5 7. dxe5 Ne7 8. Qb3 dxc4 9.
Nxc4 Nc6 10. Rd1 Qe7 11. g3 Qb4+ 12. Qxb4 Nxb4 13. Bg2 O-O 14. a4 <HT>)
+1.06/15 2} 7. dxe5 {(7. dxe5 Ne7 8. Qb3 dxc4 9. Nxc4 Nc6 10. Rd1 Qe7 11.
g3 Qb4+ 12. Qxb4 Nxb4 13. Bg2 O-O 14. a4 <HT>) +1.06/14 1} Ne7 {(7. ... Ne7
8. e3 O-O 9. Qb3 dxc4 10. Bxc4 Nc6 11. O-O-O Qg5 12. f4 Qxg2 13. Bd5 Qe2
14. Bxc6 bxc6 15. Ne4) +1.07/16 3} 8. Qc2 {(8. Qc2 Nc6 9. O-O-O Nxe5 10.
cxd5 O-O 11. e4 Bg4 12. f3 Bh5 13. Nc4 Re8 14. Be2 Qg5+ 15. Kb1 Rad8 <HT>)
+1.09/15 3} Nc6 {(8. ... Nc6 9. O-O-O Nxe5 10. cxd5 O-O 11. e4 Qf6 12. Qc5
Bd7 13. Be2 Rfc8 14. Nf3 Ba4 15. Rd2) +1.06/14 2} 9. O-O-O {(9. O-O-O Nxe5
10. cxd5 O-O 11. Nf3 Bg4 12. Nxe5 Qg5+ 13. Rd2 Qxe5 14. Qb3 Rab8 15. Qa4
Bf5 16. Qxa7 Qf4) +1.03/15 3} Nxe5 {(9. ... Nxe5 10. cxd5 O-O 11. Nc4 Nxc4
12. Qxc4 Re8 13. e3 Qf6 14. Qc2 Bf5 15. Bd3 Bxd3 16. Rxd3 Qf5) +1.08/15 2}
10. cxd5 {(10. cxd5 Qxd5 11. Qxc7 Qc6+ 12. Qxc6+ Nxc6 13. e4 Be6 14. Nc4
O-O 15. Be2 Rad8 16. Rxd8 Rxd8 17. Rd1 Nd4 18. Kb1 <HT>) +0.92/15 4} Qxd5
{(10. ... Qxd5 11. Qxc7 Be6 12. Nf3 Qc6+ 13. Qxc6+ Nxc6 14. e4 Bxa2 15. Bb5
O-O 16. Bxc6 bxc6 17. Ne5 f5 18. exf5 Rxf5) +0.83/14 2} 11. Qxc7 {(11. Qxc7
Be6 12. Nf3 Qc6+ 13. Qxc6+ Nxc6 14. e4 Bxa2 15. Bb5 Rc8 16. Ne5 Be6 17.
Bxc6+ Ke7 18. f4 bxc6 19. f5 f6 <HT>) +0.90/15 3} Be6 {(11. ... Be6 12. Nf3
Qc6+ 13. Qxc6+ Nxc6 14. e4 Bxa2 15. Bb5 Rc8 16. Ne5 Ke7 17. Nxc6+ bxc6 18.
Ba4 Be6 19. Kc2 Rhd8 20. Rxd8 Rxd8 21. Bxc6) +0.71/15 2} 12. Nf3 {(12. Nf3
Qxa2 13. Qxe5 Rc8+ 14. Kd2 O-O 15. e4 Rfd8+ 16. Bd3 f6 17. Qg3 Bb3 18. Ke3
Bxd1 19. Rxd1 Qb3 20. Rd2) +0.77/15 2} Qxa2 {(12. ... Qxa2!) +0.41/15 3}
13. Qxe5 Rc8+ {(13. ... Rc8+ 14. Kd2 O-O 15. e4 Rfd8+ 16. Bd3 f6 17. Qh5
Bb3 18. Ke2 Bxd1+ 19. Rxd1 Qxb2+ 20. Kf1 Qb3 21. Ne1 Rc3 22. Qf3) +0.15/14
3} 14. Kd2 {(14. Kd2 O-O 15. e4 f6 16. Qh5 Bb3 17. Bd3 Qxb2+ 18. Ke3 Bxd1
19. Qd5+ Kh8 20. Rxd1 Rc3 21. Rd2 Qb4) +0.25/14 2} O-O {(14. ... O-O 15. e4
Rfd8+ 16. Bd3 f6 17. Qh5 Bb3 18. Ke2 Bxd1+ 19. Rxd1 Qxb2+ 20. Kf1 Qb3 21.
Ne1 g6 22. Qg4 Rc5) +0.35/14 2} 15. e4 {(15. e4 f6 16. Qh5 Bb3 17. Bd3
Qxb2+ 18. Ke3 Bxd1 19. Rxd1 Qb6+ 20. Ke2 Rfd8 21. Kf1 Qb3 22. Ne1 Rd4)
+0.25/14 2} Rfd8+ {(15. ... Rfd8+ 16. Bd3 f6 17. Qh5 Bf7 18. Qh3 Be6 19.
Qh4 Rxd3+ 20. Kxd3 Rd8+ 21. Ke3 Qb3+ 22. Kf4 Rxd1 23. Rxd1 Qxd1) +0.10/13
2} 16. Bd3 {(16. Bd3 f6 17. Qh5 Bf7 18. Qg4 Be6 19. Qh4 g5 20. Nxg5 Qxb2+
21. Ke1 Qc3+ 22. Ke2 fxg5 23. Qxg5+ Kh8 24. e5) +0.27/14 2} f6 {(16. ... f6
17. Qh5 Bf7 18. Qh3 Be6 19. Qg3 Bb3 20. Ke3 Bxd1 21. Rxd1 Qb3 22. Ke2 Rc5
23. h4) +0.34/14 4} 17. Qh5 {(17. Qh5 g6 18. Qh6 Bc4 19. Ne1 Bb3 20. Nf3
Qxb2+ 21. Ke3 Bxd1 22. Rxd1 Qb6+ 23. Ke2 Rc3 24. Qe3 Qa5 <HT>) -0.81/14 7}
g6 {(17. ... g6 18. Qh6 Bc4 19. Ne1 Bb3 20. Nf3 Qxb2+ 21. Ke3 Bxd1 22. Rxd1
Rc3 <HT>) -0.75/13 2} 18. Qh4 {(18. Qh4 Bc4 19. Ne1 Bb3 20. Qxf6 Bxd1 21.
Kxd1 Rxd3+ 22. Nxd3 Qb1+ 23. Ke2 Rc2+ 24. Kf3 Qxh1 25. Qd8+ Kg7 26. Qd7+
Kh6 27. Qxb7 Qxh2 28. Qxa7) -0.70/13 2} Bc4 {(18. ... Bc4 19. Ne1 Qxb2+ 20.
Ke3 Bxd3 21. Nxd3 Qd4+ 22. Ke2 Rc2+ 23. Kf1 Qc3 24. Qh3 Rxd3 25. Qxd3 Qxd3+
26. Rxd3 Rc1+ 27. Ke2 Rxh1) -0.55/13 3} 19. Ne1 {(19. Ne1 Qxb2+ 20. Ke3
Qb6+ 21. Ke2 Bxd3+ 22. Rxd3 Qb2+ 23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+
26. Re3 Rd1 27. Qg4 Rd2 28. Qc8+ Kg7 29. Qc7+ Kf8 30. Qb8+ Kf7 <HT>)
-1.52/14 3} Qxb2+ {(19. ... Qxb2+ 20. Ke3 Qb6+ 21. Ke2 Bxd3+ 22. Rxd3 Qb2+
23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rd2 27. Qg3 Kg7 28. Qc7+
Kh8 29. Qb8+ Kg7 30. Qc7+ <HT>) -1.52/14 2} 20. Ke3 {(20. Ke3 Qb6+ 21. Ke2
Bxd3+ 22. Rxd3 Qb2+ 23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rd2
27. Qg3 Kg7 28. Qc7+ Kh6 29. Kg4 f5+ 30. exf5 gxf5+ 31. Kg3 Rxf2) -1.58/15
1} Qb6+ {(20. ... Qb6+ 21. Ke2 Bxd3+ 22. Rxd3 Qb2+ 23. Kf3 Rxd3+ 24. Nxd3
Rc3 25. Re1 Rxd3+ 26. Re3 Rxe3+ 27. Kxe3 b5 28. Qg4 Qc3+ 29. Ke2 Qc4+ 30.
Ke3 b4 31. Qd7 Qc1+ 32. Kf3) -1.71/15 2} 21. Ke2 {(21. Ke2 Bxd3+ 22. Rxd3
Qb2+ 23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rxe3+ 27. Kxe3 b5 28.
Qg4 Qc3+ <HT>) -1.74/15 2} Bxd3+ {(21. ... Bxd3+ 22. Rxd3 Qb2+ 23. Kf3
Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rxe3+ 27. Kxe3 a5 28. Kd3 b5 29.
Qf4 a4 30. Qb8+ Kg7 31. Qc7+ Kh6 32. Qf4+ Kh5 <HT>) -1.80/15 2} 22. Rxd3
{(22. Rxd3 Qb2+ 23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rxe3+ 27.
Kxe3 b5 28. Qg4 Qc3+ 29. Ke2 Qc4+ 30. Ke3 b4 31. Qd1 Kg7 32. Qd7+ Qf7 <HT>)
-1.78/15 1} Qb2+ {(22. ... Qb2+ 23. Kf3 Rxd3+ 24. Nxd3 Rc3 25. Re1 Rxd3+
26. Re3 Rxe3+ 27. fxe3 Qa1 28. Qh3 Qd1+ 29. Kf4 Qd6+ 30. Kf3 Qc6 31. Qh6
Qc1 32. Qh3 Qf1+ 33. Kg3 Qe1+ 34. Kf3 Qd1+ <HT>) -1.85/16 2} 23. Kf3 {(23.
Kf3 Rxd3+ 24. Nxd3 Rc3 25. g3 Rxd3+ 26. Kg2 Rd2 27. Qf4 a5 28. Re1 b5 29.
e5 fxe5 30. Rxe5 Re2 31. Rxe2 Qxe2) -1.85/16 1} Rxd3+ {(23. ... Rxd3+ 24.
Nxd3 Rc3 25. Re1 Rxd3+ 26. Re3 Rxe3+ 27. fxe3 Qa1 28. Qh3 Qd1+ 29. Kf4 Qd6+
30. Kf3 Qc6 31. Qh6 Qc1 32. Qh3 Qf1+ 33. Kg3 Qe1+ 34. Kf3 Qd1+ <HT>)
-1.85/15 1} 24. Nxd3 {(24. Nxd3 Rc3 25. g3 Rxd3+ 26. Kg2 Qe5 27. Rc1 Rd4
28. Qg4 Qxe4+ 29. Qxe4 Rxe4 30. Rc7 Rb4 31. Kf3 Rb2 32. h3 a5) -1.87/16 2}
Rc3 {(24. ... Rc3 25. g3 Rxd3+ 26. Kg2 Rd2 27. Qf4 a5 28. Qb8+ Kg7 29. Qc7+
Kh6 30. Qf4+ g5 31. Qf5 Qe5 32. Qxe5 fxe5 33. Rb1 Rd7 34. Rc1) -1.66/16 3}
25. g3 {(25. g3 Rxd3+ 26. Kg2 Rd2 27. Qf4 a5 28. Qb8+ Kg7 29. Qc7+ Kh6 30.
Qf4+ g5 31. Qf5 Qe5 32. Qxe5 fxe5 33. Rb1 Rd7 34. Rc1 <HT>) -1.66/15 1}
Rxd3+ 26. Kg2 {(26. Kg2 Qe5 27. Qg4 Kg7 28. Rb1 b6 29. Qe2 Qd6 30. Rc1 a5
31. Qc2 Kh6 32. Qc6 Qe5 33. Qxb6 Qxe4+ 34. Kg1) -1.69/16 2} Qe5 {(26. ...
Qe5 27. Qg4 Kg7 28. Rb1 b6 29. Qe2 Rd7 30. f3 Qd4 31. Qf2 Qxf2+ 32. Kxf2
Rd2+ 33. Ke3 Rxh2 34. Rd1 a5 35. Rd7+ Kf8) -1.63/16 5} 27. Qg4 {(27. Qg4
Kg7 28. Rb1 b6 29. Qe2 Rd7 30. f3 Qd4 31. Qf2 Qxf2+ 32. Kxf2 Rd2+ 33. Ke3
Rxh2 34. Rd1 a5 35. Rd7+ Kf8 36. Kf4) -1.63/16 2} Kg7 {(27. ... Kg7 28. Rb1
b6 29. Qe2 Rd7 30. f3 Qd4 31. Qf2 Qxf2+ 32. Kxf2 Rd2+ 33. Ke3 Rxh2 34. Rd1
a5 35. Rd7+ Kf8 36. Kf4) -1.63/15 2} 28. Rb1 {(28. Rb1 b6 29. Qe2 Rd7 30.
f3 Qd4 31. Qf2 Qxf2+ 32. Kxf2 Rd2+ 33. Ke3 Rxh2 34. Rd1 a5 35. Rd7+ Kf8 36.
Rb7 a4 37. Rxb6) -1.62/16 1} b6 {(28. ... b6 29. Qe2 Qd6 30. Rc1 a5 31. Qc2
Kh6 32. Qc6 Rd1 33. Rc4 Qd3 34. Rc3 Qe2 35. Qxf6 Qxe4+ 36. Kh3) -1.64/15 2}
29. Qe2 Qd6 {(29. ... Qd6 30. Rc1 a5 31. Qc2 Rd4 32. Qc7+ Kh6 33. Qb7 Qe6
34. Rb1 Rd6 35. Rc1 Rd7 36. Qc6 Qxc6 37. Rxc6) -1.67/15 1} 30. Rc1 {(30.
Rc1 Rd4 31. Rc8 Qe7 32. Rc4 Qd7 33. Rxd4 Qxd4 34. Qc2 a5 35. f3 a4 36. Qc7+
Kh6 37. Qf4+ g5) -1.63/14 1} a5 {(30. ... a5 31. Qc2 Rd4 32. Qc7+ Kh6 33.
Qb7 Qe6 34. Rb1 Rd6 35. Rc1 a4 36. Rc7 Qg8 37. Qa6 Qb3) -1.64/14 1} 31. Rc8
{(31. Rc8 Qe6 32. Qxd3 Qxc8 33. Qb5 Qd8 34. Qc6 Qd4 35. Qc7+ Kg8 36. Qc8+
Kf7 37. Qc7+ Ke6 38. Qc6+ Qd6 39. Qc4+ Kd7 40. h3 Qe5) -1.53/15 1} Qe6
{(31. ... Qe6 32. Rb8 Rd6 33. Qc2 Kh6 34. Qe2 a4 35. Rb7 g5 36. Qe3 Kg6 37.
h3 Qb3 38. Kf3 Qd1+ 39. Kg2) -1.48/14 1} 32. Rb8 {(32. Rb8 Rd6 33. Qc2 Rd4
34. Qb1 Qxe4+ 35. Qxe4 Rxe4 36. Rxb6 a4 37. Rb7+ Kg8 38. Kf3 f5 39. Ra7 g5
40. h3) -1.46/15 1} Rd6 {(32. ... Rd6 33. Qc2 Kh6 34. h4 Kg7 35. Rb7+ Rd7
36. Rb8 Re7 37. Qb1 Qxe4+ 38. Qxe4 Rxe4 39. Rb7+ Kf8 40. Rxb6 a4 41. Rxf6+
Ke7) -1.43/15 3} 33. Qc2 {(33. Qc2 Kh6 34. h4 Rd4 35. Qc1+ Kg7 36. Qc7+ Rd7
37. Qc2 Re7 38. Qb1 Qxe4+ 39. Qxe4 Rxe4 40. Rb7+ Kf8 41. Rxb6 Kf7 42. Rb7+
Re7 43. Rxe7+ Kxe7 <HT>) -1.43/14 1} Rd4 {(33. ... Rd4 34. Qc7+ Rd7 35. Qc2
Re7 36. h4 Qxe4+ 37. Qxe4 Rxe4 38. Rxb6 a4 39. Rb7+ Kg8 40. Kf3 f5 41. Ra7
h6 42. Rd7) -1.52/15 1} 34. Qc7+ {(34. Qc7+ Rd7 35. Qc2 Re7 36. h4 Qxe4+
37. Qxe4 Rxe4 38. Rxb6 a4 39. Rb7+ Kg8 40. Kf3 f5 41. Rd7 h6 42. h5 gxh5)
-1.46/15 1} Rd7 {(34. ... Rd7 35. Qc2 Re7 36. h4 Qxe4+ 37. Qxe4 Rxe4 38.
Rb7+ Kg8 39. Rxb6 f5 40. Rb8+ Kf7 41. Ra8 a4 42. f3 Re2+ 43. Kf1 Ra2 44.
Ra7+ Kg8) -1.50/16 1} 35. Qc2 {(35. Qc2 Re7 36. h4 Qxe4+ 37. Qxe4 Rxe4 38.
Rb7+ Kg8 39. Rxb6 f5 40. Ra6 a4 41. Kf3 Kf7 42. Ra7+ Ke6 43. Rxh7 a3 44.
Ra7) -1.37/16 1} Re7 {(35. ... Re7 36. h4 Qxe4+ 37. Qxe4 Rxe4 38. Rb7+ Kg8
39. Rxb6 f5 40. Rb7 a4 41. Kf3 h6 42. Rc7 Rb4 43. h5 g5 44. Ke3 Re4+ 45.
Kd3) -1.48/16 1} 36. h4 {(36. h4 Qxe4+ 37. Qxe4 Rxe4 38. Rb7+ Kg8 39. Rxb6
f5 40. h5 gxh5 41. Rb5 a4 42. Rxf5 h4 43. Kf3 Rb4 44. gxh4 Rxh4) -1.49/16
4} Qxe4+ {(36. ... Qxe4+ 37. Qxe4 Rxe4 38. Rb7+ Kf8 39. Rxb6 f5 40. Ra6 a4
41. Kf3 Ke7 42. Ra7+ Kd6 43. Rxh7 Re7 44. Rh6 Re6 45. Rh8 Re4 46. Rd8+ Kc5)
-1.29/16 1} 37. Qxe4 Rxe4 38. Rb7+ {(38. Rb7+ Kg8 39. Rxb6 Kf7 40. Ra6 a4
41. Ra7+ Kg8 42. Ra8+ Kg7 43. Ra7+ Kh6 44. Kf3 Rb4 45. Ke3 Rb3+ 46. Ke4 a3
47. f3 f5+ 48. Kf4) -1.24/15 1} Kf8 {(38. ... Kf8 39. Rxb6 Kf7 40. Kf3 Rb4
41. Ra6 a4 42. Ra7+ Kg8 43. Ra8+ Kg7 44. Ra7+ Kh6 45. Ke3 Rb3+ 46. Ke4 a3
47. f3 Rc3 48. Rd7) -1.27/16 1} 39. Rxb6 {(39. Rxb6 f5 40. Kf3 a4 41. Ra6
Rb4 42. Ra8+ Ke7 43. Ra7+ Ke6 44. Rxh7 Rb3+ 45. Kf4 Rb2 46. Kg5 a3 47. Kxg6
Rxf2) -1.25/15 1} Kf7 {(39. ... Kf7 40. Kf3 Rb4 41. Ra6 a4 42. Ra7+ Kg8 43.
Ra8+ Kg7 44. Ra7+ Kh6 45. Ke3 Rb3+ 46. Ke4 f5+ 47. Ke5 a3 48. Kf6 Rd3 49.
Ke5) -1.26/15 2} 40. Ra6 {(40. Ra6 a4 41. Ra7+ Kg8 42. Ra8+ Kg7 43. Ra7+
Kh6 44. Rf7 f5 45. Kf3 Rc4 46. Ra7 Rc3+ 47. Kf4 a3 48. Ke5 Rd3 49. Ke6)
-1.26/15 3} a4 {(40. ... a4 41. Kf3 Rb4 42. Ra7+ Kg8 43. Ra8+ Kg7 44. Ra7+
Kh6 45. Ke3 Rb3+ 46. Ke4 a3 47. g4 g5 48. hxg5+ fxg5 49. Kd5 Kg6) -1.23/14
1} 41. Ra7+ {(41. Ra7+ Kg8 42. Ra8+ Kg7 43. Ra7+ Kh6 44. Rf7 f5 45. Kf3 Rb4
46. Ra7 Rb3+ 47. Kf4 a3 48. Ke5 Rc3 49. Kf6 Rd3 50. Ke5 Rb3 <HT>) -1.26/14
1} Kg8 {(41. ... Kg8 42. Kf3 Rc4 43. Ra8+ Kg7 44. Ra7+ Kh6 45. Ke3 Rb4 46.
Ra6 Rb3+ 47. Ke4 a3 48. g4 Kg7 49. Ra7+ Kg8 50. Kd5 Rd3+ 51. Ke4 Rc3 <HT>)
-1.18/15 1} 42. Ra8+ {(42. Ra8+ Kg7 43. Ra7+ Kh6 44. Ra6 f5 45. Ra7 Rb4 46.
Kf3 Rb3+ 47. Kf4 a3 48. Ke5 Rc3 49. Kf6 Rf3 50. Ke5 Rd3 51. Ke6 Rc3 52. Kf6
<HT>) -1.11/15 1} Kg7 {(42. ... Kg7 43. Ra7+ Kh6 44. Ra6 f5 45. Ra7 Rb4 46.
Kf3 Re4 47. Rd7 Rc4 48. Ra7 Rc3+ 49. Kf4 a3 50. Ke5 Rc5+ 51. Ke6 Kh5 <HT>)
-1.17/15 1} 43. Ra7+ {(43. Ra7+ Kh6 44. Ra6 f5 45. Ra7 Rb4 46. Kh2 g5 47.
hxg5+ Kxg5 48. Rxh7 Rb2 49. Kg2 a3 50. Rd7 Kg4 51. Rg7+ Kh5) -1.26/15 1}
Kh6 {(43. ... Kh6 44. Ra6 f5 45. Ra7 Rb4 46. Kh2 Rc4 47. Kg2 Re4 48. Rb7
Rd4 49. Ra7 <HT>) -1.26/15 1} 44. Ra6 {(44. Ra6 f5 45. Ra7 Rb4 46. f3 Rc4
47. Kf1 Rc1+ 48. Ke2 Rc2+ 49. Kd3 Rg2 50. Ke3 Rxg3 51. Rxa4 Rg2 52. Ra7)
-1.28/15 2} f5 {(44. ... f5 45. Ra7 Rb4 46. f3 Rc4 47. Kf1 Rc1+ 48. Kg2
Rc2+ 49. Kh3 Ra2 50. g4 fxg4+ 51. fxg4 g5 52. hxg5+ Kxg5 53. Rxh7 Ra3+ 54.
Kg2 Kxg4) -1.30/15 1} 45. Ra7 {(45. Ra7 Rb4 46. f3 Rc4 47. Kf1 Rc1+ 48. Kg2
Rc2+ 49. Kh3 Ra2 50. g4 fxg4+ 51. fxg4 g5 52. hxg5+ Kxg5 53. Rxh7 Ra3+ 54.
Kg2 Kxg4) -1.30/14 1} Rb4 46. f3 {(46. f3 Rc4 47. Kh3 Rd4 48. Kg2 g5 49.
hxg5+ Kxg5 50. Kf2 h5 51. Ke3 Rb4 52. Rg7+ Kf6 53. Ra7 Ke5 <HT>) -1.28/13
1} Rc4 {(46. ... Rc4 47. Kh3 Rd4 48. Kg2 g5 49. hxg5+ Kxg5 50. Kf2 h5 51.
Ke3 Rb4 52. Rg7+ Kf6 53. Ra7 Ke5) -1.41/13 1} 47. Kh3 {(47. Kh3 Rd4 48. Kg2
g5 49. hxg5+ Kxg5 50. Rxh7 a3 51. Kh3 Rd2 52. Rg7+ Kf6 53. Ra7 Ra2 54. Rd7)
-1.35/13 1} Rd4 {(47. ... Rd4 48. Kg2 g5 49. hxg5+ Kxg5 50. f4+ Kf6 51.
Rxh7 a3 52. Rh6+ Ke7 53. Rh7+ Ke6 54. Ra7 Rd3 55. Ra6+ Kd5 56. Ra5+ Kc4 57.
Rxf5) -1.46/14 1} 48. Kg2 {(48. Kg2 g5 49. hxg5+ Kxg5 50. Kf2 h5 51. Ke3
Rb4 52. Rg7+ Kf6 53. Rh7 a3 54. Rxh5 Ke5 55. Rh7 Rb3+ 56. Ke2) -1.48/13 0}
g5 49. hxg5+ Kxg5 50. Kf2 h5 {(50. ... h5 51. f4+ Kf6 52. Ra6+ Ke7 53. Ra7+
Kd6 54. Ra6+ Kd5 55. Ra5+ Ke6 56. Ra6+ Kd7 57. Ke3 Rb4 58. Kd3 Rb3+ 59. Kd4
Rxg3 60. Rxa4) -1.57/11 1} 51. f4+ {(51. f4+ Kf6 52. Ra6+ Ke7 53. Ra5 Ke6
54. Ra6+ Kd7 55. Ra5 h4 56. Rxf5 hxg3+ 57. Kxg3 Rd3+ 58. Kf2 a3 59. Ra5)
-1.34/12 1} Kf6 {(51. ... Kf6 52. Kg2 Ke6 53. Ra6+ Kd5 54. Ra5+ Kd6 55.
Ra6+ Kc5 56. Ra5+ Kc6 57. Rxf5 Rd5 58. Rf7 a3 59. f5) -1.46/12 1} 52. Ra6+
{(52. Ra6+ Ke7 53. Ra5 Ke6 54. Ra6+ Kd7 55. Ra5 Kd6 56. Rxf5 Rd5 57. Rf7 a3
58. Ke3 Ke6) -1.27/12 0} Ke7 53. Ra5 {(53. Ra5 Ke6 54. Ra6+ Kd7 55. Ra5 Rb4
56. Rxf5 a3 57. Rxh5 a2 58. Rh7+ Ke6 59. Ra7 Rb2+ 60. Ke3) -1.08/12 1} Kd6
{(53. ... Kd6 54. Rxf5 Rd5 55. Rf6+ Kc5 56. Ra6 Kb4 57. Ke3 a3 58. Ra7 Kb3)
-1.00/10 1} 54. Rxf5 Rd5 55. Rf6+ Kc5 56. Ra6 Kb4 57. Ke3 {(57. Ke3 a3 58.
Ra8 Ra5 59. Rb8+ Ka4 60. Rb1 a2 61. Ra1 Kb3 62. Ke4 Ra4+ 63. Ke5 Ra5+ 64.
Kd4) -1.83/11 1} a3 {(57. ... a3 58. Ke4 Ra5 59. Rb6+ Kc5 60. Rb1 a2 61.
Ra1 Ra4+ 62. Kd3 Kd5 63. f5 Ra3+ 64. Kc2) -1.69/11 1} 58. Ke4 {(58. Ke4 Ra5
59. Rb6+ Ka4 60. Rb1 a2 61. Ra1 Kb3 62. f5 Kb2 63. Rxa2+ Rxa2 64. f6)
-2.61/11 1} Ra5 {(58. ... Ra5 59. Rb6+ Kc3 60. Rb1 a2 61. Ra1 Kb2 62. Rxa2+
Rxa2 63. f5 Kc3 64. f6 Re2+ 65. Kf5 Kd4) -3.34/12 1} 59. Rb6+ {(59. Rb6+
Kc3 60. Rb1 a2 61. Ra1 Kb2 62. Rxa2+ Rxa2 63. f5 Ra3 64. Kf4 Ra4+ 65. Kg5
Rg4+ 66. Kxh5 <HT>) -3.25/11 0} Kc3 60. Rb1 a2 {(60. ... a2 61. Rc1+ Kb2
62. Rf1 a1=Q 63. Rxa1 Rxa1 64. f5 Rf1 65. Ke5 Rf3 66. f6 Rxg3) -3.70/12 1}
61. Ra1 {(61. Ra1 Kb2 62. Rg1 a1=Q 63. Rxa1 Rxa1 64. Kf3 Ra3+ 65. Kf2 h4
66. g4 h3 67. g5 h2 68. Kg2 <HT>) -4.50/11 1} Kb2 62. Rh1 a1=Q {(62. ...
a1=Q 63. Rxa1 Rxa1 64. Kf3 h4 65. g4 Rf1+ 66. Ke3 h3 67. Ke2 Rxf4 68. g5)
-11.38/11 0} 63. Rxa1 Rxa1 64. Kf3 {(64. Kf3 Ra4 65. Kf2 h4 66. gxh4
<EGTB>) -M21/12 1} Ra3+ {(64. ... Ra3+ 65. Kg2 Rd3 66. Kh3 h4 67. Kxh4
<EGTB>) -M22/13 1} 65. Kg2 Kc3 66. Kh2 Kd4 {(66. ... Kd4 67. Kg2 Ke4 68.
Kh3 Ra2 69. g4 hxg4+ <EGTB>) -M12/15 1} 67. Kg2 {(67. Kg2 Ke4 68. Kf2 Rb3
69. Kg2 Rb2+ 70. Kf1 Kf3 71. Ke1 Kxg3 <EGTB>) -M11/15 1} Ke4 {(67. ... Ke4
68. Kf2 Rb3 69. f5 Kxf5 <EGTB>) -M11/15 1} 68. Kf2 {(68. Kf2 Rb3 69. Ke2
Rxg3 <EGTB>) -M10/15 1} Rb3 {(68. ... Rb3 69. f5 Kxf5 <EGTB>) -M10/15 0}
69. Ke2 {(69. Ke2 Rxg3 <EGTB>) -M9/16 1} Rxg3 {(69. ... Rxg3 <EGTB>) -M9/16
1} 70. f5 Kxf5 71. Kf2 h4 72. Ke2 h3 73. Kf2 Kf4 74. Ke2 h2 75. Kd2 h1=Q
76. Ke2 Qh2+ 77. Ke1 Rg1# 0-1

Even with skill at 0, the engine still sees checkmate somehow.
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

Truly a bizarre beginning (though we would expect positive on top and negative below, the sterling start of zero evaluation has me scratching my head.):

Code: Select all

    Program                  Elo    +   -   Games   Score   Av.Op.  Draws
  1 Crafty-232ap00         : 3490    0   0     4   100.0 %   2890    0.0 %
  2 Crafty-232ap50         : 3343    0   0     4   100.0 %   2743    0.0 %
  3 Crafty-23.2a-skill-mod : 3082  497 415     4    62.5 %   2993   25.0 %
  4 Crafty-232ap10         : 3035  497 415     4    62.5 %   2946   25.0 %
  5 Crafty-232ap01         : 2981  415 497     4    37.5 %   3070   25.0 %
  6 Crafty-232am10         : 2776  675 409     4    25.0 %   2967    0.0 %
  7 Crafty-232am01         : 2752  318 262     4    12.5 %   3090   25.0 %
  8 Crafty-232am50         : 2541    0   0     4     0.0 %   3141    0.0 %
Joost Buijs
Posts: 1635
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Questions for the Stockfish team

Post by Joost Buijs »

bob wrote: OK, some background. It turns out that if you replace Crafty's evaluation with a pure random number, it plays well above 2,000 Elo. If you disable all the search extensions, reductions, no null-move and such, you still can't get it below 1800. There has been a long discussion about this, something I call "The Beal Effect" since Don Beal first reported on this particular phenomenon many years ago. So a basic search + random eval gives an 1800 player. Full search + full eval adds 1,000 to that. How much from each? Unknown. But I have watched many many stockfish vs crafty games and the deciding issue does not seem to be evaluation. We seem to get hurt by endgame search depth more than anything...
Of cause both search-depth and evaluation are important. The issue is that value based pruning depends upon the evaluation. When your evaluation is more consistent, the pruning just works better. At least that's what I found.
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

Dann Corbit wrote:Truly a bizarre beginning (though we would expect positive on top and negative below, the sterling start of zero evaluation has me scratching my head.):

Code: Select all

    Program                  Elo    +   -   Games   Score   Av.Op.  Draws
  1 Crafty-232ap00         : 3490    0   0     4   100.0 %   2890    0.0 %
  2 Crafty-232ap50         : 3343    0   0     4   100.0 %   2743    0.0 %
  3 Crafty-23.2a-skill-mod : 3082  497 415     4    62.5 %   2993   25.0 %
  4 Crafty-232ap10         : 3035  497 415     4    62.5 %   2946   25.0 %
  5 Crafty-232ap01         : 2981  415 497     4    37.5 %   3070   25.0 %
  6 Crafty-232am10         : 2776  675 409     4    25.0 %   2967    0.0 %
  7 Crafty-232am01         : 2752  318 262     4    12.5 %   3090   25.0 %
  8 Crafty-232am50         : 2541    0   0     4     0.0 %   3141    0.0 %
New chapter in the theatre of the bizarre:

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws
 1 Crafty-232ap00         : 3525    0   0     7   100.0 %   2925    0.0 %
 2 Crafty-23.2a-skill-mod : 3197  374 317     7    78.6 %   2972   14.3 %
 3 Crafty-232ap50         : 3139  441 334     7    71.4 %   2980    0.0 %
 4 Crafty-232ap10         : 3058  342 316     6    58.3 %   3000   16.7 %
 5 Crafty-232ap01         : 2942  316 342     6    41.7 %   3000   16.7 %
 6 Crafty-232am01         : 2861  255 290     7    28.6 %   3020   28.6 %
 7 Crafty-232am10         : 2803  317 374     7    21.4 %   3028   14.3 %
 8 Crafty-232am50         : 2475    0   0     7     0.0 %   3075    0.0 %
Everything makes easy sense except the top entry. 100% skill better than 50% which is better than 10% which is better than 1% which is better than -1% which is better than -10% which is better than -50%.

However, the mighty clout of EVAL_ZERO is giving me pause. Surely, it's just a statistical abnormality. Or perhaps in the code somewhere there is a test for (skill == 0) and it is exercising a different branch.
Mangar
Posts: 65
Joined: Thu Jul 08, 2010 9:16 am

Re: Questions for the Stockfish team

Post by Mangar »

Hi Bob,

I think the answer is not that simple any more.

In principle I agree for evaluation terms that changes very quickly. I lately implemented an elo reduction in Spike. Adding a term that randomly adds one pawn or even a rook does not reduce the playing strength that much. Here it is the statistic that helps. On the other hand adding an extra rook every time you put a white bishop on a8 will make your engine weak. A "random" weakness like misjudging some positional effects that changes quickly hasn´t much effect on playing strength. A "stable" weakness like ignoring passed pawns has a large effect.

But this is not the point I whanted to reply to you :-)

I found, that with a very selective search done with large lmr and simular terms, eval has the new job to guide search though the tree. That´s quite a different job than evaluating a final position as good as it can.
Additionaly I found that evaluation terms are able to change the search depth very much.
A simple example is a position with some pawns and exactly one bishop for each side. Both bishops have the same field - color only one side has a passed pawn. Sure the position may not be trivial and the passed pawn not much advanced. If your evaluation has a huge term for attacking the field in front of the passed pawn (that is quite good in many positions) the search depth will drop. The reason is the alternating nature of evaluation in this position and thus many researches are done with lmr.

Another point is that the additional search depth of large lmr will bring many more evaluation of endgame positions in midgame play. IMHO endgame position evaluation is much harder then midgame evalualtion as the risk is high to have an evaluation that is "stable" wrong even if pieces change place. A correct evaluation for an outside passer is a good example here.

Greetings Volker
Mangar Spike Chess
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

Dann Corbit wrote:
Dann Corbit wrote:Truly a bizarre beginning (though we would expect positive on top and negative below, the sterling start of zero evaluation has me scratching my head.):

Code: Select all

    Program                  Elo    +   -   Games   Score   Av.Op.  Draws
  1 Crafty-232ap00         : 3490    0   0     4   100.0 %   2890    0.0 %
  2 Crafty-232ap50         : 3343    0   0     4   100.0 %   2743    0.0 %
  3 Crafty-23.2a-skill-mod : 3082  497 415     4    62.5 %   2993   25.0 %
  4 Crafty-232ap10         : 3035  497 415     4    62.5 %   2946   25.0 %
  5 Crafty-232ap01         : 2981  415 497     4    37.5 %   3070   25.0 %
  6 Crafty-232am10         : 2776  675 409     4    25.0 %   2967    0.0 %
  7 Crafty-232am01         : 2752  318 262     4    12.5 %   3090   25.0 %
  8 Crafty-232am50         : 2541    0   0     4     0.0 %   3141    0.0 %
New chapter in the theatre of the bizarre:

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws
 1 Crafty-232ap00         : 3525    0   0     7   100.0 %   2925    0.0 %
 2 Crafty-23.2a-skill-mod : 3197  374 317     7    78.6 %   2972   14.3 %
 3 Crafty-232ap50         : 3139  441 334     7    71.4 %   2980    0.0 %
 4 Crafty-232ap10         : 3058  342 316     6    58.3 %   3000   16.7 %
 5 Crafty-232ap01         : 2942  316 342     6    41.7 %   3000   16.7 %
 6 Crafty-232am01         : 2861  255 290     7    28.6 %   3020   28.6 %
 7 Crafty-232am10         : 2803  317 374     7    21.4 %   3028   14.3 %
 8 Crafty-232am50         : 2475    0   0     7     0.0 %   3075    0.0 %
Everything makes easy sense except the top entry. 100% skill better than 50% which is better than 10% which is better than 1% which is better than -1% which is better than -10% which is better than -50%.

However, the mighty clout of EVAL_ZERO is giving me pause. Surely, it's just a statistical abnormality. Or perhaps in the code somewhere there is a test for (skill == 0) and it is exercising a different branch.
We are now approaching the cliffs of insanity.

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws

 1 Crafty-232ap00         : 3526    0   0    15   100.0 %   2926    0.0 %
 2 Crafty-23.2a-skill-mod : 3207  234 202    15    80.0 %   2966   13.3 %
 3 Crafty-232ap50         : 3142  211 188    15    73.3 %   2966   13.3 %
 4 Crafty-232ap10         : 3077  195 181    15    66.7 %   2956   13.3 %
 5 Crafty-232ap01         : 2926  181 195    15    33.3 %   3046   13.3 %
 6 Crafty-232am01         : 2883  175 189    15    30.0 %   3031   20.0 %
 7 Crafty-232am10         : 2763  227 288    15    16.7 %   3042    6.7 %
 8 Crafty-232am50         : 2476    0   0    15     0.0 %   3076    0.0 %
I suggest that it may be worthwhile for others to perform the simple test with the patch I posted up above. + 300 Elo for removal of eval seems a bit odd at best.
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

Dann Corbit wrote:
Dann Corbit wrote:
Dann Corbit wrote:Truly a bizarre beginning (though we would expect positive on top and negative below, the sterling start of zero evaluation has me scratching my head.):

Code: Select all

    Program                  Elo    +   -   Games   Score   Av.Op.  Draws
  1 Crafty-232ap00         : 3490    0   0     4   100.0 %   2890    0.0 %
  2 Crafty-232ap50         : 3343    0   0     4   100.0 %   2743    0.0 %
  3 Crafty-23.2a-skill-mod : 3082  497 415     4    62.5 %   2993   25.0 %
  4 Crafty-232ap10         : 3035  497 415     4    62.5 %   2946   25.0 %
  5 Crafty-232ap01         : 2981  415 497     4    37.5 %   3070   25.0 %
  6 Crafty-232am10         : 2776  675 409     4    25.0 %   2967    0.0 %
  7 Crafty-232am01         : 2752  318 262     4    12.5 %   3090   25.0 %
  8 Crafty-232am50         : 2541    0   0     4     0.0 %   3141    0.0 %
New chapter in the theatre of the bizarre:

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws
 1 Crafty-232ap00         : 3525    0   0     7   100.0 %   2925    0.0 %
 2 Crafty-23.2a-skill-mod : 3197  374 317     7    78.6 %   2972   14.3 %
 3 Crafty-232ap50         : 3139  441 334     7    71.4 %   2980    0.0 %
 4 Crafty-232ap10         : 3058  342 316     6    58.3 %   3000   16.7 %
 5 Crafty-232ap01         : 2942  316 342     6    41.7 %   3000   16.7 %
 6 Crafty-232am01         : 2861  255 290     7    28.6 %   3020   28.6 %
 7 Crafty-232am10         : 2803  317 374     7    21.4 %   3028   14.3 %
 8 Crafty-232am50         : 2475    0   0     7     0.0 %   3075    0.0 %
Everything makes easy sense except the top entry. 100% skill better than 50% which is better than 10% which is better than 1% which is better than -1% which is better than -10% which is better than -50%.

However, the mighty clout of EVAL_ZERO is giving me pause. Surely, it's just a statistical abnormality. Or perhaps in the code somewhere there is a test for (skill == 0) and it is exercising a different branch.
We are now approaching the cliffs of insanity.

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws

 1 Crafty-232ap00         : 3526    0   0    15   100.0 %   2926    0.0 %
 2 Crafty-23.2a-skill-mod : 3207  234 202    15    80.0 %   2966   13.3 %
 3 Crafty-232ap50         : 3142  211 188    15    73.3 %   2966   13.3 %
 4 Crafty-232ap10         : 3077  195 181    15    66.7 %   2956   13.3 %
 5 Crafty-232ap01         : 2926  181 195    15    33.3 %   3046   13.3 %
 6 Crafty-232am01         : 2883  175 189    15    30.0 %   3031   20.0 %
 7 Crafty-232am10         : 2763  227 288    15    16.7 %   3042    6.7 %
 8 Crafty-232am50         : 2476    0   0    15     0.0 %   3076    0.0 %
I suggest that it may be worthwhile for others to perform the simple test with the patch I posted up above. + 300 Elo for removal of eval seems a bit odd at best.
You said I would be surprised, but I guess even astonished would not begin to cover it. At least it turns out that ZERO_EVAL is not indestructable:

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws

 1 Crafty-232ap00         : 3523    0 301    18    97.2 %   2923    5.6 %
 2 Crafty-23.2a-skill-mod : 3183  184 168    18    75.0 %   2992   16.7 %
 3 Crafty-232ap50         : 3146  206 181    18    77.8 %   2928   11.1 %
 4 Crafty-232ap10         : 3109  178 166    18    66.7 %   2989   11.1 %
 5 Crafty-232ap01         : 2924  158 166    18    36.1 %   3023   16.7 %
 6 Crafty-232am01         : 2867  157 168    18    27.8 %   3033   22.2 %
 7 Crafty-232am10         : 2763  197 236    18    19.4 %   3010    5.6 %
 8 Crafty-232am50         : 2485    0   0    18     0.0 %   3085    0.0 %
 
p00 is skill=0
skill-mod is skill=100
p50 is skill=+50
p10 is skill=+10
p01 is skill=+1
m01 is skill=-1
m10 is skill=-10
m50 is skill=-50
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Questions for the Stockfish team

Post by Daniel Shawul »

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

don't know what your "completely random" comment means, but I have tested (and just did it again) with pure random scores.
Just take Crafty, and right at the top of evaluate.c return 100 * random_generator() (assuming random_generator() returns a float 0.0 <= N < 1.00). Then you won't be guessing.
Actually I did do the math but it seems you don't comprehend. What was talked about
was complete randomness, but you suddenly decided to mix order into it, no matter
how insignificant you think it is!. The skill-1 mixes 1% of 'order' with 99% of
'chaos', which roughly translates into a maximum of 3-sigma (99.7% or so) result if we
are to say it is completley based on randomness. So when I say completely random,
I mean 0% order. I don't know why you thought otherwise..

There was recently news on the discovery of Higgs boson based on a 3 sigma result.
Despite doubts of the source of this news, this statement was by itself enough to convince
scientists it is not a _discovery_ as that is something attributed to a 5 sigma experimental
result. More about odds of discovery here http://www.fnal.gov/pub/ferminews/fermi ... 16/p1.html

I did not guess anything. I and Tord just did this to our engines and got something light years away
from 1800 elo.
I have no "eval cache". There is a pawn score cache (pawn hash) but the random trick is applied to the score after that is used,
so this has absolutely no effect on anything. yes it causes TT issues. Again, "so what"? We want worse play, not an optimal search. Let it fail low and then high on the same move, it just wastes more time.
You have to attention to details . The same evaluation should be assigned to the same position
when visited twice..What is the sense of giving two different scores to the same position ?
Infact I have one more detail that I think needs to be addressed. Just using random() gives out positive values
(winning) for either side to move. They both get the same values which completely breaks the
_zero sum_ game notion. I am going to change this so that the score of white is negative of score of black
for a given position.
We are using uniform PRNGs. The larger the sample, the greater the probability of getting a large PRN. That is pretty simple to understand.
Duh. Like I said you roughly get some kind of extereme value distribution from taking maximum of
random numbers. This is a little bit skewed to the left compared to normal. But If it were normal,
we could use the 1/sqrt(n) rule to make the comparison. You would need to multiply the sample size by 4
to double the certainity of getting a larger number from the sample. So if you compare 15 and 20, you see
it doesn't differ much.. Maybe when mobility difference is like 15 and 60 you start talking of
something and that is maybe. You can also take the exact mobility score (howevery you calculate it) if you like.
I do not belive mobility only brings 1800 elo, all it does will be to properly place his queen to the highest 'mobile' square only
to be captured by the opponent... epic fail!
Care to rephrase that? Who is talking about "amplifying Elo" anywhere? Just a simple way to introduce mobility into the eval, which does lead to decent play. Not GM play, but also not 1200-level play either. I want to get the ELo down to 800 or less. Right now, with 23.2, the best one can get is down to 1800, which is much too high. With a purely random eval, at that
Are you saying approximate mobility eval is the only ingredient added by the search?
I just want to make sure.
QED
Posts: 60
Joined: Thu Nov 05, 2009 9:53 pm

Re: Questions for the Stockfish team

Post by QED »

Daniel Shawul wrote:I do not belive mobility only brings 1800 elo, all it does will be to properly place his queen to the highest 'mobile' square only
to be captured by the opponent... epic fail!
Mobility itself does not give 1800 elo. Mobility AND search gives it.

With ply 3 search, program is happy not to recapture knight, but centralize queen instead. But with ply 7 search it sees that the queen can be driven back (under threat to capture it to reduce mobility) and then missing knight will decide the mobility advantage.

So, with deep search, the engine does not go for short term mobility, but for lasting mobility advantage (ideally with opponent making forced moves only). So this engine will not make errors that can be translated to stable mobility disadvantage within search horizon. And since most of tactical errors of the opponent can be punished in a way that leads to stable mobility advantage, the engine will play reasonably well. The opponent would need to accumulate 'strategic' advantages first, to prevail comfortably.
Testing conditions:
tc=/0:40+.1 option.Threads=1 option.Hash=32 option.Ponder=false -pgnin gaviota-starters.pgn -concurrency 1 -repeat -games 1000
hash clear between games
make build ARCH=x86-64 COMP=gcc
around 680kps on 1 thread at startposition.