Questions for the Stockfish team

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Dann Corbit wrote:
lkaufman wrote:
bob wrote:
Tord Romstad wrote:.
No. with skill 1, what you get for a score from evaluate is this:

score = 0.01 * real_evaluation + .99 * random();

where random() returns a value between 0 and 100 (0 to 1 pawn).

With skill 1, the material/positional score is almost nothing, the remainder of the score is a pure random number.

1 0 is not the best test. That restricts the depth enough that random eval fails, but use something like 1+1 or 2+2 and watch what happens. Suddenly it won't hang material, and plays decent chess...
Two points: The game Tord cited was 2'+1", so he already followed your advice in advance. So there is some discrepancy between his findings and yours. The only obvious culprit is the 0.01 weight on real eval; it's not much, but maybe it biases things enough in favor of good moves to make the difference between 800 and 1800. Hard to believe, but you should play a version with zero weight for real eval against some weak program with a rating of maybe 1600 to see what happens. Do you have any other explanation for the huge difference between Crafty random and Stockfish random? With eval not an issue and with LMR and such turned off in Crafty, there can hardly be an explanation in the difference between th programs.
Something that is probably very relevant is if the program is still able to recognize draws and wins, despite the awful eval. If (for instance) a program can see that chasing the opposing king around the mullberry bush with their queen will cause a repeated position resulting in a draw, this will collect quite a lot of draws. And if it can recognize perfectly an 8 full move checkmate, it will gain a lot of wins that way (even having checkmate recognized at all will result in a lot of wins because 0.01 * 30000 = 300 so the path will be seen as a good one. If (on the other hand) the eval function collects no information at all, then I expect the performance to be far, far worse.

So, we have these two questions:
Can the crippled program see a draw?
Can the crippled program see a win?
Yes to both of those. But remember, at skill 1, all the clever stuff is disabled. No reductions. No null-move. No check extensions. The search depth is therefore not very deep. I don't have an easy way to avoid recognizing mates since those are done in the search itself, same for repetitions. But if a program will hang pawns or pieces, those won't be issues anyway.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().

How much does that .01 * 100 contribute? 1 point. up to 99 points for the random part, average = 50. So what difference would removing that last 1% do? I didn't do "skill 0" because that makes no sense. It implies zero skill, which would never win or draw a single game at all, and that seemed like an unreasonable expectation.

I don't know what your "completely random" comment means, but I have tested (and just did it again) with pure random scores. Just take Crafty, and right at the top of evaluate.c return 100 * random_generator() (assuming random_generator() returns a float 0.0 <= N < 1.00). Then you won't be guessing.

[/quote]

Another,when the same position is evaluated twice, it gets a different evaluation
as you call random() again. This breaks TT and eval cache.. Is this part of
the trick I (and probably some others) don't understand ? For my test I used
a (hash_key % 1000) and then changed to (hash_key % 100) after you suggested
that it should be like that , for reasons that is not obvious to me.. Unless ofcourse
you want to add it to real evaluation somehow, which you just did. Scaling shouldn't matter
at all for the original statement you made..that is I repeat completely random eval.[/quote]

I have no "eval cache". There is a pawn score cache (pawn hash) but the random trick is applied to the score after that is used, so this has absolutely no effect on anything. yes it causes TT issues. Again, "so what"? We want worse play, not an optimal search. Let it fail low and then high on the same move, it just wastes more time.

I can understand that there is a bit of sense as to what Marco tried to explain viz a viz
approximate mobility evaluation. I accept that it is some weird way of approximating mobility.
There is nothing to accept. This has been _proven_ already. Don Beal was the first to report the issue somewhere in the late 80''s or early 90's in the JICCA. This really happens.
Even at that I am not completley sure. If you take random numbers and get the maximum, you get one of
the Extreme value distributions (say gambel). Now with that, how much sure are you that
you get a larger random value if you take a sample of 20 or 15...
We are using uniform PRNGs. The larger the sample, the greater the probability of getting a large PRN. That is pretty simple to understand.

There must be some sensible input to get a sensible result. I accept
(0.01 eval and a bit of mobility eval added by the search) are
possible improvement. The search doesn't amplify elo, if it doesn't add
something good as in the mobility case (disguised at first). Otherwise it is
garbage in garbage out.
Care to rephrase that? Who is talking about "amplifying Elo" anywhere? Just a simple way to introduce mobility into the eval, which does lead to decent play. Not GM play, but also not 1200-level play either. I want to get the ELo down to 800 or less. Right now, with 23.2, the best one can get is down to 1800, which is much too high. With a purely random eval, at that.
Dann Corbit
Posts: 12797
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Daniel Shawul wrote:Ok maybe it does approximate mobility as you say which is one positional term but I really don't see how it can reach 1800 elo without basic material eval , unless I overvalued the elo number. Someone told me strelka can be a 2600 elo engine with material only eval. So my thought was it is possible to reach more than 1800 elo with material only but the sense I got from the discussion is that this comes from no eval..
To be blunt, it doesn't matter what you "see how can happen." This is not speculation about an issue. It is an actual observed fact from some that apparently test weaker engines. And we had the discussion about this here a month back or so.

Feel free to take 23.2, which is public, compile it with -dSKILL, crank it up using skill=1, and turn the book off and play a game. Unless you are a 2000+ player, it is going to play far better than you expect, and if you are not careful, it will pick you apart tactically, because maximizing mobility means win as much material as you can while losing as little as possible. The random eval begins to fail as search depth drops, because the tree size shrinks and you get fewer chances to pick up those big eval random scores. At present, 23.3 has a variable-sized loop in Evaluate() that can drop the NPS all the way down to 1000 for skill=1, and that does knock the Elo flat, because it takes depth and nodes to get "the Beal effect" to work.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
Dann Corbit
Posts: 12797
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
This is *really* interesting to me. I made the following change, so that I can even make negative eval hints, and I will test with various settings:

Code: Select all

#if defined(SKILL)
  else if (OptionMatch("skill", *args)) {
    if (nargs < 2) {
      printf("usage:  skill <1-100>\n");
      return (1);
    }
    if (skill != 100) {
      printf("ERROR:  skill can only be changed one time in a game\n");
    } else {
      skill = atoi(args[1]);
      if (skill < -100 || skill > 100) {
        printf("ERROR: skill range is -100 to 100 only\n");
        skill = 100;
      }
      Print(128, "skill level set to %d%%\n", skill);
      null_depth = null_depth * skill / 100;
      check_depth = check_depth * skill / 100;
      LMR_depth = LMR_depth * skill / 100;
    }
  }
#endif
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
This is *really* interesting to me. I made the following change, so that I can even make negative eval hints, and I will test with various settings:

Code: Select all

#if defined(SKILL)
  else if (OptionMatch("skill", *args)) {
    if (nargs < 2) {
      printf("usage:  skill <1-100>\n");
      return (1);
    }
    if (skill != 100) {
      printf("ERROR:  skill can only be changed one time in a game\n");
    } else {
      skill = atoi(args[1]);
      if (skill < -100 || skill > 100) {
        printf("ERROR: skill range is -100 to 100 only\n");
        skill = 100;
      }
      Print(128, "skill level set to %d%%\n", skill);
      null_depth = null_depth * skill / 100;
      check_depth = check_depth * skill / 100;
      LMR_depth = LMR_depth * skill / 100;
    }
  }
#endif
That appears to break things. Adjusting things like null-depth with a - skill will blow things up because it will do a null-move search with a _deeper_ depth, not a reduced one. Ditto for LMR which will suddenly be extending, not reducing.
Dann Corbit
Posts: 12797
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
This is *really* interesting to me. I made the following change, so that I can even make negative eval hints, and I will test with various settings:

Code: Select all

#if defined(SKILL)
  else if (OptionMatch("skill", *args)) {
    if (nargs < 2) {
      printf("usage:  skill <1-100>\n");
      return (1);
    }
    if (skill != 100) {
      printf("ERROR:  skill can only be changed one time in a game\n");
    } else {
      skill = atoi(args[1]);
      if (skill < -100 || skill > 100) {
        printf("ERROR: skill range is -100 to 100 only\n");
        skill = 100;
      }
      Print(128, "skill level set to %d%%\n", skill);
      null_depth = null_depth * skill / 100;
      check_depth = check_depth * skill / 100;
      LMR_depth = LMR_depth * skill / 100;
    }
  }
#endif
That appears to break things. Adjusting things like null-depth with a - skill will blow things up because it will do a null-move search with a _deeper_ depth, not a reduced one. Ditto for LMR which will suddenly be extending, not reducing.
It might make it fun for raw beginners.

Here is a game between normal Crafty (skill = 100) and Crafty with skill = -1:

Code: Select all

[Event "Crskill"]
[Site "DCORBIT2008"]
[Date "2010.07.20"]
[Round "1"]
[White "Crafty-232am01"]
[Black "Crafty-23.2a-skill-mod"]
[Result "0-1"]
[BlackElo "2800"]
[ECO "A28"]
[Opening "English"]
[Time "20:50:44"]
[Variation "Four Knights, Nenarokov Variation"]
[WhiteElo "2700"]
[TimeControl "60+1"]
[Termination "normal"]
[PlyCount "54"]
[WhiteType "program"]
[BlackType "program"]

1. d4 Nc6 2. Nf3 d5 3. c4 e5 4. Bg5 {(4. Bg5 f6 5. Bh4 dxc4 6. Qa4 exd4 7.
Qxc4 Ke7 8. Nfd2) 0.00/9 2} f6 {(4. ... f6 5. cxd5 Nxd4 6. Be3 Nf5 7. Bc1
Bc5 8. e4 Nd4 9. Nc3 Ne7 10. Nxd4 exd4 11. Bb5+ Bd7 12. Qh5+ Kf8) -0.16/15
2} 5. Bh4 {(5. Bh4 dxc4 6. Qa4 Bd7 7. dxe5 Nb8 8. Qc2 Ke7 9. Ng1) 0.00/9 2}
g5 {(5. ... g5 6. Bg3 h5 7. cxd5 Bb4+ 8. Nc3 Qxd5 9. Qd3 h4 10. Qg6+ Qf7
11. Qxf7+ Kxf7 12. d5 Nce7 13. Nxg5+ fxg5 14. Bxe5) -0.65/13 3} 6. Bxg5
{(6. Bxg5 fxg5 7. Nxe5 Nxe5 8. dxe5 Bb4+ 9. Nc3 Ke7 10. Rg1) 0.00/9 2} fxg5
{(6. ... fxg5 7. Nc3 Bg7 8. dxe5 d4 9. Nd5 Nxe5 10. Nxg5 h6 11. Ne4 Nxc4
12. Qa4+ c6 13. Qxc4 Qxd5 14. Qxd5 cxd5) -2.51/12 1} 7. e3 {(7. e3 Bb4+ 8.
Nc3 Nxd4 9. Nxd4 Bxc3+ 10. bxc3 dxc4 11. Qa4+) 0.00/9 2} Bb4+ {(7. ... Bb4+
8. Nc3 exd4 9. Nxd4 Nf6 10. Nxc6 bxc6 11. h4 Bxc3+ 12. bxc3 gxh4 13. Rxh4
Rg8 14. cxd5 cxd5 15. Bb5+ Bd7) -2.82/13 3} 8. Nc3 {(8. Nc3 Nxd4 9. Nxd4
Bxc3+ 10. bxc3 dxc4 11. Bxc4 b5 12. Bf1) 0.00/9 3} exd4 {(8. ... exd4 9.
Nxd4 Nf6 10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. Qd4 O-O 14. Bd3 Be6 15.
cxd5 Nxd5 16. Rb1) -2.86/15 2} 9. Nxd4 {(9. Nxd4 Ne5 10. cxd5 Bxc3+ 11.
bxc3 c5 12. dxc6 Nf3+ 13. Ke2 Nxd4+ 14. exd4) 0.00/9 2} Nf6 {(9. ... Nf6
10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. cxd5 cxd5 14. Qd4 O-O 15. Bd3
Nh5 16. Qe5 Ng7) -2.85/14 2} 10. a3 {(10. a3 Nxd4 11. exd4 Bxa3 12. bxa3
Ng4 13. cxd5 Nf6 14. Be2) 0.00/9 7} Bxc3+ {(10. ... Bxc3+ 11. bxc3 O-O 12.
Rc1 Ne5 13. cxd5 Qxd5 14. f3 Qf7 15. Qc2 Rd8 16. Rd1 Nc4 17. Bxc4 Qxc4 18.
e4) -3.59/15 2} 11. bxc3 {(11. bxc3 dxc4 12. Nxc6 Qxd1+ 13. Rxd1 bxc6 14.
Bxc4 Ba6 15. Bxa6) -0.01/9 2} O-O {(11. ... O-O 12. Rc1 Ne5 13. cxd5 Qxd5
14. f3 Qf7 15. h4 c5 16. hxg5 Nd5 17. e4 Ne3 18. g6 Qxg6 <HT>) -3.57/15 2}
12. Nb5 {(12. Nb5 Na5 13. cxd5 Ne4 14. Ra2 c6 15. dxc6 Qxd1+ 16. Kxd1 Nxf2+
17. Rxf2) -0.01/9 3} a6 {(12. ... a6 13. Nd4 Ne5 14. cxd5 Ne4 15. f3 Nxc3
16. Qc2 Nxd5 17. e4 Nf4 18. Rd1 Qe7 19. g3 Ne6) -4.11/15 2} 13. Nd4 {(13.
Nd4 Na5 14. cxd5 Qxd5 15. Ne2 Qxd1+ 16. Rxd1 Rd8 17. Rxd8+) -0.02/9 2} Ne5
{(13. ... Ne5 14. cxd5 Qxd5 15. f3 c5 16. Nb3 c4 17. Nd4 Qa5 18. Qc2 Nd5
19. Kd2 Nd3 20. Bxd3 cxd3 21. Qxd3) -3.74/14 4} 14. Nf3 {(14. Nf3 Ne4 15.
Bd3 Nxd3+ 16. Qxd3 Rxf3 17. gxf3 Nxf2 18. Kxf2) -0.01/9 2} Bg4 {(14. ...
Bg4 15. Be2 Bxf3 16. Bxf3 Nxf3+ 17. gxf3 dxc4 18. e4 Nh5 19. Qxd8 Raxd8 20.
Rg1 Rxf3 21. Rb1 h6 22. Rxb7 Rxc3 23. Rxc7 Rxa3 24. Rxc4) -4.54/14 2} 15.
Ke2 {(15. Ke2 Ne4 16. Qxd5+ Qxd5 17. cxd5 Rxf3 18. gxf3 Ng3+ 19. hxg3 Nxf3)
-0.03/9 3} Nxf3 {(15. ... Nxf3 16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5 Bxf3+
19. Kd3 Nxf2+ 20. Kd4 Bxh1 21. Bc4 Rad8 22. Rg1 Rf5 23. h4 Bxd5 24. Rxg5+
Rxg5 25. hxg5) -10.39/15 2} 16. gxf3 {(16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5
Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 <HT>) -0.02/10 2} Ne4 {(16. ... Ne4 17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 Bxd5 22. Rc2
Nb1 23. Rxc7 Nxa3 24. Rc5 Bh1 25. Rxg5+ Kh8) -10.54/15 12} 17. Qxd5+ {(17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 21. Rg1+ Kf7) -0.02/10
2} Qxd5 {(17. ... Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 b5
22. Bh3 Bxd5 23. Rd2 Be4 24. Rd7 Kh8 25. Rxc7) -10.88/15 1} 18. cxd5 {(18.
cxd5 Bxf3+ 19. Kd3 Nxf2+ 20. Kd4 c5+ 21. dxc6 Rad8+ 22. Ke5 Bd5 23. cxb7
Rde8+ 24. Kxd5) -0.04/11 3} Bxf3+ 19. Kd3 {(19. Kd3 Nxf2+ 20. Kd4 Ng4 21.
Bxa6 Rfd8 22. Bxb7 Rxa3 23. Rxa3 c5+) -0.04/10 2} Bxh1 {(19. ... Bxh1 20.
f4 gxf4 21. Kd4 c5+ 22. Kd3 fxe3 23. c4 Nf2+ 24. Kc3 Rad8 25. Be2 Rd6 26.
Rg1+ Kh8 27. a4 Ne4+ 28. Kc2) -11.18/14 5} 20. Ke2 {(20. Ke2 Rxf2+ 21. Kd3
Rf4 22. exf4 Rf8 23. fxg5 Rf6 24. gxf6 Nf2+) -0.04/10 2} Rxf2+ {(20. ...
Rxf2+ 21. Ke1 Raf8 22. Bd3 Bg2 23. c4 Rf1+ 24. Bxf1 Rxf1+ 25. Ke2 Rxa1 26.
Kd3 Rxa3+ 27. Kd4 g4 28. c5 Ra4+ 29. Ke5 Nxc5) -15.42/13 1} 21. Kd3 {(21.
Kd3 Ng3 22. hxg3 c5 23. dxc6 Rb2 24. cxb7 Rd2+ 25. Kxd2 Rd8+) -0.04/10 3}
Rd8 {(21. ... Rd8 22. Rd1 Rxd5+ 23. Kc4 Rxd1 24. Bd3 Nd6+ 25. Kd4 Be4 26.
Ke5 Rxd3 27. Ke6 Rxh2 28. Kf6 Rxc3 29. Kxg5 Rxa3) -23.21/12 1} 22. Bg2
{(22. Bg2 Nd6 23. Bxh1 c5 24. dxc6 Rb2 25. cxb7 Rd2+ 26. Kxd2 Nf7+)
-0.04/10 3} Rxd5+ {(22. ... Rxd5+ 23. Kc4 Rc5+ 24. Kb3 Rb5+ 25. Kc4 Rd2 26.
Bxe4 Bxe4 27. Ra2 Bd5#) -M6/11 0} 23. Kc4 {(23. Kc4 Re2 24. Kxd5 Nxc3+ 25.
Ke6 Na4 26. Bxh1 Re1 27. Rxe1 Nc5+) -0.03/10 1} Rc5+ 24. Kb3 {(24. Kb3
Rxc3+ 25. Kb4 Rb2+ 26. Ka5 Rb5+ 27. Ka4 Nc5#) -M4/12 2} Rb5+ 25. Kc4 {(25.
Kc4 Rd2 26. Bxe4 Bxe4 27. Ra2 Bd5#) -M3/12 2} Rd2 26. Bxe4 {(26. Bxe4 Bxe4
27. Ra2 Bd5#) -M2/13 1} Bxe4 27. Rg1 {(27. Rg1 Bd5#) -M1/12 2} Bd5# 0-1
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
This is *really* interesting to me. I made the following change, so that I can even make negative eval hints, and I will test with various settings:

Code: Select all

#if defined(SKILL)
  else if (OptionMatch("skill", *args)) {
    if (nargs < 2) {
      printf("usage:  skill <1-100>\n");
      return (1);
    }
    if (skill != 100) {
      printf("ERROR:  skill can only be changed one time in a game\n");
    } else {
      skill = atoi(args[1]);
      if (skill < -100 || skill > 100) {
        printf("ERROR: skill range is -100 to 100 only\n");
        skill = 100;
      }
      Print(128, "skill level set to %d%%\n", skill);
      null_depth = null_depth * skill / 100;
      check_depth = check_depth * skill / 100;
      LMR_depth = LMR_depth * skill / 100;
    }
  }
#endif
That appears to break things. Adjusting things like null-depth with a - skill will blow things up because it will do a null-move search with a _deeper_ depth, not a reduced one. Ditto for LMR which will suddenly be extending, not reducing.
It might make it fun for raw beginners.

Here is a game between normal Crafty (skill = 100) and Crafty with skill = -1:

Code: Select all

[Event "Crskill"]
[Site "DCORBIT2008"]
[Date "2010.07.20"]
[Round "1"]
[White "Crafty-232am01"]
[Black "Crafty-23.2a-skill-mod"]
[Result "0-1"]
[BlackElo "2800"]
[ECO "A28"]
[Opening "English"]
[Time "20:50:44"]
[Variation "Four Knights, Nenarokov Variation"]
[WhiteElo "2700"]
[TimeControl "60+1"]
[Termination "normal"]
[PlyCount "54"]
[WhiteType "program"]
[BlackType "program"]

1. d4 Nc6 2. Nf3 d5 3. c4 e5 4. Bg5 {(4. Bg5 f6 5. Bh4 dxc4 6. Qa4 exd4 7.
Qxc4 Ke7 8. Nfd2) 0.00/9 2} f6 {(4. ... f6 5. cxd5 Nxd4 6. Be3 Nf5 7. Bc1
Bc5 8. e4 Nd4 9. Nc3 Ne7 10. Nxd4 exd4 11. Bb5+ Bd7 12. Qh5+ Kf8) -0.16/15
2} 5. Bh4 {(5. Bh4 dxc4 6. Qa4 Bd7 7. dxe5 Nb8 8. Qc2 Ke7 9. Ng1) 0.00/9 2}
g5 {(5. ... g5 6. Bg3 h5 7. cxd5 Bb4+ 8. Nc3 Qxd5 9. Qd3 h4 10. Qg6+ Qf7
11. Qxf7+ Kxf7 12. d5 Nce7 13. Nxg5+ fxg5 14. Bxe5) -0.65/13 3} 6. Bxg5
{(6. Bxg5 fxg5 7. Nxe5 Nxe5 8. dxe5 Bb4+ 9. Nc3 Ke7 10. Rg1) 0.00/9 2} fxg5
{(6. ... fxg5 7. Nc3 Bg7 8. dxe5 d4 9. Nd5 Nxe5 10. Nxg5 h6 11. Ne4 Nxc4
12. Qa4+ c6 13. Qxc4 Qxd5 14. Qxd5 cxd5) -2.51/12 1} 7. e3 {(7. e3 Bb4+ 8.
Nc3 Nxd4 9. Nxd4 Bxc3+ 10. bxc3 dxc4 11. Qa4+) 0.00/9 2} Bb4+ {(7. ... Bb4+
8. Nc3 exd4 9. Nxd4 Nf6 10. Nxc6 bxc6 11. h4 Bxc3+ 12. bxc3 gxh4 13. Rxh4
Rg8 14. cxd5 cxd5 15. Bb5+ Bd7) -2.82/13 3} 8. Nc3 {(8. Nc3 Nxd4 9. Nxd4
Bxc3+ 10. bxc3 dxc4 11. Bxc4 b5 12. Bf1) 0.00/9 3} exd4 {(8. ... exd4 9.
Nxd4 Nf6 10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. Qd4 O-O 14. Bd3 Be6 15.
cxd5 Nxd5 16. Rb1) -2.86/15 2} 9. Nxd4 {(9. Nxd4 Ne5 10. cxd5 Bxc3+ 11.
bxc3 c5 12. dxc6 Nf3+ 13. Ke2 Nxd4+ 14. exd4) 0.00/9 2} Nf6 {(9. ... Nf6
10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. cxd5 cxd5 14. Qd4 O-O 15. Bd3
Nh5 16. Qe5 Ng7) -2.85/14 2} 10. a3 {(10. a3 Nxd4 11. exd4 Bxa3 12. bxa3
Ng4 13. cxd5 Nf6 14. Be2) 0.00/9 7} Bxc3+ {(10. ... Bxc3+ 11. bxc3 O-O 12.
Rc1 Ne5 13. cxd5 Qxd5 14. f3 Qf7 15. Qc2 Rd8 16. Rd1 Nc4 17. Bxc4 Qxc4 18.
e4) -3.59/15 2} 11. bxc3 {(11. bxc3 dxc4 12. Nxc6 Qxd1+ 13. Rxd1 bxc6 14.
Bxc4 Ba6 15. Bxa6) -0.01/9 2} O-O {(11. ... O-O 12. Rc1 Ne5 13. cxd5 Qxd5
14. f3 Qf7 15. h4 c5 16. hxg5 Nd5 17. e4 Ne3 18. g6 Qxg6 <HT>) -3.57/15 2}
12. Nb5 {(12. Nb5 Na5 13. cxd5 Ne4 14. Ra2 c6 15. dxc6 Qxd1+ 16. Kxd1 Nxf2+
17. Rxf2) -0.01/9 3} a6 {(12. ... a6 13. Nd4 Ne5 14. cxd5 Ne4 15. f3 Nxc3
16. Qc2 Nxd5 17. e4 Nf4 18. Rd1 Qe7 19. g3 Ne6) -4.11/15 2} 13. Nd4 {(13.
Nd4 Na5 14. cxd5 Qxd5 15. Ne2 Qxd1+ 16. Rxd1 Rd8 17. Rxd8+) -0.02/9 2} Ne5
{(13. ... Ne5 14. cxd5 Qxd5 15. f3 c5 16. Nb3 c4 17. Nd4 Qa5 18. Qc2 Nd5
19. Kd2 Nd3 20. Bxd3 cxd3 21. Qxd3) -3.74/14 4} 14. Nf3 {(14. Nf3 Ne4 15.
Bd3 Nxd3+ 16. Qxd3 Rxf3 17. gxf3 Nxf2 18. Kxf2) -0.01/9 2} Bg4 {(14. ...
Bg4 15. Be2 Bxf3 16. Bxf3 Nxf3+ 17. gxf3 dxc4 18. e4 Nh5 19. Qxd8 Raxd8 20.
Rg1 Rxf3 21. Rb1 h6 22. Rxb7 Rxc3 23. Rxc7 Rxa3 24. Rxc4) -4.54/14 2} 15.
Ke2 {(15. Ke2 Ne4 16. Qxd5+ Qxd5 17. cxd5 Rxf3 18. gxf3 Ng3+ 19. hxg3 Nxf3)
-0.03/9 3} Nxf3 {(15. ... Nxf3 16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5 Bxf3+
19. Kd3 Nxf2+ 20. Kd4 Bxh1 21. Bc4 Rad8 22. Rg1 Rf5 23. h4 Bxd5 24. Rxg5+
Rxg5 25. hxg5) -10.39/15 2} 16. gxf3 {(16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5
Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 <HT>) -0.02/10 2} Ne4 {(16. ... Ne4 17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 Bxd5 22. Rc2
Nb1 23. Rxc7 Nxa3 24. Rc5 Bh1 25. Rxg5+ Kh8) -10.54/15 12} 17. Qxd5+ {(17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 21. Rg1+ Kf7) -0.02/10
2} Qxd5 {(17. ... Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 b5
22. Bh3 Bxd5 23. Rd2 Be4 24. Rd7 Kh8 25. Rxc7) -10.88/15 1} 18. cxd5 {(18.
cxd5 Bxf3+ 19. Kd3 Nxf2+ 20. Kd4 c5+ 21. dxc6 Rad8+ 22. Ke5 Bd5 23. cxb7
Rde8+ 24. Kxd5) -0.04/11 3} Bxf3+ 19. Kd3 {(19. Kd3 Nxf2+ 20. Kd4 Ng4 21.
Bxa6 Rfd8 22. Bxb7 Rxa3 23. Rxa3 c5+) -0.04/10 2} Bxh1 {(19. ... Bxh1 20.
f4 gxf4 21. Kd4 c5+ 22. Kd3 fxe3 23. c4 Nf2+ 24. Kc3 Rad8 25. Be2 Rd6 26.
Rg1+ Kh8 27. a4 Ne4+ 28. Kc2) -11.18/14 5} 20. Ke2 {(20. Ke2 Rxf2+ 21. Kd3
Rf4 22. exf4 Rf8 23. fxg5 Rf6 24. gxf6 Nf2+) -0.04/10 2} Rxf2+ {(20. ...
Rxf2+ 21. Ke1 Raf8 22. Bd3 Bg2 23. c4 Rf1+ 24. Bxf1 Rxf1+ 25. Ke2 Rxa1 26.
Kd3 Rxa3+ 27. Kd4 g4 28. c5 Ra4+ 29. Ke5 Nxc5) -15.42/13 1} 21. Kd3 {(21.
Kd3 Ng3 22. hxg3 c5 23. dxc6 Rb2 24. cxb7 Rd2+ 25. Kxd2 Rd8+) -0.04/10 3}
Rd8 {(21. ... Rd8 22. Rd1 Rxd5+ 23. Kc4 Rxd1 24. Bd3 Nd6+ 25. Kd4 Be4 26.
Ke5 Rxd3 27. Ke6 Rxh2 28. Kf6 Rxc3 29. Kxg5 Rxa3) -23.21/12 1} 22. Bg2
{(22. Bg2 Nd6 23. Bxh1 c5 24. dxc6 Rb2 25. cxb7 Rd2+ 26. Kxd2 Nf7+)
-0.04/10 3} Rxd5+ {(22. ... Rxd5+ 23. Kc4 Rc5+ 24. Kb3 Rb5+ 25. Kc4 Rd2 26.
Bxe4 Bxe4 27. Ra2 Bd5#) -M6/11 0} 23. Kc4 {(23. Kc4 Re2 24. Kxd5 Nxc3+ 25.
Ke6 Na4 26. Bxh1 Re1 27. Rxe1 Nc5+) -0.03/10 1} Rc5+ 24. Kb3 {(24. Kb3
Rxc3+ 25. Kb4 Rb2+ 26. Ka5 Rb5+ 27. Ka4 Nc5#) -M4/12 2} Rb5+ 25. Kc4 {(25.
Kc4 Rd2 26. Bxe4 Bxe4 27. Ra2 Bd5#) -M3/12 2} Rd2 26. Bxe4 {(26. Bxe4 Bxe4
27. Ra2 Bd5#) -M2/13 1} Bxe4 27. Rg1 {(27. Rg1 Bd5#) -M1/12 2} Bd5# 0-1
Which is which??? I notice white is searching less deeply so I'd guess that is -1???
Dann Corbit
Posts: 12797
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Questions for the Stockfish team

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
bob wrote:
Daniel Shawul wrote:I tested it and as I said it played terrible ... not even close to 1000 elo.
Before we argue further, I noticed two things in your implementation.

One is you don't have a skill value of 0, why ?. Eventhough it is very
small one can not completely disregard the effect of the small real
evaluation introduced. For example with skill=1, you can not get more than
3 sigma accuracy. Completely random means _completely_ random, not 0.01 of real evaluation
bla bla ...

do the math first...

.01 * real eval + .99 * random() where random is between 0 and 100 (one pawn value).

If the program drops a pawn and evaluates that position, you get .01 * 100 + .99 *random().
[snip]
I guess that the tiny eval contribution is important. On average, the random value will be 0.5 or half of a pawn, and it will be high as often as it is low. I guess that the noise, going in both directions, will tend to channel the program in the right direction over a long search.

IOW, I suspect that 1.0 * random() will play far worse than having the 1% correct part because every correct eval is a tiny nudge in the right direction that may average out over a large branch of a tree to have some proper weight (500 million high nodes + 500 million low nodes will give an answer that I suspect is pretty good).
As I told Daniel, you suspect _wrong_. You can certainly make that simple change in evaluate.c, down at the bottom where the SKILL code is. Just make it pure random evaluation, play it, and report back. Actually, report back "amazed." It won't change a thing. That .01 really is irrelevant compared to the big random component.

BTW, you are missing the key point. "The Beal effect" is based on sampling theory here. So the .5 average random number is irrelevant. We steer into the parts of the tree where we have lots of moves to take us to positions where we get those big random values, that is the whole point of the Beal effect, in fact.
This is *really* interesting to me. I made the following change, so that I can even make negative eval hints, and I will test with various settings:

Code: Select all

#if defined(SKILL)
  else if (OptionMatch("skill", *args)) {
    if (nargs < 2) {
      printf("usage:  skill <1-100>\n");
      return (1);
    }
    if (skill != 100) {
      printf("ERROR:  skill can only be changed one time in a game\n");
    } else {
      skill = atoi(args[1]);
      if (skill < -100 || skill > 100) {
        printf("ERROR: skill range is -100 to 100 only\n");
        skill = 100;
      }
      Print(128, "skill level set to %d%%\n", skill);
      null_depth = null_depth * skill / 100;
      check_depth = check_depth * skill / 100;
      LMR_depth = LMR_depth * skill / 100;
    }
  }
#endif
That appears to break things. Adjusting things like null-depth with a - skill will blow things up because it will do a null-move search with a _deeper_ depth, not a reduced one. Ditto for LMR which will suddenly be extending, not reducing.
It might make it fun for raw beginners.

Here is a game between normal Crafty (skill = 100) and Crafty with skill = -1:

Code: Select all

[Event "Crskill"]
[Site "DCORBIT2008"]
[Date "2010.07.20"]
[Round "1"]
[White "Crafty-232am01"]
[Black "Crafty-23.2a-skill-mod"]
[Result "0-1"]
[BlackElo "2800"]
[ECO "A28"]
[Opening "English"]
[Time "20:50:44"]
[Variation "Four Knights, Nenarokov Variation"]
[WhiteElo "2700"]
[TimeControl "60+1"]
[Termination "normal"]
[PlyCount "54"]
[WhiteType "program"]
[BlackType "program"]

1. d4 Nc6 2. Nf3 d5 3. c4 e5 4. Bg5 {(4. Bg5 f6 5. Bh4 dxc4 6. Qa4 exd4 7.
Qxc4 Ke7 8. Nfd2) 0.00/9 2} f6 {(4. ... f6 5. cxd5 Nxd4 6. Be3 Nf5 7. Bc1
Bc5 8. e4 Nd4 9. Nc3 Ne7 10. Nxd4 exd4 11. Bb5+ Bd7 12. Qh5+ Kf8) -0.16/15
2} 5. Bh4 {(5. Bh4 dxc4 6. Qa4 Bd7 7. dxe5 Nb8 8. Qc2 Ke7 9. Ng1) 0.00/9 2}
g5 {(5. ... g5 6. Bg3 h5 7. cxd5 Bb4+ 8. Nc3 Qxd5 9. Qd3 h4 10. Qg6+ Qf7
11. Qxf7+ Kxf7 12. d5 Nce7 13. Nxg5+ fxg5 14. Bxe5) -0.65/13 3} 6. Bxg5
{(6. Bxg5 fxg5 7. Nxe5 Nxe5 8. dxe5 Bb4+ 9. Nc3 Ke7 10. Rg1) 0.00/9 2} fxg5
{(6. ... fxg5 7. Nc3 Bg7 8. dxe5 d4 9. Nd5 Nxe5 10. Nxg5 h6 11. Ne4 Nxc4
12. Qa4+ c6 13. Qxc4 Qxd5 14. Qxd5 cxd5) -2.51/12 1} 7. e3 {(7. e3 Bb4+ 8.
Nc3 Nxd4 9. Nxd4 Bxc3+ 10. bxc3 dxc4 11. Qa4+) 0.00/9 2} Bb4+ {(7. ... Bb4+
8. Nc3 exd4 9. Nxd4 Nf6 10. Nxc6 bxc6 11. h4 Bxc3+ 12. bxc3 gxh4 13. Rxh4
Rg8 14. cxd5 cxd5 15. Bb5+ Bd7) -2.82/13 3} 8. Nc3 {(8. Nc3 Nxd4 9. Nxd4
Bxc3+ 10. bxc3 dxc4 11. Bxc4 b5 12. Bf1) 0.00/9 3} exd4 {(8. ... exd4 9.
Nxd4 Nf6 10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. Qd4 O-O 14. Bd3 Be6 15.
cxd5 Nxd5 16. Rb1) -2.86/15 2} 9. Nxd4 {(9. Nxd4 Ne5 10. cxd5 Bxc3+ 11.
bxc3 c5 12. dxc6 Nf3+ 13. Ke2 Nxd4+ 14. exd4) 0.00/9 2} Nf6 {(9. ... Nf6
10. h4 Bxc3+ 11. bxc3 g4 12. Nxc6 bxc6 13. cxd5 cxd5 14. Qd4 O-O 15. Bd3
Nh5 16. Qe5 Ng7) -2.85/14 2} 10. a3 {(10. a3 Nxd4 11. exd4 Bxa3 12. bxa3
Ng4 13. cxd5 Nf6 14. Be2) 0.00/9 7} Bxc3+ {(10. ... Bxc3+ 11. bxc3 O-O 12.
Rc1 Ne5 13. cxd5 Qxd5 14. f3 Qf7 15. Qc2 Rd8 16. Rd1 Nc4 17. Bxc4 Qxc4 18.
e4) -3.59/15 2} 11. bxc3 {(11. bxc3 dxc4 12. Nxc6 Qxd1+ 13. Rxd1 bxc6 14.
Bxc4 Ba6 15. Bxa6) -0.01/9 2} O-O {(11. ... O-O 12. Rc1 Ne5 13. cxd5 Qxd5
14. f3 Qf7 15. h4 c5 16. hxg5 Nd5 17. e4 Ne3 18. g6 Qxg6 <HT>) -3.57/15 2}
12. Nb5 {(12. Nb5 Na5 13. cxd5 Ne4 14. Ra2 c6 15. dxc6 Qxd1+ 16. Kxd1 Nxf2+
17. Rxf2) -0.01/9 3} a6 {(12. ... a6 13. Nd4 Ne5 14. cxd5 Ne4 15. f3 Nxc3
16. Qc2 Nxd5 17. e4 Nf4 18. Rd1 Qe7 19. g3 Ne6) -4.11/15 2} 13. Nd4 {(13.
Nd4 Na5 14. cxd5 Qxd5 15. Ne2 Qxd1+ 16. Rxd1 Rd8 17. Rxd8+) -0.02/9 2} Ne5
{(13. ... Ne5 14. cxd5 Qxd5 15. f3 c5 16. Nb3 c4 17. Nd4 Qa5 18. Qc2 Nd5
19. Kd2 Nd3 20. Bxd3 cxd3 21. Qxd3) -3.74/14 4} 14. Nf3 {(14. Nf3 Ne4 15.
Bd3 Nxd3+ 16. Qxd3 Rxf3 17. gxf3 Nxf2 18. Kxf2) -0.01/9 2} Bg4 {(14. ...
Bg4 15. Be2 Bxf3 16. Bxf3 Nxf3+ 17. gxf3 dxc4 18. e4 Nh5 19. Qxd8 Raxd8 20.
Rg1 Rxf3 21. Rb1 h6 22. Rxb7 Rxc3 23. Rxc7 Rxa3 24. Rxc4) -4.54/14 2} 15.
Ke2 {(15. Ke2 Ne4 16. Qxd5+ Qxd5 17. cxd5 Rxf3 18. gxf3 Ng3+ 19. hxg3 Nxf3)
-0.03/9 3} Nxf3 {(15. ... Nxf3 16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5 Bxf3+
19. Kd3 Nxf2+ 20. Kd4 Bxh1 21. Bc4 Rad8 22. Rg1 Rf5 23. h4 Bxd5 24. Rxg5+
Rxg5 25. hxg5) -10.39/15 2} 16. gxf3 {(16. gxf3 Ne4 17. Qxd5+ Qxd5 18. cxd5
Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 <HT>) -0.02/10 2} Ne4 {(16. ... Ne4 17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 Bxd5 22. Rc2
Nb1 23. Rxc7 Nxa3 24. Rc5 Bh1 25. Rxg5+ Kh8) -10.54/15 12} 17. Qxd5+ {(17.
Qxd5+ Qxd5 18. cxd5 Bxf3+ 19. Kd3 Rf4 20. exf4 gxf4 21. Rg1+ Kf7) -0.02/10
2} Qxd5 {(17. ... Qxd5 18. cxd5 Bxf3+ 19. Ke1 Bxh1 20. Ra2 Nxc3 21. Rb2 b5
22. Bh3 Bxd5 23. Rd2 Be4 24. Rd7 Kh8 25. Rxc7) -10.88/15 1} 18. cxd5 {(18.
cxd5 Bxf3+ 19. Kd3 Nxf2+ 20. Kd4 c5+ 21. dxc6 Rad8+ 22. Ke5 Bd5 23. cxb7
Rde8+ 24. Kxd5) -0.04/11 3} Bxf3+ 19. Kd3 {(19. Kd3 Nxf2+ 20. Kd4 Ng4 21.
Bxa6 Rfd8 22. Bxb7 Rxa3 23. Rxa3 c5+) -0.04/10 2} Bxh1 {(19. ... Bxh1 20.
f4 gxf4 21. Kd4 c5+ 22. Kd3 fxe3 23. c4 Nf2+ 24. Kc3 Rad8 25. Be2 Rd6 26.
Rg1+ Kh8 27. a4 Ne4+ 28. Kc2) -11.18/14 5} 20. Ke2 {(20. Ke2 Rxf2+ 21. Kd3
Rf4 22. exf4 Rf8 23. fxg5 Rf6 24. gxf6 Nf2+) -0.04/10 2} Rxf2+ {(20. ...
Rxf2+ 21. Ke1 Raf8 22. Bd3 Bg2 23. c4 Rf1+ 24. Bxf1 Rxf1+ 25. Ke2 Rxa1 26.
Kd3 Rxa3+ 27. Kd4 g4 28. c5 Ra4+ 29. Ke5 Nxc5) -15.42/13 1} 21. Kd3 {(21.
Kd3 Ng3 22. hxg3 c5 23. dxc6 Rb2 24. cxb7 Rd2+ 25. Kxd2 Rd8+) -0.04/10 3}
Rd8 {(21. ... Rd8 22. Rd1 Rxd5+ 23. Kc4 Rxd1 24. Bd3 Nd6+ 25. Kd4 Be4 26.
Ke5 Rxd3 27. Ke6 Rxh2 28. Kf6 Rxc3 29. Kxg5 Rxa3) -23.21/12 1} 22. Bg2
{(22. Bg2 Nd6 23. Bxh1 c5 24. dxc6 Rb2 25. cxb7 Rd2+ 26. Kxd2 Nf7+)
-0.04/10 3} Rxd5+ {(22. ... Rxd5+ 23. Kc4 Rc5+ 24. Kb3 Rb5+ 25. Kc4 Rd2 26.
Bxe4 Bxe4 27. Ra2 Bd5#) -M6/11 0} 23. Kc4 {(23. Kc4 Re2 24. Kxd5 Nxc3+ 25.
Ke6 Na4 26. Bxh1 Re1 27. Rxe1 Nc5+) -0.03/10 1} Rc5+ 24. Kb3 {(24. Kb3
Rxc3+ 25. Kb4 Rb2+ 26. Ka5 Rb5+ 27. Ka4 Nc5#) -M4/12 2} Rb5+ 25. Kc4 {(25.
Kc4 Rd2 26. Bxe4 Bxe4 27. Ra2 Bd5#) -M3/12 2} Rd2 26. Bxe4 {(26. Bxe4 Bxe4
27. Ra2 Bd5#) -M2/13 1} Bxe4 27. Rg1 {(27. Rg1 Bd5#) -M1/12 2} Bd5# 0-1
Which is which??? I notice white is searching less deeply so I'd guess that is -1???
Yes, you can tell by the names.

When skill is negative, it does everything backwards. It extends when it should reduce and reduces where it should extend. It preferes losing a pawn to keeping it.

In the name [White "Crafty-232am01"] 'm01' means minus 1 for skill

I have the following values in the contest:

Crafty skill = 100 (Normal crafty)
Crafty skill = 50
Crafty skill = 10
Crafty skill = 1
Crafty skill = 0
Crafty skill = -10
Crafty skill = -50