Current skill command (Crafty) results

bob · Post by **bob** » Tue Jul 27, 2010 6:45 pm

jhaglund wrote:Posted: Mon Jul 26, 2010 4:21 pm Post subject: Re: Current skill command (Crafty) results

--------------------------------------------------------------------------------

jhaglund wrote:
Quote:
if you find a way to wait 1/16 second on every machine

Sleep(62); //62.5/1000

Not guaranteed. In fact, sleep(1) is supposed to sleep for _one_ second, according to POSIX. nanosleep() is supposed to sleep for either (a) the indicated number of nanoseconds, or (b) the indicated number of nanoseconds rounded up to the operating system clock resolution, which for most Linux kernels is 100th of a second, but can vary from that.
This was for "windoze"... works for me....

Not everyone uses windoze however. Here is an excerpt from the man page for sleep on linux:

DESCRIPTION
sleep() makes the calling process sleep until seconds seconds have
elapsed or a signal arrives which is not ignored.

Sleep(1000); // = 1 sec.
Sleep(62); // about 1/16th
Sleep(125); // = 1/8th
etc...

so?

int x, skill;
cout << " Enter skill (1-100): ";
cin >> skill;
skill = x;
cout << " Level: " << x << endl;
if(x >= 100 && x <=1)
if(x == 100) //100% strength
nanosleep(0); // no sleep
if(x == 90)
nanosleep(10);
if(x == 80)
nanosleep(20);
if(x == 70)
nanosleep(30);
if(x == 60)
nanosleep(40);
if(x == 50)
nanosleep(50);
if(x == 40)
nanosleep(60);
if(x == 30)
nanosleep(70);
if(x == 20)
nanosleep(80);
if(x == 10)
nanosleep(90);
if(x == 1)
nanosleep(100);
else
nanosleep(x);
...

Ditto for nanosleep():

NOTES
If the interval specified in req is not an exact multiple of the granu-
larity underlying clock (see time(7)), then the interval will be
rounded up to the next multiple. Furthermore, after the sleep com-
pletes, there may still be a delay before the CPU becomes free to once
again execute the calling thread.

In linux the clock resolution is typically 100 ticks per second. But can vary depending on kernel build or boot setting options.

Jimbo I · Post by **Jimbo I** » Tue Aug 17, 2010 1:23 pm

Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.

Code: Select all

    Engine                     Score    
01: Piranha 0.5                384.5/405
02: Crafty 23.1 x64ja SL001    339.5/405
03: Pupsi2 0.08 SL003          304.5/405
04: Pupsi2 0.08 SL002          234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000         189.0/405
07: Crafty 23.3 x64ja SL001    140.0/405
08: Crafty 22.10 x64ja SL001    84.0/405
09: Crafty 23.0 x64ja SL001     84.0/405
10: Ufim 8.02 elo 800           65.5/405

2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU     T6600  @ 2.20GHz  with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)

Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)

Code: Select all

Rank Name                         Elo    +    - games score oppo. draws 
   1 Piranha 0.5                 1598   61   54   405   95%   958    4% 
   2 Crafty 23.1 x64ja SL001     1376   42   40   405   84%   982   12% 
   3 Pupsi2 0.08 SL003           1286   39   38   405   75%   992    9% 
   4 Pupsi2 0.08 SL002           1083   34   34   405   58%  1015    7% 
   5 Tornado 3.6.2 x64 elo 1000   998   34   34   405   49%  1024   13% 
   6 Ufim 8.02 elo 1000           973   34   34   405   47%  1027   12% 
   7 Crafty 23.3 x64ja SL001      852   35   35   405   35%  1041    7% 
   8 Crafty 23.0 x64ja SL001      703   37   38   405   21%  1057    8% 
   9 Crafty 22.10 x64ja SL001     699   37   38   405   21%  1058    8% 
  10 Ufim 8.02 elo 800            649   38   40   405   16%  1063    8%

bob · Post by **bob** » Tue Aug 17, 2010 6:51 pm

Jimbo I wrote:Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.

Code: Select all

    Engine                     Score    
01: Piranha 0.5                384.5/405
02: Crafty 23.1 x64ja SL001    339.5/405
03: Pupsi2 0.08 SL003          304.5/405
04: Pupsi2 0.08 SL002          234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000         189.0/405
07: Crafty 23.3 x64ja SL001    140.0/405
08: Crafty 22.10 x64ja SL001    84.0/405
09: Crafty 23.0 x64ja SL001     84.0/405
10: Ufim 8.02 elo 800           65.5/405

2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU     T6600  @ 2.20GHz  with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)

Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)

Code: Select all

Rank Name                         Elo    +    - games score oppo. draws 
   1 Piranha 0.5                 1598   61   54   405   95%   958    4% 
   2 Crafty 23.1 x64ja SL001     1376   42   40   405   84%   982   12% 
   3 Pupsi2 0.08 SL003           1286   39   38   405   75%   992    9% 
   4 Pupsi2 0.08 SL002           1083   34   34   405   58%  1015    7% 
   5 Tornado 3.6.2 x64 elo 1000   998   34   34   405   49%  1024   13% 
   6 Ufim 8.02 elo 1000           973   34   34   405   47%  1027   12% 
   7 Crafty 23.3 x64ja SL001      852   35   35   405   35%  1041    7% 
   8 Crafty 23.0 x64ja SL001      703   37   38   405   21%  1057    8% 
   9 Crafty 22.10 x64ja SL001     699   37   38   405   21%  1058    8% 
  10 Ufim 8.02 elo 800            649   38   40   405   16%  1063    8%

That looks better. If you ever have time, other skill levels would be interesting to measure as well as I have no real idea how skill and Elo compare for skill varying from 1 on up... Obviously SL001 plays pretty badly, as desired. But what happens between 1 (which is bad) and 100 (which is way strong)??? Be nice to have some sort of graph. My problem is that I have no programs weak enough to test against on my cluster so that I can get useful Elo estimations down on the lower end of the rating scale...

At least this verifies that skill 1 is back to playing very weakly...

Jimbo I · Post by **Jimbo I** » Tue Aug 17, 2010 7:05 pm

bob wrote:
Jimbo I wrote:Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.
Code: Select all
    Engine                     Score    
01: Piranha 0.5                384.5/405
02: Crafty 23.1 x64ja SL001    339.5/405
03: Pupsi2 0.08 SL003          304.5/405
04: Pupsi2 0.08 SL002          234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000         189.0/405
07: Crafty 23.3 x64ja SL001    140.0/405
08: Crafty 22.10 x64ja SL001    84.0/405
09: Crafty 23.0 x64ja SL001     84.0/405
10: Ufim 8.02 elo 800           65.5/405

2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU     T6600  @ 2.20GHz  with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)
Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)
Code: Select all
Rank Name                         Elo    +    - games score oppo. draws 
   1 Piranha 0.5                 1598   61   54   405   95%   958    4% 
   2 Crafty 23.1 x64ja SL001     1376   42   40   405   84%   982   12% 
   3 Pupsi2 0.08 SL003           1286   39   38   405   75%   992    9% 
   4 Pupsi2 0.08 SL002           1083   34   34   405   58%  1015    7% 
   5 Tornado 3.6.2 x64 elo 1000   998   34   34   405   49%  1024   13% 
   6 Ufim 8.02 elo 1000           973   34   34   405   47%  1027   12% 
   7 Crafty 23.3 x64ja SL001      852   35   35   405   35%  1041    7% 
   8 Crafty 23.0 x64ja SL001      703   37   38   405   21%  1057    8% 
   9 Crafty 22.10 x64ja SL001     699   37   38   405   21%  1058    8% 
  10 Ufim 8.02 elo 800            649   38   40   405   16%  1063    8% 
That looks better. If you ever have time, other skill levels would be interesting to measure as well as I have no real idea how skill and Elo compare for skill varying from 1 on up... Obviously SL001 plays pretty badly, as desired. But what happens between 1 (which is bad) and 100 (which is way strong)??? Be nice to have some sort of graph. My problem is that I have no programs weak enough to test against on my cluster so that I can get useful Elo estimations down on the lower end of the rating scale...

At least this verifies that skill 1 is back to playing very weakly...

I'll see what I can do. It may take some time, though.

Jimbo I · Post by **Jimbo I** » Mon Nov 22, 2010 3:47 am

bob wrote:
Jimbo I wrote:Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.
Code: Select all
    Engine                     Score    
01: Piranha 0.5                384.5/405
02: Crafty 23.1 x64ja SL001    339.5/405
03: Pupsi2 0.08 SL003          304.5/405
04: Pupsi2 0.08 SL002          234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000         189.0/405
07: Crafty 23.3 x64ja SL001    140.0/405
08: Crafty 22.10 x64ja SL001    84.0/405
09: Crafty 23.0 x64ja SL001     84.0/405
10: Ufim 8.02 elo 800           65.5/405

2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU     T6600  @ 2.20GHz  with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)
Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)
Code: Select all
Rank Name                         Elo    +    - games score oppo. draws 
   1 Piranha 0.5                 1598   61   54   405   95%   958    4% 
   2 Crafty 23.1 x64ja SL001     1376   42   40   405   84%   982   12% 
   3 Pupsi2 0.08 SL003           1286   39   38   405   75%   992    9% 
   4 Pupsi2 0.08 SL002           1083   34   34   405   58%  1015    7% 
   5 Tornado 3.6.2 x64 elo 1000   998   34   34   405   49%  1024   13% 
   6 Ufim 8.02 elo 1000           973   34   34   405   47%  1027   12% 
   7 Crafty 23.3 x64ja SL001      852   35   35   405   35%  1041    7% 
   8 Crafty 23.0 x64ja SL001      703   37   38   405   21%  1057    8% 
   9 Crafty 22.10 x64ja SL001     699   37   38   405   21%  1058    8% 
  10 Ufim 8.02 elo 800            649   38   40   405   16%  1063    8% 
That looks better. If you ever have time, other skill levels would be interesting to measure as well as I have no real idea how skill and Elo compare for skill varying from 1 on up... Obviously SL001 plays pretty badly, as desired. But what happens between 1 (which is bad) and 100 (which is way strong)??? Be nice to have some sort of graph. My problem is that I have no programs weak enough to test against on my cluster so that I can get useful Elo estimations down on the lower end of the rating scale...

At least this verifies that skill 1 is back to playing very weakly...

Hi Bob, a bit of an update, but I'm not sure how useful this information will be to you.

A while back, I tried to run an Arena tournament with a number of Crafty 23.3 engines with various skill levels from 1 to 100, along with some non-Crafty engines, with the idea of generating a rating graph for skills 1-100. However, as I got into the tournament, I started noticing quite a few games with the lower skill levels being lost on time. The Bayeselo rating results of the weaker engines were higher than expected, and I started to wonder if the time losses were biasing the results. I wasn't able to eliminate the time losses of the weaker Crafty skill levels, and not knowing the cause of the time forfeits, I got frustrated and shelved the effort.

When Crafty 23.4 was released, I decided to give it another try. Another Arena tournament (40/4 time control), and more time forfeits at low skill levels. I tried adding a time increment, using game in 5 min with a 2 second per move increment. The time forfeits continued, although possibly at a slightly lower rate. At this point, I was still suspecting that Arena was the problem, so I switched over to ChessGUI. Still I got time forfeits, but there were also some reports of time forfeit problems with ChessGUI, so I still wasn't sure of the cause of the problem.

Finally, I've switched over to Winboard 4.4.4 with the PSWBTM tournament manager. I'm currently running a tournament with four Crafty 23.4 Skill Level 1 engines: a 32-bit 1cpu version, a 32-bit 2cpu version, a 64-bit 1 cpu version, and a 64-bit 2cpu version. I'm still getting many time forfeits. I haven't counted the forfeits, but it seems that the 32-bit engines have more time losses than the 64-bit engines.

Since I haven't heard of anyone having time forfeit problems in Winboard, I'm beginning to suspect that the problem lies somewhere in the Crafty engine at low skill levels.

bob · Post by **bob** » Mon Nov 22, 2010 5:43 pm

Jimbo I wrote:
bob wrote:
Jimbo I wrote:Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.
Code: Select all
    Engine                     Score    
01: Piranha 0.5                384.5/405
02: Crafty 23.1 x64ja SL001    339.5/405
03: Pupsi2 0.08 SL003          304.5/405
04: Pupsi2 0.08 SL002          234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000         189.0/405
07: Crafty 23.3 x64ja SL001    140.0/405
08: Crafty 22.10 x64ja SL001    84.0/405
09: Crafty 23.0 x64ja SL001     84.0/405
10: Ufim 8.02 elo 800           65.5/405

2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU     T6600  @ 2.20GHz  with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)
Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)
Code: Select all
Rank Name                         Elo    +    - games score oppo. draws 
   1 Piranha 0.5                 1598   61   54   405   95%   958    4% 
   2 Crafty 23.1 x64ja SL001     1376   42   40   405   84%   982   12% 
   3 Pupsi2 0.08 SL003           1286   39   38   405   75%   992    9% 
   4 Pupsi2 0.08 SL002           1083   34   34   405   58%  1015    7% 
   5 Tornado 3.6.2 x64 elo 1000   998   34   34   405   49%  1024   13% 
   6 Ufim 8.02 elo 1000           973   34   34   405   47%  1027   12% 
   7 Crafty 23.3 x64ja SL001      852   35   35   405   35%  1041    7% 
   8 Crafty 23.0 x64ja SL001      703   37   38   405   21%  1057    8% 
   9 Crafty 22.10 x64ja SL001     699   37   38   405   21%  1058    8% 
  10 Ufim 8.02 elo 800            649   38   40   405   16%  1063    8% 
That looks better. If you ever have time, other skill levels would be interesting to measure as well as I have no real idea how skill and Elo compare for skill varying from 1 on up... Obviously SL001 plays pretty badly, as desired. But what happens between 1 (which is bad) and 100 (which is way strong)??? Be nice to have some sort of graph. My problem is that I have no programs weak enough to test against on my cluster so that I can get useful Elo estimations down on the lower end of the rating scale...

At least this verifies that skill 1 is back to playing very weakly...
Hi Bob, a bit of an update, but I'm not sure how useful this information will be to you.

A while back, I tried to run an Arena tournament with a number of Crafty 23.3 engines with various skill levels from 1 to 100, along with some non-Crafty engines, with the idea of generating a rating graph for skills 1-100. However, as I got into the tournament, I started noticing quite a few games with the lower skill levels being lost on time. The Bayeselo rating results of the weaker engines were higher than expected, and I started to wonder if the time losses were biasing the results. I wasn't able to eliminate the time losses of the weaker Crafty skill levels, and not knowing the cause of the time forfeits, I got frustrated and shelved the effort.

When Crafty 23.4 was released, I decided to give it another try. Another Arena tournament (40/4 time control), and more time forfeits at low skill levels. I tried adding a time increment, using game in 5 min with a 2 second per move increment. The time forfeits continued, although possibly at a slightly lower rate. At this point, I was still suspecting that Arena was the problem, so I switched over to ChessGUI. Still I got time forfeits, but there were also some reports of time forfeit problems with ChessGUI, so I still wasn't sure of the cause of the problem.

Finally, I've switched over to Winboard 4.4.4 with the PSWBTM tournament manager. I'm currently running a tournament with four Crafty 23.4 Skill Level 1 engines: a 32-bit 1cpu version, a 32-bit 2cpu version, a 64-bit 1 cpu version, and a 64-bit 2cpu version. I'm still getting many time forfeits. I haven't counted the forfeits, but it seems that the 32-bit engines have more time losses than the 64-bit engines.

Since I haven't heard of anyone having time forfeit problems in Winboard, I'm beginning to suspect that the problem lies somewhere in the Crafty engine at low skill levels.

The time losses are pretty predictable. I'll take a look at the code. I intentionally slow things down because a deep search + random eval is way too strong for what I am trying to do. By artificially slowing things down, the depth is reduced and the randomness in the eval leads to weaker play. But the "spin loop" can be a problem. I'll look at alternatives. The problem is finding a portable solution, which is not so easy...