bob wrote:Jimbo I wrote:Here are some results for Crafty 23.3 Skill Level 1. It's not a huge number of games, but it should give a rough idea of the strength.
Code: Select all
Engine Score
01: Piranha 0.5 384.5/405
02: Crafty 23.1 x64ja SL001 339.5/405
03: Pupsi2 0.08 SL003 304.5/405
04: Pupsi2 0.08 SL002 234.0/405
05: Tornado 3.6.2 x64 elo 1000 200.0/405
06: Ufim 8.02 elo 1000 189.0/405
07: Crafty 23.3 x64ja SL001 140.0/405
08: Crafty 22.10 x64ja SL001 84.0/405
09: Crafty 23.0 x64ja SL001 84.0/405
10: Ufim 8.02 elo 800 65.5/405
2025 of 18000 games played
Name of the tournament: Crafty Skill Test 5
Site/ Country: PC, United States
Level: Blitz 5/2
Hardware: Intel(R) Core(TM)2 Duo CPU T6600 @ 2.20GHz with 4,024 MB Memory
Operating system: Windows 7 Home Premium Home Edition (Build 7600)
Bayeselo results (based on Piranha 0.5 offset at 1598 elo, CCRL 40/4)
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Piranha 0.5 1598 61 54 405 95% 958 4%
2 Crafty 23.1 x64ja SL001 1376 42 40 405 84% 982 12%
3 Pupsi2 0.08 SL003 1286 39 38 405 75% 992 9%
4 Pupsi2 0.08 SL002 1083 34 34 405 58% 1015 7%
5 Tornado 3.6.2 x64 elo 1000 998 34 34 405 49% 1024 13%
6 Ufim 8.02 elo 1000 973 34 34 405 47% 1027 12%
7 Crafty 23.3 x64ja SL001 852 35 35 405 35% 1041 7%
8 Crafty 23.0 x64ja SL001 703 37 38 405 21% 1057 8%
9 Crafty 22.10 x64ja SL001 699 37 38 405 21% 1058 8%
10 Ufim 8.02 elo 800 649 38 40 405 16% 1063 8%
That looks better. If you ever have time, other skill levels would be interesting to measure as well as I have no real idea how skill and Elo compare for skill varying from 1 on up... Obviously SL001 plays pretty badly, as desired. But what happens between 1 (which is bad) and 100 (which is way strong)??? Be nice to have some sort of graph. My problem is that I have no programs weak enough to test against on my cluster so that I can get useful Elo estimations down on the lower end of the rating scale...
At least this verifies that skill 1 is back to playing very weakly...
Hi Bob, a bit of an update, but I'm not sure how useful this information will be to you.
A while back, I tried to run an Arena tournament with a number of Crafty 23.3 engines with various skill levels from 1 to 100, along with some non-Crafty engines, with the idea of generating a rating graph for skills 1-100. However, as I got into the tournament, I started noticing quite a few games with the lower skill levels being lost on time. The Bayeselo rating results of the weaker engines were higher than expected, and I started to wonder if the time losses were biasing the results. I wasn't able to eliminate the time losses of the weaker Crafty skill levels, and not knowing the cause of the time forfeits, I got frustrated and shelved the effort.
When Crafty 23.4 was released, I decided to give it another try. Another Arena tournament (40/4 time control), and more time forfeits at low skill levels. I tried adding a time increment, using game in 5 min with a 2 second per move increment. The time forfeits continued, although possibly at a slightly lower rate. At this point, I was still suspecting that Arena was the problem, so I switched over to ChessGUI. Still I got time forfeits, but there were also some reports of time forfeit problems with ChessGUI, so I still wasn't sure of the cause of the problem.
Finally, I've switched over to Winboard 4.4.4 with the PSWBTM tournament manager. I'm currently running a tournament with four Crafty 23.4 Skill Level 1 engines: a 32-bit 1cpu version, a 32-bit 2cpu version, a 64-bit 1 cpu version, and a 64-bit 2cpu version. I'm still getting many time forfeits. I haven't counted the forfeits, but it seems that the 32-bit engines have more time losses than the 64-bit engines.
Since I haven't heard of anyone having time forfeit problems in Winboard, I'm beginning to suspect that the problem lies somewhere in the Crafty engine at low skill levels.