CM11 testings - Default sel=16 a disaster.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Graham Banks
Posts: 44921
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

CM11 testings - Default sel=16 a disaster.

Post by Graham Banks »

Code: Select all

Summary of CM11 Ratings so far
------------------------------

2815 CM11 Default *2CPU*
2782 CM11 Glorfindel
2778 CM11 Sauron
2767 CM11 Default
2767 CM11 Tomahawk
2763 CM11 Default Sel 21
2761 CM11 Silver Fern
2740 CM11 Default Sel 16


Performances
------------

Chessmaster 11 2CPU       2815  600 (+209,=209,-182), 52.2 %

Chess Tiger 2007.1            :  50 (+ 18,= 14,- 18), 50.0 %
Delfi 5.2                     :  50 (+ 27,= 10,- 13), 64.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 16,= 17,- 17), 49.0 %
Hiarcs 11.1                   :  50 (+ 10,= 19,- 21), 39.0 %
Loop 13.6 32-bit              :  50 (+ 13,= 22,- 15), 48.0 %
Movei 00.8.438                :  50 (+ 18,= 19,- 13), 55.0 %
Scorpio 1.91                  :  50 (+ 18,= 21,- 11), 57.0 %
Slow Chess Blitz WV2.1        :  50 (+ 29,= 12,-  9), 70.0 %
Spike 1.2 Turin               :  50 (+ 10,= 20,- 20), 40.0 %
WildCat 7                     :  50 (+ 23,= 14,- 13), 60.0 %
Ktulu 8.0                     :  50 (+ 16,= 19,- 15), 51.0 %
Naum 2.2 32-bit               :  50 (+ 11,= 22,- 17), 44.0 %



CM 11 Glorfindel          2782  600 (+183,=201,-216), 47.2 %

Chess Tiger 2007.1            :  50 (+ 13,= 21,- 16), 47.0 %
Delfi 5.2                     :  50 (+ 27,= 13,- 10), 67.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 11,= 15,- 24), 37.0 %
Hiarcs 11.1                   :  50 (+  8,= 14,- 28), 30.0 %
Loop 13.6 32-bit              :  50 (+ 11,= 11,- 28), 33.0 %
Movei 00.8.438                :  50 (+ 15,= 22,- 13), 52.0 %
Scorpio 1.91                  :  50 (+ 19,= 16,- 15), 54.0 %
Slow Chess Blitz WV2.1        :  50 (+ 23,= 17,- 10), 63.0 %
Spike 1.2 Turin               :  50 (+ 11,= 24,- 15), 46.0 %
WildCat 7                     :  50 (+ 26,= 15,-  9), 67.0 %
Ktulu 8.0                     :  50 (+ 12,= 11,- 27), 35.0 %
Naum 2.2 32-bit               :  50 (+  7,= 22,- 21), 36.0 %



CM 11 Sauron              2778  600 (+183,=193,-224), 46.6 %

Chess Tiger 2007.1            :  50 (+ 10,= 16,- 24), 36.0 %
Delfi 5.2                     :  50 (+ 26,= 15,-  9), 67.0 %
Glaurung 2.0.1 64-bit         :  50 (+  9,= 18,- 23), 36.0 %
Hiarcs 11.1                   :  50 (+  5,= 14,- 31), 24.0 %
Loop 13.6 32-bit              :  50 (+  8,= 18,- 24), 34.0 %
Movei 00.8.438                :  50 (+ 25,= 15,- 10), 65.0 %
Scorpio 1.91                  :  50 (+ 23,= 16,- 11), 62.0 %
Slow Chess Blitz WV2.1        :  50 (+ 19,= 20,- 11), 58.0 %
Spike 1.2 Turin               :  50 (+ 12,= 11,- 27), 35.0 %
WildCat 7                     :  50 (+ 21,= 18,- 11), 60.0 %
Ktulu 8.0                     :  50 (+ 15,= 15,- 20), 45.0 %
Naum 2.2 32-bit               :  50 (+ 10,= 17,- 23), 37.0 %



CM 11 Default             2767  600 (+170,=198,-232), 44.8 %

Chess Tiger 2007.1  	      :  50 (+  7,= 21,- 22), 35.0 %
Delfi 5.2   		      :  50 (+ 22,= 12,- 16), 56.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 10,= 13,- 27), 33.0 %
Hiarcs 11.1                   :  50 (+  9,= 12,- 29), 30.0 %
Loop 13.6 32-bit              :  50 (+ 12,= 11,- 27), 35.0 %
Movei 00.8.438                :  50 (+ 22,= 14,- 14), 58.0 %
Scorpio 1.91                  :  50 (+ 19,= 20,- 11), 58.0 %
Slow Chess Blitz WV2.1        :  50 (+ 21,= 21,-  8), 63.0 %
Spike 1.2 Turin               :  50 (+  7,= 20,- 23), 34.0 %
WildCat 7                     :  50 (+ 18,= 20,- 12), 56.0 %
Ktulu 8.0                     :  50 (+ 12,= 17,- 21), 41.0 %
Naum 2.2 32-bit               :  50 (+ 11,= 17,- 22), 39.0 %



Chessmaster 11 Tomahawk   2767  600 (+177,=184,-239), 44.8 %

Chess Tiger 2007.1            :  50 (+ 10,= 23,- 17), 43.0 %
Delfi 5.2                     :  50 (+ 20,= 15,- 15), 55.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 11,= 17,- 22), 39.0 %
Hiarcs 11.1                   :  50 (+  7,= 13,- 30), 27.0 %
Loop 13.6 32-bit              :  50 (+  8,= 11,- 31), 27.0 %
Movei 00.8.438                :  50 (+ 21,= 14,- 15), 56.0 %
Scorpio 1.91                  :  50 (+ 20,= 11,- 19), 51.0 %
Slow Chess Blitz WV2.1        :  50 (+ 24,= 16,- 10), 64.0 %
Spike 1.2 Turin               :  50 (+ 15,= 16,- 19), 46.0 %
WildCat 7                     :  50 (+ 21,= 16,- 13), 58.0 %
Ktulu 8.0                     :  50 (+ 14,= 16,- 20), 44.0 %
Naum 2.2 32-bit               :  50 (+  6,= 16,- 28), 28.0 %



CM11 Default Sel 21       2763  600 (+152,=228,-220), 44.3 %

Chess Tiger 2007.1            :  50 (+ 14,= 16,- 20), 44.0 %
Delfi 5.2                     :  50 (+ 19,= 18,- 13), 56.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 13,= 16,- 21), 42.0 %
Hiarcs 11.1                   :  50 (+  3,= 18,- 29), 24.0 %
Loop 13.6 32-bit              :  50 (+  5,= 17,- 28), 27.0 %
Movei 00.8.438                :  50 (+ 18,= 19,- 13), 55.0 %
Scorpio 1.91                  :  50 (+ 14,= 24,- 12), 52.0 %
Slow Chess Blitz WV2.1        :  50 (+ 18,= 24,-  8), 60.0 %
Spike 1.2 Turin               :  50 (+ 13,= 19,- 18), 45.0 %
WildCat 7                     :  50 (+ 22,= 15,- 13), 59.0 %
Ktulu 8.0                     :  50 (+  8,= 18,- 24), 34.0 %
Naum 2.2 32-bit               :  50 (+  5,= 24,- 21), 34.0 %



CM 11 Silver Fern         2761  600 (+169,=189,-242), 43.9 %

Chess Tiger 2007.1            :  50 (+ 12,= 15,- 23), 39.0 %
Delfi 5.2                     :  50 (+ 18,= 24,-  8), 60.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 10,= 16,- 24), 36.0 %
Hiarcs 11.1                   :  50 (+  6,= 17,- 27), 29.0 %
Loop 13.6 32-bit              :  50 (+  8,= 16,- 26), 32.0 %
Movei 00.8.438                :  50 (+ 16,= 14,- 20), 46.0 %
Scorpio 1.91                  :  50 (+ 22,= 12,- 16), 56.0 %
Slow Chess Blitz WV2.1        :  50 (+ 27,= 13,- 10), 67.0 %
Spike 1.2 Turin               :  50 (+ 11,= 15,- 24), 37.0 %
WildCat 7                     :  50 (+ 15,= 20,- 15), 50.0 %
Ktulu 8.0                     :  50 (+ 12,= 11,- 27), 35.0 %
Naum 2.2 32-bit               :  50 (+ 12,= 16,- 22), 40.0 %



CM 11 Default Sel 16      2740  600 (+149,=191,-260), 40.8 %

Chess Tiger 2007.1            :  50 (+  7,= 26,- 17), 40.0 %
Delfi 5.2                     :  50 (+ 16,= 18,- 16), 50.0 %
Glaurung 2.0.1 64-bit         :  50 (+ 13,= 16,- 21), 42.0 %
Hiarcs 11.1                   :  50 (+  6,= 17,- 27), 29.0 %
Loop 13.6 32-bit              :  50 (+  6,= 11,- 33), 23.0 %
Movei 00.8.438                :  50 (+ 20,= 12,- 18), 52.0 %
Scorpio 1.91                  :  50 (+ 20,= 17,- 13), 57.0 %
Slow Chess Blitz WV2.1        :  50 (+ 13,= 15,- 22), 41.0 %
Spike 1.2 Turin               :  50 (+  9,= 13,- 28), 31.0 %
WildCat 7                     :  50 (+ 20,= 12,- 18), 52.0 %
Ktulu 8.0                     :  50 (+ 13,= 17,- 20), 43.0 %
Naum 2.2 32-bit               :  50 (+  6,= 17,- 27), 29.0 %


gbanksnz at gmail.com
User avatar
Werner
Posts: 3004
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: CM11 testings - Default sel=16 a disaster.

Post by Werner »

Hi Graham,
after the bad results I had testing the Sauron setting on 40/20 I stopped here testing on this time control. I hope other testers will find a better setting - perhaps with blitz times where you can make more games.

As I see now in your results - a lot of games are necessary to find such a setting. So it is hard to understand the positions of sel def, sel 19 and sel 16 in your results :?
Werner
User avatar
Graham Banks
Posts: 44921
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CM11 testings - Default sel=16 a disaster.

Post by Graham Banks »

Werner wrote:Hi Graham,
after the bad results I had testing the Sauron setting on 40/20 I stopped here testing on this time control. I hope other testers will find a better setting - perhaps with blitz times where you can make more games.

As I see now in your results - a lot of games are necessary to find such a setting. So it is hard to understand the positions of sel def, sel 19 and sel 16 in your results :?
Hi Werner,

each setting plays 600 games at CCRL 40/4 time control.
Finding a much improved setting is going to be a difficult task with CM11.

Regards, Graham.
gbanksnz at gmail.com
Spock

Re: CM11 testings - Default sel=16 a disaster.

Post by Spock »

Yes a very difficult task. There may be a sweet-spot with the Sel setting and it may also be different depending on the time control. I really do wonder whether the setting of 21 actually takes effect and is obeyed by the engine. We know that in the CM GUI it is not supported. Supposedly the engine does accept it, but I'm not convinced.

There is no difficulty in understanding the result. The assumption that higher settings always produce one result and lower settings another result is not necessarily valid, and in fact it appears indeed that the Sel setting is unpredictable. My view is that the default may well be best, at least at this time control

600 games is a decent amount, but I think we will look at extending the testing up to 700 at least
ozziejoe
Posts: 811
Joined: Wed Mar 08, 2006 10:07 pm

Re: CM11 testings - Default sel=16 a disaster.

Post by ozziejoe »

It looks like these settings are being tested on a single processor. Is it possible that the selectivity parameter works differently for single versus dual processors. It seems very unlikely but thought i'd ask

best
J
Spock

Re: CM11 testings - Default sel=16 a disaster.

Post by Spock »

ozziejoe wrote:It looks like these settings are being tested on a single processor. Is it possible that the selectivity parameter works differently for single versus dual processors. It seems very unlikely but thought i'd ask

best
J
Yes these are single CPU tests, apart from the 2CPU test itself of course.

And yes I agree, for dual or even quad processors a different Sel setting might very well be optimal compared to single