A test idea without Elo, I think I start middle of Jan.26!

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

exactly ... this solved the problem of the day.

A third test is necessary ... after Fritz 16.
Dog 4.10.2 can be play the third test.

I will add the new rank-system later this day.

:-)

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Code: Select all

[b]Standard Engines:[/b]

01. Uralochka 3.42a JA             3550
02. Revenge 4.0                    3500
03. CSTal 2.0                      3475
04. Velvet 8.1.1                   3475
05. Igel 3.6.0 JA                  3475
06. SlowChess Blitz 2.9            3400
07. Texel 1.12                     3400
08. Stockfish 200731 HCE           3375
09. Wasp 7.00                      3350
10. Patricia 5.0 JA                3350
11. Monty 251209 MCTS dev          3300
12. Leorik 3.1.3 JA                3275
13. Tcheran 9.0                    3275
14. Nemorino 6.11 JA               3250
15. Booot 6.50 HCE                 3200
16. Xiphos 0.6.1 HCE JA            3175
17. Laser 1.7 HCE JA               3175
18. Senpai 3.0.1 HCE               3125
19. DanaSah 9.1 JA                 3075
20. Fizbo 2.0 JA                   3075
21. Petrel 3.1 JA                  3050
22. Vajolet2 2.8.0 HCE             2975
23. Critter 1.6a HCE               2975
24. Deep iCE 4.0.853 HCE           2950
25. Hakkapelitta TCEC v2 HCE       2875
26. Spark 1.0 HCE                  2775
                                  -----
                                  83875 : 26 = 3225,97 Elo

Code: Select all

Rank = 20, 3-4-3 system

1.300,0 (100,00%) - 1.144,0 ( 88,00%) points = 01. ***** General Field Marshal
1.143,5 ( 87,97%) - 1.105,0 ( 85,00%) points = 02. ****  General
1.104,5 ( 84,97%) - 1.066,0 ( 82,00%) points = 03. ***   Lieutenant General
1.065,5 ( 81,97%) - 1.027,0 ( 79,00%) points = 04. **    Major General
1.026,5 ( 78,97%) -   988,0 ( 76,00%) points = 05. *     Brigadier General
--
  987,5 ( 75,97%) -   936,0 ( 72,00%) points = 06. Colonel
  935,5 ( 71,97%) -   884,0 ( 68,00%) points = 07. Lieutenant Colonel
  883,5 ( 67,97%) -   832,0 ( 64,00%) points = 08. Major
  831,5 ( 63,97%) -   780,0 ( 60,00%) points = 09. Captain
  779,5 ( 59,97%) -   728,0 ( 56,00%) points = 10. First Lieutenant
  727,5 ( 55,97%) -   676,0 ( 52,00%) points = 11. Second Lieutenant
--
  675,5 ( 51,97%) -   637,0 ( 49,00%) points = 12. Sergeant Major
  636,5 ( 48,97%) -   598,0 ( 46,00%) points = 13. Master Sergeant
  597,5 ( 45,97%) -   559,0 ( 43,00%) points = 14. Sergeant First Class
  558,5 ( 42,97%) -   520,0 ( 40,00%) points = 15. Staff Sergeant
  519,5 ( 39,97%) -   481,0 ( 37,00%) points = 16. Sergeant
  480,5 ( 36,97%) -   442,0 ( 34,00%) points = 17. Corporal
  441,5 ( 33,97%) -   403,0 ( 31,00%) points = 18. Specialist
  402,5 ( 30,97%) -   364,0 ( 28,00%) points = 19. Private First Class
  363,5 ( 27,97%) -     0,0 ( 25,00%) points = 20. Private
Ready ...
New is the "Staff Sergeant" and "Sergeant First Class".
Problem solved!

Well, they're not new.
They've always been around, but they're new to my rank system.

For 19. Private First Class ... Fritz 16 (Rybka) should make 28,00% - 30,97%.

Live mode:
shredder *.sto file (tournament configuration, game plan)
https://www.amateurschach.de/fling/etoc-g_test-02.sto

shredder *.html file (current results)
https://www.amateurschach.de/fling/etoc-g_test-02.html

shredder *.pgn file (the games)
https://www.amateurschach.de/fling/etoc-g_test-02.pgn

Updates each 2 minutes with FTP-Software Fling Plus 5.04.

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

Now, what should I display later the detail site?
In the group of "Standard Engines" there are many of the "King attacker", engines can win many games very shortly.

The following situation:
Engine A plays 1.300 games vs. the 26 Standard Engines, result = 745,5 points, rank: First Lieutenant
Engine B plays 1,300 games vs. the 26 Standard Engines, result = 745,5 points, rank: First Lieutenant

Is it possible if I test around 250 or more engines next year.

I have interest in displaying on a detail site only 4 stats.

- draw quote
- move average
- short won games below 50 moves
- short lost games below 50 moves

After all the years computerchess ...
Nothing is more boring as Elo!
... for such strong engines, today are available!

Engines give me a higher draw quote = several possible reasons!
Engines give me a higher move-average, are fighting more in endgames or fighting for the 0,5-1,0% possibility to make a half point more, waiting for mistakes by the opponent.
Engines give me a higher quantity of won games, are very aggressive in the middle game.
Engines give me a higher quantity of fast-loss games, and are usually stronger in the endgames (in most cases)
Engines give me a higher quantity of won and lost games, are playing more or less a bit speculative

Draw Quote = difficult to evaluate!
Very fast draws are all the time a topic but not for my idea.
More games can be played with a lower move-average, less boring end games will be the result.
On the other hand, a lower move-average because an engine gives to fast a draw isn’t very nice.

Back to Engine A and Engine B, both with 745,5 points.

Engine A: 12 won games below 50 moves, 4 lost games below 50 moves = 12-4 = +8
Engine B: 3 won games below 50 moves, 1 lost game below 50 moves = 3-1 = +2

A battle can be won through surprise attacks.
Very difficult at all, because also the endgame-artist plays nice chess.

I have no solution for an second system for evaluate all this; each game phase is interesting.

What I can do is to display:
1. Engine A = 745,5 P., First Lieutenant, 52,5% draws, 84,4% m-avg, 12-w / 04-l = +008
1. Engine B = 745,5 P., First Lieutenant, 56,8% draws, 89,4% m-avg, 03-w / 01-l = +002

Such information should be enough.
As little information as possible on an overview page with over 250 engines.

I will be thinking about it in the next day’s how I display all that.
Furthermore, url to the test-run result, url to the PGN database, url to the tournament configuration for each test.

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Again, what I saw from Laser is just great. Pure random that I have it in the group of "Standard Engines". Of course, I know the program from all my tourneys, the compiles from Jim give Laser more power.

Monty games = Fun, fun, fun ... what for a fine development.

Still no idea how strong Monty is, but clearly much stronger than I had assumed. We'll find out exactly how strong later, once we've done a lot of test runs.

So far, I am very happy with the group of 26 Standard engines.
This group of 26 engines, can test all 250 TOP-Engines very well with the rank-system.
No changes are necessary here. It must be hard for every engine I test to play vs. the 26-Standard engines.

:-)

Also the work from the Russian programmer, the engine Petrel ...
Very aggressive in the earlier mid-games. Often I am thinking the program will win the game after 10 moves.