UCI_Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Danasah human is around 1300.

Code: Select all

   # PLAYER                              :  RATING  ERROR  POINTS  PLAYED   (%)
   1 Amyan 1.72 ucielo 1500              :  2351.6  137.5   112.0     132    85
   2 Cheese 2.1 ucielo 1500              :  2340.2  132.6   111.0     132    84
   3 Cheng 4.39 ucielo 1500              :  2329.1  132.5   110.0     132    83
   4 Fruit reloaded v3.21 ucielo 1500    :  2311.6  130.9   106.5     130    82
   5 Ufim v8.02 ucielo 1500              :  2146.3  118.6    99.5     146    68
   6 Rhetoric 1.4.3 ucielo 1500          :  2112.5  120.9    86.0     130    66
   7 DanaSah 7.9 ucielo 1500             :  2101.7  116.2    79.5     132    60
   8 MadChess 2.2 ucielo 1500            :  2088.8  115.2    92.0     146    63
   9 Houdini 3 ucielo 1500               :  2063.8  128.7    81.5     112    73
  10 D2019.2.37.53 ucielo 1500           :  2019.8  114.7    77.5     132    59
  11 Discocheck 5.2 ucielo 1500          :  1848.6  111.2    59.5     132    45
  12 Iota 1.0 ccrl 1019                  :  1821.1  158.6    15.5      46    34
  13 CT800 V1.34 ucielo 1500             :  1758.5  110.0    53.0     148    36
  14 Arasan 21.3 ucielo 1500             :  1662.1  112.8    41.5     132    31
  15 Hiarcs 14 ucielo 1500               :  1510.6  113.7    28.5     146    20
  16 NSVChess v0.14 ccrl 946             :  1500.0   ----    21.0     212    10
  17 DanaSah 7.9 human ucielo 1500       :  1278.1  156.1     7.5     208     4

User avatar
pedrox
Posts: 1056
Joined: Fri Mar 10, 2006 6:07 am
Location: Basque Country (Spain)

Re: UCI_Elo

Post by pedrox »

Thanks for the results.

I had not seen anywhere that UCI_ELO refers to Elo FIDE. But it makes sense than when a user uses limit strenght is to play against the engine and in that case offer an Elo FIDE (Although I have also used my engine to deal with dedicated machines of the 80-90s). I will make the "human" version as the default version and I will study how to do the other options.

I think the "engine" version played more or less at the level I expected, the "human" version made it much lower than I expected. I will try to increase the strength for this mode by 200 Elo points.

In my engine, I could make a force adjustment by changing values in the configuration options. For example:

Diff engine = 50
Diff computer-engine = 350
Diff human-computer = 70

With these values it is possible that the engine in the "human" mode played something like 1500. But I will have to check if this is and if the regulation then works for other values.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

According to uci protocol, UCI_Elo refers to Elo, since there is no other popular chess Elo than FIDE Elo, I believe this is FIDE Elo. Mark the author of Hiarcs is probably aware of this, his engine at 1500 uci elo is close.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

A sample game against Danasah using the chess GUI that I have been developing, featuring 2 TC's one with time delay. Danasah played at TC 5min+10s (Fischer), and I am on TC 5min-10s (10s delay).

The pgn source has clk or clock showing the time remaining after a move. Press C8 on the board.

[pgn][Event "Human vs computer"] [Site "?"] [Date "2019.07.14"] [Round "?"] [White "Ferdy"] [Black "DanaSah 7.9 Human UCI_Elo 1500"] [Result "1-0"] [BlackTimeControl "300+10"] [Termination "Adjudication"] [WhiteTimeControl "300-10"] 1. d4 Nf6 { book } 2. c4 { [%clk 0:05:00] } 2... b6 { book } 3. Nc3 { [%clk 0:05:00] } 3... Bb7 { book } 4. d5 { [%clk 0:05:00] } 4... e6 { book } 5. e4 { [%clk 0:05:00] } 5... exd5 { [%clk 0:05:28] } 6. cxd5 { [%clk 0:05:00] } 6... Nxe4 { [%clk 0:05:17] } 7. Nxe4 { [%clk 0:05:00] } 7... Bb4+ { [%clk 0:05:10] } 8. Nc3 { [%clk 0:04:56] } 8... Qe7+ { [%clk 0:05:04] } 9. Be2 { [%clk 0:04:50] } 9... Qd6 { [%clk 0:04:32] } 10. Bf3 { [%clk 0:04:49] } 10... Qe7+ { [%clk 0:03:45] } 11. Ne2 { [%clk 0:04:49] } 11... Bxc3+ { [%clk 0:03:39] } 12. bxc3 { [%clk 0:04:49] } 12... Na6 { [%clk 0:03:35] } 13. d6 { [%clk 0:03:56] There is tactics here with d6, attacking the queen and Bishop at B7. } 13... Qxd6 { [%clk 0:03:11] } 14. Qxd6 { [%clk 0:03:53] } 14... cxd6 { [%clk 0:02:57] } 15. Bxb7 { [%clk 0:03:47] This is winning. } 15... Rb8 { [%clk 0:02:43] } 16. Bxa6 { [%clk 0:03:47] } 16... O-O { [%clk 0:02:38] } 17. Ba3 { [%clk 0:03:27] My time does not increase because I am playing with time delay of 10s with base time of 5 minutes. } 17... b5 { [%clk 0:02:34] } 18. Bxd6 { [%clk 0:03:27] } 18... Rfd8 { [%clk 0:02:23] } 19. Bxb8 { [%clk 0:03:27] } 19... Rxb8 { [%clk 0:02:11] } 20. Rb1 { [%clk 0:03:27] } 20... b4 { [%clk 0:02:06] } 21. Rxb4 { [%clk 0:03:27] } 21... Rxb4 { [%clk 0:01:50] } 22. cxb4 { [%clk 0:03:27] } 22... Kf8 { [%clk 0:01:33] } 23. Kd2 { [%clk 0:03:27] } 23... g5 { [%clk 0:01:29] } 24. Kd3 { [%clk 0:03:27] } 24... Kg7 { [%clk 0:01:27] } 25. Kd4 { [%clk 0:03:25] I will adjudicate this game as win for me. } 25... Kg8 { [%clk 0:01:11] } 1-0[/pgn]


And a game with Arasan
[pgn][Event "Human vs computer"] [Site "?"] [Date "2019.07.14"] [Round "?"] [White "Ferdy"] [Black "Arasan 21.3 UCI_Elo 1500"] [Result "1-0"] [BlackTimeControl "300+10"] [WhiteTimeControl "300-10"] 1. d4 d5 { book } 2. c4 { [%clk 0:05:00] } 2... c6 { book } 3. Nf3 { [%clk 0:05:00] } 3... Nf6 { book } 4. e3 { [%clk 0:05:00] } 4... Bg4 { book } 5. Nbd2 { [%clk 0:05:00] } 5... Qa5 { [%clk 0:05:43] } 6. a3 { [%clk 0:04:41] } 6... Nbd7 { [%clk 0:05:46] } 7. b4 { [%clk 0:04:41] } 7... Qc7 { [%clk 0:05:49] } 8. Bb2 { [%clk 0:04:41] } 8... Bf5 { [%clk 0:05:52] } 9. Rc1 { [%clk 0:04:39] } 9... g6 { [%clk 0:05:55] } 10. cxd5 { [%clk 0:04:36] } 10... Nxd5 { [%clk 0:05:58] } 11. e4 { [%clk 0:04:31] Got a fork here. } 11... Be6 { [%clk 0:06:00] } 12. exd5 { [%clk 0:04:31] } 12... Bxd5 { [%clk 0:06:03] } 13. Bd3 { [%clk 0:04:31] } 13... Bh6 { [%clk 0:06:06] } 14. O-O { [%clk 0:04:31] } 14... Qd6 { [%clk 0:06:08] } 15. Ra1 { [%clk 0:04:22] } 15... Bg7 { [%clk 0:06:10] } 16. Qe2 { [%clk 0:04:22] } 16... O-O { [%clk 0:06:13] } 17. Rfe1 { [%clk 0:04:22] } 17... Rae8 { [%clk 0:06:15] } 18. Rad1 { [%clk 0:04:22] } 18... f5 { [%clk 0:06:18] } 19. Nc4 { [%clk 0:04:22] } 19... Qf4 { [%clk 0:06:21] } 20. Bc1 { [%clk 0:04:21] } 20... Qg4 { [%clk 0:06:21] } 21. Ne3 { [%clk 0:04:15] } 21... Bxf3 { [%clk 0:06:24] } 22. Qxf3 { [%clk 0:04:15] } 22... Qxf3 { [%clk 0:06:27] } 23. gxf3 { [%clk 0:04:15] } 23... b5 { [%clk 0:06:31] } 24. Bb2 { [%clk 0:04:15] } 24... f4 { [%clk 0:06:35] } 25. Nf1 { [%clk 0:04:12] } 25... e5 { [%clk 0:06:39] } 26. dxe5 { [%clk 0:04:12] } 26... Nb8 { [%clk 0:06:43] } 27. Nd2 { [%clk 0:04:12] } 27... Rd8 { [%clk 0:06:48] } 28. Ne4 { [%clk 0:04:12] } 28... Rfe8 { [%clk 0:06:52] } 29. Bb1 { [%clk 0:04:04] } 29... Rxd1 { [%clk 0:06:56] } 30. Rxd1 { [%clk 0:04:04] } 30... Bxe5 { [%clk 0:07:01] } 31. Bxe5 { [%clk 0:04:04] } 31... Rxe5 { [%clk 0:07:06] } 32. Rd8+ { [%clk 0:04:04] } 32... Kg7 { [%clk 0:07:11] } 33. Rxb8 { [%clk 0:04:04] } 33... Re7 { [%clk 0:07:16] } 34. Rc8 { [%clk 0:04:04] } 34... Rd7 { [%clk 0:07:21] } 35. Ba2 { [%clk 0:04:04] } 35... Rd3 { [%clk 0:07:26] } 36. Rc7+ { [%clk 0:03:54] } 36... Kh6 { [%clk 0:07:31] } 37. Bg8 { [%clk 0:03:54] } 37... g5 { [%clk 0:07:36] } 38. Rxh7+ { [%clk 0:03:54] } 38... Kg6 { [%clk 0:07:41] } 39. Rxa7 { [%clk 0:03:54] } 39... Rxf3 { [%clk 0:07:46] } 40. Bh7+ { [%clk 0:03:54] } 40... Kh5 { [%clk 0:07:51] } 41. Nf6+ { [%clk 0:03:54] } 41... Kh4 { [%clk 0:07:56] } 42. Bf5 { [%clk 0:03:54] } 42... g4 { [%clk 0:08:01] } 43. Bxg4 { [%clk 0:03:54] } 43... Rd3 { [%clk 0:08:06] } 44. Rg7 { [%clk 0:03:47] } 44... Rxa3 { [%clk 0:08:11] } 45. h3 { [%clk 0:03:47] } 45... Rb3 { [%clk 0:08:16] } 46. Kg2 { [%clk 0:03:47] } 46... Rxb4 { [%clk 0:08:20] } 47. Bf3 { [%clk 0:03:47] } 47... Rb1 { [%clk 0:08:25] } 48. Rg4# { [%clk 0:03:47] } 1-0 [/pgn]
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: UCI_Elo

Post by MikeB »

pedrox wrote: Fri Jul 12, 2019 9:54 pm Thanks for the results.

I had not seen anywhere that UCI_ELO refers to Elo FIDE. But it makes sense than when a user uses limit strenght is to play against the engine and in that case offer an Elo FIDE (Although I have also used my engine to deal with dedicated machines of the 80-90s). I will make the "human" version as the default version and I will study how to do the other options.

I think the "engine" version played more or less at the level I expected, the "human" version made it much lower than I expected. I will try to increase the strength for this mode by 200 Elo points.

In my engine, I could make a force adjustment by changing values in the configuration options. For example:

Diff engine = 50
Diff computer-engine = 350
Diff human-computer = 70

With these values it is possible that the engine in the "human" mode played something like 1500. But I will have to check if this is and if the regulation then works for other values.
It is commonly accepted that Elo means something in the ball parl of FIDE Elo. Of course, many national federations have their own ratings systems and even the engine vs engine testers try have something that is supposed to align with FIDE Elo - but of course with no interaction between the universe of players between the human group of players and the engine universe of players, it is impossible to say CCRL equals FIDE, etc. We do know top players are rated around 2800 FIDE and it does appear from a distance , that an engine rated near 2800 CCRL is probably close to 2800 FIDE, but who really knows for sure. The answer is we do not know and , but we do know that is not exact - but it's probably in the range if you use very large bars - say 2800 CCRL is probably plus or minus 100 ELo of 2800 FIDE. And that of course will be true at 1500 ELO - plus or minus 100 Elo. And even my off the cuff comment here - somebody else will say , "no, it's xyz" and they could be right ..or they could be wrong. I would be shocked to find that the error bar would be more than 200 Elo off - but who knows. The very best players one the world no longer like to play the best engines in the world in public, and I I don't blame them one iota as the difference is now in the the multiple hundreds of ELO and they have almost no shot at winning even one game. Drawing a game now and then is probably the best they can do now.
Image
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Reduce again Deuterium's nodes limit to 200, now it much closer to CT800. No other tricks added to reduce strenght so far, just the node reduction. Deuterium played on TC 3m+2s while I use TC 10m with 10s delay.

Code: Select all

   # PLAYER                                   :  RATING  ERROR  POINTS  PLAYED   (%)
   1 Amyan 1.72 ucielo 1500                   :  2298.3  118.2   120.0     140    86
   2 Cheng 4.39 ucielo 1500                   :  2281.8  117.3   118.5     140    85
   3 Cheese 2.1 ucielo 1500                   :  2245.1  118.0   115.0     140    82
   4 Fruit reloaded v3.21 ucielo 1500         :  2238.3  104.1   112.5     138    82
   5 Ufim v8.02 ucielo 1500                   :  2128.0   98.1   110.5     154    72
   6 Rhetoric 1.4.3 ucielo 1500               :  2062.9  108.5    93.5     138    68
   7 DanaSah 7.9 ucielo 1500                  :  2061.9  108.7    74.0     124    60
   8 MadChess 2.2 ucielo 1500                 :  2061.7   95.5   101.5     154    66
   9 Houdini 3 ucielo 1500                    :  2055.3  101.6    95.0     128    74
  10 Discocheck 5.2 ucielo 1500               :  1822.5  105.6    66.5     140    48
  11 Iota 1.0 ccrl 1019                       :  1822.1  121.7    24.5      58    42
  12 Deuterium v2019.2.37.53 ucielo 1500      :  1784.4   86.1    91.0     240    38
  13 CT800 V1.34 ucielo 1500                  :  1731.5   90.1    58.0     156    37
  14 Arasan 21.3 ucielo 1500                  :  1619.6   90.4    44.5     156    29
  15 Hiarcs 14 ucielo 1500                    :  1510.7  104.2    32.5     154    21
  16 NSVChess v0.14 ccrl 946                  :  1500.0   ----    25.0     212    12
  17 DanaSah 7.9 human ucielo 1500            :  1243.9  129.1     7.5     208     4

Sample game of how it played. It does not blunder material directly but you have to play with combination to outplay it.

[pgn][Event "Human vs computer"] [Site "?"] [Date "2019.07.14"] [Round "?"] [White "Ferdy"] [Black "Deuterium v2019.2.37.53 UCI_Elo 1500"] [Result "1-0"] [BlackTimeControl "180+2"] [WhiteTimeControl "600-10"] 1. d4 d6 { book } 2. c4 { [%clk 0:10:00] } 2... e5 { book } 3. d5 { [%clk 0:10:00] } 3... g6 { book } 4. e4 { [%clk 0:10:00] } 4... Bg7 { book } 5. Bd3 { [%clk 0:10:00] } 5... Nf6 { [%clk 0:03:06] } 6. f3 { [%clk 0:10:00] } 6... O-O { [%clk 0:03:05] } 7. Be3 { [%clk 0:10:00] } 7... Nbd7 { [%clk 0:03:04] } 8. Ne2 { [%clk 0:10:00] } 8... Nc5 { [%clk 0:03:03] } 9. Bc2 { [%clk 0:10:00] } 9... Nfd7 { [%clk 0:03:02] } 10. Nd2 { [%clk 0:09:36] } 10... f5 { [%clk 0:03:01] } 11. O-O { [%clk 0:09:36] } 11... f4 { [%clk 0:03:00] } 12. Bf2 { [%clk 0:09:36] } 12... a5 { [%clk 0:02:59] } 13. Nc3 { [%clk 0:09:34] } 13... c6 { [%clk 0:02:58] } 14. a3 { [%clk 0:09:34] } 14... g5 { [%clk 0:02:57] } 15. b4 { [%clk 0:09:34] } 15... Na6 { [%clk 0:02:56] } 16. Rb1 { [%clk 0:09:34] } 16... axb4 { [%clk 0:02:55] } 17. axb4 { [%clk 0:09:34] } 17... Nc7 { [%clk 0:02:54] } 18. c5 { [%clk 0:09:31] } 18... Ra3 { [%clk 0:02:53] } 19. Rb3 { [%clk 0:09:31] } 19... Rxb3 { [%clk 0:02:52] } 20. Bxb3 { [%clk 0:09:31] } 20... cxd5 { [%clk 0:02:51] } 21. Nxd5 { [%clk 0:09:31] } 21... Ne8 { [%clk 0:02:50] } 22. Nc4 { [%clk 0:09:31] } 22... dxc5 { [%clk 0:02:49] } 23. bxc5 { [%clk 0:09:31] } 23... Nc7 { [%clk 0:02:48] } 24. Nd6 { [%clk 0:08:28] } 24... Ne6 { [%clk 0:02:47] } 25. Nxc8 { [%clk 0:08:25] } 25... Ndxc5 { [%clk 0:02:46] } 26. Nde7+ { [%clk 0:06:32] } 26... Kh8 { [%clk 0:02:44] } 27. Qxd8 { [%clk 0:06:32] } 27... Rxd8 { [%clk 0:02:43] } 28. Bd5 { [%clk 0:06:23] } 28... Nd4 { [%clk 0:02:42] } 29. Nb6 { [%clk 0:06:12] } 29... Rd6 { [%clk 0:02:41] } 30. Nc4 { [%clk 0:06:11] } 30... Rd7 { [%clk 0:02:40] } 31. Nc8 { [%clk 0:06:00] } 31... Rc7 { [%clk 0:02:39] } 32. N8b6 { [%clk 0:05:58] } 32... Nd3 { [%clk 0:02:38] } 33. Ra1 { [%clk 0:05:53] } 33... h5 { [%clk 0:02:37] } 34. Ra8+ { [%clk 0:05:43] } 34... Kh7 { [%clk 0:02:36] } 35. Rb8 { [%clk 0:05:38] } 35... Nxf2 { [%clk 0:02:35] } 36. Kxf2 { [%clk 0:05:38] } 36... Kg6 { [%clk 0:02:34] } 37. Rxb7 { [%clk 0:05:38] } 37... Rxb7 { [%clk 0:02:33] } 38. Bxb7 { [%clk 0:05:38] } 38... g4 { [%clk 0:02:32] } 39. Nd7 { [%clk 0:05:38] } 39... Nc2 { [%clk 0:02:31] } 40. Ndxe5+ { [%clk 0:05:38] } 40... Bxe5 { [%clk 0:02:30] } 41. Nxe5+ { [%clk 0:05:38] } 41... Kg5 { [%clk 0:02:29] } 42. Nc4 { [%clk 0:05:35] } 42... Nd4 { [%clk 0:02:28] } 43. Bc8 { [%clk 0:05:28] } 43... gxf3 { [%clk 0:02:27] } 44. gxf3 { [%clk 0:05:28] } 44... Kf6 { [%clk 0:02:26] } 45. Nb2 { [%clk 0:05:12] } 45... Kg5 { [%clk 0:02:25] } 46. Nd3 { [%clk 0:05:12] } 46... h4 { [%clk 0:02:24] } 47. h3 { [%clk 0:05:12] } 47... Nc6 { [%clk 0:02:23] } 48. Bg4 { [%clk 0:05:11] } 48... Nd4 { [%clk 0:02:22] } 49. Ke1 { [%clk 0:05:11] } 49... Nc2+ { [%clk 0:02:21] } 50. Kd2 { [%clk 0:05:11] } 50... Nd4 { [%clk 0:02:20] } 51. Kc3 { [%clk 0:05:11] } 51... Nc6 { [%clk 0:02:19] } 52. Kc4 { [%clk 0:05:11] } 52... Na5+ { [%clk 0:02:18] } 53. Kd5 { [%clk 0:05:11] } 53... Nb3 { [%clk 0:02:17] } 54. e5 { [%clk 0:05:11] } 54... Nd2 { [%clk 0:02:16] } 55. e6 { [%clk 0:05:11] } 55... Kf6 { [%clk 0:02:15] } 56. Nxf4 { [%clk 0:05:11] } 56... Nb3 { [%clk 0:02:14] } 57. Kd6 { [%clk 0:05:11] } 57... Nd4 { [%clk 0:02:13] } 58. Nd5+ { [%clk 0:05:11] } 58... Kg6 { [%clk 0:02:12] } 59. e7 { [%clk 0:05:11] } 59... Nb5+ { [%clk 0:02:11] } 60. Kd7 { [%clk 0:05:11] } 60... Nd6 { [%clk 0:02:10] } 61. Kxd6 { [%clk 0:05:11] } 61... Kf7 { [%clk 0:02:09] } 62. Kd7 { [%clk 0:05:11] } 62... Kg6 { [%clk 0:02:08] } 63. e8=Q+ { [%clk 0:05:11] } 63... Kh6 { [%clk 0:02:06] } 64. Qe6+ { [%clk 0:05:11] } 64... Kg7 { [%clk 0:02:05] } 65. Qe7+ { [%clk 0:05:11] } 65... Kh6 { [%clk 0:02:04] } 66. Qf6+ { [%clk 0:05:11] } 66... Kh7 { [%clk 0:02:03] } 67. Bf5+ { [%clk 0:05:11] } 67... Kg8 { [%clk 0:02:02] } 68. Ne7# { [%clk 0:05:11] } 1-0 [/pgn]
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Reduce strength by randomizing piece values of queen and rook. Before a search is made, queen is randomize between 400 to 700 cp, while rook is between 300 and 700 cp. This is on Deuterium v2019.2.37.54 ucielo 1500. Note basic weakening is at 200 nodes a move at this 1500 Elo level plus this material randomizer.

The base version is Deuterium v2019.2.37.53 ucielo 1500 which is run at 200 nodes.

Result at TC 40/2min. It is now below CT800.

Code: Select all

   # PLAYER                                 :  RATING  ERROR  POINTS  PLAYED   (%)
   1 Amyan 1.72 ucielo 1500                 :  2290.3  118.7   135.5     156    87
   2 Cheng 4.39 ucielo 1500                 :  2279.5  131.5   134.5     156    86
   3 Cheese 2.1 ucielo 1500                 :  2234.1  118.4   130.0     156    83
   4 Fruit reloaded v3.21 ucielo 1500       :  2208.5  119.5   125.5     154    81
   5 Ufim v8.02 ucielo 1500                 :  2121.1  102.3   125.5     170    74
   6 MadChess 2.2 ucielo 1500               :  2065.4  105.2   117.5     170    69
   7 Rhetoric 1.4.3 ucielo 1500             :  2054.5  106.5   107.5     154    70
   8 DanaSah 7.9 ucielo 1500                :  2053.2  113.2    74.0     124    60
   9 Houdini 3 ucielo 1500                  :  2037.3  113.3   107.5     144    75
  10 Iota 1.0 ccrl 1019                     :  1821.9  131.7    24.5      58    42
  11 Discocheck 5.2 ucielo 1500             :  1819.8   96.7    77.5     156    50
  12 Deuterium v2019.2.37.53 ucielo 1500    :  1788.9   92.1   103.0     256    40
  13 CT800 V1.34 ucielo 1500                :  1727.9  108.0    67.0     172    39
  14 Deuterium v2019.2.37.54 ucielo 1500    :  1672.7   96.6    60.5     224    27
  15 Arasan 21.3 ucielo 1500                :  1628.3  102.4    52.5     172    31
  16 Hiarcs 14 ucielo 1500                  :  1511.7  108.3    37.0     170    22
  17 NSVChess v0.14 ccrl 946                :  1500.0   ----    25.0     212    12
  18 DanaSah 7.9 human ucielo 1500          :  1259.7  124.7     9.5     224     4

Sample game with Deuterium v2019.2.37.54 ucielo 1500. There is a bit of struggle in the opening and it keeps on weakening its squares. It gives up its queen without too much fight, perhaps this is because of the randomized queen value at [400, 700] cp.
I played at TC 10m and 10s delay, Deuterium is at TC 3m+2s.

[pgn][Event "Human vs computer"] [Site "?"] [Date "2019.07.15"] [Round "?"] [White "Deuterium v2019.2.37.54 UCI_Elo 1500"] [Black "Ferdy"] [Result "0-1"] [BlackTimeControl "600-10"] [WhiteTimeControl "180+2"] 1. d4 { book } 1... d5 { [%clk 0:10:00] } 2. c4 { book } 2... e6 { [%clk 0:10:00] } 3. Nc3 { book } 3... Nf6 { [%clk 0:10:00] } 4. Bg5 { book } 4... Be7 { [%clk 0:10:00] } 5. Nf3 { book } 5... O-O { [%clk 0:10:00] } 6. e3 { [%clk 0:03:08] } 6... b6 { [%clk 0:10:00] } 7. Bd3 { [%clk 0:03:07] } 7... Bb7 { [%clk 0:10:00] } 8. O-O { [%clk 0:03:06] } 8... Nbd7 { [%clk 0:09:59] } 9. cxd5 { [%clk 0:03:05] } 9... exd5 { [%clk 0:09:59] } 10. h3 { [%clk 0:03:04] } 10... Ne4 { [%clk 0:09:48] } 11. Bxe7 { [%clk 0:03:03] } 11... Qxe7 { [%clk 0:09:48] } 12. a4 { [%clk 0:03:02] } 12... a6 { [%clk 0:09:48] } 13. h4 { [%clk 0:03:01] } 13... h6 { [%clk 0:09:42] } 14. Rc1 { [%clk 0:03:00] } 14... c5 { [%clk 0:09:42] } 15. Kh1 { [%clk 0:02:59] } 15... Rac8 { [%clk 0:09:14] } 16. Ne2 { [%clk 0:02:58] } 16... c4 { [%clk 0:08:54] } 17. Bb1 { [%clk 0:02:56] } 17... b5 { [%clk 0:08:54] } 18. Nf4 { [%clk 0:02:55] } 18... Rc7 { [%clk 0:08:32] } 19. a5 { [%clk 0:02:54] } 19... Rd8 { [%clk 0:07:18] } 20. Bc2 { [%clk 0:02:53] } 20... Nf8 { [%clk 0:07:06] } 21. g3 { [%clk 0:02:52] } 21... Bc8 { [%clk 0:06:43] } 22. Qe2 { [%clk 0:02:51] } 22... Bg4 { [%clk 0:06:43] } 23. Kh2 { [%clk 0:02:50] } 23... g5 { [%clk 0:06:43] } 24. hxg5 { [%clk 0:02:49] } 24... hxg5 { [%clk 0:06:43] } 25. Nh3 { [%clk 0:02:48] } 25... Rd6 { [%clk 0:06:30] } 26. Bxe4 { [%clk 0:02:47] } 26... dxe4 { [%clk 0:06:30] } 27. Nfg1 { [%clk 0:02:46] } 27... Bxe2 { [%clk 0:06:30] } 28. Nxe2 { [%clk 0:02:45] } 28... Rh6 { [%clk 0:06:30] } 29. f3 { [%clk 0:02:44] } 29... exf3 { [%clk 0:06:30] } 30. Rxf3 { [%clk 0:02:43] } 30... g4 { [%clk 0:06:30] } 31. Rf5 { [%clk 0:02:42] } 31... Rxh3+ { [%clk 0:06:30] } 32. Kg1 { [%clk 0:02:41] } 32... Qxe3+ { [%clk 0:06:30] } 33. Rf2 { [%clk 0:02:40] } 33... Ne6 { [%clk 0:06:11] } 34. d5 { [%clk 0:02:39] } 34... Ng5 { [%clk 0:06:11] } 35. d6 { [%clk 0:02:38] } 35... Rd7 { [%clk 0:06:11] } 36. Rd1 { [%clk 0:02:37] } 36... Ne4 { [%clk 0:06:11] } 37. Rf1 { [%clk 0:02:36] } 37... Rxd6 { [%clk 0:06:11] } 38. b4 { [%clk 0:02:35] } 38... Rd2 { [%clk 0:06:10] } 39. Kg2 { [%clk 0:02:34] } 39... Rxe2 { [%clk 0:06:10] } 40. Rxe2 { [%clk 0:02:33] } 40... Qxg3# { [%clk 0:06:10] } 0-1 [/pgn]
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Collected some games played by players with Elo 1400 to 1600, based from TWIC 2018 to July 2019. This could be useful to approximate the UCI_Elo 1500 for engine authors who may wish to implement UCI_Elo on their engine or improve current implementation.

The source pgn file is cleaned and doubles are removed by pgn-extract.

7000 plus games, white has a rating from 1400 to 1600.
https://drive.google.com/file/d/1eD0a9z ... sp=sharing

7000 plus games, black has a rating from 1400 to 1600.
https://drive.google.com/file/d/1eoXGxQ ... sp=sharing
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Latest Stockfish has UCI_Elo feature. It is now included in the test for 1500.
Also make some revisions to Deuterium, now it is limited to 300 nodes and randomize piece values for [Q, R, B and N] at a max of 50% of the time for all moves in a game.

TC 40/2m

Code: Select all

   # PLAYER                                 :  RATING  ERROR  POINTS  PLAYED   (%)
   1 Cheng 4.39 ucielo 1500                 :  2282.8  121.0   134.5     156    86
   2 Cheese 2.1 ucielo 1500                 :  2258.0  114.7   132.0     156    85
   3 Fruit reloaded v3.21 ucielo 1500       :  2233.2  113.5   127.5     154    83
   4 Amyan 1.72 ucielo 1500                 :  2220.8  113.6   128.0     156    82
   5 Ufim v8.02 ucielo 1500                 :  2109.9  102.7   123.0     170    72
   6 Rhetoric 1.4.3 ucielo 1500             :  2070.8   99.8   107.5     154    70
   7 DanaSah 7.9 ucielo 1500                :  2039.2   89.6    74.0     124    60
   8 Houdini 3 ucielo 1500                  :  2006.3  104.8   101.5     144    70
   9 MadChess 2.2 ucielo 1500               :  1994.0  112.5   104.5     170    61
  10 Deuterium v2019.2.37.59 ucielo 1500    :  1862.7   94.7   117.5     256    46
  11 Stockfish 2019.07.14 ucielo 1500       :  1842.8   96.9   112.5     256    44
  12 Discocheck 5.2 ucielo 1500             :  1795.4  104.3    69.0     156    44
  13 Iota 1.0 ccrl 1019                     :  1766.3  117.5    26.0      74    35
  14 CT800 V1.34 ucielo 1500                :  1762.9   86.2    67.5     172    39
  15 Arasan 21.3 ucielo 1500                :  1648.5  104.5    50.5     172    29
  16 Hiarcs 14 ucielo 1500                  :  1534.2   96.4    35.5     170    21
  17 NSVChess v0.14 ccrl 946                :  1500.0   ----    25.5     228    11
  18 DanaSah 7.9 human ucielo 1500          :  1295.1  132.4     9.5     224     4


Meanwhile created a test set for these UCI_Elo 1500 engines. The test positions are from human players with Elo 1450 to 1550. The main goal is to find which uci engines has the greatest number of matches in the test. The test epd would look something like this,

Code: Select all

3r2k1/p5p1/1pR4p/4R3/3r4/8/PP4PP/6K1 b - - bm Rd2; ce 0; c0 "Rd1+"; c1 "154";
That bm Rd2 is the move by a human player with an Elo rating within 1450 to 1550 from an actual game.
That ce 0 is the centipawn eval score of move Rd2 based on stockfish dev at 1sec of analysis on i7 3.4 Ghz PC.
That Rd1+ is the move preferred by stockfish dev and has a score of 154 cp. I have collected around 60k test positions.

Now to test these uci elo 1500 engines, the position is given to the engine and allow it to search at 1sec of analysis per pos. Whenever the engine bestmove and bm is the same then a match counter is incremented. Aside from the match counter, I also save how many positions are there where the bm of human is not the same to the bestmove of engine. The bestmove of engine can be stronger or weaker than human move. If engine move is stronger I record it in High counter, if engine move is weaker than that of human move I record it in Low counter. Other items are also recorded like the average difference between the move score of engine and the move score of human when the bestmove of engine and the bestmove of human are not the same.

Results on 1000 test positions.

Code: Select all

UCI_Elo 1500 engine test results on FIDE Elo 1500
Test positions are taken from players with FIDE Elo 1450 to 1550

                               Engine  Total  Match  High  Low  HACD  LACD
 Deuterium v2019.2.37.59 UCI_Elo 1500   1000    362   291  347   357   335
             Arasan 21.3 UCI_Elo 1500   1000    305   266  429   494   313
              Ufim v8.02 UCI_Elo 1500   1000    428   280  292   475   332
             CT800 V1.34 UCI_Elo 1500   1000    333   244  423   268   739
       DanaSah 7.9 Human UCI_Elo 1500   1000    332   250  418   392   577
    Stockfish 2019.07.14 UCI_Elo 1500   1000    254   263  483   493   356
              Cheng 4.39 UCI_Elo 1500   1000    408   325  267   442   329
          Discocheck 5.2 UCI_Elo 1500   1000    368   276  356   250   422
               Houdini 3 UCI_Elo 1500   1000    360   263  377   393   217
              Amyan 1.72 UCI_Elo 1500   1000    348   239  413   399   738
          Rhetoric 1.4.3 UCI_Elo 1500   1000    359   286  355   240   664
               Hiarcs 14 UCI_Elo 1500   1000    342   239  419   286   440
              Cheese 2.1 UCI_Elo 1500   1000    432   312  256   441   432

Code: Select all

::Legend::
Total: Number of test positions from human games.
Match: Count of pos, where engine and human move are the same.
High : Count of pos, where engine move is stronger than human move.
Low  : Count of pos, where engine move is weaker than human move.
HACD : High Average Centipawn Difference, engine move is stronger 
       than human move by Centipawn amount, according to Stockfish 2019.04.16.
LACD : Low Average Centipawn Difference, engine move is weaker 
       than human move by Centipawn amount, according to Stockfish 2019.04.16.
Table interpretation:
Deuterium was able to match the human move by 362 or 100*362/1000 or 36.2%. In relative comparison, the engine that got the most matches is Cheese 2.1 at 43.2%. The HACD of Deuterium is 357 or 357 cp or around 3 and a half pawns. HACD means that if Deuterium move is stronger than human move, it has a difference of 357 cp above that of human move score. In order to simulate a human play, its HACD should be minimum, of the engines tested Rhetoric has the best at 240 cp. This means that when you play against Rhetoric, it can play stronger moves at an average of 240 cp above the human move. LACD is the opposite of HACD. In LACD the engine move is weaker than human move. For Deuterium LACD is 335 cp, that would mean that when Deuterium plays a bad move, on average it gives 335 cp advantage to its opponent. Looking at the table the engine that gives away some advantage to its opponent are CCT800 at 739 cp and Amyan at 738 cp. For humans at lower rating range these engines are good to play, but be aware of its HACD values too.

So how do we rank engines that plays like humans based on human test positions?
I can list the following criteria:
1. Match (max is better)
2. High (min is better)
3. Low (max is better)
4. HACD (min is better)
5. LACD (max is better)

It seems like this is an MCDA (Multi-Criteria Decision Analysis) issue, where alternatives are ranked based on criteria. One technique to rank alternatives is by using TOPSIS.
TOPSIS ref.
https://en.wikipedia.org/wiki/TOPSIS
https://www.slideshare.net/pranavmishra ... g-approach

With that table I tried to rank those engines using TOPSIS utilizing skcriteria python module.
Here are the results with the application of weight for each criteria.
match, weight=0.6
High, weight=0.05
Low, weight=0.05
HACD, weight=0.2
LACD, weight=0.1
Total weight is 1.0. In the table there are also indications in the column if min and max is preferrable, That is my input too.

Code: Select all

TOPSIS (mnorm=vector, wnorm=sum) - Solution:
             ALT./CRIT.                Match (max) W.0.6    High (min) W.0.05    Low (max) W.0.05    HACD (min) W.0.2    LACD (max) W.0.1    Rank
------------------------------------  -------------------  -------------------  ------------------  ------------------  ------------------  ------
Deuterium v2019.2.37.59 UCI_Elo 1500          362                  291                 347                 357                 335            7
      Arasan 21.3 UCI_Elo 1500                305                  266                 429                 494                 313            12
      Ufim v8.02 UCI_Elo 1500                 428                  280                 292                 475                 332            2
      CT800 V1.34 UCI_Elo 1500                333                  244                 423                 268                 739            6
   DanaSah 7.9 Human UCI_Elo 1500             332                  250                 418                 392                 577            11
 Stockfish 2019.07.14 UCI_Elo 1500            254                  263                 483                 493                 356            13
      Cheng 4.39 UCI_Elo 1500                 408                  325                 267                 442                 329            5
    Discocheck 5.2 UCI_Elo 1500               368                  276                 356                 250                 422            4
       Houdini 3 UCI_Elo 1500                 360                  263                 377                 393                 217            10
      Amyan 1.72 UCI_Elo 1500                 348                  239                 413                 399                 738            8
    Rhetoric 1.4.3 UCI_Elo 1500               359                  286                 355                 240                 664            3
       Hiarcs 14 UCI_Elo 1500                 342                  239                 419                 286                 440            9
      Cheese 2.1 UCI_Elo 1500                 432                  312                 256                 441                 432            1
According to the weight assigned Cheese 2.1 is the best at rank #1, followed by Ufim and Rhetoric.

If you have weight values that you would like to run post it and I will try to run it.

Next I will be testing these engines at 5000 positions.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI_Elo

Post by Ferdy »

Ferdy wrote: Tue Jul 23, 2019 3:53 pm Next I will be testing these engines at 5000 positions.
Tests at 5000 pos is completed.

UCI_Elo 1500 engine test results on FIDE Elo 1500
Test positions are taken from players with FIDE Elo 1450 to 1550

Code: Select all

                               Engine  Total  Match  High   Low  HACD  LACD
 Deuterium v2019.2.37.59 UCI_Elo 1500   5000   1891  1360  1749   426   284
              Ufim v8.02 UCI_Elo 1500   5000   2164  1360  1476   447   423
             CT800 V1.34 UCI_Elo 1500   5000   1627  1164  2209   329   914
             Arasan 21.3 UCI_Elo 1500   5000   1634  1178  2188   491   440
       DanaSah 7.9 Human UCI_Elo 1500   5000   1710  1195  2095   434   324
    Stockfish 2019.07.14 UCI_Elo 1500   5000   1304  1231  2465   443   390
              Cheng 4.39 UCI_Elo 1500   5000   2141  1527  1332   427   144
          Discocheck 5.2 UCI_Elo 1500   5000   1947  1308  1745   380   445
               Houdini 3 UCI_Elo 1500   5000   1704  1269  2027   427   174
          Rhetoric 1.4.3 UCI_Elo 1500   5000   1875  1330  1795   360   448
               Hiarcs 14 UCI_Elo 1500   5000   1798  1112  2090   375   685
              Cheese 2.1 UCI_Elo 1500   5000   2138  1532  1330   421   165
              Amyan 1.72 UCI_Elo 1500   5000   1803  1186  2011   460   551

Code: Select all

::Legend::
Total: Number of test positions from human games.
Match: Count of pos, where engine and human move are the same.
High : Count of pos, where engine move is stronger than human move.
Low  : Count of pos, where engine move is weaker than human move.
HACD : High Average Centipawn Difference, engine move is stronger 
       than human move by Centipawn amount, according to Stockfish 2019.04.16.
LACD : Low Average Centipawn Difference, engine move is weaker 
       than human move by Centipawn amount, according to Stockfish 2019.04.16.
Ranking result using TOPSIS. Ranked 1 engine is the top engine that performed well in this test according to the given criteria and weights. This is Ufim, followed by ranked 2 DiscoCheck and ranked 3 Cheese.

Code: Select all

TOPSIS (mnorm=vector, wnorm=sum) - Solution:
             ALT./CRIT.                Match (max) W.0.6    High (min) W.0.05    Low (max) W.0.05    HACD (min) W.0.2    LACD (max) W.0.1    Rank
------------------------------------  -------------------  -------------------  ------------------  ------------------  ------------------  ------
Deuterium v2019.2.37.59 UCI_Elo 1500         1891                 1360                 1749                426                 284            9
      Ufim v8.02 UCI_Elo 1500                2164                 1360                 1476                447                 423            1
      CT800 V1.34 UCI_Elo 1500               1627                 1164                 2209                329                 914            7
      Arasan 21.3 UCI_Elo 1500               1634                 1178                 2188                491                 440            12
   DanaSah 7.9 Human UCI_Elo 1500            1710                 1195                 2095                434                 324            10
 Stockfish 2019.07.14 UCI_Elo 1500           1304                 1231                 2465                443                 390            13
      Cheng 4.39 UCI_Elo 1500                2141                 1527                 1332                427                 144            5
    Discocheck 5.2 UCI_Elo 1500              1947                 1308                 1745                380                 445            2
       Houdini 3 UCI_Elo 1500                1704                 1269                 2027                427                 174            11
    Rhetoric 1.4.3 UCI_Elo 1500              1875                 1330                 1795                360                 448            6
       Hiarcs 14 UCI_Elo 1500                1798                 1112                 2090                375                 685            4
      Cheese 2.1 UCI_Elo 1500                2138                 1532                 1330                421                 165            3
      Amyan 1.72 UCI_Elo 1500                1803                 1186                 2011                460                 551            8