How to rate my engine in CCRL?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

No4b
Posts: 105
Joined: Thu Jun 18, 2020 3:21 pm
Location: Moscow
Full name: Alexander Litov

Re: How to rate my engine in CCRL?

Post by No4b »

mvanthoor wrote: Sun Sep 27, 2020 3:14 pm
maksimKorzh wrote: Fri Sep 25, 2020 1:16 pm ...
Hi Maksim,

I don't know if you've seen this already, but Wukong has been rated on CCRL's Blitz list.

Wukong on CCRL

Rating: 1474

Congratulations :)

It is not an advanced chess engine (I know it wasn't ever supposed to be), but its rating gives me good hopes for my own engine :)

Seeing that such a basic engine as Wukong already scores 1474 CCRL, I wonder what one needs to do (or omit) to build an engine that scores in the 1200's...
Well, when i was getting started on my chess variant AI, it was VERY weak. At that time i was yet to watch VICE video and never heard of this forum.
So my engine didnt have Quiescence search, made/unmade moves directly on the UI board (one can only imagine how slow it was), and the only move ordering technique it used was my pretty strange implementation of the Killer moves.
I never measured its strength, but i bet it would be pretty low.

Also i must note that any bugs in the code can contribute to the strenght loss very significantly (i often fixed bugs that cost like ~60 elo).

Another significant elo-eater might be inefficient make\unmake or MoveGen. For example my current AI for unity game has very convoluted Make\Unmake functions, because there are pieces with very different move rules, and it should be considered (f.e. some pieces do not move if they are capturing smth, some do not disappear while promoting, but can do it only once in a game, some do not disseapear if they are captured... etc - this brings in many additional IFs in the code and i bet slowdown engine hard).
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: How to rate my engine in CCRL?

Post by mvanthoor »

No4b wrote: Sun Sep 27, 2020 6:48 pm Well, when i was getting started on my chess variant AI, it was VERY weak. At that time i was yet to watch VICE video and never heard of this forum.
So my engine didnt have Quiescence search, made/unmade moves directly on the UI board (one can only imagine how slow it was), and the only move ordering technique it used was my pretty strange implementation of the Killer moves.
I never measured its strength, but i bet it would be pretty low.

Also i must note that any bugs in the code can contribute to the strenght loss very significantly (i often fixed bugs that cost like ~60 elo).

Another significant elo-eater might be inefficient make\unmake or MoveGen. For example my current AI for unity game has very convoluted Make\Unmake functions, because there are pieces with very different move rules, and it should be considered (f.e. some pieces do not move if they are capturing smth, some do not disappear while promoting, but can do it only once in a game, some do not disseapear if they are captured... etc - this brings in many additional IFs in the code and i bet slowdown engine hard).
That seems to be quite a complicated chess variant. Can I play this somewhere/somehow?

(I did try some chess variants, but some just have too many different pieces with too many different capabilities; it becomes hard to remember what piece can actually perform which moves, and in what situations.)
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Gabor Szots
Posts: 1364
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: How to rate my engine in CCRL?

Post by Gabor Szots »

I am just testing BBC. First results do not show +600 compared to TSCP, though.
Gabor Szots
CCRL testing group
User avatar
maksimKorzh
Posts: 771
Joined: Sat Sep 08, 2018 5:37 pm
Location: Ukraine
Full name: Maksim Korzh

Re: How to rate my engine in CCRL?

Post by maksimKorzh »

Gabor Szots wrote: Mon Sep 28, 2020 10:53 am I am just testing BBC. First results do not show +600 compared to TSCP, though.
Thanks you so much Gabor.
It shouldn't be +600)))

BBC 1.0 Should be +100/150 greater than TSCP
The current development version is already as strong a VICE (after tuning evaluation) but I need to do lots of tests before releasing next version.

So it should be just stronger than TSCP.
Also I played to little games.

How many games did BBC already play?
Did it crush?
Gabor Szots
Posts: 1364
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: How to rate my engine in CCRL?

Post by Gabor Szots »

maksimKorzh wrote: Mon Sep 28, 2020 10:58 am How many games did BBC already play?
Did it crush?
66 games, no crash.

In another thread you wrote it beat TSCP 15,5-0,5. Based upon that and remaining on the cautious side I selected opponents around 2200. I'm going to change that a bit. I assess in the end its rating will be somewhere near 2000.
Gabor Szots
CCRL testing group
No4b
Posts: 105
Joined: Thu Jun 18, 2020 3:21 pm
Location: Moscow
Full name: Alexander Litov

Re: How to rate my engine in CCRL?

Post by No4b »

mvanthoor wrote: Sun Sep 27, 2020 8:52 pm
No4b wrote: Sun Sep 27, 2020 6:48 pm Well, when i was getting started on my chess variant AI, it was VERY weak. At that time i was yet to watch VICE video and never heard of this forum.
So my engine didnt have Quiescence search, made/unmade moves directly on the UI board (one can only imagine how slow it was), and the only move ordering technique it used was my pretty strange implementation of the Killer moves.
I never measured its strength, but i bet it would be pretty low.

Also i must note that any bugs in the code can contribute to the strenght loss very significantly (i often fixed bugs that cost like ~60 elo).

Another significant elo-eater might be inefficient make\unmake or MoveGen. For example my current AI for unity game has very convoluted Make\Unmake functions, because there are pieces with very different move rules, and it should be considered (f.e. some pieces do not move if they are capturing smth, some do not disappear while promoting, but can do it only once in a game, some do not disseapear if they are captured... etc - this brings in many additional IFs in the code and i bet slowdown engine hard).
That seems to be quite a complicated chess variant. Can I play this somewhere/somehow?

(I did try some chess variants, but some just have too many different pieces with too many different capabilities; it becomes hard to remember what piece can actually perform which moves, and in what situations.)
Well, its a Unity game and its currently work in progress.
I can PM you a link to a previous test version i made for my friends back in june (there are some progress after it, but i didnt do all i wanted yet), the only problem i can see is that all text regarding movesets of the pieces are currently only in Russian, althought i suppose i can briefly describe each one, dont know.
No4b
Posts: 105
Joined: Thu Jun 18, 2020 3:21 pm
Location: Moscow
Full name: Alexander Litov

Re: How to rate my engine in CCRL?

Post by No4b »

maksimKorzh wrote: Mon Sep 28, 2020 10:58 am
Gabor Szots wrote: Mon Sep 28, 2020 10:53 am I am just testing BBC. First results do not show +600 compared to TSCP, though.
Thanks you so much Gabor.
It shouldn't be +600)))

BBC 1.0 Should be +100/150 greater than TSCP
The current development version is already as strong a VICE (after tuning evaluation) but I need to do lots of tests before releasing next version.

So it should be just stronger than TSCP.
Also I played to little games.

How many games did BBC already play?
Did it crush?
I decided to have a quick match of the BBC 1.0 against Drofa 1.0 (lunix compile vs lunix compile)

Code: Select all

Score of Drofa_v.1.0 vs bbc_1.0_64bit_linux: 13 - 3 - 4 [0.750]
Elo difference: 190.85 +/- 172.61

20 of 20 games finished.
It somewhat confirm your ~100-150 suggestion, although for an accurate result much more games are needed.
As i watched some games unfold, i came to my attention that BBC 1.0 has some sort of a bug, where it prints 0.00 score even in a completely lost positions (see game below). I suppose it is either repetition or TT issue, but could be excessive pruning as well. If this is not fixed yet, i have a feeling that such bug may have really big negative impact on overall strength. If you want, i can PM you archive with all games played.

[pgn][Event "bbc_test"]
[Site "?"]
[Date "2020.09.28"]
[Round "1"]
[White "Drofa_v.1.0"]
[Black "bbc_1.0_64bit_linux"]
[Result "1-0"]
[ECO "A40"]
[Opening "Queen's pawn"]
[PlyCount "65"]
[TimeControl "60+1"]

1. d4 {book} e6 {book} 2. c4 {book} d5 {book} 3. Nc3 {book} Bb4 {book}
4. e3 {book} Nf6 {book} 5. Qb3 {book} Bd6 {book} 6. Bd2 {book} c6 {book}
7. Nf3 {book} O-O {book} 8. Bd3 {book} Nbd7 {book} 9. O-O-O {book} a5 {book}
10. c5 {book} Bc7 {book} 11. e4 {+0.53/8 2.5s} e5 {0.00/10 3.0s}
12. Qa4 {+0.36/8 3.9s} Re8 {0.00/9 2.9s} 13. Bg5 {+0.37/8 2.7s} h6 {0.00/9 2.8s}
14. Bh4 {+0.46/8 3.4s} g5 {0.00/9 2.8s} 15. Bg3 {+0.45/8 2.1s}
exd4 {0.00/10 2.7s} 16. Qxd4 {+0.25/9 3.3s} g4 {0.00/10 2.6s}
17. Nd2 {+0.16/8 3.4s} Bxg3 {+0.52/9 2.6s} 18. hxg3 {+0.45/9 2.7s}
Qe7 {+0.50/9 2.5s} 19. exd5 {+0.49/8 3.2s} Qxc5 {+0.62/9 2.5s}
20. Qf4 {+1.39/9 2.3s} Ne5 {0.00/9 2.5s} 21. Qxh6 {+4.54/8 3.0s}
Nxd3+ {-4.21/9 2.4s} 22. Kb1 {+6.33/9 2.9s} Nh7 {-4.55/9 2.4s}
23. Qxh7+ {+7.25/9 1.9s} Kf8 {0.00/9 2.3s} 24. Qh6+ {+7.25/8 2.8s}
Ke7 {-6.85/9 2.3s} 25. Nde4 {+7.79/8 2.7s} Qb4 {0.00/9 2.3s}
26. Rxd3 {+12.32/8 2.0s} Rg8 {-13.49/9 2.2s} 27. d6+ {+20.71/8 1.6s}
Kd7 {-21.83/9 2.2s} 28. Qf6 {+999.89/8 1.4s} Qxe4 {0.00/10 2.1s}
29. Nxe4 {+999.91/8 2.5s} c5 {-M10/11 2.1s} 30. Qxf7+ {+999.93/8 1.9s}
Kc6 {-M8/11 2.1s} 31. Qc7+ {+999.95/8 2.4s} Kb5 {-M6/11 2.0s}
32. Rb3+ {+999.91/8 2.3s} Ka6 {-M4/12 2.0s}
33. Qb6# {+999.99/9 2.2s, White mates} 1-0[/pgn]
User avatar
maksimKorzh
Posts: 771
Joined: Sat Sep 08, 2018 5:37 pm
Location: Ukraine
Full name: Maksim Korzh

Re: How to rate my engine in CCRL?

Post by maksimKorzh »

No4b wrote: Mon Sep 28, 2020 1:59 pm
maksimKorzh wrote: Mon Sep 28, 2020 10:58 am
Gabor Szots wrote: Mon Sep 28, 2020 10:53 am I am just testing BBC. First results do not show +600 compared to TSCP, though.
Thanks you so much Gabor.
It shouldn't be +600)))

BBC 1.0 Should be +100/150 greater than TSCP
The current development version is already as strong a VICE (after tuning evaluation) but I need to do lots of tests before releasing next version.

So it should be just stronger than TSCP.
Also I played to little games.

How many games did BBC already play?
Did it crush?
I decided to have a quick match of the BBC 1.0 against Drofa 1.0 (lunix compile vs lunix compile)

Code: Select all

Score of Drofa_v.1.0 vs bbc_1.0_64bit_linux: 13 - 3 - 4 [0.750]
Elo difference: 190.85 +/- 172.61

20 of 20 games finished.
It somewhat confirm your ~100-150 suggestion, although for an accurate result much more games are needed.
As i watched some games unfold, i came to my attention that BBC 1.0 has some sort of a bug, where it prints 0.00 score even in a completely lost positions (see game below). I suppose it is either repetition or TT issue, but could be excessive pruning as well. If this is not fixed yet, i have a feeling that such bug may have really big negative impact on overall strength. If you want, i can PM you archive with all games played.

[pgn][Event "bbc_test"]
[Site "?"]
[Date "2020.09.28"]
[Round "1"]
[White "Drofa_v.1.0"]
[Black "bbc_1.0_64bit_linux"]
[Result "1-0"]
[ECO "A40"]
[Opening "Queen's pawn"]
[PlyCount "65"]
[TimeControl "60+1"]

1. d4 {book} e6 {book} 2. c4 {book} d5 {book} 3. Nc3 {book} Bb4 {book}
4. e3 {book} Nf6 {book} 5. Qb3 {book} Bd6 {book} 6. Bd2 {book} c6 {book}
7. Nf3 {book} O-O {book} 8. Bd3 {book} Nbd7 {book} 9. O-O-O {book} a5 {book}
10. c5 {book} Bc7 {book} 11. e4 {+0.53/8 2.5s} e5 {0.00/10 3.0s}
12. Qa4 {+0.36/8 3.9s} Re8 {0.00/9 2.9s} 13. Bg5 {+0.37/8 2.7s} h6 {0.00/9 2.8s}
14. Bh4 {+0.46/8 3.4s} g5 {0.00/9 2.8s} 15. Bg3 {+0.45/8 2.1s}
exd4 {0.00/10 2.7s} 16. Qxd4 {+0.25/9 3.3s} g4 {0.00/10 2.6s}
17. Nd2 {+0.16/8 3.4s} Bxg3 {+0.52/9 2.6s} 18. hxg3 {+0.45/9 2.7s}
Qe7 {+0.50/9 2.5s} 19. exd5 {+0.49/8 3.2s} Qxc5 {+0.62/9 2.5s}
20. Qf4 {+1.39/9 2.3s} Ne5 {0.00/9 2.5s} 21. Qxh6 {+4.54/8 3.0s}
Nxd3+ {-4.21/9 2.4s} 22. Kb1 {+6.33/9 2.9s} Nh7 {-4.55/9 2.4s}
23. Qxh7+ {+7.25/9 1.9s} Kf8 {0.00/9 2.3s} 24. Qh6+ {+7.25/8 2.8s}
Ke7 {-6.85/9 2.3s} 25. Nde4 {+7.79/8 2.7s} Qb4 {0.00/9 2.3s}
26. Rxd3 {+12.32/8 2.0s} Rg8 {-13.49/9 2.2s} 27. d6+ {+20.71/8 1.6s}yea
Kd7 {-21.83/9 2.2s} 28. Qf6 {+999.89/8 1.4s} Qxe4 {0.00/10 2.1s}
29. Nxe4 {+999.91/8 2.5s} c5 {-M10/11 2.1s} 30. Qxf7+ {+999.93/8 1.9s}
Kc6 {-M8/11 2.1s} 31. Qc7+ {+999.95/8 2.4s} Kb5 {-M6/11 2.0s}
32. Rb3+ {+999.91/8 2.3s} Ka6 {-M4/12 2.0s}
33. Qb6# {+999.99/9 2.2s, White mates} 1-0[/pgn]
Yeah, send me PGNs please.
User avatar
maksimKorzh
Posts: 771
Joined: Sat Sep 08, 2018 5:37 pm
Location: Ukraine
Full name: Maksim Korzh

Re: How to rate my engine in CCRL?

Post by maksimKorzh »

Gabor Szots wrote: Mon Sep 28, 2020 11:15 am
maksimKorzh wrote: Mon Sep 28, 2020 10:58 am How many games did BBC already play?
Did it crush?
66 games, no crash.
In another thread you wrote it beat TSCP 15,5-0,5. Based upon that and remaining on the cautious side I selected opponents around 2200. I'm going to change that a bit. I assess in the end its rating will be somewhere near 2000.
Thank you Gabor. Even 2000 seems a bit too much. I think it should be around 1950 because version 1.0 is weaker than VICE which is around 2000. I'm now fixing bugs and also improved evaluation so next version should be much stronger.

Re: result vs TSCP
- that was 30 sec + 0, in 2min +1sec result should be worth for bbc. It happens due to the difference in search depth, it's more critical on ultra short time controls.
Gabor Szots
Posts: 1364
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: How to rate my engine in CCRL?

Post by Gabor Szots »

maksimKorzh wrote: Mon Sep 28, 2020 6:03 pmRe: result vs TSCP
- that was 30 sec + 0, in 2min +1sec result should be worth for bbc. It happens due to the difference in search depth, it's more critical on ultra short time controls.
Christophe Théron, author of Chess Tiger said once: if an engine is sensitive to the time control, then it is badly written. :wink:
Gabor Szots
CCRL testing group