Nice! So the estimated FIDE rating of 2846 is almost exactly the middle of the 2800 to 2900 estimated CCRL rating, although of course with only one draw given up the margin of error is huge. Similarly the estimated FIDE rating for MiniHuman based on your formula is close to the CCRL blitz rating we measured for it (now 2233). It seems that CCRL blitz ratings are not so far off from FIDE ratings for blitz games between humans and engines. Do you know at what time control these blitz games were played on LiChess (could be anywhere from 3-0 to 5-3 I think)? Also, are there any engines on your list with Lichess blitz ratings that are standard A/B engines not using nets which have ratings based solely (or almost solely) on playing humans and many opponents within a hundred elo or so?Ferdy wrote: ↑Fri Dec 24, 2021 10:09 amHere is the data of its opponents.lkaufman wrote: ↑Fri Dec 24, 2021 1:33 amOK, now we're getting somewhere. Let's call it 2850 CCRL based on your statement. Is there a way to determine its score only against titled players (excluding TOS violators if any) and their mean LiChess blitz rating? Then we can compute an approximate performance rating for it and see the difference from the CCRL estimate. Perhaps it won't be much more than the 206 elo calculated for MiniHuman.carldaman wrote: ↑Fri Dec 24, 2021 1:01 amNezh is a private engine, and I can't add much more than what's been stated, since I'm only its tester. It predates NN/NNUE and only uses HCE.MonteCarlo wrote: ↑Thu Dec 23, 2021 11:52 pm Per https://talkchess.com/forum3/viewtopic.php?f=2&t=73564, a maybe-conceptually-inspired-by-Rodent private engine.
Not likely to be a derivative of SF, but not a lot of detail to go on.
I wouldn't read too much into its lichess rating. Just an artefact of its opponents. It's +1839 =11 -11 in blitz, with at least one of the losses coming against a TOS-violating opponent.
If it played the same people Carlsen played, it would be much higher rated.
Cheers!
As I mentioned before, its rating would continue to climb if it kept playing more and more decent strength humans.
That has been the undeniable trend and 2900 Elo on lichess does not reflect its ultimate potential, which is probably to be on par or slightly better than the human world champ. That's what its strength projects to be based on private tests against other engines, which point to somewhere around 2800-2900 CCRL.pgnrating column is the lichess blitz rating of the opp of Nezh-BOT at the time of encounter.Code: Select all
site oppname pgntitle pgnrating TOS blitzrating score Nezh-BOT_score 0 https://lichess.org/oIYFQfAs dmitrijiiiGM CM 2657 0 2727 0.0 1.0 1 https://lichess.org/HMPDu49k dmitrijiiiGM CM 2656 0 2727 0.5 0.5 2 https://lichess.org/Bw0A7t8l CXCX83 GM 2653 0 2708 0.0 1.0 3 https://lichess.org/phaor9Ki CXCX83 GM 2654 0 2708 0.0 1.0 4 https://lichess.org/NgzHhi7C CXCX83 GM 2655 0 2708 0.0 1.0 5 https://lichess.org/JTH4qCHh CXCX83 GM 2656 0 2708 0.0 1.0 6 https://lichess.org/J1vRvjKi DanPach FM 2488 0 2592 0.0 1.0 7 https://lichess.org/jLOpgANG DanPach FM 2488 0 2592 0.0 1.0 8 https://lichess.org/cPEEwdDR DanPach FM 2489 0 2592 0.0 1.0 9 https://lichess.org/3ILuIVsH DanPach FM 2489 0 2592 0.0 1.0 10 https://lichess.org/6pNUtbRz DanPach FM 2490 0 2592 0.0 1.0 11 https://lichess.org/CDZ1vD3w JuleVerne FM 2610 0 2620 0.0 1.0 12 https://lichess.org/VKSXZcFl HorrendousBrilliancy GM 2635 0 2929 0.0 1.0 13 https://lichess.org/FDZhQKDZ justantan GM 2642 0 2850 0.0 1.0 14 https://lichess.org/KuWdI0F9 juancruzariasTDF CM 2530 0 2707 0.0 1.0 15 https://lichess.org/OwWgh9ie Tahaned2015 IM 2386 0 2607 0.0 1.0 16 https://lichess.org/WUewJm1l jeffforever FM 2474 0 2617 0.0 1.0 17 https://lichess.org/KvBwiY2T gmluke GM 2677 0 2737 0.0 1.0 18 https://lichess.org/CLzKn17H rickyrich NM 2339 0 2313 0.0 1.0 19 https://lichess.org/eSiPhCGg rickyrich NM 2295 0 2313 0.0 1.0
blitzrating column is the current lichess blitz rating of the opp of Nezh-BOT.
I tried to calculate its perf using the pgnrating column as this is their actual encounter. Get the mean convert it to FIDE rating. Then take its score rate. Then use the FIDE table given score rate to get the rating difference. Calculate the FIDE perf by opp_rating + rating diff, then convert it back to lichess blitz rating to get its blitz rating based from those titled opponents.
Code: Select all
Nezh-BOT opp mean lichess blitz rating: 2548 (using pgnrating column) Nezh-BOT score rate: 0.975 FIDE mean rating conversion: 2308 from 2548 (using fide=181 + 0.83458xlichess) FIDE rating table 8.1a, 0.975 is around rating diff = 538Code: Select all
Nezh-BOT FIDE perf rating = 2308 + 538 = 2846 Nezh-BOT Lichess Blitz rating conversion: 3193 (using lichess = (fide - 181) / 0.83458) Nezh-BOT Lichess blitz rating is around 3193, based from titled, non-bot and non-tos violator opponents.
Lichess Blitz rating to FIDE rating
Moderator: Ras
-
lkaufman
- Posts: 6287
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Lichess Blitz rating to FIDE rating
Komodo rules!
-
lkaufman
- Posts: 6287
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Lichess Blitz rating to FIDE rating
MiniHuman ended up at 2233 under CCRL Blitz conditions, exactly 200 below its lichess blitz rating. However I overlooked one point; looking at the LiChess games, it seems that most of the rated games against strong human opponents have been "slow blitz", with 5' + 3" being very common and perhaps close to average (some games 3' + 3", but some 5' + 4" or 3' + 7" and such). But a bot with fixed number of nodes won't play any better at 5' + 3" than it would at 2' + 1". I suppose the CCRL Blitz rating at 5' + 3" would be about a hundred elo lower. So perhaps we should say that CCRL + 300 is a fair estimate for the lichess blitz rating needed by a human to have an even chance at 5' + 3" blitz, at least in the 2100 CCRL ballpark. Even that seems a bit hard for me to believe, I would think ccrl + 400 seems more realistic, but this is just one datapoint. I think it would be better for this purpose to use a normal engine (not one with a net) that uses the allotted time; any candidates with enough human data?lkaufman wrote: ↑Fri Dec 24, 2021 12:36 amSo far MiniHuman is at 2227 under CCRL Blitz conditions, just 206 elo below its lichess blitz rating. This is rather surprising to me, since lichess blitz ratings are a class or so above FIDE ratings at this level; in fact 2433 lichess converts to 2212 FIDE by the formula given in this thread. Pretty much a perfect match between FIDE and CCRL blitz for this one engine, allowing for MOE. So assuming that there aren't a lot of cheaters who avoided detection in this sample, the engines may be weaker in blitz than I thought they were around this level; perhaps I overestimated the "contraction" factor going from engine ratings to human ratings? Maybe the relationship between ccrl and lichess is not so far from linear. I think we need some data around the 1500 level or so.lkaufman wrote: ↑Wed Dec 22, 2021 11:45 pmWe're testing it now vs ccrl engines at ccrl blitz tc. dkappe says the correct number of nodes is 3000, not 2500.Ferdy wrote: ↑Wed Dec 22, 2021 7:48 amProfile of MiniHuman BOT. It has only played a single BOT in blitz rated game with 7 games. There are 24 opp players who violates Lichess TOS. Lichess has a mechanism to refund rating lost if opp violates TOS. So rating is not affected in this case.Ferdy wrote: ↑Wed Dec 22, 2021 2:57 amThere are maia bots in lichess like maia1 has around 1478 blitz. It is just lc0 + maia nn called maia-1100.pb.gz. This can be run with nodes=1 limit, you may use the older versions like Lc0 v0.26. Better to use CPU version to generate more games in shorter time if cpu has more threads.lkaufman wrote: ↑Tue Dec 21, 2021 5:32 pm This is interesting and quite reasonable; the roughly 6 to 5 spread in ratings from FIDE to Lichess blitz is about what I would expect due to the smaller percentage of draws in blitz. Is there any data that would allow you to compare CCRL blitz ratings with Lichess blitz ratings? Due to the compression of CCRL ratings by BayesElo and the expansion of LiChess blitz ratings, I would expect that the relationship would be fairly close to linear.
maia9 has around 1770 blitz using maia-1900.pb.gz nn. This can be easily compared with CCRL engines around this strength levels.
There are other lichess bots but reproducing it for CCRL on our PC can be a challenge. maia is easier. I think the authors of those bots can run those under CCRL settings.
There is minihuman by dkappe with a higher blitz rating. Reading the description I think this can be easily adapted for CCRL. It has a limit of around 2500 nodes.Mean Girl 8 (32x4) -- the most fun leela-style network -- looking at ~2500 nodes on a Raspberry Pi 3. Will play casual and rated Blitz, Rapid and Classical with at least 3 sec increment in both standard and chess960. Will move almost instantly. Designed to be a reasonable sparring partner. Uses a gambit book for extra entertainment value.
TOS = Terms of Service.
Code: Select all
profile of MiniHuman: userid bgames blitzrating tos title MiniHuman 1297 2433 0 BOTCode: Select all
number of unique opp names: 141 unique opp names who violates TOS: 24 unique opp names who do not violates TOS: 117opp with rating 2000 and above that are not TOS violators:Code: Select all
name of bot opp: userid bgames blitzrating tos title Leela1Node 459 2143 0 BOT number of games played when opp is a bot: 7The ratings in the list are the current ratings, the ratings in the actual game may not be the same.Code: Select all
userid bgames blitzrating tos title 0 Kudritsky_Maksim_04 9516 2523 0 no_title 1 itay121 2692 2467 0 no_title 2 Maharlikan 582 2344 0 no_title 3 DOLPHIN_2012 3251 2332 0 no_title 4 m_kastriot 7733 2294 0 no_title 5 Thxultra2 10244 2291 0 no_title 6 schpringer 8563 2280 0 no_title 7 Karagialis 4374 2261 0 no_title 8 Michalsos 5501 2259 0 no_title 9 Wildcard1659 5152 2249 0 no_title 10 the_providence 202 2244 0 no_title 11 skoyen99 805 2243 0 FM 12 Dingdongking 1858 2231 0 no_title 13 completemagnet 301 2226 0 no_title 14 Mai-San_skida 21 2207 0 no_title 15 DeltaZero_99 3201 2204 0 no_title 16 EarlyLight 64 2204 0 no_title 17 kapibarr 7075 2176 0 no_title 18 rutvik3 1187 2176 0 no_title 19 Leela1Node 459 2143 0 BOT 20 never__quit 9388 2120 0 no_title 21 Soni_Atharv 3375 2119 0 no_title 22 MichaelLambert 4017 2109 0 no_title 23 petermac 5861 2107 0 no_title 24 PiecePeace 1451 2103 0 no_title 25 MasterofUnknown 3 2072 0 no_title 26 checkmatetrix 1513 2060 0 no_title 27 Nahia12 1521 2054 0 no_title 28 DanTheMan82 1216 2053 0 no_title 29 Plaskad 112 2035 0 no_title 30 Speed1 28916 2028 0 no_title 31 Vkiller 2401 2026 0 no_title 32 JanHudak 1631 2025 0 no_title 33 Quick_chess65 380 2024 0 no_title 34 sirprimal11 6300 2021 0 no_title 35 Noyar 827 2016 0 no_title 36 Creignor 367 2004 0 no_title
So I guess this is a good candidate to connect to CCRL Blitz, close to this level of play (Lichess Blitz 2400).
Mean Girl net is at https://github.com/dkappe/leela-chess-w ... -style-net. Then get Lc0 to run it at around 2500 nodes per move according to its profile in Lichess. I am not sure which Lc0 version to use, maybe from year 2019 or 2020.
Komodo rules!
-
Raphexon
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: Lichess Blitz rating to FIDE rating
simpleEval probably has the most amount of human data, even if it's primarily at bullet. (And a few cheaters here and there)lkaufman wrote: ↑Fri Dec 24, 2021 8:14 pmMiniHuman ended up at 2233 under CCRL Blitz conditions, exactly 200 below its lichess blitz rating. However I overlooked one point; looking at the LiChess games, it seems that most of the rated games against strong human opponents have been "slow blitz", with 5' + 3" being very common and perhaps close to average (some games 3' + 3", but some 5' + 4" or 3' + 7" and such). But a bot with fixed number of nodes won't play any better at 5' + 3" than it would at 2' + 1". I suppose the CCRL Blitz rating at 5' + 3" would be about a hundred elo lower. So perhaps we should say that CCRL + 300 is a fair estimate for the lichess blitz rating needed by a human to have an even chance at 5' + 3" blitz, at least in the 2100 CCRL ballpark. Even that seems a bit hard for me to believe, I would think ccrl + 400 seems more realistic, but this is just one datapoint. I think it would be better for this purpose to use a normal engine (not one with a net) that uses the allotted time; any candidates with enough human data?lkaufman wrote: ↑Fri Dec 24, 2021 12:36 amSo far MiniHuman is at 2227 under CCRL Blitz conditions, just 206 elo below its lichess blitz rating. This is rather surprising to me, since lichess blitz ratings are a class or so above FIDE ratings at this level; in fact 2433 lichess converts to 2212 FIDE by the formula given in this thread. Pretty much a perfect match between FIDE and CCRL blitz for this one engine, allowing for MOE. So assuming that there aren't a lot of cheaters who avoided detection in this sample, the engines may be weaker in blitz than I thought they were around this level; perhaps I overestimated the "contraction" factor going from engine ratings to human ratings? Maybe the relationship between ccrl and lichess is not so far from linear. I think we need some data around the 1500 level or so.lkaufman wrote: ↑Wed Dec 22, 2021 11:45 pmWe're testing it now vs ccrl engines at ccrl blitz tc. dkappe says the correct number of nodes is 3000, not 2500.Ferdy wrote: ↑Wed Dec 22, 2021 7:48 amProfile of MiniHuman BOT. It has only played a single BOT in blitz rated game with 7 games. There are 24 opp players who violates Lichess TOS. Lichess has a mechanism to refund rating lost if opp violates TOS. So rating is not affected in this case.Ferdy wrote: ↑Wed Dec 22, 2021 2:57 amThere are maia bots in lichess like maia1 has around 1478 blitz. It is just lc0 + maia nn called maia-1100.pb.gz. This can be run with nodes=1 limit, you may use the older versions like Lc0 v0.26. Better to use CPU version to generate more games in shorter time if cpu has more threads.lkaufman wrote: ↑Tue Dec 21, 2021 5:32 pm This is interesting and quite reasonable; the roughly 6 to 5 spread in ratings from FIDE to Lichess blitz is about what I would expect due to the smaller percentage of draws in blitz. Is there any data that would allow you to compare CCRL blitz ratings with Lichess blitz ratings? Due to the compression of CCRL ratings by BayesElo and the expansion of LiChess blitz ratings, I would expect that the relationship would be fairly close to linear.
maia9 has around 1770 blitz using maia-1900.pb.gz nn. This can be easily compared with CCRL engines around this strength levels.
There are other lichess bots but reproducing it for CCRL on our PC can be a challenge. maia is easier. I think the authors of those bots can run those under CCRL settings.
There is minihuman by dkappe with a higher blitz rating. Reading the description I think this can be easily adapted for CCRL. It has a limit of around 2500 nodes.Mean Girl 8 (32x4) -- the most fun leela-style network -- looking at ~2500 nodes on a Raspberry Pi 3. Will play casual and rated Blitz, Rapid and Classical with at least 3 sec increment in both standard and chess960. Will move almost instantly. Designed to be a reasonable sparring partner. Uses a gambit book for extra entertainment value.
TOS = Terms of Service.
Code: Select all
profile of MiniHuman: userid bgames blitzrating tos title MiniHuman 1297 2433 0 BOTCode: Select all
number of unique opp names: 141 unique opp names who violates TOS: 24 unique opp names who do not violates TOS: 117opp with rating 2000 and above that are not TOS violators:Code: Select all
name of bot opp: userid bgames blitzrating tos title Leela1Node 459 2143 0 BOT number of games played when opp is a bot: 7The ratings in the list are the current ratings, the ratings in the actual game may not be the same.Code: Select all
userid bgames blitzrating tos title 0 Kudritsky_Maksim_04 9516 2523 0 no_title 1 itay121 2692 2467 0 no_title 2 Maharlikan 582 2344 0 no_title 3 DOLPHIN_2012 3251 2332 0 no_title 4 m_kastriot 7733 2294 0 no_title 5 Thxultra2 10244 2291 0 no_title 6 schpringer 8563 2280 0 no_title 7 Karagialis 4374 2261 0 no_title 8 Michalsos 5501 2259 0 no_title 9 Wildcard1659 5152 2249 0 no_title 10 the_providence 202 2244 0 no_title 11 skoyen99 805 2243 0 FM 12 Dingdongking 1858 2231 0 no_title 13 completemagnet 301 2226 0 no_title 14 Mai-San_skida 21 2207 0 no_title 15 DeltaZero_99 3201 2204 0 no_title 16 EarlyLight 64 2204 0 no_title 17 kapibarr 7075 2176 0 no_title 18 rutvik3 1187 2176 0 no_title 19 Leela1Node 459 2143 0 BOT 20 never__quit 9388 2120 0 no_title 21 Soni_Atharv 3375 2119 0 no_title 22 MichaelLambert 4017 2109 0 no_title 23 petermac 5861 2107 0 no_title 24 PiecePeace 1451 2103 0 no_title 25 MasterofUnknown 3 2072 0 no_title 26 checkmatetrix 1513 2060 0 no_title 27 Nahia12 1521 2054 0 no_title 28 DanTheMan82 1216 2053 0 no_title 29 Plaskad 112 2035 0 no_title 30 Speed1 28916 2028 0 no_title 31 Vkiller 2401 2026 0 no_title 32 JanHudak 1631 2025 0 no_title 33 Quick_chess65 380 2024 0 no_title 34 sirprimal11 6300 2021 0 no_title 35 Noyar 827 2016 0 no_title 36 Creignor 367 2004 0 no_title
So I guess this is a good candidate to connect to CCRL Blitz, close to this level of play (Lichess Blitz 2400).
Mean Girl net is at https://github.com/dkappe/leela-chess-w ... -style-net. Then get Lc0 to run it at around 2500 nodes per move according to its profile in Lichess. I am not sure which Lc0 version to use, maybe from year 2019 or 2020.
My 2nd post also has a nice strong of titled players, large sample size and games longer than bullet.
https://lichess.org/V9jNeEMG
27-1 for Simple vs a GM. Though some games are hyperbullet (30seconds).
https://lichess.org/VDu6zbRB/black
143-5 vs an IM. But a lot / most at hyperbullet.
https://lichess.org/N4Rikz1R
Another strong player, but a lot at hyperbullet again.
https://lichess.org/@/simpleEval
I know it uses a very shallow (and weak) opening book to add variety, but no idea which one specifically. Runs on a Pi4B.
Very "fun" opponent. You're going to get an advantage which you'll likely blunder.
Its positional understanding is horrible, its tactics are bonkers.
Last edited by Raphexon on Fri Dec 24, 2021 8:57 pm, edited 1 time in total.
-
Raphexon
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: Lichess Blitz rating to FIDE rating
Some more results:
https://lichess.org/EbBJ0Iux/black
https://lichess.org/tc50EqKV/black (results suspicious imo)
https://lichess.org/0vxkgPmi/black
A NM who fought a lot of rapid games vs simple. 32.5-2.5 for Simple.
https://lichess.org/S8I7N5qX/black
Another huge sample size of bullet games vs a strongish player
https://lichess.org/mNKL56St/black
2 games at 3 min vs an IM
https://lichess.org/uH5t06ee
18-0 at rapid (10m+5s) vs a strongish player.
https://lichess.org/4rCsfcV2/black
Nice sample size at longer TC vs an ok player.
https://lichess.org/xvPszRqH/black
Another good sample size vs a strong player.
https://lichess.org/oWM0NHDj
Strong blitz player. 9.5-0.5 for Simple.
https://lichess.org/GcsrPB8X/black
Another GM getting crushed by Simple. All games at bullet (1m+0) though.
https://lichess.org/PxtepTYR/black
A few games at 3m+2s vs an FM
https://lichess.org/GAI2Vwkb/black
2 games at 3m+2s from a NM
https://lichess.org/tN2sQnuK/black
Strong player at blitz, large sample at 3m+0.
https://lichess.org/A5hcQopF
Simple going 14-0 at 3m+0 vs a NM
So a lot of games. Includes titled players. From Bullet to Rapid.
https://lichess.org/EbBJ0Iux/black
https://lichess.org/tc50EqKV/black (results suspicious imo)
https://lichess.org/0vxkgPmi/black
A NM who fought a lot of rapid games vs simple. 32.5-2.5 for Simple.
https://lichess.org/S8I7N5qX/black
Another huge sample size of bullet games vs a strongish player
https://lichess.org/mNKL56St/black
2 games at 3 min vs an IM
https://lichess.org/uH5t06ee
18-0 at rapid (10m+5s) vs a strongish player.
https://lichess.org/4rCsfcV2/black
Nice sample size at longer TC vs an ok player.
https://lichess.org/xvPszRqH/black
Another good sample size vs a strong player.
https://lichess.org/oWM0NHDj
Strong blitz player. 9.5-0.5 for Simple.
https://lichess.org/GcsrPB8X/black
Another GM getting crushed by Simple. All games at bullet (1m+0) though.
https://lichess.org/PxtepTYR/black
A few games at 3m+2s vs an FM
https://lichess.org/GAI2Vwkb/black
2 games at 3m+2s from a NM
https://lichess.org/tN2sQnuK/black
Strong player at blitz, large sample at 3m+0.
https://lichess.org/A5hcQopF
Simple going 14-0 at 3m+0 vs a NM
So a lot of games. Includes titled players. From Bullet to Rapid.
-
Cornfed
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Lichess Blitz rating to FIDE rating
I am guessing Larry is trying to pull in a lot of data points to try to back into reasonable approximations for human elo with Dragon?amanjpro wrote: ↑Wed Dec 22, 2021 4:11 am Lichess bots are highly underrated. Thanks to all the SF bots that are running on the site and hammer the self-made bots.
But looking at Zahak for example even when it was around CCRL 2100 and running on a raspberry pi 3 (32 bit) it still could beat all human players that it has played.
This would seem perhaps a bit time consuming, but I wonder he could find a good approximation of his current strength in various time controls and play x amounts of games against Dragon under a few time controls and use that for approximation...then attempt to extrapolate by lowering Dragon a couple of hundred points, playing the same number of games and seeing if the results fit somewhat.
Or simply find one weaker engine most agree has a pretty accurate rating under a couple of TC's and how it already corresponds to human elo, let Dragon play against it and adjust based on the result?
At least those are 'controlled experiments' as opposed to all the issues with trying to pull data from websites where...frankly, people don't always play close to their optimal (lack of sleep, playing against 'students', games to pass the time while waiting for dinner to cook, etc)...differences often exacerbated by playing against engines. Engines don't normally miss simple tactics while human vs engine (or human) play would be littered with them so I would think that would be where most of the volatility would come into play...seems like that would be hard to program for.
-
lkaufman
- Posts: 6287
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Lichess Blitz rating to FIDE rating
Since I'm unfamiliar with "simple" and also unfamiliar with the "Pi4b", can you point to any data that would estimate its rating on that hardware when playing against engines with known CCRL or CEGT ratings? I'm looking for an engine with both a reliable engine vs engine rating and a reliable engine vs. humans LIchess rating.Raphexon wrote: ↑Fri Dec 24, 2021 8:40 pm
https://lichess.org/@/simpleEval
I know it uses a very shallow (and weak) opening book to add variety, but no idea which one specifically. Runs on a Pi4B.
Very "fun" opponent. You're going to get an advantage which you'll likely blunder.
Its positional understanding is horrible, its tactics are bonkers.
Komodo rules!
-
lkaufman
- Posts: 6287
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Lichess Blitz rating to FIDE rating
Yes, you understand my goal. Yes, I do try to estimate dragon level ratings by playing them myself, but I don't want to rely too much on just my personal experience. I have a 2300 blitz rating on Lichess after only 25 games, perhaps it will rise with more experience there. I totally like your quote "Or simply find one weaker engine most agree has a pretty accurate rating under a couple of TC's and how it already corresponds to human elo, let Dragon play against it and adjust based on the result?". But so far no one has named such an engine. Do you have any suggestions?Cornfed wrote: ↑Fri Dec 24, 2021 9:23 pmI am guessing Larry is trying to pull in a lot of data points to try to back into reasonable approximations for human elo with Dragon?amanjpro wrote: ↑Wed Dec 22, 2021 4:11 am Lichess bots are highly underrated. Thanks to all the SF bots that are running on the site and hammer the self-made bots.
But looking at Zahak for example even when it was around CCRL 2100 and running on a raspberry pi 3 (32 bit) it still could beat all human players that it has played.
This would seem perhaps a bit time consuming, but I wonder he could find a good approximation of his current strength in various time controls and play x amounts of games against Dragon under a few time controls and use that for approximation...then attempt to extrapolate by lowering Dragon a couple of hundred points, playing the same number of games and seeing if the results fit somewhat.
Or simply find one weaker engine most agree has a pretty accurate rating under a couple of TC's and how it already corresponds to human elo, let Dragon play against it and adjust based on the result?
At least those are 'controlled experiments' as opposed to all the issues with trying to pull data from websites where...frankly, people don't always play close to their optimal (lack of sleep, playing against 'students', games to pass the time while waiting for dinner to cook, etc)...differences often exacerbated by playing against engines. Engines don't normally miss simple tactics while human vs engine (or human) play would be littered with them so I would think that would be where most of the volatility would come into play...seems like that would be hard to program for.
Komodo rules!
-
Raphexon
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: Lichess Blitz rating to FIDE rating
simpleEval only uses material evaluation. (P=1, N=3, B=3, R=5, Q=9)lkaufman wrote: ↑Fri Dec 24, 2021 9:58 pmSince I'm unfamiliar with "simple" and also unfamiliar with the "Pi4b", can you point to any data that would estimate its rating on that hardware when playing against engines with known CCRL or CEGT ratings? I'm looking for an engine with both a reliable engine vs engine rating and a reliable engine vs. humans LIchess rating.Raphexon wrote: ↑Fri Dec 24, 2021 8:40 pm
https://lichess.org/@/simpleEval
I know it uses a very shallow (and weak) opening book to add variety, but no idea which one specifically. Runs on a Pi4B.
Very "fun" opponent. You're going to get an advantage which you'll likely blunder.
Its positional understanding is horrible, its tactics are bonkers.
And a small random component (which is very important)
Besides that it's mostly Stockfish. You can play it yourself, you will get an advantage. You'll likely not convert.
https://github.com/vondele/Stockfish/tree/simpleEval
Pi4B is the hardware it's running on on Lichess. A Raspberry Pi4B.
I think per core it's about 4x as slow as the i7-4770K CCRL uses as a baseline.
From my testing it's about 50-100 elo weaker than Glaurung 2.2. Also tested it vs Cheese with similar results.
I'd say In the ballpark of 2700 CCRL 40/4.
Taking into account the slower hardware it's using on Lichess I'd expect it to be roughly 2500 CCRL 40/4.
-
lkaufman
- Posts: 6287
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Lichess Blitz rating to FIDE rating
The results on lichess suggest a blitz rating there over 3000 and a rapid rating not far below 3000. These are higher numbers than I would expect for ccrl 2500 engine. It seems that chess knowledge helps more vs engines than vs humans, which seems counterintuitive.Raphexon wrote: ↑Fri Dec 24, 2021 10:15 pmsimpleEval only uses material evaluation. (P=1, N=3, B=3, R=5, Q=9)lkaufman wrote: ↑Fri Dec 24, 2021 9:58 pmSince I'm unfamiliar with "simple" and also unfamiliar with the "Pi4b", can you point to any data that would estimate its rating on that hardware when playing against engines with known CCRL or CEGT ratings? I'm looking for an engine with both a reliable engine vs engine rating and a reliable engine vs. humans LIchess rating.Raphexon wrote: ↑Fri Dec 24, 2021 8:40 pm
https://lichess.org/@/simpleEval
I know it uses a very shallow (and weak) opening book to add variety, but no idea which one specifically. Runs on a Pi4B.
Very "fun" opponent. You're going to get an advantage which you'll likely blunder.
Its positional understanding is horrible, its tactics are bonkers.
And a small random component (which is very important)
Besides that it's mostly Stockfish. You can play it yourself, you will get an advantage. You'll likely not convert.
https://github.com/vondele/Stockfish/tree/simpleEval
Pi4B is the hardware it's running on on Lichess. A Raspberry Pi4B.
I think per core it's about 4x as slow as the i7-4770K CCRL uses as a baseline.
From my testing it's about 50-100 elo weaker than Glaurung 2.2. Also tested it vs Cheese with similar results.
I'd say In the ballpark of 2700 CCRL 40/4.
Taking into account the slower hardware it's using on Lichess I'd expect it to be roughly 2500 CCRL 40/4.
Komodo rules!
-
Cornfed
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Lichess Blitz rating to FIDE rating
No, but...lkaufman wrote: ↑Fri Dec 24, 2021 10:04 pmYes, you understand my goal. Yes, I do try to estimate dragon level ratings by playing them myself, but I don't want to rely too much on just my personal experience. I have a 2300 blitz rating on Lichess after only 25 games, perhaps it will rise with more experience there. I totally like your quote "Or simply find one weaker engine most agree has a pretty accurate rating under a couple of TC's and how it already corresponds to human elo, let Dragon play against it and adjust based on the result?". But so far no one has named such an engine. Do you have any suggestions?Cornfed wrote: ↑Fri Dec 24, 2021 9:23 pmI am guessing Larry is trying to pull in a lot of data points to try to back into reasonable approximations for human elo with Dragon?amanjpro wrote: ↑Wed Dec 22, 2021 4:11 am Lichess bots are highly underrated. Thanks to all the SF bots that are running on the site and hammer the self-made bots.
But looking at Zahak for example even when it was around CCRL 2100 and running on a raspberry pi 3 (32 bit) it still could beat all human players that it has played.
This would seem perhaps a bit time consuming, but I wonder he could find a good approximation of his current strength in various time controls and play x amounts of games against Dragon under a few time controls and use that for approximation...then attempt to extrapolate by lowering Dragon a couple of hundred points, playing the same number of games and seeing if the results fit somewhat.
Or simply find one weaker engine most agree has a pretty accurate rating under a couple of TC's and how it already corresponds to human elo, let Dragon play against it and adjust based on the result?
At least those are 'controlled experiments' as opposed to all the issues with trying to pull data from websites where...frankly, people don't always play close to their optimal (lack of sleep, playing against 'students', games to pass the time while waiting for dinner to cook, etc)...differences often exacerbated by playing against engines. Engines don't normally miss simple tactics while human vs engine (or human) play would be littered with them so I would think that would be where most of the volatility would come into play...seems like that would be hard to program for.
It's almost Christmas and my mind is elsewhere, but doesn’t your answers pretty much depend on 2 things:
1. A single established correlation between some version of Dragon and a good human player and
2. Your ability to adjust Dragons playing strength with suitable tweaks?
Example: establish a series of tweaks so that you have a version of Dragon which scores roughly 50% against a 2500 elo human. A baseline. The data is already out there for the expected percentage your average 2500 human is likely to score in standard OTB chess against your average human rated 2700, 2400, 2200, etc.
You then run a series of matches between this ‘2500’ elo version of Dragon and various other tweaked versions of Dragon. Then you comb through the results to find versions which score at roughly the same rate against your 2500 Dragon as various human elo would against your average 2500 human.