ccrl hardware adjustment question

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: ccrl hardware adjustment question

Post by Dann Corbit »

I guess that the main difficulty in calibration is lack of data.
I think a lot of really great chess players don't understand the hardware and software very well either.
I guess that if they knew how to prepare against machines, they could perform much better.
But if we want a realistic estimate of how engines compare to humans we need 1000 game from some very well known rating human verses a very well known rating engine.
There is no way that will ever be achieved.
Unless Larry spends a year playing chess against a computer. I guess he has better things to do, like live a life.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ccrl hardware adjustment question

Post by lkaufman »

Dann Corbit wrote: Tue May 12, 2020 3:46 am I guess that the main difficulty in calibration is lack of data.
I think a lot of really great chess players don't understand the hardware and software very well either.
I guess that if they knew how to prepare against machines, they could perform much better.
But if we want a realistic estimate of how engines compare to humans we need 1000 game from some very well known rating human verses a very well known rating engine.
There is no way that will ever be achieved.
Unless Larry spends a year playing chess against a computer. I guess he has better things to do, like live a life.
I'm not particularly interested in the rating a particular engine would have if its opponents got to play it constantly and find weaknesses. I'm interested in what rating engine would be equal with a given level top human if he knows nothing specific about the engine, just that it is a typical A/B engine of about his level. So as few games between specific opponents as possible suits me. Of course I know all about margin of error, so I know that 18 games won't give you a very precise rating, but it's considered enough for a World Championship so it should tell if one of the two players is significantly stronger than the other. As for playing them myself, I'm clearly not the player I was a decade ago, so I'm hardly a good reference point. I do have some solid reasons for wanting to know if my estimates are accurate/realistic, but I don't want to get into that topic now. As for living a life, I can still remember back in the old days (meaning over two months ago), when that was possible!
Komodo rules!
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: ccrl hardware adjustment question

Post by MikeB »

Modern Times wrote: Tue May 12, 2020 12:20 am OK so I was about right with my estimate of twice the speed.
Yes, a very good estimate.
Image
MMarco
Posts: 195
Joined: Sun Apr 12, 2020 1:09 am
Full name: Marc-O Moisan-Plante

Re: ccrl hardware adjustment question

Post by MMarco »

I don't think we should infer elo from the last Kramnik match due to the "blunder of the century" (https://en.m.wikipedia.org/wiki/Blunder ... ir_Kramnik ) and that Kramnik tried to win with black to even the match (but lost again).

For a human, it is easier to draw twice than win once against Fritz 10. Maybe Carlsen can draw twice (out of two games say) a 3200 CCRL rated engine if he can reach the kind of position he needs from the opening with classical time control. In my opinion Kramnik did outplay Fritz 10 in some games, but lost trying to win.

As for a future match of Komodo NN, I would be more interested in a non odds match on very slow hardware (rasberry or equivalent of an old pentium), than an odds match. Or maybe a depth-limited match, d=5 or d=6. I think Lc0 (with 591226) with d=6 would beat Carlsen at classical time control, but I'd like to see that (or Komodo NN of course!). d=5 would be even less clear.
Gabor Szots
Posts: 1362
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: ccrl hardware adjustment question

Post by Gabor Szots »

I had a Q6600 overclocked to 3 GHz and my Crafty banchmark was 31 so I used 40/3 TC but it could have been 40/2:35 if GUI's had allowed that.
Gabor Szots
CCRL testing group
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ccrl hardware adjustment question

Post by lkaufman »

MMarco wrote: Tue May 12, 2020 8:58 am I don't think we should infer elo from the last Kramnik match due to the "blunder of the century" (https://en.m.wikipedia.org/wiki/Blunder ... ir_Kramnik ) and that Kramnik tried to win with black to even the match (but lost again).

For a human, it is easier to draw twice than win once against Fritz 10. Maybe Carlsen can draw twice (out of two games say) a 3200 CCRL rated engine if he can reach the kind of position he needs from the opening with classical time control. In my opinion Kramnik did outplay Fritz 10 in some games, but lost trying to win.

As for a future match of Komodo NN, I would be more interested in a non odds match on very slow hardware (rasberry or equivalent of an old pentium), than an odds match. Or maybe a depth-limited match, d=5 or d=6. I think Lc0 (with 591226) with d=6 would beat Carlsen at classical time control, but I'd like to see that (or Komodo NN of course!). d=5 would be even less clear.
Kramnik was a bit unlucky in that final match, but on the other hand he got unprecedented advantages, notably the right to see the opponent's opening book DURING THE GAME! I agree the earlier three matches were a better indication, especially since the tied scores meant that draws favored neither side. Yes, much easier for the weaker player (human or engine) to draw twice than to win once; this is why only the ratings of engines that faced equally strong human opponents are fully reliable; beyond that they have to be extrapolated with assumptions.
I'm sure that for any strong NN, there is some setting for hardware, time limit, or depth, that would make for an even match with Carlsen at any specified time limit, and I agree that it would be an interesting match to watch. But the result wouldn't be very interesting, I mean how significant is it to know whether a specific network needs d=5 or d=6 (or exactly how many minutes) to beat the champ? Especially depth has no real meaning, it is an arbitrary number for NNs. There is some interest in whether limited hardware, such as a cellphone, can defeat the champ, but I'm pretty sure that the answer is already Yes for Stockfish and Komodo, so if and when an NN can also do so is not so critical.
Komodo rules!
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: ccrl hardware adjustment question

Post by carldaman »

A couple of points to Larry:

The time controls were not exactly what we'd now call 'rapid' on the early CCRL hardware.

And, weren't the CCRL slow ratings 'adjusted' by 100 Elo points downward a few years ago?
That means the CCRL ratings were higher across the board by that much for a few years after those SuperGM vs engine matches happened.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ccrl hardware adjustment question

Post by lkaufman »

carldaman wrote: Tue May 12, 2020 8:45 pm A couple of points to Larry:

The time controls were not exactly what we'd now call 'rapid' on the early CCRL hardware.

And, weren't the CCRL slow ratings 'adjusted' by 100 Elo points downward a few years ago?
That means the CCRL ratings were higher across the board by that much for a few years after those SuperGM vs engine matches happened.
I don't see how either of these points matters. I'm talking about how engines would perform on CURRENT hardware in Rapid play (or in slow games where I say so, but I accept that the relative engine ratings would be a bit different at such time controls). The level of CCRL ratings in the past is quite irrelevant; I'm comparing CURRENT CCRL ratings with estimated FIDE ratings, it has nothing to do with what CCRL ratings might have been in the past. The fact that they were lowered by 100 Elo in the past might explain why we need to increase them for FIDE comparison purposes now, but I'm not looking for an explanation, just for how much adjustment is needed.
Komodo rules!
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: ccrl hardware adjustment question

Post by Alayan »

lkaufman wrote: Tue May 12, 2020 5:45 pm There is some interest in whether limited hardware, such as a cellphone, can defeat the champ, but I'm pretty sure that the answer is already Yes for Stockfish and Komodo, so if and when an NN can also do so is not so critical.
Pocket Fritz already crushed GMs in tournament play 10 years ago ruinning on a mobile phone. It was much weaker software and much weaker hardware.

Our estimates for SF11 release was that with a modern quad-core, SF could give 1:1000 time odds to Carlsen with classical TC, which drops SF downs into the range where it can't properly exploit its scaling potential.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ccrl hardware adjustment question

Post by lkaufman »

Alayan wrote: Tue May 12, 2020 9:29 pm
lkaufman wrote: Tue May 12, 2020 5:45 pm There is some interest in whether limited hardware, such as a cellphone, can defeat the champ, but I'm pretty sure that the answer is already Yes for Stockfish and Komodo, so if and when an NN can also do so is not so critical.
Pocket Fritz already crushed GMs in tournament play 10 years ago ruinning on a mobile phone. It was much weaker software and much weaker hardware.

Our estimates for SF11 release was that with a modern quad-core, SF could give 1:1000 time odds to Carlsen with classical TC, which drops SF downs into the range where it can't properly exploit its scaling potential.
Well, crushing GMs is not the same as playing Carlsen, after all even I am a GM, but it did have a performance rating above Carlsen's rating, so even if that was partly luck (i.e. small sample), I agree that it would not be a close contest today with either Stockfish or Komodo on a good cellphone vs. Carlsen. Based on my limited experience and data, I think that your 1000 to 1 estimate for time odds is a bit on the high side, but even if the right number is half that, 500, it doesn't change the conclusion. Matches between engines and humans need some objective handicap beyond just time/hardware etc. to be interesting in my opinion; other than material handicaps, I think that Draw odds plus White odds plus Castling odds is the most promising option for a 2800 level human opponent, perhaps with some time odds as well.
Komodo rules!