Computer vs. human rating

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

kasinp
Posts: 264
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Computer vs. human rating

Post by kasinp »

I thought this was a cute example of just how difficult it is to correlate Elo between computers and humans.
On the black side we have Fidelity Chess Challenger 7 (haha, I know :) ), and its next move is the strongest in this position.

Your favorite engine won't have any problems with it.
The question is: can you find it?


[fen]r4rk1/ppp1q1pp/1b1pP3/3n2B1/2B5/5P1P/PPnQ1P2/R4RK1 b - - 0 20[/fen]


If my 1164 rated human opponent (Spacious Mind suggested USCF rating for CC7) played this move I might be getting suspicious :wink:

Cheers,
Peter
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Computer vs. human rating

Post by lkaufman »

kasinp wrote: Tue Apr 19, 2022 5:50 pm I thought this was a cute example of just how difficult it is to correlate Elo between computers and humans.
On the black side we have Fidelity Chess Challenger 7 (haha, I know :) ), and its next move is the strongest in this position.

Your favorite engine won't have any problems with it.
The question is: can you find it?


[fen]r4rk1/ppp1q1pp/1b1pP3/3n2B1/2B5/5P1P/PPnQ1P2/R4RK1 b - - 0 20[/fen]


If my 1164 rated human opponent (Spacious Mind suggested USCF rating for CC7) played this move I might be getting suspicious :wink:

Cheers,
Peter
It's a nice example of a tactic that would indicate a much higher rating for a human who sees it than 1164, but it doesn't mean that it is hard to correlate engine and human ratings. They have different strengths and weaknesses, to be sure, just as different humans have different strengths and weaknesses (more so in the engine case, I admit), but Elo is measured by results at some specified time control, it doesn't matter whether the games are won by simple tactics or by superior positional play. In general, a given engine will perform at a fairly predictable level against a range of human opposition at a given time limit, and that is its (human-scale) Elo. It only breaks down when the engines are above 3000 Human Elo in strength, then we have to define Elo either by engine vs engine games, by Engine vs human handicap games (with specified Elo values for the handicaps), or by some new method.
Komodo rules!
kasinp
Posts: 264
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Computer vs. human rating

Post by kasinp »

Larry, your points are well taken of course, and I am sure a statistical analysis based on a longer series of games would bear them out.
What I find striking is just how easy it would be to misjudge CC7 strength on the basis of a "collection of brilliancies" that one could publish from its games :)

Cheers,
Peter
User avatar
AdminX
Posts: 6363
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Computer vs. human rating

Post by AdminX »

kasinp wrote: Tue Apr 19, 2022 5:50 pm I thought this was a cute example of just how difficult it is to correlate Elo between computers and humans.
On the black side we have Fidelity Chess Challenger 7 (haha, I know :) ), and its next move is the strongest in this position.

Your favorite engine won't have any problems with it.
The question is: can you find it?


[fen]r4rk1/ppp1q1pp/1b1pP3/3n2B1/2B5/5P1P/PPnQ1P2/R4RK1 b - - 0 20[/fen]


If my 1164 rated human opponent (Spacious Mind suggested USCF rating for CC7) played this move I might be getting suspicious :wink:

Cheers,
Peter
That is a nice little tactic for sure. I had to copy the FEN and paste it my program in order to flip the board to blacks viewing prospective to see it. Otherwise I had a hard time spotting it from the default diagram view.

Image
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
kasinp
Posts: 264
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Computer vs. human rating

Post by kasinp »

Thanks Ted! This does look much nicer too.
P.
Fritz 0
Posts: 151
Joined: Fri Mar 11, 2022 12:10 pm
Full name: Branislav Đošić

Re: Computer vs. human rating

Post by Fritz 0 »

It took me 5 minutes to find it, and I am some 850 points higher rated than Fidelity Chess Challenger 7. Engines are tactical monsters.
Uri Blass
Posts: 11126
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Computer vs. human rating

Post by Uri Blass »

Fritz 0 wrote: Tue Apr 19, 2022 8:44 pm It took me 5 minutes to find it, and I am some 850 points higher rated than Fidelity Chess Challenger 7. Engines are tactical monsters.
I found the right move in less than a minute.
I think that it is not a hard tactics if you know to think correctly.
User avatar
Ajedrecista
Posts: 2165
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Computer vs. human rating.

Post by Ajedrecista »

Hello Peter:
kasinp wrote: Tue Apr 19, 2022 5:50 pm[...]
The question is: can you find it?
[...]
I could not find the move by myself. It was one of the few that I considered, but the tactic was too much for me.

Knowing that Elo values are relative and not absolute, 1164 looks a valid guess. I found 1110 here; and 1126 and 1236 here, at the same schach-computer.info wiki.

Just for the record, I found a PDF here with some reviews, photos of the circuits... There are many more reviews of other dedicated computers here.

Regards from Spain.

Ajedrecista.
Fritz 0
Posts: 151
Joined: Fri Mar 11, 2022 12:10 pm
Full name: Branislav Đošić

Re: Computer vs. human rating

Post by Fritz 0 »

Uri Blass wrote: Tue Apr 19, 2022 8:57 pm
Fritz 0 wrote: Tue Apr 19, 2022 8:44 pm It took me 5 minutes to find it, and I am some 850 points higher rated than Fidelity Chess Challenger 7. Engines are tactical monsters.
I found the right move in less than a minute.
I think that it is not a hard tactics if you know to think correctly.
If I remember correctly, you are stronger player than me. Regardless, I realized fairly quickly that the theme is interposing, but it took me forever to find the right piece for that. I am just tactically blind sometimes :( That's why I suck at blitz and play disproportionately bad at rapid compared to classical.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Computer vs. human rating

Post by lkaufman »

kasinp wrote: Tue Apr 19, 2022 5:50 pm I thought this was a cute example of just how difficult it is to correlate Elo between computers and humans.
On the black side we have Fidelity Chess Challenger 7 (haha, I know :) ), and its next move is the strongest in this position.

Your favorite engine won't have any problems with it.
The question is: can you find it?


[fen]r4rk1/ppp1q1pp/1b1pP3/3n2B1/2B5/5P1P/PPnQ1P2/R4RK1 b - - 0 20[/fen]


If my 1164 rated human opponent (Spacious Mind suggested USCF rating for CC7) played this move I might be getting suspicious :wink:

Cheers,
Peter
This is not such a clean example as we all thought. I checked out the position with the latest Dev. Dragon, gradually raising the Elo rating from 1200. I kept raising it, but it kept insisting on playing ...Nf6 (the first move I thought of myself, until I noticed ...Be3). I had to raise it all the way to 2900 (!) before it switched to ...Be3. It seems (with full strength analysis), that both moves (along with ...Rf6) are favorable to Black, but it takes a lot of detailed analysis to prove that ...Be3 is the better move, it comes down to minor details. So it is probably more or less just random luck that a 1200 rated engine chose ...Be3. It's rather easy to pick it if you pose it as a problem, since ...Nf6 is too obvious and boring to be a problem solution, but in an actual Rapid game perhaps even grandmasters would choose ...Nf6 as it is clearly not bad whereas ...Be3 is very complicated. Anyway this is not a suitable test position due to the alternative(s). It's too easy to pick or miss the right move for the wrong reasons.
Komodo rules!