What engine breaks even with GMs in blitz?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, Dann Corbit, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
lkaufman
Posts: 4415
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: What engine breaks even with GMs in blitz?

Post by lkaufman » Tue Apr 09, 2019 10:04 pm

Laskos wrote:
Tue Apr 09, 2019 8:48 pm
lkaufman wrote:
Tue Apr 09, 2019 6:28 pm
lkaufman wrote:
Tue Apr 09, 2019 4:48 pm

I ran the same test as you did overnight, except that I ran at the actual 45' + 15" level under discussion instead of your 15' + 5" level. So far I have five wins for Arasan 14.3, no wins for Lc0 11248, one draw, and the current game pretty clearly a draw so call it two draws. My 2080 is about 20% faster than your 2070 we determined, but perhaps my 4.9 GHz i7 is also a bit faster than yours? Anyway, it seems that tripling the time limit made a big difference, 6-1 instead of 6-4.
Much to my surprise, Lc0 won that "drawn" game making the score 1.5 -5.5, not 1-6. Lc0 had a lone queen against bishop, knight, and three pawns, and so I assumed (and the evals indicated) that Lc0 would seek perpetual check. But somehow it picked up all three of the pawns, one by one over many moves, and won the queen vs. two minors endgame (no TBs used).
Might start looking similar to my result, although I expect Arasan to perform worse at 45' + 15'' than at 15' + 5'' (and your result will probably show that). That endgame you describe seems a bit funny.

I am getting quite interesting results with ThothFish, a SF derivative which can be adjusted to like or dislike swapping pieces to desired degree. I am playing with some parameters at fast TC, and got a "weak" (small number of nodes) ThothFish which likes very much swapping pieces and overperforms heavily the regular "weak" (small number of nodes) SF, both being Knight up against "strong" (and handicapped) Lc0 11248. Adjusted in this way SF can probably model somehow a human too.

So, Arasan results are sure not the final word.
Well, Lc0 won the last (8th) game vs. Arasan, so the final score was 2.5 to 5.5 for Lc0 giving knight odds at 45' + 15", pretty much what we would expect based on your result at 15' + 5". The ThothFish test sounds interesting; I wonder how strong the incentive to exchange (especially queens) should be for optimum results at knight odds. I suppose though that it's not perfect in that it might still try to trade even if it loses back the piece. For example if it will pay a pawn to trade queens, that might be okay when up a full piece but is certainly not ok when up a piece for two pawns.
Komodo rules!

lkaufman
Posts: 4415
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: What engine breaks even with GMs in blitz?

Post by lkaufman » Wed Apr 10, 2019 4:41 am

mwyoung wrote:
Tue Apr 09, 2019 12:59 pm
mwyoung wrote:
Tue Apr 09, 2019 6:05 am
lkaufman wrote:
Tue Apr 09, 2019 5:21 am
mwyoung wrote:
Tue Apr 09, 2019 4:59 am
mwyoung wrote:
Tue Apr 09, 2019 4:38 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
Hello Larry,

It looks like you are going to have to go down to the bottom of the list.

Here is a news report from 1994. About how Fritz 2 won against all the worlds best in 1994 in 5m blitz games.

https://www.independent.co.uk/arts-ente ... 38085.html

In 1994 at the time of the news report. The best processor was the Pentium.

And the report says: "When Intel sponsored the World Chess Express Challenge in Munich last Friday, they could never have hoped for such a good advertisement for their high-speed Pentium processor. It turned a good computer - Fritz 2 - into a world beater.:"

March 1994:
Intel introduces and ships faster Pentium chips, based on 0.6 micron BiCMOS manufacturing. The processor now includes clock-doubling of 1.5 or 2 time the external clock rate, allowing processor speeds of up to 100 MHz on a 50-66 MHz system bus. The processor also includes power management capabilities to allow stopping and restarting the processor. Code-name during development was P54C. The 60/90 MHz Pentium 735 processor is rated at 149.8 MIPS, and is priced at US$849 in 1000 unit quantities. The 66/100 MHz Pentium 815 processor is rated at 166.3 MIPS, and is priced at US$995 in 1000 unit quantities. [205.98] [265] [62] [550.29] [551.168,259] [557.134] [584.43] [689.115] [276]
I found 2 games of Fritz 2 from 1992. Scoring 1-1 Playing GM Kasparov. Fritz could be playing on a 386 or 486 processor in 1992. My guess would be the 486.



I didn't remember these two events, but I suppose it makes sense that overall Fritz 2 would perform maybe 2750 or so at blitz overall mostly on a Pentium, because Rexchess performed in the 2500s around 1990 on a 486, and Fritz 2 was later and stronger. Considering the hardware avancement since the Pentium, I suppose that the estimates of raising CCRL blitz ratings by 500 for FIDE blitz rating equivalence should be revised upward quite a bit. I don't know if I even have any engine weak enough to play the same level on my 5 Ghz I7 as Fritz 2 did on a Pentium! Well, there's always the handicapped levels of Komodo, one of them must be suitable. But I'd have to have a weak enough engine to run it against to determine which level that would be! Any suggestions of engines of that level that are easy to download and problem-free?
I have tested all of these in the past, and they have worked. And they are the right vintage...
Put your laptop on power saving mode, and or use less time.

http://rebel13.nl/windows/rebel's%20with%20uci.html


MGP 1993.jpg
I looked up the rating for Fritz 2 and Gideon pro. From the 1993 computer chess reports. Gideon pro was rated about 100 elo better then Fritz 2. Tested on a 486 with 4mb of HT.
I picked Pawny 0.2 x64 as my current decade substitute for Fritz 2. It is 2385 CCRL blitz; the list doesn't go back as far as Fritz 2, but by looking at the other Fritz versions and extrapolating backwards I would guess that this would be a pretty fair match on a modern computer. So presumably that would mean that even running at 0.1 GHz (instead of my nearly 5 GHz) it would be about the level of the Pentium that performed around 2750 in blitz with Fritz 2, without even considering that an i7 should be much better than a Pentium at the same speed (can anyone estimate that?). Komodo level 19 is losing to Pawnee at 3' + 2" at full speed but only by 104 elo after 24 games, which means it should be something like the level that Fritz 2 would have achieved with a 25 to 1 speedup from what it had in 1994! That would mean it would crush even Carlsen at blitz, but that doesn't seem to be right, as it has just mixed results vs. Naka and MVL at "slow blitz". Something seems wrong here, not sure what. Anyway, Lc0 11248 totally crushes Komodo level 19 giving it knight odds, even though Komodo does know to exchange major pieces when up a knight.
Komodo rules!

jp
Posts: 1411
Joined: Mon Apr 23, 2018 5:54 am

Re: What engine breaks even with GMs in blitz?

Post by jp » Wed Apr 10, 2019 4:50 am

Laskos wrote:
Tue Apr 09, 2019 1:43 pm
jp wrote:
Tue Apr 09, 2019 12:45 pm
Uri Blass wrote:
Tue Apr 09, 2019 9:42 am
I do not see a reason to assume that LC0 with adjusting the time control can mimic a 2800 fide level human opponent better than A-B engines.
I guess that LC0 knows too much things in the evaluation that 2800 GM's do not know.

I guess that if A-B engines fail to mimic 2800 humans because of a relatively stupid evaluation then lc0 nets fail to mimic 2800 humans because of a relatively stupid search.
People have suggested running tests to see whether we can tell the difference between Lc0 & AB engines' games just looking at them. It's not clear you can, at least if you don't look at their worst sequences of moves.
In general not close to the borders game-play, one probably has to be above 2000-2100 FIDE to see clearly the differences. But if you put a specified Lc0 net in a pool of regular engines from the standard opening position to the outcome by chess rules (no adjudications), there are some markers even I can see with several hours of studying the play of that Lc0 net. Starting opening choice and very late endgames are probably one of the best markers.

Lc0 and regular engines DO differ A LOT, especially seen in test-suites.
I think their bad play might give them away, but if I'm deliberately selecting good play then the test might be much harder. e.g. We know how Lc0 goes in endgames and its opening choices, but I'm not sure people can tell its attacking middlegame wins from AB ones.

Raphexon
Posts: 340
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: What engine breaks even with GMs in blitz?

Post by Raphexon » Wed Apr 10, 2019 6:03 am

lkaufman wrote:
Wed Apr 10, 2019 4:41 am
mwyoung wrote:
Tue Apr 09, 2019 12:59 pm
mwyoung wrote:
Tue Apr 09, 2019 6:05 am
lkaufman wrote:
Tue Apr 09, 2019 5:21 am
mwyoung wrote:
Tue Apr 09, 2019 4:59 am
mwyoung wrote:
Tue Apr 09, 2019 4:38 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
Hello Larry,

It looks like you are going to have to go down to the bottom of the list.

Here is a news report from 1994. About how Fritz 2 won against all the worlds best in 1994 in 5m blitz games.

https://www.independent.co.uk/arts-ente ... 38085.html

In 1994 at the time of the news report. The best processor was the Pentium.

And the report says: "When Intel sponsored the World Chess Express Challenge in Munich last Friday, they could never have hoped for such a good advertisement for their high-speed Pentium processor. It turned a good computer - Fritz 2 - into a world beater.:"

March 1994:
Intel introduces and ships faster Pentium chips, based on 0.6 micron BiCMOS manufacturing. The processor now includes clock-doubling of 1.5 or 2 time the external clock rate, allowing processor speeds of up to 100 MHz on a 50-66 MHz system bus. The processor also includes power management capabilities to allow stopping and restarting the processor. Code-name during development was P54C. The 60/90 MHz Pentium 735 processor is rated at 149.8 MIPS, and is priced at US$849 in 1000 unit quantities. The 66/100 MHz Pentium 815 processor is rated at 166.3 MIPS, and is priced at US$995 in 1000 unit quantities. [205.98] [265] [62] [550.29] [551.168,259] [557.134] [584.43] [689.115] [276]
I found 2 games of Fritz 2 from 1992. Scoring 1-1 Playing GM Kasparov. Fritz could be playing on a 386 or 486 processor in 1992. My guess would be the 486.



I didn't remember these two events, but I suppose it makes sense that overall Fritz 2 would perform maybe 2750 or so at blitz overall mostly on a Pentium, because Rexchess performed in the 2500s around 1990 on a 486, and Fritz 2 was later and stronger. Considering the hardware avancement since the Pentium, I suppose that the estimates of raising CCRL blitz ratings by 500 for FIDE blitz rating equivalence should be revised upward quite a bit. I don't know if I even have any engine weak enough to play the same level on my 5 Ghz I7 as Fritz 2 did on a Pentium! Well, there's always the handicapped levels of Komodo, one of them must be suitable. But I'd have to have a weak enough engine to run it against to determine which level that would be! Any suggestions of engines of that level that are easy to download and problem-free?
I have tested all of these in the past, and they have worked. And they are the right vintage...
Put your laptop on power saving mode, and or use less time.

http://rebel13.nl/windows/rebel's%20with%20uci.html


MGP 1993.jpg
I looked up the rating for Fritz 2 and Gideon pro. From the 1993 computer chess reports. Gideon pro was rated about 100 elo better then Fritz 2. Tested on a 486 with 4mb of HT.
I picked Pawny 0.2 x64 as my current decade substitute for Fritz 2. It is 2385 CCRL blitz; the list doesn't go back as far as Fritz 2, but by looking at the other Fritz versions and extrapolating backwards I would guess that this would be a pretty fair match on a modern computer. So presumably that would mean that even running at 0.1 GHz (instead of my nearly 5 GHz) it would be about the level of the Pentium that performed around 2750 in blitz with Fritz 2, without even considering that an i7 should be much better than a Pentium at the same speed (can anyone estimate that?). Komodo level 19 is losing to Pawnee at 3' + 2" at full speed but only by 104 elo after 24 games, which means it should be something like the level that Fritz 2 would have achieved with a 25 to 1 speedup from what it had in 1994! That would mean it would crush even Carlsen at blitz, but that doesn't seem to be right, as it has just mixed results vs. Naka and MVL at "slow blitz". Something seems wrong here, not sure what. Anyway, Lc0 11248 totally crushes Komodo level 19 giving it knight odds, even though Komodo does know to exchange major pieces when up a knight.
Somebody benched Stockfish 6 on old hardware. (Newer versions don't work anymore)
So if you have SF6, you can compare results.

http://www.talkchess.com/forum3/viewtopic.php?t=63857

"Intel Pentium I 75Mhz 6200nps/7400nps External cache 256kb COMP=i586 (Command bench=7465nps)"

A modern i7 should be like 8+ times as fast at the same clock speed. (conservative estimate)
A more liberal estimate of mine is that a modern i7 would be around 15 times as fast at 0.1 ghz. (single core)

lkaufman
Posts: 4415
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: What engine breaks even with GMs in blitz?

Post by lkaufman » Wed Apr 10, 2019 3:42 pm

Raphexon wrote:
Wed Apr 10, 2019 6:03 am
lkaufman wrote:
Wed Apr 10, 2019 4:41 am
mwyoung wrote:
Tue Apr 09, 2019 12:59 pm
mwyoung wrote:
Tue Apr 09, 2019 6:05 am
lkaufman wrote:
Tue Apr 09, 2019 5:21 am
mwyoung wrote:
Tue Apr 09, 2019 4:59 am
mwyoung wrote:
Tue Apr 09, 2019 4:38 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
Hello Larry,

It looks like you are going to have to go down to the bottom of the list.

Here is a news report from 1994. About how Fritz 2 won against all the worlds best in 1994 in 5m blitz games.

https://www.independent.co.uk/arts-ente ... 38085.html

In 1994 at the time of the news report. The best processor was the Pentium.

And the report says: "When Intel sponsored the World Chess Express Challenge in Munich last Friday, they could never have hoped for such a good advertisement for their high-speed Pentium processor. It turned a good computer - Fritz 2 - into a world beater.:"

March 1994:
Intel introduces and ships faster Pentium chips, based on 0.6 micron BiCMOS manufacturing. The processor now includes clock-doubling of 1.5 or 2 time the external clock rate, allowing processor speeds of up to 100 MHz on a 50-66 MHz system bus. The processor also includes power management capabilities to allow stopping and restarting the processor. Code-name during development was P54C. The 60/90 MHz Pentium 735 processor is rated at 149.8 MIPS, and is priced at US$849 in 1000 unit quantities. The 66/100 MHz Pentium 815 processor is rated at 166.3 MIPS, and is priced at US$995 in 1000 unit quantities. [205.98] [265] [62] [550.29] [551.168,259] [557.134] [584.43] [689.115] [276]
I found 2 games of Fritz 2 from 1992. Scoring 1-1 Playing GM Kasparov. Fritz could be playing on a 386 or 486 processor in 1992. My guess would be the 486.



I didn't remember these two events, but I suppose it makes sense that overall Fritz 2 would perform maybe 2750 or so at blitz overall mostly on a Pentium, because Rexchess performed in the 2500s around 1990 on a 486, and Fritz 2 was later and stronger. Considering the hardware avancement since the Pentium, I suppose that the estimates of raising CCRL blitz ratings by 500 for FIDE blitz rating equivalence should be revised upward quite a bit. I don't know if I even have any engine weak enough to play the same level on my 5 Ghz I7 as Fritz 2 did on a Pentium! Well, there's always the handicapped levels of Komodo, one of them must be suitable. But I'd have to have a weak enough engine to run it against to determine which level that would be! Any suggestions of engines of that level that are easy to download and problem-free?
I have tested all of these in the past, and they have worked. And they are the right vintage...
Put your laptop on power saving mode, and or use less time.

http://rebel13.nl/windows/rebel's%20with%20uci.html


MGP 1993.jpg
I looked up the rating for Fritz 2 and Gideon pro. From the 1993 computer chess reports. Gideon pro was rated about 100 elo better then Fritz 2. Tested on a 486 with 4mb of HT.
I picked Pawny 0.2 x64 as my current decade substitute for Fritz 2. It is 2385 CCRL blitz; the list doesn't go back as far as Fritz 2, but by looking at the other Fritz versions and extrapolating backwards I would guess that this would be a pretty fair match on a modern computer. So presumably that would mean that even running at 0.1 GHz (instead of my nearly 5 GHz) it would be about the level of the Pentium that performed around 2750 in blitz with Fritz 2, without even considering that an i7 should be much better than a Pentium at the same speed (can anyone estimate that?). Komodo level 19 is losing to Pawnee at 3' + 2" at full speed but only by 104 elo after 24 games, which means it should be something like the level that Fritz 2 would have achieved with a 25 to 1 speedup from what it had in 1994! That would mean it would crush even Carlsen at blitz, but that doesn't seem to be right, as it has just mixed results vs. Naka and MVL at "slow blitz". Something seems wrong here, not sure what. Anyway, Lc0 11248 totally crushes Komodo level 19 giving it knight odds, even though Komodo does know to exchange major pieces when up a knight.
Somebody benched Stockfish 6 on old hardware. (Newer versions don't work anymore)
So if you have SF6, you can compare results.

http://www.talkchess.com/forum3/viewtopic.php?t=63857

"Intel Pentium I 75Mhz 6200nps/7400nps External cache 256kb COMP=i586 (Command bench=7465nps)"

A modern i7 should be like 8+ times as fast at the same clock speed. (conservative estimate)
A more liberal estimate of mine is that a modern i7 would be around 15 times as fast at 0.1 ghz. (single core)
Thanks. So if we use 10x as a compromise estimate, that means that my laptop is about 500 times faster than the hardware that Fritz 2 used to place ahead of everyone but Kasparov in Blitz? But this is crazy, that would imply that Fritz 2 or a similar rated engine like Pawny 0.2 x64 on my laptop would easily win 100% of the blitz games from Magnus Carlsen. Does anyone believe that? What is wrong here?
Komodo rules!

Raphexon
Posts: 340
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: What engine breaks even with GMs in blitz?

Post by Raphexon » Wed Apr 10, 2019 6:11 pm

lkaufman wrote:
Wed Apr 10, 2019 3:42 pm
Raphexon wrote:
Wed Apr 10, 2019 6:03 am
lkaufman wrote:
Wed Apr 10, 2019 4:41 am
mwyoung wrote:
Tue Apr 09, 2019 12:59 pm
mwyoung wrote:
Tue Apr 09, 2019 6:05 am
lkaufman wrote:
Tue Apr 09, 2019 5:21 am
mwyoung wrote:
Tue Apr 09, 2019 4:59 am
mwyoung wrote:
Tue Apr 09, 2019 4:38 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
Hello Larry,

It looks like you are going to have to go down to the bottom of the list.

Here is a news report from 1994. About how Fritz 2 won against all the worlds best in 1994 in 5m blitz games.

https://www.independent.co.uk/arts-ente ... 38085.html

In 1994 at the time of the news report. The best processor was the Pentium.

And the report says: "When Intel sponsored the World Chess Express Challenge in Munich last Friday, they could never have hoped for such a good advertisement for their high-speed Pentium processor. It turned a good computer - Fritz 2 - into a world beater.:"

March 1994:
Intel introduces and ships faster Pentium chips, based on 0.6 micron BiCMOS manufacturing. The processor now includes clock-doubling of 1.5 or 2 time the external clock rate, allowing processor speeds of up to 100 MHz on a 50-66 MHz system bus. The processor also includes power management capabilities to allow stopping and restarting the processor. Code-name during development was P54C. The 60/90 MHz Pentium 735 processor is rated at 149.8 MIPS, and is priced at US$849 in 1000 unit quantities. The 66/100 MHz Pentium 815 processor is rated at 166.3 MIPS, and is priced at US$995 in 1000 unit quantities. [205.98] [265] [62] [550.29] [551.168,259] [557.134] [584.43] [689.115] [276]
I found 2 games of Fritz 2 from 1992. Scoring 1-1 Playing GM Kasparov. Fritz could be playing on a 386 or 486 processor in 1992. My guess would be the 486.



I didn't remember these two events, but I suppose it makes sense that overall Fritz 2 would perform maybe 2750 or so at blitz overall mostly on a Pentium, because Rexchess performed in the 2500s around 1990 on a 486, and Fritz 2 was later and stronger. Considering the hardware avancement since the Pentium, I suppose that the estimates of raising CCRL blitz ratings by 500 for FIDE blitz rating equivalence should be revised upward quite a bit. I don't know if I even have any engine weak enough to play the same level on my 5 Ghz I7 as Fritz 2 did on a Pentium! Well, there's always the handicapped levels of Komodo, one of them must be suitable. But I'd have to have a weak enough engine to run it against to determine which level that would be! Any suggestions of engines of that level that are easy to download and problem-free?
I have tested all of these in the past, and they have worked. And they are the right vintage...
Put your laptop on power saving mode, and or use less time.

http://rebel13.nl/windows/rebel's%20with%20uci.html


MGP 1993.jpg
I looked up the rating for Fritz 2 and Gideon pro. From the 1993 computer chess reports. Gideon pro was rated about 100 elo better then Fritz 2. Tested on a 486 with 4mb of HT.
I picked Pawny 0.2 x64 as my current decade substitute for Fritz 2. It is 2385 CCRL blitz; the list doesn't go back as far as Fritz 2, but by looking at the other Fritz versions and extrapolating backwards I would guess that this would be a pretty fair match on a modern computer. So presumably that would mean that even running at 0.1 GHz (instead of my nearly 5 GHz) it would be about the level of the Pentium that performed around 2750 in blitz with Fritz 2, without even considering that an i7 should be much better than a Pentium at the same speed (can anyone estimate that?). Komodo level 19 is losing to Pawnee at 3' + 2" at full speed but only by 104 elo after 24 games, which means it should be something like the level that Fritz 2 would have achieved with a 25 to 1 speedup from what it had in 1994! That would mean it would crush even Carlsen at blitz, but that doesn't seem to be right, as it has just mixed results vs. Naka and MVL at "slow blitz". Something seems wrong here, not sure what. Anyway, Lc0 11248 totally crushes Komodo level 19 giving it knight odds, even though Komodo does know to exchange major pieces when up a knight.
Somebody benched Stockfish 6 on old hardware. (Newer versions don't work anymore)
So if you have SF6, you can compare results.

http://www.talkchess.com/forum3/viewtopic.php?t=63857

"Intel Pentium I 75Mhz 6200nps/7400nps External cache 256kb COMP=i586 (Command bench=7465nps)"

A modern i7 should be like 8+ times as fast at the same clock speed. (conservative estimate)
A more liberal estimate of mine is that a modern i7 would be around 15 times as fast at 0.1 ghz. (single core)
Thanks. So if we use 10x as a compromise estimate, that means that my laptop is about 500 times faster than the hardware that Fritz 2 used to place ahead of everyone but Kasparov in Blitz? But this is crazy, that would imply that Fritz 2 or a similar rated engine like Pawny 0.2 x64 on my laptop would easily win 100% of the blitz games from Magnus Carlsen. Does anyone believe that? What is wrong here?
Pawny maybe scales really badly with extra nodes/time.
And I assume Fritz used an opening book, which should help somewhat.

Either way I tested Igel vs the Play Magnus app* (age 28) with Igel search till depth 15.
Igel trivially beat Magnus as black. But I also realized depth 15 for Igel isn't very blitz.
Maybe I'll try again tomorrow and just limit nodes per move for Igel instead of a set depth.

*Uses a modifed Glaurung set at a specific strength + opening book to mimic Magnus.

But I can believe it.
Engines have a massive advantage in Blitz.

jdart
Posts: 4035
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: What engine breaks even with GMs in blitz?

Post by jdart » Wed Apr 10, 2019 6:30 pm

I think the important point here is just that you want the engine mimicking the GM to fully appreciate the importance of simplifying when up a piece.
Arasan for a long time had an eval term like this. 14.3 has trade down code but it is not like the more recent versions and may be less effective.

However, soon after I started auto-tuning the parameters, the "trade down" bonus got tuned down to zero. I am not sure what the reason for this is, but one thing that comes to mind is that Arasan quite a bit of special case knowledge of specific material imbalances, as well as tablebase support, and if you have that code, a general trade down bonus becomes less useful.

--Jon

Ferdy
Posts: 4308
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: What engine breaks even with GMs in blitz?

Post by Ferdy » Thu Apr 11, 2019 2:20 am

lkaufman wrote:
Tue Apr 09, 2019 3:47 am
Ferdy wrote:
Tue Apr 09, 2019 1:44 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
I am currently working on this via similarity approch.

1. Collect games from TWIC 2017 to 2019

2. Save games played by players with rating 2500 to 2700 both white and black.

3. Read these games and save position for similarity testing with following criteria
a. Evaluate the move in the game by running stockfish at multipv 2 at 1s per pos, single thread, save bs1 and bs2 (bestscore1 from multipv1, and bestscore2 from multipv2 respectively).
b. bs1 should not be too bad and not too good either, bs1 >= -100cp and bs1 <= 100cp
c. Don't save position if there is a clear best move, bs1 - bs2 > 100cp
d. Don't save position if side to move is in check
e. Save position in a game randomly at a maximum of 5 pos in every game. It can be 0 depending on the generated move numbers.

Code: Select all

skip this pos, move num 12 is not in pre-generated move random numbers for this game [97, 31, 26, 81, 84]
4. Once saved position reaches 6000 stop the position generation.

5. Get some uci engines that support go movetime command as priority if not those that support go infinite.

6. Let the engines analyze those saved gm positions and use sim3 to generate similarity info. Those engines that have a high similarity will be the candidates for selection.
Although this does sound interesting, I don't think that similarity in move choice between human and computer is a reasonable way to predict which computer would be of the same strength as the human. The difference between engine and human play is too large for this to work. It's not particularly obvious whether the engine that played most like a 2600 would win or lose to him, only that there is no reason to expect a close match. My guess is that the occasional human blunder would mean that the 2600 human would lose badly to the similar engine, but there could be factors working the other way too.
This is just another method to try to compare with other methods of estimations, this has the advantage of comparing 6000 root positions than what we have perhaps a couple of games only against those GM's.

If there is a position where the GM blunders, then those engines that also prefer the blunder move would increase the similarity to the GM. Hence it could be a good candidate to have same strength to the GM. Note the degree of blunder is also controlled, when Stockfish best score is way too good compared to the gm move score then that position will not be saved in sim test positions.

Code: Select all

if engine_score - gm_score > 300cp don't save the position.
So there are sub-optimal gm moves in the sim test within 3 pawns worst (but have not actually count how many are there, this can be verified of course since sim epd is available), this is designed to separate weak and strong engines under tests. Weaker engines has higher probability to choose these moves.

There is actually an optimal way of collecting positions for sim test. Something like collect gm positions where gm moves are sub-optimal within a given threshold around 20% or some percentage, then another threshold for another percentage.

if gm move is 3 pawns worst save it to sim test, say only up to 2% of the total positions to be collected.
else if gm move is 2.5 pawns worst save it to sim test, say only up to 4%
and so on, the idea is for the engine under sim test to have more variety to vary from the gm moves.

Then it is also possible to have different reward weights to similarity. Example if the engine choose the blunder move at 2 pawns worst, it can have a similarity of 1.1 instead of 1. Probably most engines will not blunder that much but if it blunders then a higher similarity will be awarded. This takes much work though but can be automated.

The similarity.data file would start with an entry from the GM.

Code: Select all

{Human GM 2600 (time: 5000 ms scale: 1.0)} b8a6 b6d4 f3d2 a2a4 a3b2 g8f8 a6a5 c8c4 c3a4 ...
...
I had revised the collection of positions, now the position to be saved in sim test must be from blitz games of those 2500 to 2700 GM's.

I am testing the engines (single thread) at 5s/pos (more realistic for TC 3m+2s, than 0.1s or 0.2s/pos used to detect engine similarity) from 6000 pos in gm2600_sim.epd using i7-2600K 3.4 Ghz PC.

The rating after the engine version is based from CEGT 40/4 rating list. I cannot access CCRL rating list since 2 days ago.

Not much but this is the result so far.

Code: Select all

sim version 3
------ Human GM 2600 (time: 5000 ms scale: 1.0) ------
 56.52  Monolith 0.4 2427 (time: 5000 ms scale: 1.0)
 55.67  Hermann 2.8 2349 (time: 5000 ms scale: 1.0)
On schedule:

Giraffe 20150908, 2243
Cheese 1.5, 2346
Shield 2.1, 2560
Floyd 0.9 2410
Marvin 2.0.0, 2324
Ethereal 8.16, 2459
Rhetoric 1.4, 2541
Nebula 2.0, 2475
Zurichess Geneva, 2317
Fridolin 3.00 2482

lkaufman
Posts: 4415
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: What engine breaks even with GMs in blitz?

Post by lkaufman » Thu Apr 11, 2019 3:23 am

Raphexon wrote:
Wed Apr 10, 2019 6:11 pm
lkaufman wrote:
Wed Apr 10, 2019 3:42 pm
Raphexon wrote:
Wed Apr 10, 2019 6:03 am
lkaufman wrote:
Wed Apr 10, 2019 4:41 am
mwyoung wrote:
Tue Apr 09, 2019 12:59 pm
mwyoung wrote:
Tue Apr 09, 2019 6:05 am
lkaufman wrote:
Tue Apr 09, 2019 5:21 am
mwyoung wrote:
Tue Apr 09, 2019 4:59 am
mwyoung wrote:
Tue Apr 09, 2019 4:38 am
lkaufman wrote:
Sun Apr 07, 2019 4:32 pm
In blitz (let's say 3' + 2" or as close to this as possible), the top engines today are far beyond human level. But how far down the list do we have to go to find engines (and specified hardware) that score evenly against GMs, preferably ones with known identities and ratings? I'm sure there is plenty of data to answer this question as countless games have been played online over the years, but does anyone actually have some data, such as "Engine xyz on one thread scored 50% against GMs averaging 2600 FIDE" for example? The question I'd like to answer is: How much would we have to add to CCRL blitz ratings to estimate the FIDE blitz rating of a human GM who would score 50% against it at 3' + 2"?
Hello Larry,

It looks like you are going to have to go down to the bottom of the list.

Here is a news report from 1994. About how Fritz 2 won against all the worlds best in 1994 in 5m blitz games.

https://www.independent.co.uk/arts-ente ... 38085.html

In 1994 at the time of the news report. The best processor was the Pentium.

And the report says: "When Intel sponsored the World Chess Express Challenge in Munich last Friday, they could never have hoped for such a good advertisement for their high-speed Pentium processor. It turned a good computer - Fritz 2 - into a world beater.:"

March 1994:
Intel introduces and ships faster Pentium chips, based on 0.6 micron BiCMOS manufacturing. The processor now includes clock-doubling of 1.5 or 2 time the external clock rate, allowing processor speeds of up to 100 MHz on a 50-66 MHz system bus. The processor also includes power management capabilities to allow stopping and restarting the processor. Code-name during development was P54C. The 60/90 MHz Pentium 735 processor is rated at 149.8 MIPS, and is priced at US$849 in 1000 unit quantities. The 66/100 MHz Pentium 815 processor is rated at 166.3 MIPS, and is priced at US$995 in 1000 unit quantities. [205.98] [265] [62] [550.29] [551.168,259] [557.134] [584.43] [689.115] [276]
I found 2 games of Fritz 2 from 1992. Scoring 1-1 Playing GM Kasparov. Fritz could be playing on a 386 or 486 processor in 1992. My guess would be the 486.



I didn't remember these two events, but I suppose it makes sense that overall Fritz 2 would perform maybe 2750 or so at blitz overall mostly on a Pentium, because Rexchess performed in the 2500s around 1990 on a 486, and Fritz 2 was later and stronger. Considering the hardware avancement since the Pentium, I suppose that the estimates of raising CCRL blitz ratings by 500 for FIDE blitz rating equivalence should be revised upward quite a bit. I don't know if I even have any engine weak enough to play the same level on my 5 Ghz I7 as Fritz 2 did on a Pentium! Well, there's always the handicapped levels of Komodo, one of them must be suitable. But I'd have to have a weak enough engine to run it against to determine which level that would be! Any suggestions of engines of that level that are easy to download and problem-free?
I have tested all of these in the past, and they have worked. And they are the right vintage...
Put your laptop on power saving mode, and or use less time.

http://rebel13.nl/windows/rebel's%20with%20uci.html


MGP 1993.jpg
I looked up the rating for Fritz 2 and Gideon pro. From the 1993 computer chess reports. Gideon pro was rated about 100 elo better then Fritz 2. Tested on a 486 with 4mb of HT.
I picked Pawny 0.2 x64 as my current decade substitute for Fritz 2. It is 2385 CCRL blitz; the list doesn't go back as far as Fritz 2, but by looking at the other Fritz versions and extrapolating backwards I would guess that this would be a pretty fair match on a modern computer. So presumably that would mean that even running at 0.1 GHz (instead of my nearly 5 GHz) it would be about the level of the Pentium that performed around 2750 in blitz with Fritz 2, without even considering that an i7 should be much better than a Pentium at the same speed (can anyone estimate that?). Komodo level 19 is losing to Pawnee at 3' + 2" at full speed but only by 104 elo after 24 games, which means it should be something like the level that Fritz 2 would have achieved with a 25 to 1 speedup from what it had in 1994! That would mean it would crush even Carlsen at blitz, but that doesn't seem to be right, as it has just mixed results vs. Naka and MVL at "slow blitz". Something seems wrong here, not sure what. Anyway, Lc0 11248 totally crushes Komodo level 19 giving it knight odds, even though Komodo does know to exchange major pieces when up a knight.
Somebody benched Stockfish 6 on old hardware. (Newer versions don't work anymore)
So if you have SF6, you can compare results.

http://www.talkchess.com/forum3/viewtopic.php?t=63857

"Intel Pentium I 75Mhz 6200nps/7400nps External cache 256kb COMP=i586 (Command bench=7465nps)"

A modern i7 should be like 8+ times as fast at the same clock speed. (conservative estimate)
A more liberal estimate of mine is that a modern i7 would be around 15 times as fast at 0.1 ghz. (single core)
Thanks. So if we use 10x as a compromise estimate, that means that my laptop is about 500 times faster than the hardware that Fritz 2 used to place ahead of everyone but Kasparov in Blitz? But this is crazy, that would imply that Fritz 2 or a similar rated engine like Pawny 0.2 x64 on my laptop would easily win 100% of the blitz games from Magnus Carlsen. Does anyone believe that? What is wrong here?
Pawny maybe scales really badly with extra nodes/time.
And I assume Fritz used an opening book, which should help somewhat.

Either way I tested Igel vs the Play Magnus app* (age 28) with Igel search till depth 15.
Igel trivially beat Magnus as black. But I also realized depth 15 for Igel isn't very blitz.
Maybe I'll try again tomorrow and just limit nodes per move for Igel instead of a set depth.

*Uses a modifed Glaurung set at a specific strength + opening book to mimic Magnus.

But I can believe it.
Engines have a massive advantage in Blitz.
Pawny is rated close to what I estimate Fritz 2 would be on CCRL reference hardware, not on an ancient Pentium. As for opening book I was assuming the modern engine would also use one, which would presumably be much better than any book from 1994.
I just assumed that something called "Play Magnus" was just a gimmick; is there solid reason to believe that it has actually proven itself equal to Magnus at some level or levels and if so what time limit was he playing under? Of course it also matters whether he actually had incentive to win if such games were played. Presumably it's best to set an opposing engine to time plus inc rather than nodes or depth, since that is the normal standard for chess these days.
The score for Lc0 11248 (on 2080) vs. Pawny 0.2 (on 5 GHz i7) at 3' + 2" giving knight odds was 15.5 to 4.5 for Lc0. So Lc0 totally crushes an engine at knight odds which it would seem from the discussion should overwhelm even Magnus at 3' + 2".
Komodo rules!

User avatar
xr_a_y
Posts: 1337
Joined: Sat Nov 25, 2017 1:28 pm
Location: France

Re: What engine breaks even with GMs in blitz?

Post by xr_a_y » Thu Apr 11, 2019 5:48 am

If some of you want to challenge minic online ... : https://lichess.org/@/Minic-chess_engine

Post Reply