Next Komodo vs. GM handicap match - "Standard chess&quo

Laskos · Post by **Laskos** » Thu Jun 16, 2016 6:49 pm

lkaufman wrote:
I agree with your analysis except for the three move book part, which I believe is worth quite a bit more than 60 elo. If we make it 100 elo, this would make the conditions fair for World number 2 or 3 (Kramnik and Caruana) based on your analysis. So I'll tell Sergey that I think his chances are similar to what his chances would be against Caruana.
Another way to approach the problem is to start with the single-core ratings of Komodo 10 on the three best-known blitz lists (CCRL,CEGT, IPON) which are supposed to be rough estimates of what top engines would get in FIDE competition. Then there is no need to estimate the 24 core to 1 core dropoff. But I think the answer will be similar.

Book thing is very tricky, if the human knows how to exploit this issue, then 100 ELO points handicap is realistic. The issue is hard to resolve even in engine-engine matches. For example no book at all, although loses by say 60 ELO points against the same engine with good book, might still be better than a bad book. A bad book may lose against a good book by an excess of 100 ELO points, because the better book might be tuned against the bad book. With this issue, we would agree that the set-up of Komodo should be around 2800 ELO points, and Erenburg is expected to score 2 draws, if he exploits the book issue.

About smartphones, a mainstream iPhone 6S is faster than 1 core of your laptop, and the NPS of that 1 core should be close to most modern mid-range smartphones. Top smartphones are twice as fast. Komodo NPS are roughly equal to those of Stockfish, and the "benchmark smartphones" thread here gives for the top smartphones the following NPS of Stockfish (on all smartphone cores):

Code: Select all

Samsung S7 Edge (Exynos, 8 cores 64 bits, MM)           3.036.000   NEW 
Huawei P9 Plus (Kirin 955 8 Cores 64 bits, MM)          3.022.000   NEW 
Huawei P9 (Kirin 955 8 cores 64 bits, MM)               2.945.000   NEW 
Samsung S6 Edge + (8 cores 64 Bit Exynos)               2.505.000 
Samsung Galaxy Note 5                                   2.500.000 
iPad Pro (A9X, 2 cores @2.26 Ghz 64 bits)               2.480.000 
Samsung S6 (8 cores, 64bit Exynos)                      2.042.000 
Iphone 6S                                               2.024.000

Uri Blass · Post by **Uri Blass** » Fri Jun 17, 2016 10:23 am

Laskos wrote:
lkaufman wrote:
I agree with your analysis except for the three move book part, which I believe is worth quite a bit more than 60 elo. If we make it 100 elo, this would make the conditions fair for World number 2 or 3 (Kramnik and Caruana) based on your analysis. So I'll tell Sergey that I think his chances are similar to what his chances would be against Caruana.
Another way to approach the problem is to start with the single-core ratings of Komodo 10 on the three best-known blitz lists (CCRL,CEGT, IPON) which are supposed to be rough estimates of what top engines would get in FIDE competition. Then there is no need to estimate the 24 core to 1 core dropoff. But I think the answer will be similar.
Book thing is very tricky, if the human knows how to exploit this issue, then 100 ELO points handicap is realistic. The issue is hard to resolve even in engine-engine matches. For example no book at all, although loses by say 60 ELO points against the same engine with good book, might still be better than a bad book. A bad book may lose against a good book by an excess of 100 ELO points, because the better book might be tuned against the bad book. With this issue, we would agree that the set-up of Komodo should be around 2800 ELO points, and Erenburg is expected to score 2 draws, if he exploits the book issue.

About smartphones, a mainstream iPhone 6S is faster than 1 core of your laptop, and the NPS of that 1 core should be close to most modern mid-range smartphones. Top smartphones are twice as fast. Komodo NPS are roughly equal to those of Stockfish, and the "benchmark smartphones" thread here gives for the top smartphones the following NPS of Stockfish (on all smartphone cores):
Code: Select all
Samsung S7 Edge (Exynos, 8 cores 64 bits, MM)           3.036.000   NEW 
Huawei P9 Plus (Kirin 955 8 Cores 64 bits, MM)          3.022.000   NEW 
Huawei P9 (Kirin 955 8 cores 64 bits, MM)               2.945.000   NEW 
Samsung S6 Edge + (8 cores 64 Bit Exynos)               2.505.000 
Samsung Galaxy Note 5                                   2.500.000 
iPad Pro (A9X, 2 cores @2.26 Ghz 64 bits)               2.480.000 
Samsung S6 (8 cores, 64bit Exynos)                      2.042.000 
Iphone 6S                                               2.024.000

The question is what do you mean by a good book.

A good book can have killer lines against the same engine with no book
so I think that it can be clearly more than 60 elo against no book in direct match.

Laskos · Post by **Laskos** » Fri Jun 17, 2016 10:34 am

Uri Blass wrote:
Laskos wrote:
lkaufman wrote:
I agree with your analysis except for the three move book part, which I believe is worth quite a bit more than 60 elo. If we make it 100 elo, this would make the conditions fair for World number 2 or 3 (Kramnik and Caruana) based on your analysis. So I'll tell Sergey that I think his chances are similar to what his chances would be against Caruana.
Another way to approach the problem is to start with the single-core ratings of Komodo 10 on the three best-known blitz lists (CCRL,CEGT, IPON) which are supposed to be rough estimates of what top engines would get in FIDE competition. Then there is no need to estimate the 24 core to 1 core dropoff. But I think the answer will be similar.
Book thing is very tricky, if the human knows how to exploit this issue, then 100 ELO points handicap is realistic. The issue is hard to resolve even in engine-engine matches. For example no book at all, although loses by say 60 ELO points against the same engine with good book, might still be better than a bad book. A bad book may lose against a good book by an excess of 100 ELO points, because the better book might be tuned against the bad book. With this issue, we would agree that the set-up of Komodo should be around 2800 ELO points, and Erenburg is expected to score 2 draws, if he exploits the book issue.

About smartphones, a mainstream iPhone 6S is faster than 1 core of your laptop, and the NPS of that 1 core should be close to most modern mid-range smartphones. Top smartphones are twice as fast. Komodo NPS are roughly equal to those of Stockfish, and the "benchmark smartphones" thread here gives for the top smartphones the following NPS of Stockfish (on all smartphone cores):
Code: Select all
Samsung S7 Edge (Exynos, 8 cores 64 bits, MM)           3.036.000   NEW 
Huawei P9 Plus (Kirin 955 8 Cores 64 bits, MM)          3.022.000   NEW 
Huawei P9 (Kirin 955 8 cores 64 bits, MM)               2.945.000   NEW 
Samsung S6 Edge + (8 cores 64 Bit Exynos)               2.505.000 
Samsung Galaxy Note 5                                   2.500.000 
iPad Pro (A9X, 2 cores @2.26 Ghz 64 bits)               2.480.000 
Samsung S6 (8 cores, 64bit Exynos)                      2.042.000 
Iphone 6S                                               2.024.000
The question is what do you mean by a good book.

A good book can have killer lines against the same engine with no book
so I think that it can be clearly more than 60 elo against no book in direct match.

Maybe, usually book tourneys or engine rooms like Playchess Engine Room are book against book, and newer book1 tuned against older book2 can have more than 100 ELO points advantage. I either didn't see the tuning against engine no book, or tuning against no book is harder. I saw only some IIRC ~60 ELO points against no book in book tourneys and in tests.

nimh · Post by **nimh** » Fri Jun 17, 2016 2:09 pm

In addition to calculations below, one also must take into account the role of anti-computer strategy. How succesful a human is at employing anti-computer strategy depends on the depth of engine's search. In earlier times computers were sepecially vulnerable, because they had both slow hardware that was further aggravated by a weak search function. Evaluation function has virtually no significance, as it is relative crude even in modern top programs. So, I think that the human would score better than Laskos' calculations would indicate.

In contrast, pawn-odds games however have a completely different outlook. Removing a pawn allows an engine to develop quicker, making the subsequent positions more tactical, less suitable for humans; the inverse of the result of anti-computer strategy. For that reason the results of pawn-odds games favor engines more than one would infer from theory.

Nordlandia · Post by **Nordlandia** » Fri Jun 17, 2016 4:08 pm

nimh wrote:In addition to calculations below, one also must take into account the role of anti-computer strategy. How succesful a human is at employing anti-computer strategy depends on the depth of engine's search. In earlier times computers were sepecially vulnerable, because they had both slow hardware that was further aggravated by a weak search function. Evaluation function has virtually no significance, as it is relative crude even in modern top programs. So, I think that the human would score better than Laskos' calculations would indicate.

In contrast, pawn-odds games however have a completely different outlook. Removing a pawn allows an engine to develop quicker, making the subsequent positions more tactical, less suitable for humans; the inverse of the result of anti-computer strategy. For that reason the results of pawn-odds games favor engines more than one would infer from theory.

I thought anti-computer strategy don't work against present day engines.

But considering this handicap it may work somewhat.

lkaufman · Post by **lkaufman** » Fri Jun 17, 2016 4:13 pm

nimh wrote:In addition to calculations below, one also must take into account the role of anti-computer strategy. How succesful a human is at employing anti-computer strategy depends on the depth of engine's search. In earlier times computers were sepecially vulnerable, because they had both slow hardware that was further aggravated by a weak search function. Evaluation function has virtually no significance, as it is relative crude even in modern top programs. So, I think that the human would score better than Laskos' calculations would indicate.

In contrast, pawn-odds games however have a completely different outlook. Removing a pawn allows an engine to develop quicker, making the subsequent positions more tactical, less suitable for humans; the inverse of the result of anti-computer strategy. For that reason the results of pawn-odds games favor engines more than one would infer from theory.

All of the single-pawn handicap games Komodo has played with GMs have had Komodo remove the "f" pawn, which does not aid development, it hinders it due to worries about Qh5 check or h4-h5. So this argument is only relevant for the two pawn handicaps we have tried.

nimh · Post by **nimh** » Fri Jun 17, 2016 4:32 pm

Nordlandia wrote:
nimh wrote:In addition to calculations below, one also must take into account the role of anti-computer strategy. How succesful a human is at employing anti-computer strategy depends on the depth of engine's search. In earlier times computers were sepecially vulnerable, because they had both slow hardware that was further aggravated by a weak search function. Evaluation function has virtually no significance, as it is relative crude even in modern top programs. So, I think that the human would score better than Laskos' calculations would indicate.

In contrast, pawn-odds games however have a completely different outlook. Removing a pawn allows an engine to develop quicker, making the subsequent positions more tactical, less suitable for humans; the inverse of the result of anti-computer strategy. For that reason the results of pawn-odds games favor engines more than one would infer from theory.
I thought anti-computer strategy don't work against present day engines.

But considering this handicap it may work somewhat.

It doesn't work because of more efficient search and fast hardware. If we eliminate them, there shouldn't be much difference. At least that's what I think.

JJJ · Post by **JJJ** » Fri Jun 17, 2016 4:34 pm

The No Book Handicap doesn't worth more than 40 elo to me, because human won't be prepared as much as computer with a book.

nimh · Post by **nimh** » Fri Jun 17, 2016 4:35 pm

lkaufman wrote:
nimh wrote:In addition to calculations below, one also must take into account the role of anti-computer strategy. How succesful a human is at employing anti-computer strategy depends on the depth of engine's search. In earlier times computers were sepecially vulnerable, because they had both slow hardware that was further aggravated by a weak search function. Evaluation function has virtually no significance, as it is relative crude even in modern top programs. So, I think that the human would score better than Laskos' calculations would indicate.

In contrast, pawn-odds games however have a completely different outlook. Removing a pawn allows an engine to develop quicker, making the subsequent positions more tactical, less suitable for humans; the inverse of the result of anti-computer strategy. For that reason the results of pawn-odds games favor engines more than one would infer from theory.

All of the single-pawn handicap games Komodo has played with GMs have had Komodo remove the "f" pawn, which does not aid development, it hinders it due to worries about Qh5 check or h4-h5. So this argument is only relevant for the two pawn handicaps we have tried.

I'm not fully convinced by this argument; after castling the rook will stand on the open file, which means there's nevertheless a slight tendency towards open play.

mbabigian · Post by **mbabigian** » Fri Jun 17, 2016 5:24 pm

Considering all of the calculations below in this thread and the seeming agreement between those involved, I would find it much more interesting if Komodo's time were set to 1'+0.3" or 45"+0.25". Otherwise this is yet another match where the engine is relatively overpowered compared to the human.

If we see a more balanced match it is also somewhat easier to estimate the true strength of the computer at this particular speed versus a match that ends 4 to zero.

Think of it like IQ tests. If you make the test so weak that most people can answer all of the questions correctly, you can't determine anything. Same with chess problem sets. I'd like to see a setup where the most likely outcome is 2-2.

My two cents.
I do enjoy these matches.

Mike

Next Komodo vs. GM handicap match - "Standard chess&quo

How will Komodo score in 4 handicap games with Erenburg?

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess

Re: Next Komodo vs. GM handicap match - "Standard chess