Human versus Machine

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
whereagles
Posts: 544
Joined: Thu Nov 13, 2014 11:03 am

Re: Human versus Machine

Post by whereagles » Mon Aug 13, 2018 11:58 am

tpoppins wrote:
Mon Aug 13, 2018 10:39 am
Custom CSS sheet with Stylish for Firefox:

Code: Select all

h3 { display: none; }
thx.. but how do I do that in chrome? same code?

tpoppins
Posts: 773
Joined: Tue Nov 24, 2015 8:11 pm
Location: upstate

Re: Human versus Machine

Post by tpoppins » Mon Aug 13, 2018 7:17 pm

I don't use Chrome but it looks like its Stylish's equivalent is Stylebot.

BTW, the code I posted for eliminating nested quotes is outdated. For the new forum software it would have to be this instead:

Code: Select all

blockquote blockquote { display: none; }
Tirsa Poppins
CCRL

Nay Lin Tun
Posts: 363
Joined: Mon Jan 16, 2012 5:34 am

Re: Human versus Machine

Post by Nay Lin Tun » Wed Aug 15, 2018 1:49 am

15 Years of Chess Engine Development, What would be the score between super GM vs Engines now?


Fifteen years ago, in October of 2002, Vladimir Kramnik and Deep Fritz were locked in battle in the Brains in Bahrainmatch. If Kasparov vs. Deep Blue was the beginning of the end for humans in Chess, then the Brains in Bahrain match was the middle of the end. It marked the first match between a world champion and a chess engine running on consumer-grade hardware, although its eight-processor machine was fairly exotic at the time.

Ultimately, Kramnik and Fritz played to a 4-4 tie in the eight-game match. Of course, we know that today the world champion would be crushed in a similar match against a modern computer. But how much of that is superior algorithms, and how much is due to hardware advances? How far have chess engines progressed from a purely software perspective in the last fifteen years? I dusted off an old computer and some old chess engines and held a tournament between them to try to find out.

I started with an old laptop and the version of Fritz that played in Bahrain. Playing against Fritz were the strongest engines at each successive five-year anniversary of the Brains in Bahrain match: Rybka 2.3.2a (2007), Houdini 3 (2012), and Houdini 6 (2017). The tournament details, cross-table, and results are below.

Tournament Details

Format: Round Robin of 100-game matches (each engine played 100 games against each other engine).

Time Control: Five minutes per game with a five-second increment (5+5).

Hardware: Dell laptop from 2006, with a 32-bit Pentium M processor underclocked to 800 MHz to simulate 2002-era performance (roughly equivalent to a 1.4 GHz Pentium IV which would have been a common processor in 2002).

Openings: Each 100 game match was played using the Silver Opening Suite, a set of 50 opening positions that are designed to be varied, balanced, and based on common opening lines. Each engine played each position with both white and black.

Settings: Each engine played with default settings, no tablebases, no pondering, and 32 MB hash tables, except that Houdini 6 played with a 300ms move overhead. This is because in test games modern engines were losing on time frequently, possibly due to the slower hardware and interface.

Results

Engine 1 2 3 4 Total
Houdini 6 ** 83.5-16.5 95.5-4.5 99.5-0.5 278.5/300
Houdini 3 16.5-83.5 ** 91.5-8.5 95.5-4.5 203.5/300
Rybka 2.3.2a 4.5-95.5 8.5-91.5 ** 79.5-20.5 92.5/300
Fritz Bahrain 0.5-99.5 4.5-95.5 20.5-79.5 ** 25.5/300
I generated an Elo rating list using the results above. Anchoring Fritz's rating to Kramnik's 2809 at the time of the match, the result is:

Engine Rating
Houdini 6 3451
Houdini 3 3215
Rybka 2.3.2a 3013
Fritz Bahrain 2809
Conclusions

The progress of chess engines in the last 15 years has been remarkable. Playing on the same machine, Houdini 6 scored an absolutely ridiculous 99.5 to 0.5 against Fritz Bahrain, only conceding a single draw in a 100 game match. Perhaps equally impressive, it trounced Rybka 2.3.2a, an engine that I consider to have begun the modern era of chess engines, by a score of 95.5-4.5 (+91 =9 -0). This tournament indicates that there was clear and continuous progress in the strength of chess engines during the last 15 years, gaining on average nearly 45 Elo per year. Much of the focus of reporting on man vs. machine matches was on the calculating speed of the computer hardware, but it is clear from this experiment that one huge factor in computers overtaking humans in the past couple of decades was an increase in the strength of engines from a purely software perspective. If Fritz was roughly the same strength as Kramnik in Bahrain, it is clear that Houdini 6 on the same machine would have completely crushed Kramnik in the match.





Possible Conclusion, even with 15 years ago hardware, today SF or Houdini will crash 2800 GM by 99 vs 1 in a 100 games match.

With about 10 times powerful current hardware plus powerful software, Stockfish on common desktop will crush 2800 GM by 399 vs 1 in a 400 games match!



(credit from reddit)

todd
Posts: 8
Joined: Thu Apr 19, 2018 7:09 pm

Re: Human versus Machine

Post by todd » Wed Aug 15, 2018 2:30 pm

While we know that a human player will get crushed (and likely not win any games) against the strongest engines now, we should be very careful about inferring from Elo differences that the score would be 399 to 1.

Unlike Fritz, a strong human would know who his opponent is and play accordingly. That means that with the white pieces, they would deliberately aim for drawing lines - forced perpetuals right out of the opening, and positions that are simple enough to play and understand that the chances of blundering are much lower (see, e.g. the 5. Re1 Berlin).

So are we forcing the human players to play with the Silver Opening Suite, too? This would be highly unusual for a man-machine match - generally the humans get to play as they like.

Are we giving the engines contempt and a contempt-ish opening book that avoids forced drawing lines and accepts some worse positions as black in hopes of outplaying the opponent later? If so, this would make the engine's score much better.

If you're just letting the human play against the engine with no book, or, worse yet, with a book aimed at equalizing with black (like Cerebellum), then the human will make draws with white sometimes. The score won't be anything close to 399-1 or even 99-1. I'm only a USCF NM and even I score better than that (because I know a lot of forced draw lines for white thanks to practicing against strong engines).

Also, the elo formula tends to break down for large differences - the underdog scores better than expected (at least in human datasets).

Furthermore, we should consider that chess is a fundamentally drawish game with a fairly large draw margin. Sometimes a good player simply plays well enough to make a draw, and it doesn't matter if their opponent is 2800, 3000, or 3600. Usually not, but once again, I'm just saying that it's not realistic to hope for 99-1 against a strong opponent determined to make draws.

Milos
Posts: 3243
Joined: Wed Nov 25, 2009 12:47 am

Re: Human versus Machine

Post by Milos » Wed Aug 15, 2018 2:57 pm

todd wrote:
Wed Aug 15, 2018 2:30 pm
If you're just letting the human play against the engine with no book, or, worse yet, with a book aimed at equalizing with black (like Cerebellum), then the human will make draws with white sometimes. The score won't be anything close to 399-1 or even 99-1. I'm only a USCF NM and even I score better than that (because I know a lot of forced draw lines for white thanks to practicing against strong engines).
Take latest SFdev on some average conf like 8 core Ryzen 7 1700 and set number of threads to 16 (that should give enough variability). Any TC you like up to FIDE. Set SF contempt to 80 or even 100. 2-move opening book to SF just for randomization.
If you have more than 1 point after 400 games you are simply lying.
Ofc assuming no takebacks, no peaking into SF's eval and PV, etc.

todd
Posts: 8
Joined: Thu Apr 19, 2018 7:09 pm

Re: Human versus Machine

Post by todd » Wed Aug 15, 2018 4:08 pm

Milos, I agree with you that I would just lose every game at contempt 80-100. I have already been trying this ever since SF got its smarter contempt (combined with a small book to make sure it avoids some select opening lines where it ends up in a simple position despite high contempt and yes, for randomization as you mention), and I haven't managed a draw yet.

Since you mentioned hardware I'll say I'm usually playing on a 1950x, but it doesn't really matter that much - the whole point for the human is to reach simple positions where extra computational power is of minimal value. If the human fails at that mission, then a smartphone will finish the job well enough.

I'm only saying the human results would differ a lot depending on the match conditions, but most discussion around this topic is a lot more naïve and simply states the engine will win every game (with no mention of match conditions).

If we optimize the engine for its opponent (a strong human, in this case), then the 99/100 result is definitely reasonable to expect.

If we use defaults, it's not.

And if we use something like Brainfish, results will be even worse, because Cerebellum-style books love to draw with black.

Milos
Posts: 3243
Joined: Wed Nov 25, 2009 12:47 am

Re: Human versus Machine

Post by Milos » Wed Aug 15, 2018 4:43 pm

todd wrote:
Wed Aug 15, 2018 4:08 pm
I'm only saying the human results would differ a lot depending on the match conditions, but most discussion around this topic is a lot more naïve and simply states the engine will win every game (with no mention of match conditions).

If we optimize the engine for its opponent (a strong human, in this case), then the 99/100 result is definitely reasonable to expect.

If we use defaults, it's not.

And if we use something like Brainfish, results will be even worse, because Cerebellum-style books love to draw with black.
That is all true, but the reason is quite simple. We optimize our engines (at least the strongest ones) to beat engines that are in the range of +/-300Elo.
Developers never care about performance vs. significantly weaker engines (or even weaker humans). Main reason is that it would be impossible (or at least we don't yet know how) to optimize engine for every type of opponent it plays. So basically without contempt it is true that Elo values are not additive and that humans can perform better against engines compared to their real rating.
Maybe there will be a way to get smart automated contempt one day but I seriously doubt it.
A simple thought experiment. Imagine we have access to 32 men table bases. We are in some early opening position and there are no winning moves (which is expected considering that chess is probably a draw game) and there are 10 different draw moves, but only few of those moves would make your opponent make mistake, while the others would lead to draw continuation. If you don't know anything about your opponent there is no way to know which move to select. So, at least theoretically one engine with perfect knowledge of position (32 men TBs) would only draw while other engine with non-perfect knowledge but with the knowledge of the opponent would still win while in their mutual games engine with perfect knowledge would win couple of times and never lose.
OTOH, it is also not fair to use only one value of contempt for all opponents. Humans almost always know their opponents (or at least they know they are unknown :)). It would be also fair for engine to know its opponent, or at least to know if it is playing human. One could implement that in UCI protocol and have engine automatically adjust its contempt based on the type of opponent.

Lyudmil Tsvetkov
Posts: 6037
Joined: Tue Jun 12, 2012 10:41 am

Re: Human versus Machine

Post by Lyudmil Tsvetkov » Thu Aug 16, 2018 7:26 am

Nice to see some discussion going on here.
I even kind of managed to read one or 2 words, so parts of the forum seem readable.
Just to remind you 2 days more to go from the free promotion, don't forget to download: https://www.amazon.com/Human-Versus-Mac ... 8&qid=&sr=

Top engines are weak, everyone knows that, you just need the right kind of approach and...having read "Human versus Machine". :)
Later

Lyudmil Tsvetkov
Posts: 6037
Joined: Tue Jun 12, 2012 10:41 am

Re: Human versus Machine

Post by Lyudmil Tsvetkov » Thu Dec 06, 2018 1:02 pm

For people who kind of like my books, here my latest creation: https://www.amazon.com/Neverending-Tact ... 897&sr=8-2
Maybe someone will want to do himself a present for the New Year.

Chessqueen
Posts: 155
Joined: Wed Sep 05, 2018 12:16 am
Full name: Nancy M Pichardo

Re: Human versus Machine

Post by Chessqueen » Thu Dec 06, 2018 1:40 pm

Lyudmil Tsvetkov wrote:
Thu Aug 16, 2018 7:26 am
Nice to see some discussion going on here.
I even kind of managed to read one or 2 words, so parts of the forum seem readable.
Just to remind you 2 days more to go from the free promotion, don't forget to download: https://www.amazon.com/Human-Versus-Mac ... 8&qid=&sr=

Top engines are weak, everyone knows that, you just need the right kind of approach and...having read "Human versus Machine". :)
Later
They are NOT weak, have you tried to play versus Komodo latest version? it can give top GM like Nakamura odds, and he is one of the best player, if not the best player to play against engines.
https://www.youtube.com/watch?v=hotuuH-_jjw

Post Reply