You can't prove anything with your graph. Chess could still be a theoretical win (maybe even for black!). That would make your graphs pretty much meaningless.
I'd bet on you being right, only we'll never really know.
You can't prove anything with your graph. Chess could still be a theoretical win (maybe even for black!). That would make your graphs pretty much meaningless.
I'd bet on you being right, only we'll never really know.
Sure it's not proof of anything. It only allows further and more grounded speculation. If something more convincing to me appears which shows all this as meaningless, I will learn something new to me.
Last edited by Laskos on Wed Jan 20, 2016 12:28 am, edited 2 times in total.
""Let's say that there is some handicap that would produce an even score in a serious match between Komodo and Magnus Carlsen. I would estimate that this handicap would be in the 1 to 1.5 pawn range based on Komodo's eval after a long think. Let's say 1.25. Now both Komodo and Carlsen make errors of some average magnitude. We'll call Carlsen's error rate C, and Komodo's K. I think it's pretty obvious that K is much less than C, let's say K = .4xC. If a future engine drops the error rate to zero, then C - K increaases to 5/3 of it's former value, so the proper handicap should also increase in that ratio. ""
wrote:
Who has any idea what a perfect player will be able to do? That's the fallacy as what happens today means no more than what happened 10 or 20 years ago.
wrote:
(2) depends on the opponent. Against a beginner? 5 moves? Against a master? 15 moves? Against another GM? 30 moves? But all irrelevant when asking about "against a perfect opponent"...
I meant against komodo.
wrote:
(3) The "9 moves" is pure speculation. That's the problem. How would a USCF 2000 player fare against a top program today? And that is nowhere near the difference between a GM and the perfect player, which is almost certainly thousands of Elo yet to be seen.
let's say a gm can convert a knight handicap against komodo from 3 to 4 in x moves. would a future 60,000 elo computer still have to be able to get the score from 3 to 2 in less than x moves to win.?
I don't see why. If it can get it from 3 to 2 in 50 moves it might still be winning.
mhull wrote:
I wonder if there is a variable unaccounted for in this analysis (correct me if I'm wrong) which I would call "hardware limit optimization" or "limit optimization effect".
For instance, in the 10s versus 5s, consider that the depths achieved on current hardware at this limit were in time past, achieved in 2 minutes (however long ago that was). The software of that day (search and evaluation) was tuned and optimized for that depth which was the outer reaches of what was possible.
If current software were constrained to hardware that could only reach those depths (5s on current hardware) in 2 minutes, its search and evaluation would likely be tuned and adjusted differently to maximize its performance to that constraint (not to 10 times that practical limit).
So your curve function is fitting software results to limits for which the software has not been optimized except where it intersects a narrow range of currently optimized championship time controls.
So how can we accurately predict how future doubling in computing capacity will effect ELO if limit optimization effects are unknown? And how is this variable factored-in to measurements of historical progress? If it's not factored-in, what could be the effect on our analysis?
I've tried to make this point multiple times. But when you have a perfect player (32 piece EGTBs) things will be even more different than today. Every 10 years someone has extrapolated what the computer Elo will be in ten years. And every 10 years they have been wrong. Nobody was projecting 3300 Elo 20 years ago, given past data. It only happens with todays programs which are far different from those 20 years ago and beyond.
Curve fitting as done here is an attempt at predicting future performance based on passed performance. Look at predictions prior to LMR vs post-LMR. Prior to null-move vs post-null-move. Prior to parallel search to post-parallel-search. There will always be steady improvement, and there will be discontinuous jumps along the way when something new and different is discovered/tried. Trying to project the Elo of the "ultimate computer" based on today's numbers is senseless. Today's numbers suggest there will NEVER be an ultimate computer. Which I happen to agree with.
But this thread was about a hypothetical case, and extended arguments about something that will almost certainly never happen is a waste of time...
First to Matthew: in fact both Komodo and Stockfish are optimized for ultra-fast controls like those of my three tests, and almost never tested during development at CEGT or CCRL longer time control, never mind TCEC. Their developers only hope that the engines optimized such way will scale well to rating lists and TCEC. The shape of the curve is hardly intentionally skewed by the developers, as they see the gain at ultra-fast and hope for the reasonable scaling of the gain. I did not include unknown to me factors which nobody can measure.
Ah, of course. I should have thought of that. Thank you for that insight. However, would then the problem be the same only inverted? The optimization is done under short time constraint which my be sub-optimal in championship time control and therefore limit-optimization factors may still be in play.
I don't know if Bob has any insight from cluster testing where longer TC optimization is more practical vis-a-vis short TC optimization.
Laskos wrote:
To Bob: there is Mephisto Gideon 1993 engine (about FIDE ELO of 2200) adapted by Ed to work as a UCI engine. I let it play 40/1' on 150 times faster core than the x486 of 1993, it traduces to tournament time control of 1993. IIRC I even tried to measure the gain from doubling with this Mephisto, it seemed like 70-90 ELO points at tournament-like time control of 1993. Maybe from 1990es comes the myth about "70 ELO points". The first difference: in 1993 an engine on top hardware of the time at tournament time control gained at least 70 ELO points per doubling, nowadays at TCEC only 30 ELO points.
Draw rate in self-play is about 10-15% in these conditions (40/1' on i7 core, traduces to 40/150' on x486 of 1993). At TCEC the draw rate is above 85%. Isn't it clear that we are approaching a different state of computer chess? One is hard pressed to be able to show that the distance to perfect player is very large.
""Let's say that there is some handicap that would produce an even score in a serious match between Komodo and Magnus Carlsen. I would estimate that this handicap would be in the 1 to 1.5 pawn range based on Komodo's eval after a long think. Let's say 1.25. Now both Komodo and Carlsen make errors of some average magnitude. We'll call Carlsen's error rate C, and Komodo's K. I think it's pretty obvious that K is much less than C, let's say K = .4xC. If a future engine drops the error rate to zero, then C - K increaases to 5/3 of it's former value, so the proper handicap should also increase in that ratio. ""
OK, a bit of science is needed. We can certainly measure today to see what the handicap should be to equalize Komodo of today. But the .4 is just a magic number with zero evidence to support it. If you assume .75 instead of .4? I don't know of any way to derive this magic number today, because there is no perfect player that can be queried after each GM move to correctly categorize each move as perfect (where we can be generous and call perfect as any move that preserves the game's expected result from move 1). Or probably more accurately, maybe that number is .99. That's all it takes, just 1% fewer errors than a GM, to beat the GM. What does that 5/3 turn into then?
So picking a number out of thin air and then doing unsound math with that to prove something leaves the result a little "thin" on fact and a lot "thick" on imagination.
bob wrote:
As a simplistic point it might be interesting to go back and re-read all the chinook stuff to see if any human chess player managed to draw drawn games with any reliability. And of course, checkers is a far simpler game than chess... But we do have a perfect checkers player, so it would be interesting to see of Schaeffer ever ran such an experiment. I did a quick online search and did not find any results produced after the game solution was announced...
The great Tinsely drew Chinook reliably. He had a plus score against the machine.
not AFTER they finished the final endgame tables that let them play perfectly. He himself had branded it "unbeatable".
bob wrote:
700-1500 Elo is certainly meaningless. Where does that come from? 20 years ago the assumption was that 2800 was the upper bound on Elo. That seems to have bitten the big banana. The only thing that bounds Elo is that the best player will be hard-pressed to get more than 800 above the second-best player. But then the second best can get to 800 below the 3rd.
what is the fallacy to kai's argument that max elo is 4877
The gain from doubling the nodes I fitted with a/(b*x^c + 1), where x is the number of doublings, getting the correlation 0.99
The plot is here:
The 40/4', 40/40' and 40/120' CCRL and CEGT levels are shown, and the resulting gain from doubling in this extrapolation is ~70 points at 40/4', ~55 points at 40/40' and ~45 points at 40/120'. The limiting value I get by summing up to infinity over all doublings (infinite time control), and is 1707 points above the Houdini 3 40/40' CCRL level. So, I get 4877 Elo points on CCRL the rating of the perfect engine, similar to what I remember Don got some time ago.
The draw ratio I fitted with a shifted logistic, getting the correlation 0.999. In self play we can expect a very high percentage of draws going to very long time controls.
The plot is here:
The hardest to quantify to me was the win/loss ratio, which I somehow assumed to be constant at longer TC. It seems not to be the case, win/loss ratio seems to decrease with time control (or nodes). I fitted it with 1 + 1/(a*x + b), getting a correlation 0.96.
The plot is here:
Where to start? You believe the ONLY improvements will now come from just doubling the speed? Never thought about the major jump we saw with any of a dozen techniques, starting with iterated search, through null-move, LMR, and then forward pruning? Do you assume "this is it?" for software? I've heard that for 50 years now. And every time I've heard it cited as some sort of justification why a computer can never do "x" (x = {beat an expert, beat a master, beat an IM, beat a GM, beat the world champion, beat the world champion in a match, to the present beat the world champion when giving him some sort of handicap from a pawn to a knight or more}. Each time a ultimate barrier was raised, it was brushed aside. Logic says the same will happen here. That there must be an ultimate handicap beyond which nobody can win is obviously true. But what that ultimate handicap might end up at is today completely unknown, other than via wild guesses.
And this doubling nonsense is irrelevant. Perfect play will come from an infinite number of doublings, since the computer will play perfectly, and instantly. So that is not a part of the equation.
Too many numbers are bandied about as fact, when they are simply guesswork.
None of this seems reasonable to me. The percentage of GM errors, for example. A GM makes far FEWER errors against a weak program than against a strong program, but not because he actually makes fewer errors, but because the opponent doesn't notice them and doesn't punish them.
I disagree here.
If the opponent does not play well it is easier not to make mistakes.
I clearly have games against humans when I did no significant mistakes based on computer analysis(no move reduce the evaluation by more than 0.2 pawn).
It is not because I am so strong but because it is easier not to make mistakes when the opponent does not play well.
If the opponent play well I expect myself to do more mistakes.
Uri
"no significant mistakes based on computer analysis" is meaningless when we are talking about PERFECT computer play. ANY mistake will be significant there. This extrapolation about what happens today is meaningless when we talk about a perfect chess opponent.
It is not meaningless because the computer is clearly stronger than me and find many mistakes.
If the computer find that I play more mistakes in games that I play against stronger opponents it means that it is easier to make mistakes against stronger opponents.
I do not have the 32 piece tablebases but my speculation is that there are many games when the winner did not do a mistake that change the theoretical result and also there are draws with no mistakes.
Of course the side who did no mistake could do mistakes in case of playing against a stronger opponent.
It is not easier to make mistakes against stronger opponents. It is simply more likely that they will understand your mistake and how they can exploit it. Your propensity for making mistakes is independent of the opponent, your brain cells don't suddenly change when you play a 2500 player vs a 1500 player.
As far as the "many games with no discernible mistakes" there are only "many games" because there are millions and millions of games that have been played. If your opponent spots the mistake and beats you only 1% of the time, that does NOT mean you played 99% of your games with no mistakes.
It is easier to make mistakes when the opponent help you to do mistakes.
If the opponent play weak moves I simply has good chances not to get positions when I do mistakes.
It is not that my brain change when I play against stronger player but
I simply need to solve harder problems that I cannot solve when
I do not need to solve hard problems against the weak player.
The opening moves I guess that I know to play perfectly because there are many drawing moves and after the opponent made a mistake it is easy not to make a mistake.
As an extreme example suppose that
I play against a player who make random moves.
I guess that part of the games are going to be 1.f3 e5 2.g4 Qh4 mate when I played perfectly because the opponent did not help me to do mistakes.
You can claim that maybe 1...e5 is a mistake(and I am not 100% sure because maybe 1...e5 draw when 1...d5 win) but my guess is that it is not a mistake.
Against non random player that is a very weak chess player something like the following can happen when again I cannot prove a mistake by white with today's software.
1.e4 e5 2.Nf3 Nf6 3.Nxe5 Nxe4 4.Qe2 Nf6 5.Nc6+ that win the queen can happen when after 5.Nc6+ I believe that it is easy to play perfectly(maybe I do not mate in the fastest way that is not important but I always play winning moves).
Your opponent can't help YOU "do mistakes". Not possible. He doesn't get to make moves for you. Both of you will still make mistakes at the same rate as always, but a stronger player will make fewer, and in particular fewer that the weaker player can grasp and punish. But he still makes mistakes just the same.
You are mixing playing skill with this mistake stuff. A player at a given level makes mistakes at a predictable rate. It just takes a stronger player to spot many of them. But just because they are not spotted, doesn't mean they are not there.
None of this seems reasonable to me. The percentage of GM errors, for example. A GM makes far FEWER errors against a weak program than against a strong program, but not because he actually makes fewer errors, but because the opponent doesn't notice them and doesn't punish them.
I disagree here.
If the opponent does not play well it is easier not to make mistakes.
I clearly have games against humans when I did no significant mistakes based on computer analysis(no move reduce the evaluation by more than 0.2 pawn).
It is not because I am so strong but because it is easier not to make mistakes when the opponent does not play well.
If the opponent play well I expect myself to do more mistakes.
Uri
"no significant mistakes based on computer analysis" is meaningless when we are talking about PERFECT computer play. ANY mistake will be significant there. This extrapolation about what happens today is meaningless when we talk about a perfect chess opponent.
It is not meaningless because the computer is clearly stronger than me and find many mistakes.
If the computer find that I play more mistakes in games that I play against stronger opponents it means that it is easier to make mistakes against stronger opponents.
I do not have the 32 piece tablebases but my speculation is that there are many games when the winner did not do a mistake that change the theoretical result and also there are draws with no mistakes.
Of course the side who did no mistake could do mistakes in case of playing against a stronger opponent.
It is not easier to make mistakes against stronger opponents. It is simply more likely that they will understand your mistake and how they can exploit it. Your propensity for making mistakes is independent of the opponent, your brain cells don't suddenly change when you play a 2500 player vs a 1500 player.
As far as the "many games with no discernible mistakes" there are only "many games" because there are millions and millions of games that have been played. If your opponent spots the mistake and beats you only 1% of the time, that does NOT mean you played 99% of your games with no mistakes.
It is easier to make mistakes when the opponent help you to do mistakes.
If the opponent play weak moves I simply has good chances not to get positions when I do mistakes.
It is not that my brain change when I play against stronger player but
I simply need to solve harder problems that I cannot solve when
I do not need to solve hard problems against the weak player.
The opening moves I guess that I know to play perfectly because there are many drawing moves and after the opponent made a mistake it is easy not to make a mistake.
As an extreme example suppose that
I play against a player who make random moves.
I guess that part of the games are going to be 1.f3 e5 2.g4 Qh4 mate when I played perfectly because the opponent did not help me to do mistakes.
You can claim that maybe 1...e5 is a mistake(and I am not 100% sure because maybe 1...e5 draw when 1...d5 win) but my guess is that it is not a mistake.
Against non random player that is a very weak chess player something like the following can happen when again I cannot prove a mistake by white with today's software.
1.e4 e5 2.Nf3 Nf6 3.Nxe5 Nxe4 4.Qe2 Nf6 5.Nc6+ that win the queen can happen when after 5.Nc6+ I believe that it is easy to play perfectly(maybe I do not mate in the fastest way that is not important but I always play winning moves).
Your opponent can't help YOU "do mistakes". Not possible. He doesn't get to make moves for you. Both of you will still make mistakes at the same rate as always, but a stronger player will make fewer, and in particular fewer that the weaker player can grasp and punish. But he still makes mistakes just the same.
You are mixing playing skill with this mistake stuff. A player at a given level makes mistakes at a predictable rate. It just takes a stronger player to spot many of them. But just because they are not spotted, doesn't mean they are not there.
The stronger player make moves that lead to position when it is easier for the weaker player to make mistakes.
If you take a game that is solved like checker and 2 human non perfect players when one is significantly stronger but still usually lose against the perfect player then I believe that analysis can show that the winner often does not make mistakes based on computer analysis because the opponent made first a mistake and after the mistake it was an easy win.
bob wrote:And every time I've heard it cited as some sort of justification why a computer can never do "x" (x = {beat an expert, beat a master, beat an IM, beat a GM, beat the world champion, beat the world champion in a match, to the present beat the world champion when giving him some sort of handicap from a pawn to a knight or more}. Each time a ultimate barrier was raised, it was brushed aside. Logic says the same will happen here. That there must be an ultimate handicap beyond which nobody can win is obviously true. But what that ultimate handicap might end up at is today completely unknown, other than via wild guesses.
Logic dictates no such thing, the barriers up to the very last one had to do with an in-hindsight arrogant belief that the human brain had to be better than any computer at "intelligence" tasks. Obviously that's long gone by now.
The barriers being discussed now have absolutely nothing to do with 'how good can computers become' - we can for the sake of this argument say that given enough effort they could both be oracle-like in accuracy and able to steer for land mines for humans. Given enough technological advancement and effort in engine design something at least close enough to that is surely achievable.
Rather the barrier discussed in this thread is entirely about 'how bad are human players at chess?'. Experience in playing leads a player to come to the conclusion that if they can go ahead a clean piece while still having pawns, it's a 'matter of technique' easy win to finish. It's very difficult to imagine that a top human player could possibly be so bad as not to win a 6 game match at classical time control with an extra knight from the start even against an omnipotent deity able to read their thoughts and steer for where the human will make mistakes. A clean knight advantage is just so durable...