Albert Silver wrote:I tend to agree in all honesty. I do think that there is a difference in hyperspeed games, and not, but believe this difference disappears quite quickly unless one of the engines has serious time management issues.
Vincent used to make claims that Diep would be the top program given enough time due to its sheer superiority in knowledge over others. Ultimately it did not work that way. The diminishing returns he expected to kick in with greater ply-depth, that his engine's knowledge would outweigh, consistently failed to kick in.
In other words, if an engine was killing Diep because it was outsearching it by X Plies, it continued to kill him due to those plies even with greater time spent.
I do not believe there is an engine that will be weaker than Houdini at 5 minute games, but stronger at 2 hour games, unless they are already neck-to-neck (+ or -10 Elo) from the start.
While I agree that differences between blitz results and tournament level results will not be enormous (as in 100 or more elo), I think your estimate of a limit of 10 is way too low. Without even comparing unrelated programs, just compare Houdini 1.5a and Houdini 2.0. All blitz tests on single core (CCRL, CEGT, IPON, and others) agree that Houdini 2.0 is stronger (ccrl by 12 elo, cegt by 4 elo, others a bit higher, so average a bit over 10 elo). This is based on thousands of games. At 40/20 or 40/40 we find the opposite, H1.5 is stronger by 12 elo on CCRL and by 13 on CEGT, all with decent size samples. There is no slow data on 4 cores for 2.0 on either CCRL or CEGT that meets the minimum number of games to avoid being greyed out. So we have a net swing just going from blitz to an average of 40/30 of about 25 elo, just for two successive versions of the same program!! I know that some of this 25 could be sample error, but even if half of it is bogus it would indicate that going from blitz to 40/2 could swing relative ratings by 25 elo just in this one case. Surely with unrelated programs the swing could be much greater. I think that roughly 50 elo is the maximum likely swing in relative ratings going from blitz to 40/2.
In the present instance the trend is really marked. Someone has a 1 minute rating list that shows Houdini to be something like a hundred elo above everyone else. It's been very clear to us for a long time that all of the Ippolit-related programs, even including Critter which is related but does not copy code directly from Ippolit, are incredibly strong at bullet relative to all other programs (basically only Rybka, Komodo and Stockfish are strong enough to even compare to the Ippos), much less so but still superior in blitz, and start to be weaker somewhere around 40/20 or so. Houdini is stronger than the other Ippos, but clearly follows the same pattern of a marked decrease in relative strength with more time. This will only become clear though once a non-ippo program catches Houdini at 40/20 or 40/40.
I would be interested in any thoughtful comments as to why Stockfish gains on the ippos as we go from bullet to blitz to 40/20 to 40/40. I say Stockfish rather than Komodo because Stockfish is open-source so people need not guess as to what they are doing, they can actually compare SF code to Ippo code. I strongly suspect that whatever the answer is, it will also apply to Komodo, as our search has much more in common with SF than with Ippos.