How many engines are not able to handle a big initiative?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

MikeG

Re: How many engines are not able to handle a big initiative

Post by MikeG »

From what I understand, Bob does 100's of thousands of test matches to optimize Crafty. I would argue it's likely to be well tuned. Is it "perfectly" tuned? I doubt it.

Second, please explain exactly what you mean by "outsearched". Do you mean simply reaching a deeper ply? If so, would you argue that this *always* results in a net gain in playing strength? If it were that easy one could simply do shallow sorts, increase selectivity to the max and pick only the top move (maximizing late move reductions in a way), and extend this way to the end of the game and play perfect chess.

Also, what cold hard data do you have that Robbo wins by simply "outsearching" it's opponents?

Lasttly, Crafty was searching 2.5x as many nodes in the match. Of course there are always trade-offs, but one could argue that Cratfy is doing the "outsearching" in this case, if search depth is equal then it has more coverage (less selectivity).







[/quote]
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: How many engines are not able to handle a big initiative

Post by Milos »

Gian-Carlo Pascutto wrote:Whatever. I could have said a 1900 vs a 2400 player, the point stays the same.
It's not whatever. It's only 300 elo - 60elo for 2 cores vs. 1 which gives 240elo difference. The point is completely different, but you obviously can't see it since you're fantasizing numbers without any understanding of their meaning.
Uri Blass wrote: I doubt if GM's with rating above 2700 are going to perform so poorly at the same time control against engines in these conditions.
I think they're going to do even worse. Converting the advantage likely means the need to keep playing actively.

Suicide against an opponent that outsearches you.
Again you have no clue. For example, Rybka-Milov match clearly demonstrates that.
And what does that "outsearch" really mean? On average Robbo searches 2 plies deeper than Crafty which approximately accounts for those 240elo points of difference. Now, explain to me how does that accounts for handicap of 5 tempi or more???
Robert Flesher
Posts: 1287
Joined: Tue Aug 18, 2009 3:06 am

Re: How many engines are not able to handle a big initiative

Post by Robert Flesher »

MikeG wrote:From what I understand, Bob does 100's of thousands of test matches to optimize Crafty. I would argue it's likely to be well tuned. Is it "perfectly" tuned? I doubt it.

Second, please explain exactly what you mean by "outsearched". Do you mean simply reaching a deeper ply? If so, would you argue that this *always* results in a net gain in playing strength? If it were that easy one could simply do shallow sorts, increase selectivity to the max and pick only the top move (maximizing late move reductions in a way), and extend this way to the end of the game and play perfect chess.

Also, what cold hard data do you have that Robbo wins by simply "outsearching" it's opponents?

Lasttly, Crafty was searching 2.5x as many nodes in the match. Of course there are always trade-offs, but one could argue that Cratfy is doing the "outsearching" in this case, if search depth is equal then it has more coverage (less selectivity).



How much you are searching means very little if you are not pruning out the garage. This was proven when Fruit first arrived on the scene. A great search goes a LONG way.
[/quote]
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How many engines are not able to handle a big initiative

Post by bob »

MikeG wrote:After reading the other thread with the bad black openings I thought it would be interesting to run some games with a very strong engine (always plays black, with loss of tempos) against a less strong engine. The goal was to see if the "lesser" engine can take advantage of it's initiative (or conversley, how well the stronger engine can counterpunch (aka, rope-a-dope).

"Strong" engine was Robbolito 0085g3 \ W32, playing black with 128m hash
"Weak" engine was Crafty 23.2 with 128m + 64m hash

The matches were 2 hrs + 15 seconds, run on a laptop with a core2 duo getting around 1m nodes/second with robbo and 2.4m/sec with Crafty .

The results suprised me, as I had always thought that with 3 or more tempi-advantage, things would be hopeless for black against a 2800+ elo engine such as Crafty. However, as it turns out, the only win it could muster was when it was given a ridiculous 7 tempo advantage.

I know the number of games is limited but I still think the results are signficant (ie, lack of proper strategy for handling an opening advantage?), given the outcome.

Opening lines and scores (Crafty white all games):
e4Nc6 d4Nb8 0-1
e4Nf6 d4Ng8 Nc3h6 Nf3A5 1/2-1/2
e4Nf6 e5Ng8 Nf3Nc6 d4Nb8 0-1
e4Nf6 d4Ng8 c4Nc6 d5Nb8 1/2-1/2
e4Nf6 e5Ng8 d4Nc6 Nf3Nb8 Bd3Nc6 O-ONb8 Nc3 1-0
e4Nc6 d4Nb8 c4Nc6 Nf3Nb8 Nc3 0-1
c4Nf6 Nc3Ng8 d4 0-1

Games next post.
Your test is flawed in a basic way. First thing you need is two programs of identical strength. I'd be hesitant to play two versions of the same program in such an experiment, but even that would be better than two mismatched programs.

Once you have two that play equally, then you can try your test. But giving a program that is significantly stronger some sort of intangible "loss of tempo" doesn't say much about what the loss of tempo is worth...
MikeG

Re: How many engines are not able to handle a big initiative

Post by MikeG »

Thanks for the reply Bob. I agree if the goal is to measure worth of tempi , then having similar strength engines in the experiment is preferred.

But for me the goal was a little different. One of the fundamental dogmas of chess (I think) is that black must strive to overcome his opening disadvantage somehow in the opening. For this experiment we have multiplied this disadvantage several times over, and see that it is still possible for a very strong engine to be beaten (or held to a draw).

For me it is just proof of how complex chess is. 10 years ago I would have guessed (incorrectly) that by now with top engines, many games would tend toward draws because mistakes would become rather rare. But it seems the opposite has happened - many wins/losses occur, so errors are still frequent and there are still things to be learned.

At some theoretical ELO level I would think white might always win (or at least draw) with openings such as the ones I posted, even against perfect play. I would have guessed ahead of time that Crafty would have pulled it off . I had also ran some blitz games with similar openings previous to the 2 hr matches with older versions of Toga also - which had similar outcomes.
MikeG

Re: How many engines are not able to handle a big initiative

Post by MikeG »

"How much you are searching means very little if you are not pruning out the garage. This was proven when Fruit first arrived on the scene. A great search goes a LONG way. "

I newver disagreed that pruning in necessary. But how do you determine what is "garbage"? With a good evaluation.

Humans can play GM level at 2 or 3 nps, only because of their positional evaluation.

My disagreement is that the previous poster is doing a hand-waving argument when he says Crafty is "outsearched" . The standard tricks for selective search are well known but you cxan't squeeze water from a rock. If the eval is generating relatively large errors and you increase the pruning (and search depth), at some point playing strength will drop as you start pruning away more critical lines. I *believe* (my opinion based on what I have read) that Bob has done a good enough job tuning his search that it is not the critical difference between Crafty and the very top engines. But I woud be interested in Bob's opinion on this issue.
MikeG

Re: How many engines are not able to handle a big initiative

Post by MikeG »

.... Just to clarify: The opinion I am interested in is whether Bob would agree that Crafty is missing some pruning tricks such as Fruit might have that is simply causing his search performance to suffer.