Baffling test position

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Vinvin
Posts: 5298
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Baffling test position

Post by Vinvin »

Wihtout pawns, 6 knights is probably not enough ...
[d]1q1qk1q1/8/8/8/8/8/3NNN2/2NNKN2 w - - 0 1

What about 7 ?
[d]1q1qk1q1/8/8/8/8/8/3NNN2/2NNKNN1 w - - 0 1

hgm wrote:
David Dahlem wrote:Three queens against 6 knights? That's easy, black wins. 8-)
Actually, this is not so obvious. It depends a lot on the engine, some are better in it than others. As you can see from the DayDereamer result, there is a significant number of wins by white. And this despite the fact that the engines will have a bad conception of the piece values, thinking that it is still a good deal tot trade three Knights for one Queen, while in fact this is the only way to bungle the win.

My suspicion is that a properly programmed engine, using piecve values Q=9 and N=5, would always win for whate. In any case this would be true for 7 Knights against 3 Queens.:

1q1qk1q1/pppppppp/8/8/8/8/PPPPPPPP/NNNNKNNN w
metax
Posts: 344
Joined: Wed Sep 23, 2009 5:56 pm
Location: Germany

Re: Baffling test position

Post by metax »

Interesting. These positions really help debugging the evaluation. I especially like the knights vs bishops positions, where you can see the material considerations of the different engines very well.
Some just give white a small opening advantage, but most prefer to play black. However, the only engine I have found that prefers the knights by a significant amount is - my engine. Its score is clearly exaggerated although White really hasn't got any advantage at the end of the PV but it is decreasing:

Code: Select all

FEN: bbbbkbbb/pppppppp/8/8/8/8/PPPPPPPP/NNNNKNNN w - - 0 1 

ChessMind:
   1	00:00	           0	0	+0,38	Sb1c3 a7a6
   1	00:00	           0	0	+2,53	b2b4 a7a6
   2	00:00	       2.919	18.000	+2,75	b2b4 a7a6 d2d3
   2	00:00	       2.919	18.000	+2,93	Sf1e3 a7a6 d2d3
   3	00:00	       7.400	39.000	+2,75	Sf1e3 c7c6 Sh1g3 d7d5
   3	00:00	       7.400	39.000	+2,77	Sb1c3 c7c6 Sd1e3 d7d5
   3	00:00	       7.400	39.000	+2,84	Sa1b3 c7c6 Sc1d3 d7d5
   4	00:00	      47.212	120.000	+2,71	Sa1b3 c7c6 Sb1c3 g7g5 Sc3e4
   4	00:00	      47.212	120.000	+2,80	Sc1d3 c7c6 f2f4 g7g5 f4xg5
   4	00:00	      47.212	120.000	+2,88	c2c4 c7c6 Sa1b3 d7d5 c4c5
   5	00:01	     130.841	174.000	+2,44	c2c4 c7c6 Sa1b3 d7d5 c4c5 b7b6
   5	00:01	     130.841	174.000	+2,64	Sc1d3 c7c6 Sg1f3 d7d5 e2e3 b7b6
   5	00:01	     130.841	174.000	+2,83	Sa1b3 c7c6 d2d4 d7d5 Sb1c3 b7b6
   6	00:02	     319.137	206.000	+2,34	Sa1b3 c7c6 Sd1c3 f7f5 Sb3d4 b7b5 a2a3
   6	00:02	     319.137	206.000	+2,58	Sc1d3 c7c6 e2e3 e7e6 f2f4 g7g5 f4xg5
   7	00:07	   1.660.995	224.000	+2,52	Sc1d3 c7c6 f2f4 e7e6 Sg1h3 g7g5 f4xg5 c6c5
   7	00:10	   2.331.886	224.000	+2,54	d2d4 c7c6 Sc1d3 e7e6 c2c3 g7g5 Ke1d2 e6e5 Kd2e3
   8	00:17	   3.774.991	224.000	+2,39	d2d4 b7b6 Sd1e3 e7e6 Sc1d3 Lc8a6 Ke1d2 La6xd3 Kd2xd3
   8	00:23	   5.132.348	224.000	+2,43	Sd1e3 b7b6 c2c4 e7e6 Sc1d3 c7c6 h2h3 c6c5 Sa1b3
   8	00:30	   6.834.410	225.000	+2,46	e2e4 b7b6 f2f3 e7e6 Sg1h3 c7c6 Sb1c3 Lc8a6 Sa1b3
   9	00:44	  10.021.051	226.000	+2,30	e2e4 b7b5 f2f3 e7e6 Sb1c3 Lf8c5 Sd1f2 g7g6 Sc1d3 Lc5xf2+ Sh1xf2
  10	01:18	  17.845.462	227.000	+2,14	e2e4 b7b5 f2f3 e7e6 Sh1f2 c7c6 c2c3 Ld8c7 g2g3 f7f5 Sa1c2
  11	03:15	  44.393.519	227.000	+1,90	e2e4 b7b5 d2d3 e7e6 Sa1b3 c7c6 c2c3 c6c5 f2f3 d7d5 Sb3a1 d5d4
What is particularly interesting about this is that the actual position evaluations is 'only' +0.93 but it seems to find some kind of advantage in the lines searched. I'll have a look at this but I suspect that the different minimum mobility penalties for bishops and knights may cause problems.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Baffling test position

Post by bob »

this might be a case where time really becomes a factor. With that many knights against 3 queens, tactics become all-important. I can't imagine that 3 Q's vs 6 N's can't win, unless the 3Q side can't search deep enough to avoid the ridiculous number of fork attempts white can create.
metax
Posts: 344
Joined: Wed Sep 23, 2009 5:56 pm
Location: Germany

Re: Baffling test position

Post by metax »

bob wrote:this might be a case where time really becomes a factor. With that many knights against 3 queens, tactics become all-important. I can't imagine that 3 Q's vs 6 N's can't win, unless the 3Q side can't search deep enough to avoid the ridiculous number of fork attempts white can create.
But if Stockfish loses to a completely new engine with 5x time odds in a position which it evaluates at +14 at the beginning? I couldn't believe it at first, too.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Baffling test position

Post by zamar »

hgm wrote: Yet Stockfish is totally crushed:
Stockfish has second order material evaluation which gives bonuses/penalties for each queen-pair, knight-pair, knight-pawn combos, queen-pawn combos. Needless to say these statistical corrections were obtained from usual chess games/positions and can hurt a lot in irrational positions like this.
Joona Kiiski
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Baffling test position

Post by hgm »

It is definitely true that some of Stockfish' exquisitly tuned algorithms must badly backfire in this unusual position. I tried a few other engines now, and most handle it much better than Stockfish does. Especially Daydreamer seems to be quite strong with the Queens. Problem is that it loses about half the games on time, at 40/2, and I don't like to go to sudden-death TC to avoid that, because there play tends to speed-up at the end. And these games are then in a phase that is very sensitive to tactical blunders from either side (there is always very deep tactics), and it becomes more a lottery than enything else.

When I play my new engine with 6 Knights against Daydreamer with 3 Queens, it seems the Queens (lightly) have the upper hand. Not as much as in the Daydreamer self-play reported earlier, but still a clear win. This could be due, of course, to Daydreamer generally being stronger than my new engine, and so it does not prove very much.

I am trying now with my new engine tuned for this (i.e. N=5) in self play. Then the Queens seem to have an easy win too: because it is less hesitant than normal engines to trade Q vs 2N, it can find easy paths to a win, which normal engines would reject. Do that trade two times, and the remaining 2N cannot defend all Pawns against the remaining Q, and quickly lose. Normal engines just make life difficult for themselves by their unwillingness to trade Q vs 2N.

This strategy would not work in the case of 7N vs 3Q: the remaining Q vs 3N in the presence of many Pawns is usually badly lost. So with 7 Knights it is usualy an easy win for the Knights.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Baffling test position

Post by bob »

metax wrote:
bob wrote:this might be a case where time really becomes a factor. With that many knights against 3 queens, tactics become all-important. I can't imagine that 3 Q's vs 6 N's can't win, unless the 3Q side can't search deep enough to avoid the ridiculous number of fork attempts white can create.
But if Stockfish loses to a completely new engine with 5x time odds in a position which it evaluates at +14 at the beginning? I couldn't believe it at first, too.
What time odds? looked like 40 moves in 5 minutes to me for both sides...
Aaron Becker
Posts: 292
Joined: Tue Jul 07, 2009 4:56 am

Re: Baffling test position

Post by Aaron Becker »

hgm wrote:Problem is that it loses about half the games on time, at 40/2, and I don't like to go to sudden-death TC to avoid that, because there play tends to speed-up at the end.
Martin Thoresen has also reported losses on time with Daydreamer 1.7, so this is a common problem that I'd like to diagnose. Are you using Gaviota TBs? I've been able to reproduce the problem with Gaviota TB support turned on and endgame thread pool size > 1, but not under other conditions. If you can give me the full details of your setup, I'd appreciate the help it will give me in debugging.
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Baffling test position

Post by hgm »

I have no tablebases installed on my system. I just downloaded and unzipped the Daydreamer 1.7 package, moved the 32-bit executable and pthread DLL to \cygwin\home\engines\daydreamer, (to avoid having to type very long filenames), and run under WinBoard + Polyglot. Hash size = 64MB, EGTB cache = 4MB. I see the Nalimov Path is set to c:\egtb, which does not exist. I don't know if that could be a problem.

Virtually all time losses occur on the 39th, 40th or 41st move, when I am playing 40 moves per session, so I suppose it is more likely a mistuning of the time management than a genuine engine crash. I am playing one non-ponder game on a dual, so one CPU is normally idle, and I cannot imagine there would be any scheduling delays.
Aaron Becker
Posts: 292
Joined: Tue Jul 07, 2009 4:56 am

Re: Baffling test position

Post by Aaron Becker »

hgm wrote:I have no tablebases installed on my system. I just downloaded and unzipped the Daydreamer 1.7 package, moved the 32-bit executable and pthread DLL to \cygwin\home\engines\daydreamer, (to avoid having to type very long filenames), and run under WinBoard + Polyglot. Hash size = 64MB, EGTB cache = 4MB. I see the Nalimov Path is set to c:\egtb, which does not exist. I don't know if that could be a problem.

Virtually all time losses occur on the 39th, 40th or 41st move, when I am playing 40 moves per session, so I suppose it is more likely a mistuning of the time management than a genuine engine crash. I am playing one non-ponder game on a dual, so one CPU is normally idle, and I cannot imagine there would be any scheduling delays.
Thanks for the info. I'll try to reproduce this problem myself and release a fix in the near future.