The problem with adjudication based on score

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Graham Banks
Posts: 41416
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: The problem with adjudication based on score

Post by Graham Banks »

chrisw wrote: Wed Apr 17, 2019 12:14 pm
Graham Banks wrote: Wed Apr 17, 2019 8:26 am 1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rh1; 8.Rxh1, Rxh1+; 9.Ke2, Rh2; 10.Kf1, Kf6 seems to win?
nope. after 4 Rh7, white exchanges rooks and the impenetrable by stalemate fortress is on, as in the main line
Yes - I realised this too late and couldn't edit. Try this line:

1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rxg2
gbanksnz at gmail.com
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: The problem with adjudication based on score

Post by Joerg Oster »

konsolas wrote: Wed Apr 17, 2019 12:03 pm Nice! Have you considered PR'ing that change to the main version?
No.
It only fixes a corner-case without any measurable benefit.
Jörg Oster
MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: The problem with adjudication based on score

Post by MikeGL »

Dann Corbit wrote: Wed Apr 17, 2019 2:26 am Try this one in your favorite chess engine and you will see the problem immediately (unless some engine scores it zero):
[d]7r/6p1/7P/p1r1p1k1/Pp1pPp2/1PpP1Pp1/2P3P1/K6R w - -
Nice study. This is a draw where engine thinks its a win.
I have seen even worse where engines thinks the other side is winning when in fact it is the opposing side which wins.
I told my wife that a husband is like a fine wine; he gets better with age. The next day, she locked me in the cellar.
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: The problem with adjudication based on score

Post by konsolas »

Graham Banks wrote: Wed Apr 17, 2019 12:22 pm
chrisw wrote: Wed Apr 17, 2019 12:14 pm
Graham Banks wrote: Wed Apr 17, 2019 8:26 am 1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rh1; 8.Rxh1, Rxh1+; 9.Ke2, Rh2; 10.Kf1, Kf6 seems to win?
nope. after 4 Rh7, white exchanges rooks and the impenetrable by stalemate fortress is on, as in the main line
Yes - I realised this too late and couldn't edit. Try this line:

1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rxg2
What about 5. Rxh7 Rxh7, 6. Ke2

With the original fortress.
User avatar
Graham Banks
Posts: 41416
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: The problem with adjudication based on score

Post by Graham Banks »

konsolas wrote: Wed Apr 17, 2019 12:32 pm
Graham Banks wrote: Wed Apr 17, 2019 12:22 pm
chrisw wrote: Wed Apr 17, 2019 12:14 pm
Graham Banks wrote: Wed Apr 17, 2019 8:26 am 1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rh1; 8.Rxh1, Rxh1+; 9.Ke2, Rh2; 10.Kf1, Kf6 seems to win?
nope. after 4 Rh7, white exchanges rooks and the impenetrable by stalemate fortress is on, as in the main line
Yes - I realised this too late and couldn't edit. Try this line:

1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rxg2
What about 5. Rxh7 Rxh7, 6. Ke2

With the original fortress.
Yeah - it's a draw. :P
Last edited by Graham Banks on Wed Apr 17, 2019 1:04 pm, edited 1 time in total.
gbanksnz at gmail.com
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: The problem with adjudication based on score

Post by Raphexon »

While I agree with you these situations are very rare and works both ways.
Adjudication massively speeds up the process of testing, and if only 1% of the games produce wrong results it's still worth to adjudicate.

Speed vs accuracy.
gordonr
Posts: 194
Joined: Thu Aug 06, 2009 8:04 pm
Location: UK

Re: The problem with adjudication based on score

Post by gordonr »

Raphexon wrote: Wed Apr 17, 2019 1:02 pm While I agree with you these situations are very rare and works both ways.
Adjudication massively speeds up the process of testing, and if only 1% of the games produce wrong results it's still worth to adjudicate.

Speed vs accuracy.
I agree with your point that these situations are so rare that it's not worth slowly down the testing. However, I'm not even sure if it is a wrong result. If the White player wrongly evaluates the position as "clearly lost", enough to effectively resign, then I think a win for Black is the correct result. An engine's rating should factor in positions where it thinks it is clearly lost even when it is a draw. Then if the engine improves its evaluation of such positions, its rating will increase.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: The problem with adjudication based on score

Post by chrisw »

Graham Banks wrote: Wed Apr 17, 2019 12:22 pm
chrisw wrote: Wed Apr 17, 2019 12:14 pm
Graham Banks wrote: Wed Apr 17, 2019 8:26 am 1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rh1; 8.Rxh1, Rxh1+; 9.Ke2, Rh2; 10.Kf1, Kf6 seems to win?
nope. after 4 Rh7, white exchanges rooks and the impenetrable by stalemate fortress is on, as in the main line
Yes - I realised this too late and couldn't edit. Try this line:

1.hxg7, Rg8; 2.Kb1, Rc7; 3.Kc1, R8xg7; 4. Kd1, Rh7; 5.Rg1, Rh2; 6.Ke2, Rgh7; 7.Kf1, Rxg2
nothing helps. white always exchanges rooks, and it’s a fortress. saccing the remaining rook for the g-pawn also doesn’t work because opposition. nor even saccing the g pawn after that, still opposition.
Funny how humans can reduce the whole problem down to a few logical concepts, but nobody has found a way to do it algorithmically. Interestingly enough an algorithmic part-solution to many of these fortress positions (without sacrificing speed) is staring programmers in the face, but I don’t think anyone has ever realised it.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: The problem with adjudication based on score

Post by jp »

Dann Corbit wrote: Wed Apr 17, 2019 2:26 am 7r/6p1/7P/p1r1p1k1/Pp1pPp2/1PpP1Pp1/2P3P1/K6R w - -
Dann, who first came up with this position?
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: The problem with adjudication based on score

Post by MikeB »

Joerg Oster wrote: Wed Apr 17, 2019 12:01 pm
konsolas wrote: Wed Apr 17, 2019 11:48 am Do any engines correctly score

[d]4k3/rr6/2p1p1p1/1pPpPpPp/pP1P1P1P/P7/8/3K4 w - - 0 1

to be a draw?
Yes. :D
Just recently added a 1st blockade detection rule to my SF fork.
Patch is here: https://github.com/joergoster/Stockfish ... 020293f1ef

Code: Select all

position fen 4k3/rr6/2p1p1p1/1pPpPpPp/pP1P1P1P/P7/8/3K4 w - - 0 1
d

 +---+---+---+---+---+---+---+---+
 |   |   |   |   | k |   |   |   |
 +---+---+---+---+---+---+---+---+
 | r | r |   |   |   |   |   |   |
 +---+---+---+---+---+---+---+---+
 |   |   | p |   | p |   | p |   |
 +---+---+---+---+---+---+---+---+
 |   | p | P | p | P | p | P | p |
 +---+---+---+---+---+---+---+---+
 | p | P |   | P |   | P |   | P |
 +---+---+---+---+---+---+---+---+
 | P |   |   |   |   |   |   |   |
 +---+---+---+---+---+---+---+---+
 |   |   |   |   |   |   |   |   |
 +---+---+---+---+---+---+---+---+
 |   |   |   | K |   |   |   |   |
 +---+---+---+---+---+---+---+---+

Fen: 4k3/rr6/2p1p1p1/1pPpPpPp/pP1P1P1P/P7/8/3K4 w - - 0 1
PositionKey: 00FEC3309217ABF3
MaterialKey: FB26979F6BE2A3DB
PawnKey:     A90DE04509EB6C76
Checkers: 
eval
     Term    |     White     |     Black     |     Total    
             |   MG     EG   |   MG     EG   |   MG     EG  
 ------------+---------------+---------------+--------------
    Material |   4.69   6.52 |  14.57  16.83 |  -9.88 -10.31
   Imbalance |   0.57   0.57 |   0.52   0.52 |   0.05   0.05
  Initiative |   ----   ---- |   ----   ---- |   0.00  -0.07
       Pawns |   0.79   0.03 |   0.69  -0.06 |   0.10   0.09
     Knights |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
     Bishops |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
       Rooks |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
      Queens |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
    Mobility |   0.00   0.00 |   0.22   1.03 |  -0.22  -1.03
 King safety |  -0.36  -0.18 |  -0.41  -0.12 |   0.06  -0.06
     Threats |   0.05   0.05 |   0.00   0.00 |   0.05   0.05
      Passed |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
       Space |   0.00   0.00 |   0.00   0.00 |   0.00   0.00
 ------------+---------------+---------------+--------------
       Total |   ----   ---- |   ----   ---- |  -9.83 -11.26

Total evaluation: 0.10 (white side)

info depth 31 seldepth 32 multipv 1 score cp -10 nodes 26051077 nps 2604847 hashfull 152 tbhits 0 time 10001 pv d1e2 b7g7 e2f1 a7f7 f1g1 f7b7 g1f2 e8f8 f2g2 b7f7 g2h2 f7c7 h2g2 f8g8 g2f1 c7f7 f1g1 g8h8 g1g2 f7b7 g2f1 h8h7 f1g1 g7d7 g1h2 d7c7 h2g2 c7f7 g2h1 h7g8 h1g1
bestmove d1e2 ponder b7g7

Nice - I'm trying to merge it into in current SF code and it's not working as expected - any suggestions to what I have below:

Code: Select all

template<Tracing T>
	ScaleFactor Evaluation<T>::scale_factor(Value eg) const {
		
		Color strongSide = eg > VALUE_DRAW ? WHITE : BLACK;
		int sf = me->scale_factor(pos, strongSide);
		
		// Try to handle a fully blocked position with all pawns still
		// on the board and directly blocked by their counterpart,
		// and all remaining pieces on their respective side.
		// Test position r7/1b1r4/k1p1p1p1/1p1pPpPp/p1PP1P1P/PP1K4/8/4Q3 w - - bm Qa5+
		if (   pos.count<PAWN>() == 16
			&& popcount(shift<NORTH>(pos.pieces(WHITE, PAWN)) & pos.pieces(BLACK, PAWN)) == 8)
		{
			Bitboard b, Camp[COLOR_NB];
			
			for (Color c : { WHITE, BLACK })
			{
				b = pos.pieces(c, PAWN);
				Camp[c] = 0;
				
				while (b)
				{
					Square s = pop_lsb(&b);
					Camp[c] |= forward_file_bb(~c, s);
				}
			}
			
			if (   !(pos.pieces(WHITE) & Camp[BLACK])
				&& !(pos.pieces(BLACK) & Camp[WHITE]))
			return SCALE_FACTOR_DRAW;
		}
		
		// If scale is not already specific, scale down the endgame via general heuristics
		if (sf == SCALE_FACTOR_NORMAL)
		{
			if (   pos.opposite_bishops()
				&& pos.non_pawn_material(WHITE) == BishopValueMg
				&& pos.non_pawn_material(BLACK) == BishopValueMg)
			sf = 16 + 4 * pe->passed_count();
			else
			sf = std::min(40 + (pos.opposite_bishops() ? 2 : 7) * pos.count<PAWN>(strongSide), sf);
			
		}
		
		return ScaleFactor(sf);
	}
Thanks in advance!
Image