Linux port of newer versions of TogaII

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: Linux port of newer versions of TogaII

Post by Zach Wegner »

Michel wrote:It seems gdb is thoroughly confused. Presumably the segfault does not occur at the actual
memory access error
(this very often happens in my experience).
Really? I can't say I have ever seen it. This is just from BSD experience though, not Linux, and usually when I debug I have optimizations off.

To me it doesn't look like that is happening, but rather the board is corrupted and it's causing segfaults all over the place. Since it happens on the second move, and only on child threads, I'd imagine it is related to copying the board state to the children before starting the search. You can see in the last back trace a bunch of gibberish being passed around (from=102920, to=153, colour=-2, etc)
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Linux port of newer versions of TogaII

Post by Michel »

Really? I can't say I have ever seen it. This is just from BSD experience though, not Linux, and usually when I debug I have optimizations off.
Well it is quite possible that an incorrect memory access still falls within the memory owned by a process. Hence it does not cause a segfault. But the incorrect memory access may still cause corruption leading to a segfault later. A malloc debugger (like efence) is supposed to catch the illegal memory access right when it happens.
To me it doesn't look like that is happening, but rather the board is corrupted and it's causing segfaults all over the place. Since it happens on the second move, and only on child threads, I'd imagine it is related to copying the board state to the children before starting the search. You can see in the last back trace a bunch of gibberish being passed around (from=102920, to=153, colour=-2, etc)
Possibly. But why does it not occur on Linux? With some difficulty I turned the ASSERTs back on in the source (they don't seem to have been used for a long time) and there are plenty "board_is_ok" checks and they all pass.

As you seem to be running BSD perhaps you could do a quick test to see if the problem occurs on your system as wel?

PS. I posted some observations on Toga's multi threading at the Toga developer forum.
http://www.computerchess.info/tdbb/phpBB3/
Perhaps you care to comment?
krazyken

Re: Linux port of newer versions of TogaII

Post by krazyken »

Efence gives me the same as above. Isn't this the part that is interesting?

Code: Select all

0x00019188 in alists_hidden (alists=0xb0080a88, board=0x38b48, from=102920, to=153) at see.cpp:308 
308      inc = DELTA_INC_LINE(to-from); 
If I read this right to-from= -102767
and from attack.h:

Code: Select all

#define DELTA_INC_LINE(delta)             (DeltaIncLine[DeltaOffset+(delta)])
vector.h:

Code: Select all

const int DeltaOffset = 119;
I bet we are out of bounds on this array.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Linux port of newer versions of TogaII

Post by Michel »

You are right that this is an illegal access.

However as Zack pointed out it seems that the argument colour=-2 in the function
invocation

Code: Select all

see_rec (alists=0xb0080a88, board=0x38b48, colour=-2, to=153, piece_value=0)
can only come from board corruption (colour is the colour of a piece which is supposed
to be 0 or 1, the relevant macros are in colour.h).

I fear that the only solution is to put in printf's with board_is_ok at the places which occur in the stacktrace to see when and where the board corruption occurs precisely.
WARNING: the default board_is_ok does nothing. You have to set UseSlowDebug = true
in board.cpp

Regards,
Michel
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Linux port of newer versions of TogaII

Post by Michel »

A quick observation: if you apply the macros then -2 seems to be the opposite color of a
None piece (presumably encoded by 0). The None piece itself has color -1.

Reading the source of see_move this would seem to imply that

see_move is called with a move from a square that is empty. So it is not necessary
for the board to be corrupted to explain this behaviour.
krazyken

Re: Linux port of newer versions of TogaII

Post by krazyken »

Here is the fastest crash I can reproduce: colour=33554431

Code: Select all

Mulert:src Kenny$ gdb ./toga2
GNU gdb 6.3.50-20050815 (Apple version gdb-960) (Sun May 18 18:38:33 UTC 2008)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries ... done

(gdb) run
Starting program: /Users/Kenny/Downloads/src/toga2 
Reading symbols for shared libraries ++. done
Toga II 1.4.1SE UCI based on Fruit 2.1 by Thomas Gaksch and Fabien Letouzey. Settings by Dieter Eberle
Experimental engine by Chris Formula. Code was based on Toga II 1.4 Beta5c by Thomas Gaksch
EgbbProbe not Loaded!
go depth 1
info depth 1
info multipv 1 depth 1 seldepth 1 score cp 6 time 1 nodes 2 pv b1a3
info multipv 1 depth 1 seldepth 1 score cp 34 time 1 nodes 3 pv b1c3
info multipv 1 depth 1 seldepth 1 score cp 40 time 1 nodes 13 pv d2d4
info depth 1 seldepth 1 time 1 nodes 42 nps 0
info multipv 1 depth 2 seldepth 2 score cp 20 time 1 nodes 45 pv d2d4 d7d5
info multipv 1 depth 3 seldepth 8 score cp 34 time 2 nodes 215 pv d2d4 d7d5 g1f3
info multipv 1 depth 4 seldepth 8 score cp 20 time 3 nodes 677 pv d2d4 d7d5 g1f3 g8f6
info multipv 1 depth 5 seldepth 12 score cp 30 time 8 nodes 2541 pv d2d4 d7d5 g1f3 g8f6 b1c3
info multipv 1 depth 6 seldepth 12 score cp 20 time 14 nodes 5110 pv d2d4 d7d5 g1f3 g8f6 b1c3 b8c6
info time 28 nodes 10021 nps 0 cpuload 1000
info hashfull 1
bestmove d2d4 ponder d7d5
info multipv 1 depth 7 seldepth 13 score cp 20 time 9 nodes 13755 pv d2d4 d7d5 g1f3 g8f6 b1c3 b8c6 c1f4
info multipv 1 depth 8 seldepth 16 score cp 20 time 23 nodes 29753 pv d2d4 d7d5 g1f3 g8f6 b1c3 b8c6 c1f4 c8f5
info multipv 1 depth 9 seldepth 23 score cp 15 time 17 nodes 187796 pv d2d4 g8f6 g1f3 d7d5 c2c4 c8f5 d1b3 b7b6 c4d5 f6d5
position statpos moves d2d4

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x7028332c
[Switching to process 84358 thread 0x313]
0x00019161 in alist_add (alist=0xb0080a08, square=71, board=0x38b48) at see.cpp:357
357	   alist->square[pos] = square;
(gdb) bt
#0  0x00019161 in alist_add (alist=0xb0080a08, square=71, board=0x38b48) at see.cpp:357
#1  0x00019293 in alist_build (alist=0x48b0080a, board=0x7700038b, to=-33554432, colour=33554431) at see.cpp:277
#2  0x08000197 in ?? ()
Cannot access memory at address 0x55b0080f

Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Linux port of newer versions of TogaII

Post by Michel »

colour=33554431
I think this cannot possibly happen. I think the stack has become corrupted and gdb is confused. Did you use libefence?

Here are the macros that define the colour of a piece. I don't think they can yield colour=33554431, whatever weird value piece might have.

Code: Select all

#define PIECE_COLOUR(piece)      (((piece)&3)-1)
const int White = 0;
const int Black = 1;
#define COLOUR_OPP(colour)      ((colour)^(White^Black))
alist_build is called 4 times in see_move. Twice with colour argument "def" (defender) and
twice with colour argument "att" (attacker). att and def are defined by

Code: Select all

att = PIECE_COLOUR(piece);
def = COLOUR_OPP(att);
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Linux port of newer versions of TogaII

Post by Sven »

Michel wrote:
colour=33554431
I think this cannot possibly happen. I think the stack has become corrupted and gdb is confused.
I agree, and there is another hint that seems to prove stack corruption. In the given stack trace, alist_build() calls alist_add() with a different 'alist' pointer value which is definitely wrong since the alist_add() function does not change the 'alist' pointer.

Sven
User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: Linux port of newer versions of TogaII

Post by Zach Wegner »

Possibly. But why does it not occur on Linux? With some difficulty I turned the ASSERTs back on in the source (they don't seem to have been used for a long time) and there are plenty "board_is_ok" checks and they all pass.

As you seem to be running BSD perhaps you could do a quick test to see if the problem occurs on your system as wel?
I'm at work now (and running Windows, blegh), but I should be able to later today.

As for Linux/BSD differences, there are a few I can think of. Peculiarities of a pthread implementation, different stack sizes, and possibly different compiler versions (I assume gcc in both cases). It's hard to say, but based on what's been posted it looks like the stack is getting corrupted, which probably means an out-of-bounds access in an array on the stack. I'm not familiar with efence, but I rather doubt a simple library can detect that.
PS. I posted some observations on Toga's multi threading at the Toga developer forum.
http://www.computerchess.info/tdbb/phpBB3/
Perhaps you care to comment?
I saw those. I've been reading there, but I've been strapped for time so I haven't gotten a chance to reply yet.
Guetti

Re: Linux port of newer versions of TogaII

Post by Guetti »

Very strange. I gave it a try on my Macbook and although it didn't crash on the second move it seems to alternate between colors and eventually move for white????

Maybe a hash error that leads to board corruption?

Another thing, one thread is always running at 100% of one core, even if it's the opponents move (this already starts when isready is entered).

Code: Select all

Toga II 1.4.1SE UCI based on Fruit 2.1 by Thomas Gaksch and Fabien Letouzey. Settings by Dieter Eberle
Experimental engine by Chris Formula. Code was based on Toga II 1.4 Beta5c by Thomas Gaksch
EgbbProbe not Loaded!
uci
id name Toga II 1.4.1SE
id author Thomas Gaksch and Fabien Letouzey
option name Hash type spin default 16 min 4 max 1024
option name Search Time type spin default 0 min 0 max 3600
option name Search Depth type spin default 0 min 0 max 20
option name Ponder type check default false
option name OwnBook type check default true
option name BookFile type string default performance.bin
option name Bitbases Path type string default c:/egbb/
option name Bitbases Cache Size type spin default 16 min 16 max 1024
option name MultiPV type spin default 1 min 1 max 10
option name NullMove Pruning type combo default Always var Always var Fail High var Never
option name NullMove Reduction type spin default 3 min 1 max 4
option name Verification Search type combo default Always var Always var Endgame var Never
option name Verification Reduction type spin default 5 min 1 max 6
option name History Pruning type check default true
option name History Threshold type spin default 70 min 0 max 100
option name Futility Pruning type check default true
option name Futility Margin type spin default 100 min 0 max 500
option name Extended Futility Margin type spin default 200 min 0 max 900
option name Delta Pruning type check default true
option name Delta Margin type spin default 50 min 0 max 500
option name Quiescence Check Plies type spin default 1 min 0 max 2
option name Material type spin default 100 min 0 max 400
option name Piece Activity type spin default 100 min 0 max 400
option name Piece Square Activity type spin default 100 min 0 max 400
option name King Safety type spin default 100 min 0 max 400
option name Pawn Structure type spin default 100 min 0 max 400
option name Passed Pawns type spin default 100 min 0 max 400
option name Toga Lazy Eval type check default true
option name Toga Lazy Eval Margin type spin default 200 min 0 max 900
option name Toga King Safety type check default false
option name Toga King Safety Margin type spin default 1700 min 500 max 3000
option name Toga Extended History Pruning type check default false
uciok
setoption name Hash value 64
isready
readyok
ucinewgame
position startpos moves g1f3
go wtime 60000 btime 60000 movestogo 40
info depth 1
info multipv 1 depth 1 seldepth 1 score cp -70 time 0 nodes 2 pv a7a5
info multipv 1 depth 1 seldepth 1 score cp -64 time 0 nodes 4 pv b7b5
info multipv 1 depth 1 seldepth 1 score cp -14 time 0 nodes 8 pv d7d5
info depth 1 seldepth 2 time 0 nodes 46 nps 0
info depth 2
info multipv 1 depth 2 seldepth 2 score cp -34 time 0 nodes 50 pv d7d5 d2d4
info depth 2 seldepth 8 time 1 nodes 206 nps 0
info depth 3
info multipv 1 depth 3 seldepth 9 score cp -20 time 0 nodes 254 pv d7d5 d2d4 g8f6
info depth 3 seldepth 9 time 1 nodes 804 nps 0
info depth 4
info multipv 1 depth 4 seldepth 9 score cp -30 time 2 nodes 607 pv d7d5 d2d4 g8f6 b1c3
info depth 4 seldepth 9 time 3 nodes 3066 nps 0
info depth 5
info multipv 1 depth 5 seldepth 11 score cp -20 time 4 nodes 2265 pv d7d5 d2d4 g8f6 b1c3 b8c6
info depth 5 seldepth 11 time 4 nodes 5580 nps 0
info depth 6
info multipv 1 depth 6 seldepth 11 score cp -20 time 7 nodes 4584 pv d7d5 d2d4 g8f6 b1c3 b8c6 c1f4
info depth 6 seldepth 13 time 10 nodes 14412 nps 0
info depth 7
info multipv 1 depth 7 seldepth 13 score cp -20 time 13 nodes 10191 pv d7d5 d2d4 g8f6 b1c3 b8c6 c1f4 c8f5
info depth 7 seldepth 13 time 16 nodes 25762 nps 0
info depth 8
info multipv 1 depth 8 seldepth 15 score cp -7 time 24 nodes 19622 pv d7d5 d2d4 g8f6 b1c3 b8c6 d1d3 c8g4 c1f4
info depth 8 seldepth 15 time 28 nodes 46694 nps 0
info depth 9
info multipv 1 depth 9 seldepth 16 score cp -23 time 38 nodes 29936 pv d7d5 d2d4 g8f6 b1c3 b8c6 c1f4 g7g6 e2e3
info depth 9 seldepth 18 time 56 nodes 87792 nps 0
info depth 10
info multipv 1 depth 10 seldepth 20 score cp -23 time 101 nodes 76644 pv d7d5 e2e3 g8f6 b1c3 b8c6 f1b5 d8d6 e1g1 a7a6 b5c6 d6c6 d2d4
info depth 10 seldepth 21 time 141 nodes 215178 nps 0
info depth 11
info multipv 1 depth 11 seldepth 21 score cp -23 time 204 nodes 152797 pv d7d5 e2e3 g8f6 b1c3 b8c6 f1b5 d8d6 e1g1 f6g4 d2d4
info depth 11 seldepth 21 time 317 nodes 474450 nps 0
info depth 12
info multipv 1 depth 12 seldepth 25 score cp -17 time 607 nodes 463004 pv d7d5 d2d4 b8c6 c2c4 e7e6 c4d5 e6d5 b1c3 g8f6 d1b3 f8b4 c1f4
info depth 12 seldepth 25 time 759 nodes 1159168 nps 0
info depth 13
info time 1001 nodes 1520000 nps 1518741 cpuload 1000
info hashfull 57
info multipv 1 depth 13 seldepth 29 score cp -25 time 1112 nodes 857205 pv d7d5 d2d4 g8f6 b1c3 b8c6 e2e3 c8g4 f1b5 e7e6 e1g1 f8d6 c1d2
info currmove b8c6 currmovenumber 2
info currmove g8f6 currmovenumber 3
info currmove d7d6 currmovenumber 4
info currmove e7e6 currmovenumber 5
info currmove b8a6 currmovenumber 6
info currmove g8h6 currmovenumber 7
info currmove f7f5 currmovenumber 8
info currmove b7b5 currmovenumber 9
info currmove b7b6 currmovenumber 10
info currmove g7g6 currmovenumber 11
info currmove a7a5 currmovenumber 12
info currmove h7h5 currmovenumber 13
info currmove c7c5 currmovenumber 14
info currmove a7a6 currmovenumber 15
info currmove h7h6 currmovenumber 16
info currmove c7c6 currmovenumber 17
info currmove f7f6 currmovenumber 18
info currmove g7g5 currmovenumber 19
info currmove e7e5 currmovenumber 20
info depth 13 seldepth 27 time 1558 nodes 2322410 nps 1490750
info time 1558 nodes 2338084 nps 1500728 cpuload 1000
info hashfull 88
bestmove d7d5 ponder d2d4
position startpos g1f3 g8f6 d2d4
go wtime 59900 btime 57890 movestogo 39
info depth 1
info multipv 1 depth 1 seldepth 1 score cp 6 time 0 nodes 2 pv b1a3
info multipv 1 depth 1 seldepth 1 score cp 34 time 0 nodes 3 pv b1c3
info multipv 1 depth 1 seldepth 1 score cp 40 time 0 nodes 13 pv d2d4
info depth 1 seldepth 1 time 0 nodes 42 nps 0
info depth 2
info multipv 1 depth 2 seldepth 2 score cp 20 time 0 nodes 45 pv d2d4 d7d5
info depth 2 seldepth 8 time 1 nodes 232 nps 0
info depth 3
info multipv 1 depth 3 seldepth 8 score cp 34 time 1 nodes 207 pv d2d4 d7d5 g1f3
info multipv 1 depth 5 seldepth 14 score cp 0 time 3 nodes 748 pv b7b5 b8c6 e2e3 a8b8 b1c3
info depth 3 seldepth 8 time 1 nodes 762 nps 0
info depth 4
info multipv 1 depth 6 seldepth 14 score cp -3 time 4 nodes 1291 pv b7b5 b8c6 d2d4 e7e5 g1f3 e5d4 f3d4 c6d4 d1d4
info depth 4 seldepth 12 time 3 nodes 3360 nps 0
info depth 5
info multipv 1 depth 7 seldepth 14 score cp -3 time 6 nodes 3531 pv b7b5 b8c6 d2d4 e7e5 g1f3 e5d4 f3d4 c6d4 d1d4
info multipv 1 depth 7 seldepth 14 score cp 4 time 7 nodes 4583 pv f7f6 b8c6 b1c3 d7d5 d2d4 c8f5 g1f3
info depth 5 seldepth 12 time 5 nodes 5986 nps 0
info depth 6
info multipv 1 depth 8 seldepth 17 score cp 10 time 9 nodes 7288 pv f7f6 b8c6 e2e4 d7d5 d1h5 g7g6 h5d5 d8d5 e4d5 c6d4 b1a3
info depth 6 seldepth 13 time 11 nodes 17694 nps 0
info depth 7
info depth 7 seldepth 14 time 22 nodes 36900 nps 0
info depth 8
info multipv 1 depth 8 seldepth 16 score cp 21 time 28 nodes 23873 pv d2d4 d7d5 b1c3 b8c6 c1f4 g8f6 c3b5 e7e5 d4e5 f8b4 c2c3
info multipv 1 depth 9 seldepth 19 score cp -17 time 31 nodes 25390 pv f7f6 d7d5 d2d4 b8c6 b1c3 e7e5 e2e3 c8e6 g1f3
info depth 8 seldepth 16 time 41 nodes 70690 nps 0
info depth 9
info multipv 1 depth 9 seldepth 22 score cp 21 time 61 nodes 48524 pv d2d4 d7d5 b1c3 b8c6 c1f4 g8f6 c3b5 e7e5 d4e5 f8b4 c2c3
info depth 9 seldepth 22 time 116 nodes 186374 nps 0
info depth 10
info multipv 1 depth 10 seldepth 23 score cp 25 time 140 nodes 113593 pv d2d4 d7d5 g1f3 g8f6 e2e3 b8c6 f1d3 c8g4 e1g1 f6e4 d3e4 d5e4
info depth 10 seldepth 23 time 175 nodes 277062 nps 0
info depth 11
info multipv 1 depth 11 seldepth 23 score cp 24 time 268 nodes 210263 pv d2d4 d7d5 e2e3 g8f6 g1f3 c8g4 b1c3 f6e4 f1b5 b8c6 e1g1 e4c3 b2c3
info depth 11 seldepth 23 time 401 nodes 629494 nps 0
info depth 12
info multipv 1 depth 12 seldepth 23 score cp 24 time 487 nodes 383972 pv d2d4 d7d5 e2e3 g8f6 g1f3 b8c6 b1c3 c8g4 f1b5 f6e4 e1g1 e4c3 b2c3
info depth 12 seldepth 23 time 716 nodes 1119738 nps 0
info depth 13
info multipv 1 depth 13 seldepth 24 score cp 25 time 879 nodes 686363 pv d2d4 d7d5 e2e3 g8f6 g1f3 b8c6 b1c3 c8g4 f1b5 e7e6 e1g1 f8d6 c1d2
info time 1011 nodes 1580000 nps 1562692 cpuload 1000
info hashfull 65
info currmove d2d3 currmovenumber 5
info currmove e2e3 currmovenumber 6
info currmove b1a3 currmovenumber 7
info currmove g1h3 currmovenumber 8
info currmove f2f4 currmovenumber 9
info currmove b2b3 currmovenumber 10
info currmove b2b4 currmovenumber 11
info currmove g2g3 currmovenumber 12
info currmove g2g4 currmovenumber 13
info currmove a2a4 currmovenumber 14
info currmove h2h4 currmovenumber 15
info currmove c2c4 currmovenumber 16
info currmove a2a3 currmovenumber 17
info currmove h2h3 currmovenumber 18
info currmove c2c3 currmovenumber 19
info currmove f2f3 currmovenumber 20
info depth 13 seldepth 28 time 1208 nodes 1859562 nps 1539662
info time 1218 nodes 1829781 nps 1501113 cpuload 1000
info hashfull 77
bestmove d2d4 ponder d7d5