Bug in xboard 4.4.4; banned for ICC rated play!

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: xboard debug file w/FICS

Post by marcelk »

hgm wrote:Yes, but the purpose is not to write a new malloc, but to reproduce an error that occur with the existing one, and print diagnostics when it occurs. So in any case I want to exactly the same malloc calls as would be done when you call it directly.
MacOSX has a paranoid malloc/free built-in, you just have to activate it with environment variables.

Code: Select all

DEBUGGING ALLOCATION ERRORS
     A number of facilities are provided to aid in debugging allocation errors
     in applications.  These facilities are primarily controlled via environ-
     ment variables.  The recognized environment variables and their meanings
     are documented below.

ENVIRONMENT
     The following environment variables change the behavior of the alloca-
     tion-related functions.

     MallocLogFile <f>            Create/append messages to the given file
                                  path <f> instead of writing to the standard
                                  error.

     MallocGuardEdges             If set, add a guard page before and after
                                  each large block.

     MallocDoNotProtectPrelude    If set, do not add a guard page before large
                                  blocks, even if the MallocGuardEdges envi-
                                  ronment variable is set.

     MallocDoNotProtectPostlude   If set, do not add a guard page after large
                                  blocks, even if the MallocGuardEdges envi-
                                  ronment variable is set.

     MallocStackLogging           If set, record all stacks, so that tools
                                  like leaks can be used.

     MallocStackLoggingNoCompact  If set, record all stacks in a manner that
                                  is compatible with the malloc_history pro-
                                  gram.

     MallocStackLoggingDirectory  If set, records stack logs to the directory
                                  specified instead of saving them to the
                                  default location (/tmp).

     MallocScribble               If set, fill memory that has been allocated
                                  with 0xaa bytes.  This increases the likeli-
                                  hood that a program making assumptions about
                                  the contents of freshly allocated memory
                                  will fail.  Also if set, fill memory that
                                  has been deallocated with 0x55 bytes.  This
                                  increases the likelihood that a program will
                                  fail due to accessing memory that is no
                                  longer allocated.

     MallocCheckHeapStart <s>     If set, specifies the number of allocations
                                  <s> to wait before begining periodic heap
                                  checks every <n> as specified by
                                  MallocCheckHeapEach.  If
                                  MallocCheckHeapStart is set but
                                  MallocCheckHeapEach is not specified, the
                                  default check repetition is 1000.

     MallocCheckHeapEach <n>      If set, run a consistency check on the heap
                                  every <n> operations.  MallocCheckHeapEach
                                  is only meaningful if MallocCheckHeapStart
                                  is also set.

     MallocCheckHeapSleep <t>     Sets the number of seconds to sleep (waiting
                                  for a debugger to attach) when
                                  MallocCheckHeapStart is set and a heap cor-
                                  ruption is detected.  The default is 100
                                  seconds.  Setting this to zero means not to
                                  sleep at all.  Setting this to a negative
                                  number means to sleep (for the positive num-
                                  ber of seconds) only the very first time a
                                  heap corruption is detected.

     MallocCheckHeapAbort <b>     When MallocCheckHeapStart is set and this is
                                  set to a non-zero value, causes abort(3) to
                                  be called if a heap corruption is detected,
                                  instead of any sleeping.

     MallocErrorAbort             If set, causes abort(3) to be called if an
                                  error was encountered in malloc(3) or
                                  free(3) , such as a calling free(3) on a
                                  pointer previously freed.

     MallocCorruptionAbort        Similar to MallocErrorAbort but will not
                                  abort in out of memory conditions, making it
                                  more useful to catch only those errors which
                                  will cause memory corruption.  MallocCorrup-
                                  tionAbort is always set on 64-bit processes.

     MallocHelp                   If set, print a list of environment vari-
                                  ables that are paid heed to by the alloca-
                                  tion-related functions, along with short
                                  descriptions.  The list should correspond to
                                  this documentation.
If there is a malloc problem, you can reveal it with these.
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: xboard debug file w/FICS

Post by hgm »

Well, it seems that the paranoia I have built into MyFree is also doing a good job. Problem is that I am not sure what the variable two ints before the malloc'ed memory is supposed to be used for. Most of the time it is zero, but sometimes it has the value of a pointer pointing close to other malloc'ed.

I could make XBoard exit based on a 'free corruption' in a reproducible way, by first loading one small game file, and then another. The corrupted free occurs when destroying the old game list. The corruption is in the first word of the area header (assuming the header has two 4-byte words).This is unlike the seemingly random corruption I was experiencing during the load of the huge game file, which had a change in the second word of the header (the area size). Nevertheless this is strange: the fact that immediately after malloc the first word was 0 suggests that it is not part of some linked list. So how could it be found to change something in that header?

I tried to trace where the change of this word occurs. Turns out it happens when you open the file-browse window (to open the next file)! This dialog has its own event loop, and a test in the even-loop initially finds the changed memory location unchanged, but after some events suddenly finds it changed. So it seems that some of the file-browser event handlers can change the header of a malloc'ed area that is still in use. This does explain why it seemed like the load of the huge game file seemed to be frustrated by an asynchronous process striking at random: during the lengthy load the file-browse window remained open,and closed only after the load completed. So it probably depended on me causing events in that window during the load (e.g. moving the mouse).

Now it is very fishy that file-browser events should be able to corrupt headers of already allocated memory areas that have absolutely nothing to do with selecting the new file. (They are deeply hidden in some linked list representing the game list of the previously loaded file.) This suggest the file browser is very sick. Now the file browser does not use malloc and free; instead it uses XtMalloc and XtFree. It almost seems that this is an interfering system of memory allocation, handing out memory from the same pool as malloc/free.

None of this can be related to the crashes that Steven suffers, though, as I assume that during ICS play he won't be using the file-browse window...
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: xboard debug file w/FICS

Post by sje »

hgm wrote:Now the file browser does not use malloc and free; instead it uses XtMalloc and XtFree. It almost seems that this is an interfering system of memory allocation, handing out memory from the same pool as malloc/free.
I suspect that these X toolkit routines are just wrappers for malloc/free.
hgm wrote:None of this can be related to the crashes that Steven suffers, though, as I assume that during ICS play he won't be using the file-browse window...
Alas, there may other X activity which is accessing malloc/free in an unpredictable manner. And somehow it crashes by itself with me being far away from the keyboard.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: xboard debug file w/FICS

Post by Michel »

Now it is very fishy that file-browser events should be able to corrupt headers of already allocated memory areas that have absolutely nothing to do with selecting the new file.
This sounds very much as a buffer overflow.

There are two methods to debug a buffer overflow I use:

(1) Compile with libefence. libefence will attach a write protected page to every chunk of malloc'ed memory. This should catch typical buffer overflows.

(2) Run under valgrind.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Seg fault while on FICS; have log files this time

Post by sje »

Segmentation fault from xboard while on FICS; I have log files this time.

From FICS transcript:

Code: Select all

fics% 
maheshmanjrekar accepts your seek.

Creating: Symbolic (2162) maheshmanjrekar ( 834) rated blitz 5 0
{Game 210 (Symbolic vs. maheshmanjrekar) Creating rated blitz match.}

Game 210: All players agree that a disconnection will be considered a forfeit.
fics% (told maheshmanjrekar, who is playing)
fics% (told maheshmanjrekar, who is playing)
fics% Symbolic(C)(2162)[210] whispers: [Even/0/0.000/0/0] 1 d4
(whispered to 0 players)
fics% 
maheshmanjrekar tells you: Hi
fics% 
maheshmanjrekar tells you: Ur name ?
fics% 
maheshmanjrekar tells you: 1st tell me about itself
fics% 
maheshmanjrekar tells you: R u from?
fics% 
maheshmanjrekar tells you: Symbolic means
fics% 
maheshmanjrekar tells you: y did give such name?
fics% ./AutoFICSdebug: line 20:   585 Segmentation fault: 11  /usr/local/bin/xboard -debug -autoflag -fcp "./Symbolic -c xboard" -fd $HOME/Arena/Symbolic -hideThinkingFromHuman false -ics -icshost freechess.org -icslogon $HOME/Arena/Symbolic/ficslogon -sgf $HOME/Arena/Symbolic/fics.pgn -size Medium -thinking -xalarm -xanimate -xbuttons -xzab -xzadj -zippyGameEnd "seek 5" -zippyMaxGames 4 -zp
From Symbolic:

Code: Select all

2011.10.01 05:24:33.036 Initializing for a new game
2011.10.01 05:24:34.812 C/F: [000:00:05:00.000   000:00:05:00.000] rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
2011.10.01 05:24:34.813 New game: Round 127
2011.10.01 05:24:34.813 < random
2011.10.01 05:24:34.814 Random equi-optimal candidate selection is enabled.
2011.10.01 05:24:34.814 < ics freechess.org
2011.10.01 05:24:34.815 Internet Chess Server: freechess.org
2011.10.01 05:24:34.816 < post
2011.10.01 05:24:34.817 Status update posting is enabled.
2011.10.01 05:24:34.817 < hard
2011.10.01 05:24:34.818 Ponder searching is enabled.
2011.10.01 05:24:34.819 < ping 253
2011.10.01 05:24:34.820 > pong 253
2011.10.01 05:24:46.253 < level 0 5 0
2011.10.01 05:24:46.253 Note: level command not currently processed
2011.10.01 05:24:46.254 < name maheshmanjrekar
2011.10.01 05:24:46.255 Opponent name: maheshmanjrekar
2011.10.01 05:24:46.255 > tellopponent Greetings from Symbolic 2011.09.24 by S. J. Edwards
2011.10.01 05:24:46.255 > tellopponent Hello maheshmanjrekar, I hope you enjoy this game.
2011.10.01 05:24:46.255 < rating 2162 834
2011.10.01 05:24:46.256 My rating: 2162   Opponent rating: 834
2011.10.01 05:24:46.256 < time 30000
2011.10.01 05:24:46.257 < otim 30000
2011.10.01 05:24:46.259 < go
2011.10.01 05:24:46.259 Nominal direct search time: 14.999
2011.10.01 05:24:46.278 > 0 0 0 0 1 d4
2011.10.01 05:24:46.281 Sts: Book   Nodes: 0   Time: 0.000   Freq: 0
2011.10.01 05:24:46.281 Analysis: [Even/0/0.000/0/0] 1 d4
2011.10.01 05:24:46.281 Playing: 1 d4
2011.10.01 05:24:46.281 C/F: [000:00:04:59.978 B 000:00:04:59.999] rnbqkbnr/pppppppp/8/8/3P4/8/PPP1PPPP/RNBQKBNR b KQkq - 0 1
2011.10.01 05:24:46.282 > tellothers [Even/0/0.000/0/0] 1 d4
2011.10.01 05:24:46.282 > move d4
2011.10.01 05:24:46.317 Nominal ponder search time: 14.998
2011.10.01 05:24:46.317 Predicted move: 1... Nf6
2011.10.01 05:26:24.594 < result 1-0 {maheshmanjrekar forfeits by disconnection}
2011.10.01 05:26:24.605 Game over.
2011.10.01 05:26:24.606 < force
2011.10.01 05:26:24.607 < ping 254
2011.10.01 05:26:24.608 > pong 254
2011.10.01 05:26:25.316 Got EOF on command stream
2011.10.01 05:26:25.316 EOF detected on command input
2011.10.01 05:26:25.316 CpXboardTask run loop exiting
2011.10.01 05:26:25.318 DriverTask run loop exiting
2011.10.01 05:26:25.318 Selected command processor terminating
2011.10.01 05:26:25.318 CpXboardTask gone
2011.10.01 05:26:25.318 MTSearch controller ending
2011.10.01 05:26:25.318 Worker task 0 ending
2011.10.01 05:26:25.329 Worker task 0 gone
2011.10.01 05:26:25.329 Worker task 1 ending
2011.10.01 05:26:25.340 Worker task 1 gone
2011.10.01 05:26:25.340 Worker task 2 ending
2011.10.01 05:26:25.350 Worker task 2 gone
2011.10.01 05:26:25.350 Worker task 3 ending
2011.10.01 05:26:25.361 Worker task 3 gone
2011.10.01 05:26:25.361 MTSearch transposition tables terminating
2011.10.01 05:26:25.361 MTSearch controller gone
2011.10.01 05:26:25.361 MetaSearch ending
2011.10.01 05:26:25.361 MetaSearch gone
2011.10.01 05:26:25.361 Search task controller ending
2011.10.01 05:26:26.124 Search task controller gone
2011.10.01 05:26:26.125 Command processor base class instance gone
2011.10.01 05:26:26.126 WatchTask run loop exiting
2011.10.01 05:26:26.126 WatchTask gone
2011.10.01 05:26:26.126 Command stream input stack terminating
2011.10.01 05:26:26.145 Endgame tablebase cache object gone
2011.10.01 05:26:26.146 Opening book image unloaded
2011.10.01 05:26:26.146 Opening book cache object gone
2011.10.01 05:26:26.146 Chess classes termination started
2011.10.01 05:26:26.146 Chess classes termination done
2011.10.01 05:26:26.177 IntervalTimer done
2011.10.01 05:26:26.177 SignalHandlers done
2011.10.01 05:26:26.177 Symbolic 2011.09.24   Concluded Sat Oct  1 01:26:26 2011
From xboard:

Code: Select all

Display title 'PhilidorsMate (22) vs. Symbolic (22) {5 0}, gameInfo.variant = 0'
GameEnds(26, PhilidorsMate forfeits on time, 0)
71377922 >first : result 0-1 {PhilidorsMate forfeits on time}
71377922 >first : force
71377922 >first : ping 252
>ICS: seek 5
>ICS: \015\012
Reset(1, 1) from gameMode 11
recognized 'normal' (-1) as variant normal
GameEnds(0, (null), 2)
shuffleOpenings = 0
71377924 >first : new
random
71377924 >first : ics freechess.org
71377924 >first : post
71377924 >first : hard
71377924 >first : ping 253
71377938 <first : pong 252
<ICS: You are not playing a game.\012\015fics% Your seek has been posted with index 81.\012\015(31 player(s) saw the seek.)\012\015fics% 
ics input 0, castling = 7 0 4 7 0 4
71379723 <first : pong 253
<ICS: \012\015maheshmanjrekar accepts your seek.\012\015\012\015Creating: Symbolic (2162) maheshmanjrekar ( 834) rated blitz 5 0\012\015{Game 210 (Symbolic vs. maheshmanjrekar) Creating rated blitz match.}\012\015\012\015<12> rnbqkbnr pppppppp -------- -------- -------- -------- PPPPPPPP RNBQKBNR W -1 1 1 1 1 0 210 Symbolic maheshmanjrekar 1 5 0 39 39 300000 300000 1 none (0:00.000) none 0 0 0\012\015fics% \012\015Game 210: All players agree that a disconnection will be considered a forfeit.\012\015fics% 
ics input 0, castling = 7 0 4 7 0 4
Ratings from 'Creating:' Symbolic 2162, maheshmanjrekar 834
recognized 'rated blitz match.' (-1) as variant normal
Parsing board: rnbqkbnr pppppppp -------- -------- -------- -------- PPPPPPPP RNBQKBNR W -1 1 1 1 1 0 210 Symbolic maheshmanjrekar 1 5 0 39 39 300000 300000 1 none (0:00.000) none 0 0 0

recognized 'ICS rated blitz match' (-1) as variant normal
ParseBoard says variant = 'ICS rated blitz match'
recognized as normal
Remembered ratings: W 2162, B 834
load 8x8 board
71391155 >first : level 0 5 0
71391155 >first : name maheshmanjrekar
71391156 >first : rating 2162 834
time odds: 1.000000 1.000000 
71391156 >first : time 30000
71391156 >first : otim 30000
book hit = (NULL)
71391156 >first : go
Display title 'Symbolic (39) vs. maheshmanjrekar (39) {5 0}, gameInfo.variant = 0'
71391158 <first : tellopponent Greetings from Symbolic 2011.09.24 by S. J. Edwards
>ICS: $say Greetings from Symbolic 2011.09.24 by S. J. Edwards\015\012
71391158 <first : tellopponent Hello maheshmanjrekar, I hope you enjoy this game.
>ICS: $say Hello maheshmanjrekar, I hope you enjoy this game.\015\012
71391181 <first : 0 0 0 0 1 d4
71391185 <first : tellothers [Even/0/0.000/0/0] 1 d4
>ICS: $whisper [Even/0/0.000/0/0] 1 d4\015\012
71391185 <first : move d4
machine move 0, castling = 7 0 4 7 0 4
Disambiguate in:  0(3,-1)-(3,3) = 0 (-)
Disambiguate out: 0(3,1)-(3,3) = 0 (-)
CoordsToAlgebraic, piece=0 (3,1)-(3,3) -
7 0 4 7 0 4 Legality test? d2d4
movetype=21, promochar=0=-
MateTest: K=1, my=16, his=16
move: d2d4
, parse: d4 (
)
MateTest: K=1, my=16, his=16
repeat test fmm=1 bmm=0 ep=-4, reps=6
1 ep=-3
0 ep=-4
>ICS: d2d4\015\012
<ICS: (told maheshmanjrekar, who is playing)\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
<ICS: (told maheshmanjrekar, who is playing)\012\015fics% Symbolic(C)(2162)[210] whispers: [Even/0/0.000/0/0] 1 d4\012\015(whispered to 0 players)\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='Symbolic(C)(2162)[210] whispers: [Even/0/0.000/0/0] 1 d4
' 1
<ICS: \012\015<12> rnbqkbnr pppppppp -------- -------- ---P---- -------- PPP-PPPP RNBQKBNR B 3 1 1 1 1 0 210 Symbolic maheshmanjrekar -1 5 0 39 39 300000 300000 1 P/d2-d4 (0:00.000) d4 0 0 0\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Parsing board: rnbqkbnr pppppppp -------- -------- ---P---- -------- PPP-PPPP RNBQKBNR B 3 1 1 1 1 0 210 Symbolic maheshmanjrekar -1 5 0 39 39 300000 300000 1 P/d2-d4 (0:00.000) d4 0 0 0

load 8x8 board
parseboard 1, castling = 7 0 4 7 0 4
accepted move d4 from ICS, parse it.
moveNum = 1
board = 0-8 x 8
Disambiguate in:  0(3,-1)-(3,3) = 0 (-)
Disambiguate out: 0(3,1)-(3,3) = 0 (-)
CoordsToAlgebraic, piece=0 (3,1)-(3,3) -
7 0 4 7 0 4 Legality test? d2d4
movetype=21, promochar=0=-
MateTest: K=1, my=16, his=16
Move parsed to 'd4 (0:00.000)'
Display title 'Symbolic (39) vs. maheshmanjrekar (39) {5 0}, gameInfo.variant = 0'
<ICS: \012\015maheshmanjrekar tells you: Hi\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: Hi
' 1
<ICS: \012\015maheshmanjrekar tells you: Ur name ?\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: Ur name ?
' 1
<ICS: \012\015maheshmanjrekar tells you: 1st tell me about itself\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: 1st tell me about itself
' 1
<ICS: \012\015maheshmanjrekar tells you: R u from?\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: R u from?
' 1
<ICS: \012\015maheshmanjrekar tells you: Symbolic means\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: Symbolic means
' 1
<ICS: \012\015maheshmanjrekar tells you: y did give such name?\012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
Append: in='maheshmanjrekar tells you: y did give such name?
' 1
<ICS: \012\015Game 210: Your opponent, maheshmanjrekar, has lost contact or quit.\012\015\012\015{Game 210 (Symbolic vs. maheshmanjrekar) maheshmanjrekar forfeits by disconnection} 1-0\012\015\012\015Blitz rating adjustment: 2162 --> 2162\012\015Blitz rank:    99/21819\012\015Blitz crank:    54/86   \012\015fics% 
ics input 1, castling = 7 0 4 7 0 4
GameEnds(25, maheshmanjrekar forfeits by disconnection, 0)
71489497 >first : result 1-0 {maheshmanjrekar forfeits by disconnection}
71489497 >first : force
71489497 >first : ping 254
>ICS: seek 5
>ICS: \015\012
Reset(1, 1) from gameMode 11
recognized 'normal' (-1) as variant normal
GameEnds(0, (null), 2)
shuffleOpenings = 0
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: xboard debug file w/FICS

Post by hgm »

OK, I have done neither of that, and the buffer overflow was alsomy first suspicion, which is why I printed what I print on a detected corruption. But I think I can already rule out such an overflow in the typical case I posted. The critical part I reposted below:

Code: Select all

malloc 90e3bd0 0,19 
malloc 90e3be8 0,39 
free 90e3bd0 
malloc 90f4ff8 0,51 
corrupted free 90e3be8, old=39, new=50 
content = '[Board "5"] 
[WhiteTeam "Vologda region"] 
' 
before = 00 00 00 00 19 00 00 00 00 24 ffffffae 00 00 24 ffffffae 00 35 22 5d 0a 00 00 00 00 18 00 00 00 38 00 00 00 
bad free 90e3be8
XBoard is busy loading a PGN file here, parsing itsway through the tags, in particular the 'non-standard' tags. The way it handles these is that it concatenates them all, (separated by newlines), needing an ever larger allocated memory bufferforholding the string as tags are added.

In this particular case it encountered the tag [Board "5"],and allocated the memory at 90e3bd0 for it, 0x18 = 24 bytes long. Then it encountered a WhiteTeam tag, which it wants to strcat to it. So it allocates the are at 90e3be8, 0x38 = 56 bytes long. This is exactly 0x18 bytes after the previously allocated buffer (so we are lucky...). It copies the concatenation of the two tags there, and sucesfully frees the smaller one holding the Board tag at 90e3bd0.

Then it apparently encounters another tag that has to be concatenated as well, so it allocates bigger space, 0x50 = 80 bytes at 90f4ff8, somewhere far away. Presumably it copies and concatenates the three tags there, and then tries to free the 56-byte area at 90e3be8, which was holding the two tags. Now the header of that area is corrupted, and XBoard exits after dumping the memory around it.

The memory dump includes the 32 bytes before the address of the area to be freed, i.e. it starts at 90e3bc8, which happens to be exactly the start of the (now freed) area before it, including its header. We know this area was used to hold the [Board "5"] tag before it was freed. The freeing apparently overwrote the first 8 data bytes [Board " , but the 5"] (+ newline) can still be seen to be there: 0x35 (='5'), 0x22(='"'), 0x5d(=']') and 0xA (='\n'). after that a terminating null byte, and some unused null bytes before the header of the area we are freeing now starts. The data part of the latter area is printed as ASCII, and shows the concatenated two tags undamaged.

So no trace of a buffer overrun. The header words of the area where changed with surgical precision from 0, 0x39 to 0x18, 0x38.

Now note that I am not sure this is an error. I don't really know what these header words are used for. The second one seems to hold length+1 (in bytes) for the area directy after allocation, based on the address of the subsequently allocated area. It is not clear what the function of the +1 is, i.e. why it stores 0x39 in stead of 0x38 if the area measures 0x38 bytes. There could be a flag there. All allocated areas always seem to be aligned to 8-byte boundaries (which even on a 32-bit system would be logical, to align doubles with memory words), so in principle the three least-significant bits are available for flags. Perhaps there is a flag there to indicate that the up-stream neighbor is currently free, to be used for de-fragmentation. What the first header word is for, I have no idea at all. It looks like it contains random data after malloc, usually 0, but sometimes arecognizable ASCII snippet, sometimes clearly a pointer. (Areas are expected to contain ASCII and pointers to other allocated areas here before they are freed.) So I guess malloc does not initialize this, and you see what happened to be in the memory that was allocated. But it can apparently be changed after allocation. Perhaps the lowest bit of the second word indicates if the first word contains valid data.

Note that I don't suffer a real problem; XBoard was running perfectly for me, and this whole thing only arises because of the paranoic testing I now do on the malloc/free system. It could very well be that all this is perfectly normal operation of the malloc/free system. I could ignore the first word and the lowest bit of the second word of the area headers when testing for corruption, and I might have no problem. But that would of course mean I can no longer detect corruption of that first word and lowest bit, and as I don't know what these are used for, I have no idea if corruption of them can cause a crash on free, or later reallocation of the area. I also have no idea how portabe all this is. On other systems, in particular 64-bit systems, the area headers might have a completely different format...
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: xboard debug file w/FICS

Post by Michel »

Code: Select all

XBoard was running perfectly for me, and this whole thing only arises because of the paranoic testing I now do on the malloc/free system. It could very well be that all this is perfectly normal operation of the malloc/free system. 
Well to validate that you could run xboard with a debug malloc I presume

http://dmalloc.com/

I assume dmalloc is enabled using LD_PRELOAD so it would preempt the system malloc/free even if the latter is accessed through wrappers like
XtMalloc and XtFree.
User avatar
hgm
Posts: 28390
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: xboard debug file w/FICS

Post by hgm »

OK, this sounds like a good tool, I might try it later. In the mean time I already had prepared a test version with an own implementation of malloc/free. I hope Steven can try that under conditions where he gets the allocation error.

This version now writes all malloc/free info to stderr, even if -debug is not set. This should be more convenient for ICS play,as you can keep the debug output out of the xterm by redirecting stderr, ./xboard 2>alloclog .

This version is the latest commit in the sje branch of my repositiory: http://hgm.nubati.net/cgi-bin/gitweb.cg ... /heads/sje .
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: xboard debug file w/FICS

Post by sje »

What do you think about the logs I posted? Do they help in any way with determining the origin of the segmentation fault?