Any WinBoard bugs I missed?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Any WinBoard bugs I missed?

Post by hgm »

JoshPettus wrote:I'm not sure if it's just me on OSX, but if I open the Tournament window, click OK or Cancel and then open it again, Xboard quits with a segfault 11. It's the only window that does this as far as I can see.
I don't have that in the Linux version. Please uncomment the printf on line 1279 of gtk/xoptionx.c, to see what it prints in the terminal just before the segfault. (So that we can see what element of the dialog causes the trouble.)
Also, we discussed this before, but I don't remember if you intended to fix it for this release.

If you put xboard in ICS mode without the Terminal (E.G. via XOP file) You get an "end of file from keyboard" clicking OK quits xboard. Closing the error box leaves xboard somewhat unstable and liable to quit when clicking on the board at random times. (e.g. sometimes while you were observing)
Hmm, I remember I did some patch that would not make EOF from keyboard a fatal error when the ICS interaction window was open. But perhaps the window is not open yet when you start, or the EOF already occurs before the window gets the chance to open.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: Any WinBoard bugs I missed?

Post by xmas79 »

hgm wrote:...I think that should guarantee temporal ordering in single-PV mode, but that is not an obvious truth...
IMHO it still won't. Sometimes I have multiple fail-lows in a row, so the interface will sort them in the bad order. Sometimes I have search instability which produces a fail-high with a score X, then the research will produce a score lower than X, failing again to sort PVs in the temporal order.
If you want the engine to give explicit output to recall the fail high, it should repeat the best line in this situation.
I think that asking for the engines to modify their output logic to satisfy a GUI that should "only" (really) print on the screen what the engines have to say seems a bit forced to me... Why it can't be made a simple checkbox on the options dialog? "Do you want to sort engine output by score? Yes/No". "Yes" will put on all this logic, "No" will disable all this stuff and will display exactly what the engines have to say (that it seems to me is the most obvious way to look at that window).

Maybe managing a failed move (high/low) as a "non sorting point" could be a solution. In this way, if an engine outputs ! or ? then the PVs below this point should never be sorted. This saves single-PV engines (which usually output ! and ? for FH-FL moves), and multi-PV engines (which put ! and ? for FH-FL moves, and put clean PV for the multiPV part). No?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Any WinBoard bugs I missed?

Post by bob »

hgm wrote:
xmas79 wrote:Ah very good, I will definitely put ! or ? at the end since this problem is annoying me for about one year! I assume that the output (by explictly printing ! or ?) will now exactly reflect the output of the engine over time. Right?
New lines bubble up from the bottom (of the lines of that depth) to the point where they encounter a higher score. But when that higher score is due to a fail (or has the same move) they would pass it anyway. When it passes it, it would adjust the score (not in the display, but in the sort key), so that lines starting with other moves can now pass it too, if there was no explicit fail indicator.

I think that should guarantee temporal ordering in single-PV mode, but that is not an obvious truth. It depends on the assumption that you cannot get fail-low lines after the engine found an exact score, and that, say, after a fail high on a later move which it prints, if search instability would cause the search on that move in the re-search to get a lower score than the PV move (or a fail low), this line will not be printed. E.g. the sequence

13 +0.12 d4 Nf6 ...
13 +0.20 e4!
13 +0.10 e4 e5

would sort the last line below the first one, because neither are fails, and the first one has the highest score. If you want the engine to give explicit output to recall the fail high, it should repeat the best line in this situation. This would bubble to above that same line printed earlier (because they have the same move, even through they have the same score 0.12), and then also to above the fail-high line that was on top.
That is not so uncommon.

Ponder a long time, fill the hash with deep draft entries, opponent makes a different move and those LOWER hash entries will cause fail highs, but the research produces a lower score because it doesn't have the same depth as the deep draft hash entries...
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Any WinBoard bugs I missed?

Post by hgm »

xmas79 wrote:IMHO it still won't. Sometimes I have multiple fail-lows in a row, so the interface will sort them in the bad order.
No, it won't. Any more recent line will pass the earlier fail low in (if it was explicitly indicated as one) in the upward direction during the sort. Even when it was not indicated as such, it will likely have the same PV move (strictly speaking, a fail low should have none), and should thus be considered an improved score of the previous fail low, and sorted above it.
Sometimes I have search instability which produces a fail-high with a score X, then the research will produce a score lower than X, failing again to sort PVs in the temporal order.
Yes, this is the case that Bob also showed. But if the lower bound is marked as a fail low, it will never prevent the later line from floating to the top. If the later line starts with the same move as the fail high, it will also always float past the latter, because XBoard now can conclude it must have been a fail low, or the same move could not repeat with a different score.
If you want the engine to give explicit output to recall the fail high, it should repeat the best line in this situation.
I think that asking for the engines to modify their output logic to satisfy a GUI that should "only" (really) print on the screen what the engines have to say seems a bit forced to me... [/quote]
Well, the Engine Output window is not intended to be a general log window for the engine-GUI communication. It is intended for presenting thinking output of well-defined format by the protocol, and there never was any guarantee given that output would be presented in temporal order. There could very well be a conflict of interest here between engine developer and the typical user. In general it is more convenient to have the engine's current favorite move at the top of the list. So even if it is not a protocol requirement that moves are sent with their scores in ascending order, I think it is quite justifiable that GUIs will correct engines that sin against this logic. WB Thinking Output is not intended, for instance, to keep the GUI informed on how the engine is progressing. (The stat01 commands are for this purpose.) So when an engine would start to send a line for every low-failing move it searched after the PV move, with an upper-bound score lower than the PV move, I would definitely consider it non-compliant. Sending exact scores that are lower than an earlier exact score in the same iteration is only justifiable as multi-PV output; in single-PV mode an engine should not do that. Reporting fail highs on individual late moves before you have resolved them does come close to abuse of Thinking Output, IMO. It clutters the output, and for a non-technical user that does not know in intimate detail how engines work (i.e. 99% of all users...), it does not convey any information.
Why it can't be made a simple checkbox on the options dialog? "Do you want to sort engine output by score? Yes/No". "Yes" will put on all this logic, "No" will disable all this stuff and will display exactly what the engines have to say (that it seems to me is the most obvious way to look at that window).
This could be done, but the problem is that in multi-PV mode you would really want score order rather than temporal order, because some multi-PV implementations print the lower scores later. And users might want to switch between single and multi-PV mode quite often. So it would be annoying if they were forced to toggle yet an extra option to do that. Plus that there is no guarantee engines won't frivolously print fall lows and highs in multi-PV mode.

So if there was to be a new option, I think a much more useful one would be a checkbox "discard fail highs and fail lows", default true. But as the ! and ? standard is not yet in common use, that would not help much in practice. (Although having Polyglot do it right would make all UCI engines compliant in one swoop.)
Maybe managing a failed move (high/low) as a "non sorting point" could be a solution. In this way, if an engine outputs ! or ? then the PVs below this point should never be sorted. This saves single-PV engines (which usually output ! and ? for FH-FL moves), and multi-PV engines (which put ! and ? for FH-FL moves, and put clean PV for the multiPV part). No?
The idea of always letting more recent lines rise above failed lines was sort of an attempt at that. But I think the real issue is: what is the most useful way to present engine output like

early
13 +0.12 1:10 d4 Nf6 ...
13 +0.20 1:15 e4!
13 +0.10 1:25 e4 e5 ...
late

The current algorithm would display it as

top
13 +0.20 1:15 e4!
13 +0.12 1:10 d4 Nf6 ...
13 +0.10 1:25 e4 e5 ...
bottom

because although the +0.10 line could pass the +0.20 line upward, because the latter is a fail low (or has the same move), it now is not allowed to pass the +0.12 line, as +0.10 < +0.12. But

top
13 +0.10 1:25 e4 e5 ...
13 +0.20 1:15 e4!
13 +0.12 1:10 d4 Nf6 ...
bottom

because the +0.20 reset the sorting bottom and forces everything that comes later to come above it (as a higher depth would) IMO is even more confusing. The e4-e5 line has no business being on top, as it is not the best move. And I really think the engine is the culprit for sending it, as strict temporal order would produce exactly the same display of the lines. Most GUIs do not sort, and sending lines for poor moves after good ones will just confuse the user.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: Any WinBoard bugs I missed?

Post by xmas79 »

Ok, so we agree that the problem is more general... I simply would remark some specific pattern which seems to be common. I will present them here in temporal order (so you will see what you would see on a console window, top = oldest line, bottom = newest line)

case 1: multiple fail lows due to bad AB window. For fail soft frameworks the engine could print the so far best (and bad) move. Fail hard would simply ouput the ? without any move.

Code: Select all

13 +0.20 1&#58;15 e4?
13 +0.10 1&#58;16 e4?
13 -0.10 1&#58;17 e4?
13 -1.45 1&#58;17 d4 blah blah blah
Here the engine fails low three times lowering the best (bad) score, and finally find the best move being d4 with a score of -1.45.

Case 2: two exact scores due to bad move ordering.

Code: Select all

13 +0.20 1&#58;15 e4 blah blah blah
13 +0.25 1&#58;16 f4 blah blah
Here the engine finds two exact scores due to a bad move ordering schema. During search it picks first the "e4" move and find it to be the best move, then picks "f4" and this results in a better move.

Case 3: an exact score, followed by fail high, followed by an exact score again

Code: Select all

13 +0.20 1&#58;15 e4 blah blah
13 +0.90 1&#58;16 f4!
13 +0.20 1&#58;17 e4 blah blah
Here the engine finds a best move, then picks another move which triggers a fail high (due to TT hash hit and lack of depth, typical in distant mates). The research though cannot "complete" the fail high, and the f4 search produces nothing, leaving again e4 as the best move.

Ideally, the thinking output window should display everything that reflects the current preferred choice of the engine, and not what the GUI thinks the engine is actually preferring. How you would display such outputs?
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Any WinBoard bugs I missed?

Post by hgm »

xmas79 wrote:

Code: Select all

13 +0.20 1&#58;15 e4?
13 +0.10 1&#58;16 e4?
13 -0.10 1&#58;17 e4?
13 -1.45 1&#58;17 d4 blah blah blah
Here the engine fails low three times lowering the best (bad) score, and finally find the best move being d4 with a score of -1.45.
The XBoard-4.8 algorithm would also display these in temporal order, because any line would be able to pass a fail low (including the other fail lows). So there would never be anything that prevents the latest line from floating to the top. Without the question marks there would be a problem, as d4 != e4, so it would have no idea whether the -0.10 e4 was fail low or exact (e.g. cut by hash hit).

Code: Select all

13 +0.20 1&#58;15 e4 blah blah blah
13 +0.25 1&#58;16 f4 blah blah
Here the engine finds two exact scores due to a bad move ordering schema. During search it picks first the "e4" move and find it to be the best move, then picks "f4" and this results in a better move.
These would be displayed in temporal order, as the higher score came latest, and would float to the top past the other.

Code: Select all

13 +0.20 1&#58;15 e4 blah blah
13 +0.90 1&#58;16 f4!
13 +0.20 1&#58;17 e4 blah blah
Here for 4.8 the engine finds a best move, then picks another move which triggers a fail high (due to TT hash hit and lack of depth, typical in distant mates). The research though cannot "complete" the fail high, and the f4 search produces nothing, leaving again e4 as the best move.
f4! would float to the top as soon as it came in (as +0.90 > +0.20), after that the third line would float to the top, because it passes its previous copy (same score and move, but came in later), and then it passes the +0.90 because that was marked as fail high, so it does not matter that the score is higheror the move is different. Without the ! there would be a problem, because the +0.90 could not be recognized as non-exact.

I guess an additional heuristic here could be that because the new line passed a copy of itself with the same score, it can be recognized as 'corrective output', and should prevail over everything that went on before.

Ideally, the thinking output window should display everything that reflects the current preferred choice of the engine, and not what the GUI thinks the engine is actually preferring. How you would display such outputs?
JoshPettus
Posts: 730
Joined: Fri Oct 19, 2012 2:23 am

Re: Any WinBoard bugs I missed?

Post by JoshPettus »

Sorry, I have been gone all day.
hgm wrote: I don't have that in the Linux version. Please uncomment the printf on line 1279 of gtk/xoptionx.c, to see what it prints in the terminal just before the segfault. (So that we can see what element of the dialog causes the trouble.)


Here are the results

Code: Select all

option = 0, top = 0
option = 1, top = 1
option = 2, top = 2
option = 3, top = 3
option = 4, top = 4
option = 5, top = 5
option = 6, top = 5
option = 7, top = 6
option = 8, top = 6
option = 9, top = 7
option =10, top = 1
option =11, top = 2
option =12, top = 3
option =13, top = 4
option =14, top = 5
option =15, top = 6
option =16, top = 7
option =17, top = 8
option =18, top = 9
option =19, top =10
option =20, top =11
option =21, top =12
option =22, top =12
option =23, top =12
option =24, top =12
option =25, top =13
option =26, top =13
option = 0, top = 0
option = 1, top = 1
option = 2, top = 2
option = 3, top = 3
option = 4, top = 4
option = 5, top = 5
option = 6, top = 5
Segmentation fault&#58; 11
logout
It starts with the first time the Tournament Window opened and then it repeats it self with the second time, but crashes.

hgm wrote: Hmm, I remember I did some patch that would not make EOF from keyboard a fatal error when the ICS interaction window was open. But perhaps the window is not open yet when you start, or the EOF already occurs before the window gets the chance to open.
Hmm, well closing the window, rather then clicking ok, keeps xboard alive. But we also have the instability as a result of not having the terminal. It would be nice if we could do away with the terminal altogether.
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Any WinBoard bugs I missed?

Post by hgm »

Can you add one line at the top of MatchOptionsProc() in dialogs.c, to clear textValue of the option that seems to cause the trouble, to see if that fixes it?

Code: Select all

void
MatchOptionsProc ()
&#123;
   matchOptions&#91;PARTICIPANTS&#93;.textValue = NULL;
   if&#40;matchOptions&#91;PARTICIPANTS+1&#93;.type != ListBox&#41; &#123;
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Any WinBoard bugs I missed?

Post by hgm »

One more concern:

When XBoard is looking for a settings file, in response to an @INIFILE argument, it will first look in the current directory (if INIFILE is not an absolute path name, of course). If it does not find it there, it will look in ~/.xboard/themes/conf/ , and finally in DATADIR/themes/conf/ . In Linux DATADIR is set by the configure process to /usr/(local/)share/games/xboard . This system exists so that engines with special display needs (such as HaChu for Chu Shogi) can install a settings file as part of a theme amongst XBoard's data files. So that the command "xboard @chu" would be enough to launch HaChu with the kanji Chu-Shogi theme.

Is this still good for OS X, though? We replace DATADIR there by the bundle path, detected at run time.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Any WinBoard bugs I missed?

Post by bob »

hgm wrote:Ah yes, you are right. XBoard does put a & at the end of the command.

But that means it follows exactly the strategy I outlined above: system() forks of a child, which does an execv on the shell, the shell forks off a grand-child doing an execv on aplay, and does not wait() for it because of the &. So it exits immediately (because it was requested through arguments to do only one command), and orphans the aplay process.

There can be no zombies this way. If aplay processes remain, it must be because aply is hanging. My guess it that at some point moves come in so fast that several aplays are overlapping (especially 'gong' is a long sound!) and try to access the sound harware at the same time, and that it somehow cannot handle that and gets stuck.
I tested this when the topic was brought up and as far as I could tell, it worked OK. I've learned to be a bit cautious in my claims, however, because I am running on OS X, which does not do everything the same way as BSD or Linux flavors of Unix...