xboard: -keepAlive option no longer working at FICS

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: xboard: -keepAlive option no longer working at FICS

Post by matthewlai »

sje wrote:
matthewlai wrote:It could be a firewall or router along your connection that is killing idle TCP connections. I know some routers do this to avoid running out of memory with bittorrent clients (they open thousands of connections and sometimes don't close them).
Unlikely here, as the ICC connection was unaffected and there was no unusual loading of the LAN or router.
It could just be that the ICC connection wasn't idle, and the FICS one was. FICS is a lot quieter these days.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by mvk »

sje wrote:
matthewlai wrote:It could be a firewall or router along your connection that is killing idle TCP connections. I know some routers do this to avoid running out of memory with bittorrent clients (they open thousands of connections and sometimes don't close them).
Unlikely here, as the ICC connection was unaffected and there was no unusual loading of the LAN or router.
Was Symbolic playing games on ICC when this happened on the other server? Some routers already silently forget about TCP connections after 2 minutes, or less in cases as Mathew explained. The end points might not find out until they hit their own timeout. Also important to note is the logout time as registered on the server side (as retrieved with the 'log' command). This can help find out who timed out. There can be more than one timeout interval in the chain of events.
[Account deleted]
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by sje »

All possible combinations of idle/active FICS/ICC have been seen many times. The 59 minute disconnection phenomenon has been seen only three times, all on the same day. It hasn't been seen since.

The router in use has no problem keeping an idle ssh terminal session alive for days. I seriously doubt any LAN problems.

My guesses:

1. Someone at FICS was experimenting with an hour idle timeout option. Note that the 59 minute interval starts with the end of the last game, not at the end of the last transmission (the one being sent every five minutes). There is no way that the network knows about the end of a game, just data being sent.

2. There is some obscure bug in xboard that shows up semi-randomly and only when more than one instance is running on the same user ID.

3. There is some obscure bug in XQuartz which is sending a quit command to xboard.
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by mvk »

sje wrote:All possible combinations of idle/active FICS/ICC have been seen many times. The 59 minute disconnection phenomenon has been seen only three times, all on the same day. It hasn't been seen since.

The router in use has no problem keeping an idle ssh terminal session alive for days. I seriously doubt any LAN problems.

My guesses:

1. Someone at FICS was experimenting with an hour idle timeout option. Note that the 59 minute interval starts with the end of the last game, not at the end of the last transmission (the one being sent every five minutes). There is no way that the network knows about the end of a game, just data being sent.

2. There is some obscure bug in xboard that shows up semi-randomly and only when more than one instance is running on the same user ID.

3. There is some obscure bug in XQuartz which is sending a quit command to xboard.
The first is not possible without restarting the server software, and this didn't happen.

Do you have logging that shows that the replies to the `date' commands came back in the hour prior to disconnect? Because if they came back I agree it must be on one of the end points. If they didn't come back, it could just as well be a temporary router problem on the path to the server: ICC and FICS are in different parts of the network.
[Account deleted]
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by sje »

mvk wrote:The first is not possible without restarting the server software, and this didn't happen.

Do you have logging that shows that the replies to the `date' commands came back in the hour prior to disconnect? Because if they came back I agree it must be on one of the end points. If they didn't come back, it could just as well be a temporary router problem on the path to the server: ICC and FICS are in different parts of the network.
I don't usually have xboard generate a log because I assume that it is working properly. All I/O between xboard and Symbolic is logged, for that is the only part where I am sure to detect bugs in Symbolic.

I don't recall seeing anything unusual on the FICS terminal emulation window. There was no "you are being disconnected" warning.

I just can't see how a network problem can be triggered by an end of game condition and not a date command and response.
brianr
Posts: 536
Joined: Thu Mar 09, 2006 3:01 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by brianr »

FICS disconnects Tinker far more often than ICC which is a problem because I often have noescape set and the combination creates too many unexplained disconnection losses. There is nothing in the Winboard log to explain it. Accordingly, I have cut way back on FICS. Of course, Tinker can stay on ICC for days without problems. Perhaps it is some odd timestamp v timeseal issue.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: xboard: -keepAlive option no longer working at FICS

Post by sje »

brianr wrote:FICS disconnects Tinker far more often than ICC which is a problem because I often have noescape set and the combination creates too many unexplained disconnection losses. There is nothing in the Winboard log to explain it. Accordingly, I have cut way back on FICS. Of course, Tinker can stay on ICC for days without problems. Perhaps it is some odd timestamp v timeseal issue.
You may be right with fingering timeseal, as it does know the difference between move I/O and date command processing. But with closed source and no logging, who can be sure?

Just for the record, Symbolic's FICS variables (set on every log on):

Code: Select all

set automail      0
set autoflag      1
set bell          0
set cshout        0
set ctell         0
set examine       0
set formula       !abuser & noescape & standard & nocolor & rated & registered & (assesswin > 0) & !wild
set highlight     0
set kibitz        1
set noescape      1
set notakeback    1
set open          1
set rated         1
set seek          0
set shout         0
set silence       1
set tell          0
And the batch call (same on Linux and OS/X):

Code: Select all

xboard \
        -autoflag \
        -fcp "./Symbolic -c X" \
        -fd $HOME/Arena/FICS \
        -hideThinkingFromHuman false \
        -ics \
        -icshelper $HOME/Arena/Symbolic/timeseal \
        -icshost 167.114.65.195 \
        -icslogon $HOME/Arena/Symbolic/ficslogon \
        -keepAlive 5 \
        -sgf $HOME/Arena/Symbolic/fics.pgn \
        -size Medium \
        -thinking \
        -xalarm \
        -xanimate \
        -xbuttons \
        -xzab \
        -xzadj \
        -zippyGameEnd "seek 5 0 f\\nseek 15 0 f"\
        -zippyMaxGames 4 \
        -zp