unbuffered input/ouput

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: read/write vs recv/send

Post by Sven »

O.k. Bob, the point goes to you :-)

recv(0, ...) seems to work the way you have described, although it is

- very counter-intuitive,
- does not match the goal of programs like rsh/ssh to make the intermediate socket connection transparent to both client and server program, and also
- introduces an unwanted dependency on the origin of a program's input (i.e., there must be a socket connection bound to its standard input).

Therefore I would never recommend using it. Even after adding complete socket support to xboard/WinBoard or other GUIs, an engine using recv() this way will no longer run with older GUI versions, or GUIs that do not support it. It is a basic change that can break a lot.
bob wrote:What "socket parameter" are you talking about? The first argument? zero (descriptor zero, which is stdin).
Indeed I meant the first argument of recv() when asking about the "socket parameter", no other parameter is a socket. The last parameter is "flags". When I looked into various versions of rshd.c I did not find any dup() or dup2() call that does what you wrote. That was one of the reasons why I did not believe it. Either I missed it, or rshd uses some internal trick instead of dup2(). I did not look into the sshd source, though.

And I missed the non-obvious "0" as first argument to recv(), I thought it had to be a file descriptor that was the result of a socket() function call at some early time, and would therefore have a value > 2 that would be unknown in the general case.

So the essential code that would have to be present in the child process init part of the daemon would be like this:

Code: Select all

close(0);
dup2(socket, 0);
close(socket);
I hope this can be found somewhere, otherwise I still do not understand how and why it works.

Sven
User avatar
hgm
Posts: 27886
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: read/write vs recv/send

Post by hgm »

Sven Schüle wrote:Even after adding complete socket support to xboard/WinBoard or other GUIs, an engine using recv() this way will no longer run with older GUI versions, or GUIs that do not support it. It is a basic change that can break a lot.
And for exactly zero gain, I might add. As Bob already remarked, this whole socket business is a non-issue. Pipes work fine, pipes are perfect.

When your application cannot stand buffering, you will have to either
*) use low-level routines that bypass all buffering.
*) switch off buffering in the high-level routines you use.

That is a general truth, and thus holds for both pipes and sockets.

And, like I said before, input buffering is harmless, as long as you have a means to interrogate the buffer, and don't forget to do it. This unlike output buffering, which you really have to flush to prevent deadlocks.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: read/write vs recv/send

Post by bob »

hgm wrote:Note that input buffering in itself is not a bad thing, in an engine or GUI. Problems arise because we usually use incompatible routines to read input, and to check for it. If we use a call that asks the OS if it has input for us, it might say no, but we might still have input in the application's buffer. What we would have needed was a routine that first checks if there is anything in our own buffer, and only if there isn't, asks if the OS has any input for us.

Forcing unbuffered input is just a kludge to enable us to use the wrong input check, at the cost of lowered efficiency.

In practice, WB protocol gives trouble with the time/otim/MOVE commands. These are sent so quickly after another, that the engine gets them all at once when it is doing buffered input, and then the engine reads away the time command, but otim & MOVE go into the buffer. When you start pondering then, and use a call that asks the OS if there is input to terminate it, you hang. Because the input was already there, invisible to you, and there is never going to arrive something else. Some of my engines solve this by not pondering after receiving time and otim. That solves the problem in practice. But a proper check for input (i.e. buffer + OS) would be a safer, fundamental solution to the problem.
Don't disagree at all. Problem is, finding a solution that works on every platform. And that is an issue. Unless you resort to basic I/O (such as my read/write recommendation)...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: read/write vs recv/send

Post by bob »

Sven Schüle wrote:O.k. Bob, the point goes to you :-)

recv(0, ...) seems to work the way you have described, although it is

- very counter-intuitive,
- does not match the goal of programs like rsh/ssh to make the intermediate socket connection transparent to both client and server program, and also
- introduces an unwanted dependency on the origin of a program's input (i.e., there must be a socket connection bound to its standard input).
Point being the GUI handles this stuff. The program doesn't have a clue what it is reading from. send/recv is a separate issue. One could also use other protocols besides TCP/IP. UDP, for example. Or SCTP. But including the protocol only makes everyone's task harder. I like simple. And I also like doing something once. pipes/sockets are interchangable so long as normal I/O is being used (read/etc). Going to UDP would work from the I/O perspective, but now you have to acknowledge messages since error recovery is up to the programmer. Out-of-order data is common and has to be handled. I prefer to keep things simple.

Therefore I would never recommend using it. Even after adding complete socket support to xboard/WinBoard or other GUIs, an engine using recv() this way will no longer run with older GUI versions, or GUIs that do not support it. It is a basic change that can break a lot.
Agree completely. I don't see any advantage, and there are certainly down-side issues.
bob wrote:What "socket parameter" are you talking about? The first argument? zero (descriptor zero, which is stdin).
Indeed I meant the first argument of recv() when asking about the "socket parameter", no other parameter is a socket. The last parameter is "flags". When I looked into various versions of rshd.c I did not find any dup() or dup2() call that does what you wrote. That was one of the reasons why I did not believe it. Either I missed it, or rshd uses some internal trick instead of dup2(). I did not look into the sshd source, though.

And I missed the non-obvious "0" as first argument to recv(), I thought it had to be a file descriptor that was the result of a socket() function call at some early time, and would therefore have a value > 2 that would be unknown in the general case.

So the essential code that would have to be present in the child process init part of the daemon would be like this:

Code: Select all

close(0);[/quote]

You don't need to close stdin.  You can.  You can even dup2 it over an unused descriptor to "save it" and then later dup2 that saved descriptor back over stdin to recover it, if you want.

[quote]
dup2(socket, 0);
close(socket);
I hope this can be found somewhere, otherwise I still do not understand how and why it works.
You do _not_ want to close the socket. All you have are two different descriptors that are the same. If you close one, you close both. To understand dup2, you might look at xboard source where it starts a child process to execute a chess engine...



Sven