Bullet vs regular time control, say 40/4m CCRL/CEGT

lkaufman · Post by **lkaufman** » Mon Aug 31, 2015 4:53 pm

Laskos wrote:
lkaufman wrote:I think the question needs to be restated. It is obvious that more time per game is better for a given number of games. Let's assume for now that scaling is not an issue (although it certainly is), and that results would be the same at any time control if everything was perfectly fair. Then the question becomes, how fast does the game have to be played for the advantage of playing more games with less time per game to be balanced out by the advantage of less error due to extraneous circumstances like operating system time and granularity of measuring time? Presumably playing game in one second will have too much granularity/system error, and probably beyond game in one minute the loss of games with more time is more of a problem than the benefit of further reducing random error. Somewhere in that one second to one minute range is the ideal time (I think), but where? It is probably lower with Linux than with Windows and may depend on the GUI. This is not a mathematical question but one for experts on computers, so I don't know the answer, but would like to. Does anyone have any data on this?
Empirically, I found that the time allocation with respect to desired time control suffers badly below 5 second per game. The time granularity on Windows is about 15ms. The results can be very affected by noise, and a nasty, systematic one. Above 15 second per game most of the bias is from scaling. Fixed depth or nodes games can be extremely fast, but they have limited applications.

Would you expect that the threshold time per game would be much different in Linux than in Windows, or just slightly different?

Laskos · Post by **Laskos** » Mon Aug 31, 2015 5:01 pm

lkaufman wrote:
Laskos wrote:
lkaufman wrote:I think the question needs to be restated. It is obvious that more time per game is better for a given number of games. Let's assume for now that scaling is not an issue (although it certainly is), and that results would be the same at any time control if everything was perfectly fair. Then the question becomes, how fast does the game have to be played for the advantage of playing more games with less time per game to be balanced out by the advantage of less error due to extraneous circumstances like operating system time and granularity of measuring time? Presumably playing game in one second will have too much granularity/system error, and probably beyond game in one minute the loss of games with more time is more of a problem than the benefit of further reducing random error. Somewhere in that one second to one minute range is the ideal time (I think), but where? It is probably lower with Linux than with Windows and may depend on the GUI. This is not a mathematical question but one for experts on computers, so I don't know the answer, but would like to. Does anyone have any data on this?
Empirically, I found that the time allocation with respect to desired time control suffers badly below 5 second per game. The time granularity on Windows is about 15ms. The results can be very affected by noise, and a nasty, systematic one. Above 15 second per game most of the bias is from scaling. Fixed depth or nodes games can be extremely fast, but they have limited applications.
Would you expect that the threshold time per game would be much different in Linux than in Windows, or just slightly different?

I don't know.

bob · Post by **bob** » Tue Sep 01, 2015 12:24 am

lkaufman wrote:
Laskos wrote:
lkaufman wrote:I think the question needs to be restated. It is obvious that more time per game is better for a given number of games. Let's assume for now that scaling is not an issue (although it certainly is), and that results would be the same at any time control if everything was perfectly fair. Then the question becomes, how fast does the game have to be played for the advantage of playing more games with less time per game to be balanced out by the advantage of less error due to extraneous circumstances like operating system time and granularity of measuring time? Presumably playing game in one second will have too much granularity/system error, and probably beyond game in one minute the loss of games with more time is more of a problem than the benefit of further reducing random error. Somewhere in that one second to one minute range is the ideal time (I think), but where? It is probably lower with Linux than with Windows and may depend on the GUI. This is not a mathematical question but one for experts on computers, so I don't know the answer, but would like to. Does anyone have any data on this?
Empirically, I found that the time allocation with respect to desired time control suffers badly below 5 second per game. The time granularity on Windows is about 15ms. The results can be very affected by noise, and a nasty, systematic one. Above 15 second per game most of the bias is from scaling. Fixed depth or nodes games can be extremely fast, but they have limited applications.
Would you expect that the threshold time per game would be much different in Linux than in Windows, or just slightly different?

I think you can safely play VERY fast games in linux. The issue is "what gui"? I use my own extremely lightweight referee program. But the problem becomes a simple one. You have three programs, and assuming ponder=off, a chess engine computes, makes a move, writes to a pipe and blocks. Referee has to unblock, get scheduled, read from pipe, write to other pipe and block. Other program has to unblock, be scheduled, read from pipe, search, write to pipe, and block. Repeat.

That is a pretty significant amount of context-switching going on. And it is certainly not free. Linux seems to be much more streamlined here than windows, but game in 1 second is REALLY asking a lot. 60 moves say, which means each chess program blocks and unblocks 60 times, and the referee blocks and unblocks twice as much. If a program writes garbage, that makes it even worse since the referee will unblock every time there is something to read, whether it is a move or crap doesn't matter. Crafty is pretty disciplined in xboard move and game in 1 second self-test matches run pretty normally. Other programs display too much crap (just capture xboard debug log to see which ones) and that REALLY adds to the overhead when you run one game per core...

sje · Post by **sje** » Tue Sep 01, 2015 6:10 am

bob wrote:I think you can safely play VERY fast games in linux. The issue is "what gui"? I use my own extremely lightweight referee program. But the problem becomes a simple one. You have three programs, and assuming ponder=off, a chess engine computes, makes a move, writes to a pipe and blocks. Referee has to unblock, get scheduled, read from pipe, write to other pipe and block. Other program has to unblock, be scheduled, read from pipe, search, write to pipe, and block. Repeat.

That is a pretty significant amount of context-switching going on. And it is certainly not free. Linux seems to be much more streamlined here than windows, but game in 1 second is REALLY asking a lot. 60 moves say, which means each chess program blocks and unblocks 60 times, and the referee blocks and unblocks twice as much. If a program writes garbage, that makes it even worse since the referee will unblock every time there is something to read, whether it is a move or crap doesn't matter. Crafty is pretty disciplined in xboard move and game in 1 second self-test matches run pretty normally. Other programs display too much crap (just capture xboard debug log to see which ones) and that REALLY adds to the overhead when you run one game per core...

This reminds me of the task I had back in 1994 trying to connect a late version of my old program Spector with a very early version of Bob's program Crafty. I had Spector running on a Macintosh Plus and Crafty running on a 33 MHz 486DK box under MWC Coherent, an early Unix clone.

There was no network support the 486 box couldn't speak AppleTalk and there wasn't any physical link anyway. What I did do was to connect an RS-422 serial port on the Mac to a serial/parallel converter which was attached to a parallel port on the 486DX machine. For the software glue, I wrote a small and fast mediator (I guess you'd call this a referee nowadays) and voilá: hundred game long matches without need for manual supervision.

I had thought about writing a better, much more general mediator which had the ability to have hack-ish connections to commercial chess programs so as to bleed them dry of knowledge. But other software tasks proved more interesting and the project mas shelved.

But now I'm thinking it might be time to revisit the effort and produce a new, very lightweight mediator with a very general approach and which did not have any non-critical features.

1) Overhead can be greatly reduced by maximizing the information per transfer with a specification designed from the start for such.

2) Overhead can be further reduced by eliminating redundant directives and responses.

3) Separation of the transmission layer from the data layer; use TCP/IP direct connections where possible and so avoid the need to activate any access to a filesystem using locks or pipes. Questions like "did you remember to flush (your output)?" become a thing of the past.

4) Build for speed, not for looks. If there's no need to watch a game in progress, then there's no need for a fancy bit map display and its operation overhead. Likewise, if there's no need for supporting opening books, tablebases, etc., then a mediator can run even faster.

bob · Post by **bob** » Tue Sep 01, 2015 6:44 am

sje wrote:
bob wrote:I think you can safely play VERY fast games in linux. The issue is "what gui"? I use my own extremely lightweight referee program. But the problem becomes a simple one. You have three programs, and assuming ponder=off, a chess engine computes, makes a move, writes to a pipe and blocks. Referee has to unblock, get scheduled, read from pipe, write to other pipe and block. Other program has to unblock, be scheduled, read from pipe, search, write to pipe, and block. Repeat.

That is a pretty significant amount of context-switching going on. And it is certainly not free. Linux seems to be much more streamlined here than windows, but game in 1 second is REALLY asking a lot. 60 moves say, which means each chess program blocks and unblocks 60 times, and the referee blocks and unblocks twice as much. If a program writes garbage, that makes it even worse since the referee will unblock every time there is something to read, whether it is a move or crap doesn't matter. Crafty is pretty disciplined in xboard move and game in 1 second self-test matches run pretty normally. Other programs display too much crap (just capture xboard debug log to see which ones) and that REALLY adds to the overhead when you run one game per core...
This reminds me of the task I had back in 1994 trying to connect a late version of my old program Spector with a very early version of Bob's program Crafty. I had Spector running on a Macintosh Plus and Crafty running on a 33 MHz 486DK box under MWC Coherent, an early Unix clone.

There was no network support the 486 box couldn't speak AppleTalk and there wasn't any physical link anyway. What I did do was to connect an RS-422 serial port on the Mac to a serial/parallel converter which was attached to a parallel port on the 486DX machine. For the software glue, I wrote a small and fast mediator (I guess you'd call this a referee nowadays) and voilá: hundred game long matches without need for manual supervision.

I had thought about writing a better, much more general mediator which had the ability to have hack-ish connections to commercial chess programs so as to bleed them dry of knowledge. But other software tasks proved more interesting and the project mas shelved.

But now I'm thinking it might be time to revisit the effort and produce a new, very lightweight mediator with a very general approach and which did not have any non-critical features.

1) Overhead can be greatly reduced by maximizing the information per transfer with a specification designed from the start for such.

2) Overhead can be further reduced by eliminating redundant directives and responses.

3) Separation of the transmission layer from the data layer; use TCP/IP direct connections where possible and so avoid the need to activate any access to a filesystem using locks or pipes. Questions like "did you remember to flush (your output)?" become a thing of the past.

4) Build for speed, not for looks. If there's no need to watch a game in progress, then there's no need for a fancy bit map display and its operation overhead. Likewise, if there's no need for supporting opening books, tablebases, etc., then a mediator can run even faster.

The only problem is the xboard protocol. It generates multiple messages to the chess engine per move. IE time, otim, and the move. That's 3x the traffic I'd like to see. But in order to talk to other programs, my referee has to mimic the bare necessities of xboard protocol. I despise some of the force nonsense, but that's the way it has always been since it was written explicitly for gnuchess, which was hack on top of hack.

You can use xboard with no display (something like -noGUI or some such option, but it is still not super-lightweight. What I wrote works well, but I eventually had to include pieces of Crafty to deal with converting not-SAN to SAN (some engines REFUSE to accept SAN, something that is beyond me) as well as to recognize real draws since some engines make bogus draw claims, bogus game end commands (bogus result, etc). My referee handles all game termination itself, although it does allow forwarding draw offers.

Probably could be cleaned up quite a bit as I just wrote it to make it work, not to make it easy to change. It does collect PGN, can handle multiple simultaneous games as each instance is given a specific and unique pgn filename to use, as well as a kludge (for crafty) to force the program to use specific log file numbers for the rare cases where I run matches with logs on...

sje · Post by **sje** » Tue Sep 01, 2015 1:53 pm

bob wrote:The only problem is the xboard protocol. It generates multiple messages to the chess engine per move. IE time, otim, and the move. That's 3x the traffic I'd like to see. But in order to talk to other programs, my referee has to mimic the bare necessities of xboard protocol. I despise some of the force nonsense, but that's the way it has always been since it was written explicitly for gnuchess, which was hack on top of hack.

You can use xboard with no display (something like -noGUI or some such option, but it is still not super-lightweight. What I wrote works well, but I eventually had to include pieces of Crafty to deal with converting not-SAN to SAN (some engines REFUSE to accept SAN, something that is beyond me) as well as to recognize real draws since some engines make bogus draw claims, bogus game end commands (bogus result, etc). My referee handles all game termination itself, although it does allow forwarding draw offers.

Probably could be cleaned up quite a bit as I just wrote it to make it work, not to make it easy to change. It does collect PGN, can handle multiple simultaneous games as each instance is given a specific and unique pgn filename to use, as well as a kludge (for crafty) to force the program to use specific log file numbers for the rare cases where I run matches with logs on...

I might be able to help a little here with some XBoard traffic issues. Some years ago I was digging into the source and discovered that at times multiple directives were being sent with a single write to a program's input pipe. I recall specifically that there was something like:

Code: Select all

fprintf&#40;fptr, "time %d\notime %d\n", t0, t1&#41;;

So, if a program is expecting a sequence of input directives, it can be fooled by sending them all at once with a single transmission with embedded newlines. This should cut down on total transmission overhead.

sje · Post by **sje** » Tue Sep 01, 2015 2:24 pm

bob wrote:Probably could be cleaned up quite a bit as I just wrote it to make it work, not to make it easy to change. It does collect PGN, can handle multiple simultaneous games as each instance is given a specific and unique pgn filename to use, as well as a kludge (for crafty) to force the program to use specific log file numbers for the rare cases where I run matches with logs on...

One idea I have is to build a mediator from the routines in my perft program Oscar. Oscar already knows all the rules along with SAN, FEN, etc., and has proven to be extremely reliable. What Oscar is missing at the moment is its own versions of malloc()/free(). These are needed because Oscar, when running on a GPU, has no access to the usual C run time library. It does run okay on any Linux host CPU by borrowing the standard malloc()/free(). (Excuse: Oscar's code is constrained because the GPU environment supports neither recursion nor a stack.)

Once I get my own malloc()/free()installed, I'll release Oscar under the GPL for all to use.

An Oscar mediator could be easily built. Spoofing XBoard output for input to old programs wouldn't require much new code; same for UCI -- as long as the more obscure protocol features aren't used. Everything needed to play automated games would be there.

bob · Post by **bob** » Tue Sep 01, 2015 8:55 pm

sje wrote:
bob wrote:The only problem is the xboard protocol. It generates multiple messages to the chess engine per move. IE time, otim, and the move. That's 3x the traffic I'd like to see. But in order to talk to other programs, my referee has to mimic the bare necessities of xboard protocol. I despise some of the force nonsense, but that's the way it has always been since it was written explicitly for gnuchess, which was hack on top of hack.

You can use xboard with no display (something like -noGUI or some such option, but it is still not super-lightweight. What I wrote works well, but I eventually had to include pieces of Crafty to deal with converting not-SAN to SAN (some engines REFUSE to accept SAN, something that is beyond me) as well as to recognize real draws since some engines make bogus draw claims, bogus game end commands (bogus result, etc). My referee handles all game termination itself, although it does allow forwarding draw offers.

Probably could be cleaned up quite a bit as I just wrote it to make it work, not to make it easy to change. It does collect PGN, can handle multiple simultaneous games as each instance is given a specific and unique pgn filename to use, as well as a kludge (for crafty) to force the program to use specific log file numbers for the rare cases where I run matches with logs on...
I might be able to help a little here with some XBoard traffic issues. Some years ago I was digging into the source and discovered that at times multiple directives were being sent with a single write to a program's input pipe. I recall specifically that there was something like:
Code: Select all
fprintf&#40;fptr, "time %d\notime %d\n", t0, t1&#41;;
So, if a program is expecting a sequence of input directives, it can be fooled by sending them all at once with a single transmission with embedded newlines. This should cut down on total transmission overhead.

This is a standard part of xboard communication. You have to be prepared to get multiple commands separated by \n...

But not all engines do the same from their side, and the purpose of my referee is to work for anything compatible with xboard (and UCI via polyglot). Extra crap from an engine results in extra context-switches which do affect very fast games.

bob · Post by **bob** » Tue Sep 01, 2015 8:57 pm

sje wrote:
bob wrote:Probably could be cleaned up quite a bit as I just wrote it to make it work, not to make it easy to change. It does collect PGN, can handle multiple simultaneous games as each instance is given a specific and unique pgn filename to use, as well as a kludge (for crafty) to force the program to use specific log file numbers for the rare cases where I run matches with logs on...
One idea I have is to build a mediator from the routines in my perft program Oscar. Oscar already knows all the rules along with SAN, FEN, etc., and has proven to be extremely reliable. What Oscar is missing at the moment is its own versions of malloc()/free(). These are needed because Oscar, when running on a GPU, has no access to the usual C run time library. It does run okay on any Linux host CPU by borrowing the standard malloc()/free(). (Excuse: Oscar's code is constrained because the GPU environment supports neither recursion nor a stack.)

Once I get my own malloc()/free()installed, I'll release Oscar under the GPL for all to use.

An Oscar mediator could be easily built. Spoofing XBoard output for input to old programs wouldn't require much new code; same for UCI -- as long as the more obscure protocol features aren't used. Everything needed to play automated games would be there.

I wish xboard has specified SAN, period. Every program I have ever written used SAN (except back in the English Descriptive early days, and even then it used a SAN equivalent except when ambiguities were found. today programs want Nf3, or Ng1f3, or g1f3. Seems senseless.

sje · Post by **sje** » Wed Sep 02, 2015 12:40 am

bob wrote:I wish xboard has specified SAN, period. Every program I have ever written used SAN (except back in the English Descriptive early days, and even then it used a SAN equivalent except when ambiguities were found. today programs want Nf3, or Ng1f3, or g1f3. Seems senseless.

Some glue is on the way. See: http://www.talkchess.com/forum/viewtopic.php?p=639073

Bullet vs regular time control, say 40/4m CCRL/CEGT

Re: Bullet vs regular time control, say 40/4m CCRL/CEGT

Re: Bullet vs regular time control, say 40/4m CCRL/CEGT

Re: Bullet vs regular time control, say 40/4m CCRL/CEGT

Lightweight mediators

Re: Lightweight mediators

Re: Lightweight mediators

Re: Lightweight mediators

Re: Lightweight mediators

Re: Lightweight mediators

Re: Lightweight mediators