Crafty 23.1 (JA) oddities

Don · Post by **Don** » Sun Nov 29, 2009 11:44 pm

hgm wrote:Why not simply run the WinBoard build?

And now that we are discussing this: is there already a Crafty build that can be configured completely through the WinBoard interface, understanding the memory and cores commands, and defining option features for all the options normally in its ini file? That would be much more interesting than a UCI version.

I want to run 64 bit linux. I don't have windows.

michiguel · Post by **michiguel** » Mon Nov 30, 2009 12:54 am

Don wrote:
hgm wrote:Why not simply run the WinBoard build?

And now that we are discussing this: is there already a Crafty build that can be configured completely through the WinBoard interface, understanding the memory and cores commands, and defining option features for all the options normally in its ini file? That would be much more interesting than a UCI version.
I want to run 64 bit linux. I don't have windows.

You can do all what hgm said in Linux. In fact, my engine is fully configurable through the xboard interface.

See a screen shot I just made in Ubuntu 9.04
http://sites.google.com/site/gaviotache ... figuration

As far as I know, Gaviota is the first engine to support all these new modifications to the protocol.

Miguel

Don · Post by **Don** » Mon Nov 30, 2009 2:23 am

michiguel wrote:
Don wrote:
hgm wrote:Why not simply run the WinBoard build?

And now that we are discussing this: is there already a Crafty build that can be configured completely through the WinBoard interface, understanding the memory and cores commands, and defining option features for all the options normally in its ini file? That would be much more interesting than a UCI version.
I want to run 64 bit linux. I don't have windows.
You can do all what hgm said in Linux. In fact, my engine is fully configurable through the xboard interface.

See a screen shot I just made in Ubuntu 9.04
http://sites.google.com/site/gaviotache ... figuration

As far as I know, Gaviota is the first engine to support all these new modifications to the protocol.

Miguel

I need to use my own tester which is UCI based. xboard is a fine program and I make good use of it, but it does not even begin to compare my own tester which is designed for heavy duty serious testing. Xboard is well suited as a GUI and I like it because it is clean, simple and relatively bug-free but not for the kind of testing I do.

So my situation is that I just want to run 1 winboard engine on my own UCI based tester which is custom designed to do what I need it to do. I can probably just make my tester support winboard with a couple of hours of work and another couple of hours of debugging but I am lazy. Or I might build an adapter just because it would be fun.

hgm · Post by **hgm** » Mon Nov 30, 2009 9:05 am

Don wrote:I want to run 64 bit linux. I don't have windows.

I was not suggesting that you should use WinBoard or Windows; just that you could use a Crafty that speaks WinBoard protocol, like it natively does. But I understand now that your tester does not support WinBoard engines.

Btw, I do all my serious testing with WinBoard (using PSWBTM), and I have never run into any limitations doing that. (Otherwise I would have removed those, of course...

). I suppose one could do the same under Linux with XBoard.

Don · Post by **Don** » Mon Nov 30, 2009 8:43 pm

hgm wrote:
Don wrote:I want to run 64 bit linux. I don't have windows.
I was not suggesting that you should use WinBoard or Windows; just that you could use a Crafty that speaks WinBoard protocol, like it natively does. But I understand now that your tester does not support WinBoard engines.

Btw, I do all my serious testing with WinBoard (using PSWBTM), and I have never run into any limitations doing that. (Otherwise I would have removed those, of course... ). I suppose one could do the same under Linux with XBoard.

xboard is a wonderful tool, I'm not knocking it, it's just that my own tester is pretty hard core in comparison.

One feature of my tester that I don't see in xboard is that it schedules multiple games simultaneously on multi-core machines. It doesn't just randomly schedule them but it tracks the openings and never repeats openings between 2 opponents. It plays massive round robins between any number of individual players I specify. I can play handicap matches because each discreet player is assigned time controls separately. My tester rates the players and produces periodic HTML reports. I could go on and on by suffice it to say that I designed it from the point of view of a developer, not a tester. It's also super fast and I can run time controls like game in 2 seconds. Although I rarely do anything that fast, it works in a consistent way and it was designed to be efficient to the point of ridiculous.

Anway, I think if I hacked together a lot of scripts I could get SOME or even most of that functionality with xboard, but not all of it. And if I have to do all of that it's like building a custom tester anyway so I might as well build my own tester. And of course as a developer I like to have full ownership of the source code and I'm free to make any change I feel like.

Can xboard play 10 games per second? I can play 3 ply games at that rate on a quad core machine I have. Even on a single processor I can play a couple of game per second and I can go much faster if I do 2 or 1 ply games. Admittedly, I don't do that very often but Larry I do that once in a while to debug something. (Usually of the general form that if it doesn't win at 3 ply we must have done something wrong.)

Zach Wegner · Post by **Zach Wegner** » Mon Nov 30, 2009 9:06 pm

Don wrote:xboard is a wonderful tool, I'm not knocking it, it's just that my own tester is pretty hard core in comparison.

One feature of my tester that I don't see in xboard is that it schedules multiple games simultaneously on multi-core machines. It doesn't just randomly schedule them but it tracks the openings and never repeats openings between 2 opponents. It plays massive round robins between any number of individual players I specify. I can play handicap matches because each discreet player is assigned time controls separately. My tester rates the players and produces periodic HTML reports. I could go on and on by suffice it to say that I designed it from the point of view of a developer, not a tester. It's also super fast and I can run time controls like game in 2 seconds. Although I rarely do anything that fast, it works in a consistent way and it was designed to be efficient to the point of ridiculous.

Anway, I think if I hacked together a lot of scripts I could get SOME or even most of that functionality with xboard, but not all of it. And if I have to do all of that it's like building a custom tester anyway so I might as well build my own tester. And of course as a developer I like to have full ownership of the source code and I'm free to make any change I feel like.

Can xboard play 10 games per second? I can play 3 ply games at that rate on a quad core machine I have. Even on a single processor I can play a couple of game per second and I can go much faster if I do 2 or 1 ply games. Admittedly, I don't do that very often but Larry I do that once in a while to debug something. (Usually of the general form that if it doesn't win at 3 ply we must have done something wrong.)

Well my tester schedules multiple games across clusters.

My tester is also designed to be ultra-efficient. It's UCI, but each engine I use is modified so that only the last move has to be sent (position moves e2e4). There are no threads (except when testing on multiple cores), no timers, and it doesn't even save PGN files.

bob · Post by **bob** » Mon Nov 30, 2009 9:28 pm

Zach Wegner wrote:
Don wrote:xboard is a wonderful tool, I'm not knocking it, it's just that my own tester is pretty hard core in comparison.

One feature of my tester that I don't see in xboard is that it schedules multiple games simultaneously on multi-core machines. It doesn't just randomly schedule them but it tracks the openings and never repeats openings between 2 opponents. It plays massive round robins between any number of individual players I specify. I can play handicap matches because each discreet player is assigned time controls separately. My tester rates the players and produces periodic HTML reports. I could go on and on by suffice it to say that I designed it from the point of view of a developer, not a tester. It's also super fast and I can run time controls like game in 2 seconds. Although I rarely do anything that fast, it works in a consistent way and it was designed to be efficient to the point of ridiculous.

Anway, I think if I hacked together a lot of scripts I could get SOME or even most of that functionality with xboard, but not all of it. And if I have to do all of that it's like building a custom tester anyway so I might as well build my own tester. And of course as a developer I like to have full ownership of the source code and I'm free to make any change I feel like.

Can xboard play 10 games per second? I can play 3 ply games at that rate on a quad core machine I have. Even on a single processor I can play a couple of game per second and I can go much faster if I do 2 or 1 ply games. Admittedly, I don't do that very often but Larry I do that once in a while to debug something. (Usually of the general form that if it doesn't win at 3 ply we must have done something wrong.)
Well my tester schedules multiple games across clusters.

My tester is also designed to be ultra-efficient. It's UCI, but each engine I use is modified so that only the last move has to be sent (position moves e2e4). There are no threads (except when testing on multiple cores), no timers, and it doesn't even save PGN files.

I wrote a facility to do all of that as well. I can run multiple games at a time, across a cluster, and the engines can use one to eight cpus (depending on which cluster I run on) when I want to do SMP testing (which is relatively rare, but it does happen). I also save the PGN and automatically run the stuff thru bayeselo to produce the final results. And I have a mechanism to play multiple matches, varying some internal parameter for each match if I want to tune something or test several ideas in succession.

Don · Post by **Don** » Mon Nov 30, 2009 9:29 pm

Zach Wegner wrote:
Don wrote:xboard is a wonderful tool, I'm not knocking it, it's just that my own tester is pretty hard core in comparison.

One feature of my tester that I don't see in xboard is that it schedules multiple games simultaneously on multi-core machines. It doesn't just randomly schedule them but it tracks the openings and never repeats openings between 2 opponents. It plays massive round robins between any number of individual players I specify. I can play handicap matches because each discreet player is assigned time controls separately. My tester rates the players and produces periodic HTML reports. I could go on and on by suffice it to say that I designed it from the point of view of a developer, not a tester. It's also super fast and I can run time controls like game in 2 seconds. Although I rarely do anything that fast, it works in a consistent way and it was designed to be efficient to the point of ridiculous.

Anway, I think if I hacked together a lot of scripts I could get SOME or even most of that functionality with xboard, but not all of it. And if I have to do all of that it's like building a custom tester anyway so I might as well build my own tester. And of course as a developer I like to have full ownership of the source code and I'm free to make any change I feel like.

Can xboard play 10 games per second? I can play 3 ply games at that rate on a quad core machine I have. Even on a single processor I can play a couple of game per second and I can go much faster if I do 2 or 1 ply games. Admittedly, I don't do that very often but Larry I do that once in a while to debug something. (Usually of the general form that if it doesn't win at 3 ply we must have done something wrong.)
Well my tester schedules multiple games across clusters.

My tester is also designed to be ultra-efficient. It's UCI, but each engine I use is modified so that only the last move has to be sent (position moves e2e4). There are no threads (except when testing on multiple cores), no timers, and it doesn't even save PGN files.

Mine produces PGN files, I don't know how much that impacts the speed. Before this tester I wrote one in tcl and producing PGN files was a major drag on the speed. I suspect that I could written much faster tcl code for move validation and producing PGN files but I could still see that ultimately this should be done in C.

UCI is supposed to be stateless and you broke that abstraction. Not that it really matters as long as it works! I always thought that for slow network transmitted games there should be a single token that symbolically represents "ditto" or the same exact move list we (the interface) sent last time. So instead of:

"position startpos moves e2e4 e7e5 g1f3 b8c6 f1b5 a7a6"

you could have

"position startpos moves ditto f1b5 a7a6"

Of course ditto is recursive so the next time you see ditto it means the full list through a7a6. The engine would have to maintain state which would include the latest value of ditto. (I think most UCI engines maintain state anyway, for instance I don't clear my hash tables between moves.)

It's just not an issue unless you are communicating over a slow network. If you are on a fast network I'm not even sure it matters. I think a lot of people are real anal about the "horror" of sending a few moves via stdout - but when you consider that I can play several games per second while sending the entire game on each move, it's a non-issue. The most important issue, which is orders of magnitude more important is to not use a GUI when massively auto-testing.

hgm · Post by **hgm** » Mon Nov 30, 2009 9:42 pm

What I definitely cannot do is schedule simultaneous games from one tourney. But I never really needed to do that either: I just run several independent gauntlets in parallel, to make good use of all my cores.

Massive round-robins are no problem; I just select the engines from PSWBTM's list by clicking them, enter a TC (if it is different from last time) and the tourney is ready to start. Handicap matches I do by installing versions of the engine with a time-odds factor defined; if I select that version it then plays the entire tourney with the specified time-odds. I usually play openings or start positions from a file, such that each engine pairing plays the same set of openings. I.e. if I play 20 games per pairing, I supply a file with 10 openings that are played with bot colors. I usually play with engines that randomize, so I often play the same set of positions multiple times. (E.g. 216 poitions in the file, and 2160 games.)

For none of that I have to do any programming or scripting. Just tell the software through the menus what I want done. Calculating ratings is not built in; for that I use an external program (BayesElo), which I can put on a list of commands to be executed after each round of the tournament I am running. (A facility of the tournament manager.)

I never tried to play games that fast; a quick test reveals that on my single core I seem to be limited to 2 games per second. I am not sure what limits that. I guess both your tester and XBoard will suffer the same communication and scheduling delays by communicating with the engines through pipes. I am not nearly as fast as what you quote, but the fact that you use a quad, and the CPU where I tried it on runs only at 1.3GHz might have something to do with that too.

I am sure that your own dedicated tester will suit your needs better. (But it does seem to have some severe drawbacks as well, which would bother others a lot, such as not supporting WinBoard protocol...

) All I wanted to point out is that one can do a lot of serious testing with off-the shelf software developed around XBoard.

Don · Post by **Don** » Mon Nov 30, 2009 10:02 pm

hgm wrote:What I definitely cannot do is schedule simultaneous games from one tourney. But I never really needed to do that either: I just run several independent gauntlets in parallel, to make good use of all my cores.

Massive round-robins are no problem; I just select the engines from PSWBTM's list by clicking them, enter a TC (if it is different from last time) and the tourney is ready to start. Handicap matches I do by installing versions of the engine with a time-odds factor defined; if I select that version it then plays the entire tourney with the specified time-odds. I usually play openings or start positions from a file, such that each engine pairing plays the same set of openings. I.e. if I play 20 games per pairing, I supply a file with 10 openings that are played with bot colors. I usually play with engines that randomize, so I often play the same set of positions multiple times. (E.g. 216 poitions in the file, and 2160 games.)

For none of that I have to do any programming or scripting. Just tell the software through the menus what I want done. Calculating ratings is not built in; for that I use an external program (BayesElo), which I can put on a list of commands to be executed after each round of the tournament I am running. (A facility of the tournament manager.)

I never tried to play games that fast; a quick test reveals that on my single core I seem to be limited to 2 games per second. I am not sure what limits that. I guess both your tester and XBoard will suffer the same communication and scheduling delays by communicating with the engines through pipes. I am not nearly as fast as what you quote, but the fact that you use a quad, and the CPU where I tried it on runs only at 1.3GHz might have something to do with that too.

I am sure that your own dedicated tester will suit your needs better. (But it does seem to have some severe drawbacks as well, which would bother others a lot, such as not supporting WinBoard protocol... ) All I wanted to point out is that one can do a lot of serious testing with off-the shelf software developed around XBoard.

I agree. I hope I didn't imply you couldn't be serious with xboard because I know that you can be. I actually used xboard fairly heavily when developing cilkchess a decade or more ago and it's definitely workable. I remember that I had a few scripts that made it easier to do things I wanted to do and we wrote our own program to rate the games. Either bayeselo wasn't around then or I didn't know about it.

If you are getting 2 games per second that is fast enough for serious debugging too.

Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities

Re: Crafty 23.1 (JA) oddities