Yeah ! McBrain rules !!drj4759 wrote:Thanks for pointing out the possible cause of irregularity. Here is the partial replay:See details here:Code: Select all
# PLAYER : RATING POINTS PLAYED (%) 1 McBrain 2.7 : 3543.1 167.0 300 56 2 Raubfisch X36b : 3507.0 146.5 300 49 3 Sugar XPro 1.3 : 3502.6 144.0 300 48 4 Stockfish 17092216 : 3500.0 142.5 300 48
http://chessowl.blogspot.com/2017/09/st ... eplay.html
Stockfish 17092216 vs. Clones
Moderator: Ras
-
- Posts: 5297
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Stockfish 17092216 vs. Clones -replay
-
- Posts: 2129
- Joined: Thu May 29, 2008 10:43 am
Re: Stockfish 17092216 vs. Clones -replay
Code: Select all
CPU: AMD A8-5600K APU with Radeon(tm) HD Graphics
CORES: 4
RAM: 16GB
OS: Linux (PclinuxOs 4.9.13)
Tournament Manager: cutechess.cli v1.0
Time control: 1 minute + 1 seconds
-resign movecount=1 score=200 \
-draw movenumber=80 movecount=1 score=199 \
-concurrency 10 \
Hi Donald,
I think you might want to think about your tournament config a bit more, and make a few adjustments...
tc=80/60+1 is not 1 + 1
that's closer to 45 sec base time with a 1 sec increment
None of the games in the pgn are more than 80 ply...
and no engine ever reaches an eval of more than 200 cp
of course that makes sense if you look at the draw and resign rules closely
These are valid settings of course, but if I understand cutechess parameters properly, it appears the net effect of those two rules combined would be:
If both engines reach +-199 and at least 80 moves have been played, it's a draw
but...if one reaches -200 for 1 move they resign, win for the other side !?
so the margin between win/lose, and draw is 1 cp.
The pgn file appears to confirm this behavior.
The other thing that is, as mentioned earlier, concerns # of the concurrent threads for cutechess, and the Threads = 2 for each engine
Since the CPU has 4 threads, an optimal config IMO would be:
4 concurrent games w/ each engine using 1 thread
or
2 concurrent games w/ each engine using 2 threads
Just some friendly advice, good luck with your great site!
Norm
-
- Posts: 2802
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: Stockfish 17092216 vs. Clones
Ridiculous.drj4759 wrote:See details:Code: Select all
# PLAYER : RATING POINTS PLAYED (%) 1 Sugar XPro 1.3 : 3667.7 401.5 600 67 2 Raubfisch X36b : 3643.4 379.5 600 63 3 McBrain 2.7 : 3536.6 277.5 600 46 4 CorChess 1.8 : 3533.5 274.5 600 46 5 Brain Fish 170923 : 3521.0 262.5 600 44 6 Goby 170925 : 3520.5 262.0 600 44 7 Stockfish 17092216 : 3500.0 242.5 600 40
http://chessowl.blogspot.com/2017/09/st ... lones.html
A real test is here: http://www.talkchess.com/forum/viewtopic.php?t=65311
-
- Posts: 89
- Joined: Mon Nov 17, 2014 10:05 am
Re: Stockfish 17092216 vs. Clones -replay
I used the cutechess-cli tournament manager just recently coming from Arena GUI. It's a lot different and there are parameters which are not documented for example the option.Threads which distorted my first test by not specifiying it. I am still trying to decipher the effects of each parameter. Even the time control parameters seem to not agree with my Arena experience.
I am particularly interested in testing the strength of the engine so I value time and quantity to produce the statistics. That's the reason why my tests recently are limited to 80 moves and 200 centipawns. Beyond 80 moves, most of the games are annoying because the pieces just dance around perpetually. When the score has a gap of 200cp it usuallywill lead to a win adjudication. There are some exceptions but they are statistically irrelevant.
Cutechess-cli is amazing as it can handle lots of threads compared to Arena. I can do 7 simulataneous games with Arena/Linux/Wine and beyond that there is a noticeable response lag which consequently results in many time forfeits. The last post was with cutechess-cli with 10 concurrency and 2 threads each which is equivalent to 20 threads. It is still very responsive and no games lost by time time forfeit using an old 4 cores computer bought 5 years ago. I can produce 3 times more games with cutechess-cli than Arena which is important for statistics.
By the way, when can you provide Fire 6 that runs on Linux?
Thank you.
I am particularly interested in testing the strength of the engine so I value time and quantity to produce the statistics. That's the reason why my tests recently are limited to 80 moves and 200 centipawns. Beyond 80 moves, most of the games are annoying because the pieces just dance around perpetually. When the score has a gap of 200cp it usuallywill lead to a win adjudication. There are some exceptions but they are statistically irrelevant.
Cutechess-cli is amazing as it can handle lots of threads compared to Arena. I can do 7 simulataneous games with Arena/Linux/Wine and beyond that there is a noticeable response lag which consequently results in many time forfeits. The last post was with cutechess-cli with 10 concurrency and 2 threads each which is equivalent to 20 threads. It is still very responsive and no games lost by time time forfeit using an old 4 cores computer bought 5 years ago. I can produce 3 times more games with cutechess-cli than Arena which is important for statistics.
By the way, when can you provide Fire 6 that runs on Linux?
Thank you.
-
- Posts: 4718
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Stockfish 17092216 vs. Clones -replay
If McBrain did not play with the Brainfish book the result is irregular again...Vinvin wrote:Yeah ! McBrain rules !!drj4759 wrote:Thanks for pointing out the possible cause of irregularity. Here is the partial replay:See details here:Code: Select all
# PLAYER : RATING POINTS PLAYED (%) 1 McBrain 2.7 : 3543.1 167.0 300 56 2 Raubfisch X36b : 3507.0 146.5 300 49 3 Sugar XPro 1.3 : 3502.6 144.0 300 48 4 Stockfish 17092216 : 3500.0 142.5 300 48
http://chessowl.blogspot.com/2017/09/st ... eplay.html
(even with only 300 games)
Other people already mentioned that it is compeletely useless to play
with concurrency 10 and 2 threads on a 4 cpu machine.
It should be logical to you, even if you never had used cutecli before,
because this is just about math.
Even with hyperthreading you cannot use more than 7 threads (1 thread
at least must be reserved for the OS) w/o crippling everything, but you tried to use 20!??
Also it is much easier to use the engines.json file for the specific engine options and add simply
Code: Select all
-engine conf="$name"
Currently each engine also uses just the default hash setting, which could be anything.
Guenther
-
- Posts: 4718
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Stockfish 17092216 vs. Clones -replay
@adjudication rules, Norman is right, this completely annihilates the purpose
of your tournament too.
You'll have a lot of draw adjudications which are simply not draws
and some win adjudications which are not wins.
It's really better in this case not to post such tourneys, but keep them
at home until you figured out how to run them in a way they have some value for readers of that board.
Well and 70! games are draws with 20 or less moves with your draw rule :)
of your tournament too.
You'll have a lot of draw adjudications which are simply not draws
and some win adjudications which are not wins.
It's really better in this case not to post such tourneys, but keep them
at home until you figured out how to run them in a way they have some value for readers of that board.
Well and 70! games are draws with 20 or less moves with your draw rule :)
-
- Posts: 1346
- Joined: Sat Apr 19, 2014 1:47 pm
Re: Stockfish 17092216 vs. Clones -replay
I made a tournament, I flip a coin to abjucate a game after the 5 first move. Seems critter is doing well better than Stockfish Komodo and Houdini. Critter come back plz.
-
- Posts: 89
- Joined: Mon Nov 17, 2014 10:05 am
Re: Stockfish 17092216 vs. Clones -replay
Hi! Guenther,
I am not sure what you mean by using Brainfish book with McBrain. I just guess that McBrain should use Cerebellum book. If correct, then it would be unfair to the other chess engines because it is optimized to crush engines without the equivalent book. McBrain and Sugar has the capability to use Cerebellum. If Cerebellum is allowed for these chess engines, then they will produce a single opening which is Giuoco Piano over and over. It us ugly and it will end mostly in draws. Meanwhile Stockfish will be cloberred to death with just 5-moves opening book.
I regularly played 7 simultaneous games with Linux/Arena/Wine and still can use the computer browsing the Internet or doing some compiling, file maintenance tasks, etc. With cute-cli I regularly use 12 concurrency and the computer is still responsive compared with Arena which is somewhat sluggish with 7 threads. The irregular tournament with Sugar Pro was done with 10 concurrency oblivious that Sugar and Raubfisch was using 4 threads for each concurrency. And in that 10 concurrency at leat 5 Sugar Pro and Raubfisch instances were running which concumed an equivalent of 20 threads + 5 threads for each remaining concurrency. Every now and then I checked the progress of that tournament and I never noticed any sluggishness even though the htop program shows 100% usage for all 4 CPU's.
For me, that's fantastic serendipity because I thought I could only have a maximum 12 threads for cutechess but it turned out that it is capable of using more than 20 threads for a 4-core CPU. This is an actual experience and the proof is in the PGN games posted which are all normal with no time losses.
I have not tried JSON yet but may try it someday when my simple set-up gets complicated.
Thanks.
I am not sure what you mean by using Brainfish book with McBrain. I just guess that McBrain should use Cerebellum book. If correct, then it would be unfair to the other chess engines because it is optimized to crush engines without the equivalent book. McBrain and Sugar has the capability to use Cerebellum. If Cerebellum is allowed for these chess engines, then they will produce a single opening which is Giuoco Piano over and over. It us ugly and it will end mostly in draws. Meanwhile Stockfish will be cloberred to death with just 5-moves opening book.
I regularly played 7 simultaneous games with Linux/Arena/Wine and still can use the computer browsing the Internet or doing some compiling, file maintenance tasks, etc. With cute-cli I regularly use 12 concurrency and the computer is still responsive compared with Arena which is somewhat sluggish with 7 threads. The irregular tournament with Sugar Pro was done with 10 concurrency oblivious that Sugar and Raubfisch was using 4 threads for each concurrency. And in that 10 concurrency at leat 5 Sugar Pro and Raubfisch instances were running which concumed an equivalent of 20 threads + 5 threads for each remaining concurrency. Every now and then I checked the progress of that tournament and I never noticed any sluggishness even though the htop program shows 100% usage for all 4 CPU's.
For me, that's fantastic serendipity because I thought I could only have a maximum 12 threads for cutechess but it turned out that it is capable of using more than 20 threads for a 4-core CPU. This is an actual experience and the proof is in the PGN games posted which are all normal with no time losses.
I have not tried JSON yet but may try it someday when my simple set-up gets complicated.
Thanks.
-
- Posts: 89
- Joined: Mon Nov 17, 2014 10:05 am
Re: Stockfish 17092216 vs. Clones -replay
Hi! Guenther,
I am conscious with the side effects of my 80 moves/200 centipawns rule. What drive me to this decision was that I am never happy seeing games that end up in 200 moves and it is still a draw and it consumes time too plus a lot of disk space.
The statistics I produced is primarily for my personal curiosity but I share it so that others may compare. After all it is only statistics, it is not a critical life or death situation. Anyone can just ignore it at their pleasure.
If it means that I don't be allowed to post in this forum, so be it. I don't care.
Thank you.
I am conscious with the side effects of my 80 moves/200 centipawns rule. What drive me to this decision was that I am never happy seeing games that end up in 200 moves and it is still a draw and it consumes time too plus a lot of disk space.
The statistics I produced is primarily for my personal curiosity but I share it so that others may compare. After all it is only statistics, it is not a critical life or death situation. Anyone can just ignore it at their pleasure.
If it means that I don't be allowed to post in this forum, so be it. I don't care.
Thank you.
-
- Posts: 2129
- Joined: Thu May 29, 2008 10:43 am
Re: Stockfish 17092216 vs. Clones -replay
Yes cutechess is awesome...
Here's an idea for resign and draw adjudication
-draw movenumber=40 movecount=8 score=5
-resign movecount=4 score=500
TCEC uses:
draw: +0.05 to -0.05 pawns for the last 5 moves, or 10 plies.
resign: 6.50 pawns (or -6.50 in case of a black win) for 4 consecutive moves, or 8 plies
In cutechess format, that would be
-draw movecount=5 score=5
-resign movecount=4 score=650
The engines.json file is really useful for maintaining the engine configurations...
I have a good working example for an 8 engine tournament, with corresponding batch file to start a tournament.
I'd be happy to share them with you via email...
Here's an idea for resign and draw adjudication
-draw movenumber=40 movecount=8 score=5
-resign movecount=4 score=500
TCEC uses:
draw: +0.05 to -0.05 pawns for the last 5 moves, or 10 plies.
resign: 6.50 pawns (or -6.50 in case of a black win) for 4 consecutive moves, or 8 plies
In cutechess format, that would be
-draw movecount=5 score=5
-resign movecount=4 score=650
The engines.json file is really useful for maintaining the engine configurations...
I have a good working example for an 8 engine tournament, with corresponding batch file to start a tournament.
I'd be happy to share them with you via email...