Question for H.G. Muller

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Question for H.G. Muller

Post by LucenaTheLucid »

First just want to say what a fantastic job you have done with Winboard and thank you.

Now let me explain what I am doing. I am running UCI engines under Winboard with the Polyglot adapter. Say I am testing Stockfish 1.8A and Stockfish 1.8B and I want to play them against a variety of engines to measure which one is stronger. Is there a way I can set up a batch file with something like this:

Code: Select all

start /wait C:\WinBoard\winboard -cp -fcp "polyglot.exe Stockfish18TACTICAL.ini" -fd "C:\Polyglot15w" -scp "polyglot.exe Critter.ini" -sd "C:\Polyglot15w" -initString "new\n" /mg 10 /tc 1 -firstTimeOdds 2 -secondTimeOdds 2
But now instead of setting /tc1 have it play games with seconds per move? For instance a gauntlet where games are .5 or .1 seconds per move?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question for H.G. Muller

Post by bob »

LucenaTheLucid wrote:First just want to say what a fantastic job you have done with Winboard and thank you.

Now let me explain what I am doing. I am running UCI engines under Winboard with the Polyglot adapter. Say I am testing Stockfish 1.8A and Stockfish 1.8B and I want to play them against a variety of engines to measure which one is stronger. Is there a way I can set up a batch file with something like this:

Code: Select all

start /wait C:\WinBoard\winboard -cp -fcp "polyglot.exe Stockfish18TACTICAL.ini" -fd "C:\Polyglot15w" -scp "polyglot.exe Critter.ini" -sd "C:\Polyglot15w" -initString "new\n" /mg 10 /tc 1 -firstTimeOdds 2 -secondTimeOdds 2
But now instead of setting /tc1 have it play games with seconds per move? For instance a gauntlet where games are .5 or .1 seconds per move?
I think you can enter seconds with 0:01. I had done this to xboard years ago, but I believe it became a standard feature... HG can answer for sure.
LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid »

Dr. Hyatt a couple of side question?

Would games played at a time control of 10 seconds be meaningful if played on the ordinary average joe's computer to measure strength differences between certain versions?

Also would x/second per-move also tell a good story of which engine version is stronger?
Roger Brown
Posts: 782
Joined: Wed Mar 08, 2006 9:22 pm

Re: Question for H.G. Muller

Post by Roger Brown »

LucenaTheLucid wrote:First just want to say what a fantastic job you have done with Winboard and thank you.

Now let me explain what I am doing. I am running UCI engines under Winboard with the Polyglot adapter. Say I am testing Stockfish 1.8A and Stockfish 1.8B and I want to play them against a variety of engines to measure which one is stronger. Is there a way I can set up a batch file with something like this:

Code: Select all

start /wait C:\WinBoard\winboard -cp -fcp "polyglot.exe Stockfish18TACTICAL.ini" -fd "C:\Polyglot15w" -scp "polyglot.exe Critter.ini" -sd "C:\Polyglot15w" -initString "new\n" /mg 10 /tc 1 -firstTimeOdds 2 -secondTimeOdds 2
But now instead of setting /tc1 have it play games with seconds per move? For instance a gauntlet where games are .5 or .1 seconds per move?


Hello Luis Smith,

It may be lèse majesté to speak when you specifically asked for H.G. but if you are testing UCI engines only you may want to have a look at this software by Nathan Thom:

http://kimiensoftware.com/littlethought ... litzer.php

When you are using those ultra-fast time-controls with winboard there might be an issue of engines hanging, the adapter tidying up communications among itself and the gui and the engine, the gui waiting for termination and start etc. Those issues will wreck a test run in my opinion.

Winboard does not seem to accept fractional settings directly but there are ways around it....

I hope this helps.

Later.
LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid »

Wow Roger, I was not away some such tool existed! I wonder how accurate are the results in a game played with only 1 second as compared to a game which is played in 10 minutes? Or even 30?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question for H.G. Muller

Post by bob »

LucenaTheLucid wrote:Dr. Hyatt a couple of side question?

Would games played at a time control of 10 seconds be meaningful if played on the ordinary average joe's computer to measure strength differences between certain versions?

Also would x/second per-move also tell a good story of which engine version is stronger?
A couple of issues.

(1) in general, I've found that fast games are perfectly acceptable, so long as you are careful to not go so fast that a program starts to lose on time. After makihng several changes, you can always do an occasional verification using longer time controls, just to be safe...

(2) a fixed time per move is OK, although you cut out a part of the engine's character, since you do not allow it to use more or less time for some moves, which most programs will do.

(3) I do not like just playing games between A and A'. My testing has shown that this often gives results that are either inflated, or even wrong. It's better to play against several programs, preferably that are about the same strength as the engine you are testing, or a bit stronger. Don't test against a bunch of patsies, nor should you play against opponents you can only win one out of every hundred games.
LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid »

Code: Select all

A couple of issues.

(1) in general, I've found that fast games are perfectly acceptable, so long as you are careful to not go so fast that a program starts to lose on time. After makihng several changes, you can always do an occasional verification using longer time controls, just to be safe...
Yes this was the issue I was running into while trying to find some good Stockfish settings. I was using 10 seconds per game with .1 increment. Which is exactly why I want to use fixed time per move.

Code: Select all

(2) a fixed time per move is OK, although you cut out a part of the engine's character, since you do not allow it to use more or less time for some moves, which most programs will do.
How much does this affect playing strength?

Code: Select all

(3) I do not like just playing games between A and A'. My testing has shown that this often gives results that are either inflated, or even wrong. It's better to play against several programs, preferably that are about the same strength as the engine you are testing, or a bit stronger. Don't test against a bunch of patsies, nor should you play against opponents you can only win one out of every hundred games.
Agreed. I was planning on running the 2 individual Stockfish settings against these engines:
  • Critter 0.80
    Naum 4.2
    Komodo 1.2
    Rybka 4
    Houdini 1.03a
The testing methods I use are roughly the same as yours. Of course I do not have a cluster to test with...maybe "LittleBlitzer" can help me with that.

I want good accurate results and I want them quickly which is why I want to use a fixed time per move. :)
LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid »

I was rechecking the games I ran with /tc 1 /inc 1 -firstTimeOdds 6 -secondTimeOdds 6 which puts the game at 10sec/0.1sec inc and there were NO lost games on time.

Now these are the results I got with them:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Houdini 1.02 w32               : 3148   15  15  1500    67.5 %   3021   34.9 %
  2 IvanHoe T55                    : 3101   14  14  1500    61.3 %   3021   36.4 %
  3 Stockfish 1.8 JA Default       : 3047   11  11  2500    56.1 %   3004   32.2 %
  4 Deep Rybka 4 w32               : 3033   15  15  1500    51.7 %   3021   30.8 %
  5 Tinapa 1.01                    : 3010   11  11  2500    50.8 %   3004   32.0 %
  6 Stockfish 1.8 JA TACTICAL      : 3001   11  11  2500    49.5 %   3004   32.2 %
  7 Naum 4.2                       : 2918   15  15  1500    35.6 %   3021   32.7 %
  8 Komodo 1.2 JA                  : 2813   17  17  1500    23.2 %   3021   25.9 %
Tinapa is a small code change of Stockfish. Stockfish 1.8 TACTICAL is default except for:
  • Check Extension - 0
    Check extension non pv - 0
    Single pv - 0
    Single non pv - 0
For reference defaults are:
  • Check Extension - 2
    Check extension non pv - 1
    Single pv - 2
    Single non pv - 2
I wonder how accurate are the results? Are extensions really worth 46 ELO points +\- 11?
User avatar
Bill Rogers
Posts: 3562
Joined: Thu Mar 09, 2006 3:54 am
Location: San Jose, California

Re: Question for H.G. Muller

Post by Bill Rogers »

I don't know how fast his computer is but if he is goning to run very fast time controlls it might be advisab le to make sure that pondering is turned off as this might severely limit the time factor in a negative way.
Just a thought
Bill
LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 2:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid »

Pondering is indeed turned off.