Question for H.G. Muller

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
bob
Posts: 20912
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Question for H.G. Muller

Post by bob » Mon Aug 02, 2010 5:12 am

LucenaTheLucid wrote:

Code: Select all

A couple of issues.

(1) in general, I've found that fast games are perfectly acceptable, so long as you are careful to not go so fast that a program starts to lose on time. After makihng several changes, you can always do an occasional verification using longer time controls, just to be safe...
Yes this was the issue I was running into while trying to find some good Stockfish settings. I was using 10 seconds per game with .1 increment. Which is exactly why I want to use fixed time per move.

Code: Select all

(2) a fixed time per move is OK, although you cut out a part of the engine's character, since you do not allow it to use more or less time for some moves, which most programs will do.
How much does this affect playing strength?
That's the problem. It is difficult to know. If someone gets some really clever ideas about when to spend more time, when to spend less, and it works well, you break that with fixed time per move...

Code: Select all

(3) I do not like just playing games between A and A'. My testing has shown that this often gives results that are either inflated, or even wrong. It's better to play against several programs, preferably that are about the same strength as the engine you are testing, or a bit stronger. Don't test against a bunch of patsies, nor should you play against opponents you can only win one out of every hundred games.
Agreed. I was planning on running the 2 individual Stockfish settings against these engines:
  • Critter 0.80
    Naum 4.2
    Komodo 1.2
    Rybka 4
    Houdini 1.03a
The testing methods I use are roughly the same as yours. Of course I do not have a cluster to test with...maybe "LittleBlitzer" can help me with that.

I want good accurate results and I want them quickly which is why I want to use a fixed time per move. :)
Time allocation is a part of a chess engine. I don't like disabling that. You can go as fast as possible while still having a more normal time control. For example, 10 secs on clock, 0.1 sec increment. Games are very fast, but the engine still has to figure out how much time to spend on each move, which is a part of game management that can be important.

bob
Posts: 20912
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Question for H.G. Muller

Post by bob » Mon Aug 02, 2010 5:16 am

LucenaTheLucid wrote:I was rechecking the games I ran with /tc 1 /inc 1 -firstTimeOdds 6 -secondTimeOdds 6 which puts the game at 10sec/0.1sec inc and there were NO lost games on time.

Now these are the results I got with them:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Houdini 1.02 w32               : 3148   15  15  1500    67.5 %   3021   34.9 %
  2 IvanHoe T55                    : 3101   14  14  1500    61.3 %   3021   36.4 %
  3 Stockfish 1.8 JA Default       : 3047   11  11  2500    56.1 %   3004   32.2 %
  4 Deep Rybka 4 w32               : 3033   15  15  1500    51.7 %   3021   30.8 %
  5 Tinapa 1.01                    : 3010   11  11  2500    50.8 %   3004   32.0 %
  6 Stockfish 1.8 JA TACTICAL      : 3001   11  11  2500    49.5 %   3004   32.2 %
  7 Naum 4.2                       : 2918   15  15  1500    35.6 %   3021   32.7 %
  8 Komodo 1.2 JA                  : 2813   17  17  1500    23.2 %   3021   25.9 %
Tinapa is a small code change of Stockfish. Stockfish 1.8 TACTICAL is default except for:
  • Check Extension - 0
    Check extension non pv - 0
    Single pv - 0
    Single non pv - 0
For reference defaults are:
  • Check Extension - 2
    Check extension non pv - 1
    Single pv - 2
    Single non pv - 2
I wonder how accurate are the results? Are extensions really worth 46 ELO points +\- 11?
Error bars are +/-11 roughly so they are pretty accurate. I'd bet getting rid of all but check extension will lose much less. It seems to be the most important extension.

User avatar
hgm
Posts: 24582
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Question for H.G. Muller

Post by hgm » Mon Aug 02, 2010 5:25 am

As Bob says, the TC argument can handle seconds through the min:sec notation. It cannot directly handle fractional seconds, though.

There is an easy way around this, though: for each engine, you can define a time-odds factor, through which all requested TC is divided. So when you give _both_ engines the same factor, 60, the -tc ot -st field will be interpreted as seconds, rather than minutes. (the -inc field is already seconds normally, so it would be interpreted as 1/60 sec.) If you set /firstTimeOdds=60000, /secondTimeOdds=60000, they will be interpreted as msec.

(Note that there is a /timeOddsMode argument that can be set to have games where both engines have time odds re-normalized to give the slowest nominal time, so that giving them the same time odds would have no effect at all. So you must be sure not to be in that mode.)

LucenaTheLucid
Posts: 197
Joined: Mon Jul 13, 2009 12:16 am

Re: Question for H.G. Muller

Post by LucenaTheLucid » Mon Aug 02, 2010 6:31 am

Thanks for the quick responses.

Post Reply