CLOP for Noisy Black-Box Parameter Optimization

Michel · Post by **Michel** » Mon Sep 26, 2011 12:55 pm

CLOP will not estimate the strength of each individual opponent, but it will maximize the winning rate over replications.

I do not understand exactly what this means.

Does this mean that there is a risk that CLOP will optimize for
winning against the weaker opponents?

If so the pool should probably only contain stronger engines....

Rémi Coulom · Post by **Rémi Coulom** » Mon Sep 26, 2011 3:40 pm

Michel wrote:
CLOP will not estimate the strength of each individual opponent, but it will maximize the winning rate over replications.
I do not understand exactly what this means.

Does this mean that there is a risk that CLOP will optimize for
winning against the weaker opponents?

If so the pool should probably only contain stronger engines....

It does not matter if engines are of different strengths. If you have N opponents, and use N replications, this means that, for a given value of parameters, CLOP will play a game against each opponent. CLOP cannot choose the opponent it plays against. It plays against every opponent equally. So it should work well, regardless of their strengths. Of course, as always, you should make sure that the winning rate is not close to 0% or 100%.

What CLOP will maximize is the expected win rate when playing one game against each opponent. But CLOP won't produce a rating list like bayeselo would. This is what I meant with my sentence. Not producing a rating list should not be a problem.

Note that, in general, you cannot use CLOP to estimate the strength of the optimised program accurately. Win rates produced by CLOP are biased. Win rate over all samples is pessimistic. Local win rate tends to be optimistic. The real win rate of optimal parameters is somewhere in-between.

Rémi

Michel · Post by **Michel** » Tue Sep 27, 2011 9:54 am

Ok thanks for the explanation. I guess I was slightly worried that maximizing win rate is not the same as maximizing elo. But if you assume the elo model is correct (it is not of course) then elo is always an ascending function of win rate. So this should not happen.

Something else: I noticed something strange which might perhaps be a bug.

If you restart an experiment then suddenly the parameters change.

Code: Select all

    2677    1189    1027     0.536552            2            7            4            5            0
    2678    1190    1027     0.536761            2            7            4            5            0
    2679    1190    1028     0.536519            2            7            4            5            0
    2680    1191    1028     0.536728            2            7            4            5            0
    2681    1191    1029     0.536487            2            7            4            5            0
    2682    1192    1029     0.536695            2            7            4            5            0
    2683    1193    1029     0.536904            2            7            4            5            0
    2684    1193    1030     0.536662            2            7            4            5            0
    2685    1193    1031     0.536421            2            7            4            5            0
    2686    1193    1031     0.536421            2            7            4            5            0
    2687    1193    1031     0.536421            2            7            2            4            1
    2688    1194    1031     0.536629            2            7            2            4            1

Look at the next to last line. This is where the experiment was restarted after an engine crash.

Rémi Coulom · Post by **Rémi Coulom** » Tue Sep 27, 2011 10:39 am

Michel wrote:Look at the next to last line. This is where the experiment was restarted after an engine crash.

It's normal. Because quadratic regression is costly, CLOP does not update the regression after each sample. With 2000 samples, it updates the regression every 200 samples. Re-starting the experiment refreshed the regression, so this created a sudden change in estimated maximum.

Rémi

Michel · Post by **Michel** » Thu Sep 29, 2011 4:04 pm

Thanks!

This is really professional software!

I think I now understand everything except one thing. What is the meaning of the column "Central" in the "Win Rate" pane?

Michel

Rémi Coulom · Post by **Rémi Coulom** » Thu Sep 29, 2011 5:38 pm

Michel wrote:I think I now understand everything except one thing. What is the meaning of the column "Central" in the "Win Rate" pane?

"Central" is the win rate for all samples with w(x)=1.

Rémi

Dave_N · Post by **Dave_N** » Wed Oct 05, 2011 2:19 am

Intuitively I can see how CLOP could be used with Monte Carlo methods, and there is a mention of armed-bandits in the paper ...
Perhaps a Monte Carlo analysis of a set of positions could reveal parameter tweaks that work in special situations, i.e. aggression with disregard to king safety could be more accurate in some opening lines. I don't know if any engines already attempt to change parameters depending on situational factors.

edwardyu · Post by **edwardyu** » Fri Oct 07, 2011 1:58 pm

Hi Remi,

Thank you for your excellent software!

As a trial for my xiangqi engine, I have written the following script pawnval.py and pawnval.clop. Note that it is under Windows XP.

Code: Select all

#!/usr/bin/env python
#############################################################################
"""
 DummyScript.py

 This is an example script for use with parallel optimization. In order to
 apply clop algorithms to your own problem, you should write a script that
 behaves like this one.

 Arguments are&#58;
  #1&#58; processor id &#40;symbolic name, typically a machine name to ssh to&#41;
  #2&#58; seed &#40;integer&#41;
  #3&#58; parameter id of first parameter &#40;symbolic name&#41;
  #4&#58; value of first parameter &#40;float&#41;
  #5&#58; parameter id of second parameter &#40;optional&#41;
  #6&#58; value of second parameter &#40;optional&#41;
  ...

 This script should write the game outcome to its output&#58;
  W = win
  L = loss
  D = draw

 For instance&#58;
  $ ./DummyScript.py node-01 4 param 0.2
  W
"""
#############################################################################
import sys
import math
import random
import time
import os
#
# Log script invocations
#
f = open&#40;'pawnval.log', 'a')
print&#40;sys.argv, file=f&#41;

#
# Print doc if not enough parameters
#
if len&#40;sys.argv&#41; < 5&#58;
     print&#40;__doc__)
     sys.exit&#40;)

#
# Sleep for a random amount of time
#
# random.seed&#40;int&#40;sys.argv&#91;2&#93;))
# time.sleep&#40;random.random&#40;) * 2&#41;

#
# Parse parameter values
#
i = 4
params = &#91;&#93;
while i < len&#40;sys.argv&#41;&#58;
    # params.append&#40;float&#40;sys.argv&#91;i&#93;))
    params.append&#40;int&#40;sys.argv&#91;i&#93;))
    i += 2

#
# Compute winning probability
#
# d2 = 0
# for i in range&#40;len&#40;params&#41;)&#58;
#    delta = params&#91;i&#93; - 0.3456789
#    d2 += delta * delta * 10

# draw_elo = 100
# draw_rating = draw_elo * math.log&#40;10.0&#41; / 400.0
# win_p = 1.0 / &#40;1.0 + math.exp&#40;d2 + draw_rating&#41;)
# loss_p = 1.0 / &#40;1.0 + math.exp&#40;-d2 + draw_rating&#41;)

#
# Draw a random game according to this probability
#
# r = random.random&#40;)
# if r < loss_p&#58;
#    print&#40;"L")
# elif r > 1.0 - win_p&#58;
#    print&#40;"W")
# else&#58;
#    print&#40;"D")

# Run a game with param1=params&#91;0&#93;
fo = open&#40;"eychessu.ini","w")
fo.write&#40;"&#91;param&#93;\n");
fo.write&#40;"param1=" + str&#40;params&#91;0&#93;) + "\n");
fo.write&#40;"param2=" + str&#40;params&#91;1&#93;) + "\n");
fo.write&#40;"param3=" + str&#40;params&#91;2&#93;) + "\n");
fo.write&#40;"param4=" + str&#40;params&#91;3&#93;) + "\n");
fo.write&#40;"param5=" + str&#40;params&#91;4&#93;) + "\n");
fo.close&#40;)

if os.path.isfile&#40;"92h-3dc0.CHK") &#58;
   os.remove&#40;"92h-3dc0.CHK");
if os.path.isfile&#40;"3dc-92h0.CHK") &#58;   
   os.remove&#40;"3dc-92h0.CHK");
os.system&#40;"uccileag.exe < ucci3dc.in > ucci3dc.out");

score = 0
roundnum = 0 
for line in open&#40;"ucci3dc.out")&#58;
    parts = line.split&#40;' ')
    # print len&#40;parts&#41;   
    x = 0
    while x+1<len&#40;parts&#41; &#58;       
       parts&#91;x&#93;=parts&#91;x&#93;.strip&#40;)
       # print&#40;parts&#91;x&#93;)
       if roundnum == 0 &#58;
          if parts&#91;x&#93; == '1/2-1/2' &#58;
             score = 1
             roundnum = 1
          elif parts&#91;x&#93; == '0-1' &#58;
             score = 0
             roundnum = 1
          elif parts&#91;x&#93; == '1-0' &#58;
             score = 2
             roundnum = 1
           
       elif roundnum == 1 &#58;           
          if parts&#91;x&#93; == '1/2-1/2' &#58;
             score = score + 1
             roundnum = 2
          elif parts&#91;x&#93; == '0-1' &#58;
             score = score + 2
             roundnum = 2
          elif parts&#91;x&#93; == '1-0' &#58;
             # score = score + 0
             roundnum = 2 
                  
       elif parts&#91;x&#93; == 'Final' &#58;
          if score == 2 &#58;
             print&#40;"D")   
          if score < 2 &#58;
             print&#40;"L")  
          if score > 2 &#58;
             print&#40;"W")                 
       x=x+1 
    

os.rename&#40;"ucci3dc.out", "ucci3dc.out"+sys.argv&#91;2&#93;)
# print&#40;"roundnum=",roundnum&#41;
# print&#40;"score=",score&#41;

Code: Select all

#
# pawnval.clop
#
# Example of experiment definition
#

# Name &#40;used for .log and .dat files&#41;
Name pawnval

# Script for running a game. See DummyScript.py for details.
Script c&#58;/python32/python.exe pawnval.py

# Parameter&#40;s&#41; to be optimized
# <parameter_type> <name> <min> <max>
# <parameter_type> may be&#58;
#  LinearParameter
#  IntegerParameter
#  GammaParameter
#  IntegerGammaParameter
# For GammaParameter, quadratic regression is performed on log&#40;x&#41;
# Warning&#58; 123 and not 123.0 should be used for IntegerParameter

# int g_param1 = 24; //VP default 24
# int g_param2 = 95; //VPR default 95
# int g_param3 = 32; //ENDPAWNBONUS default 32
# int g_param4 = 32; //FUTPAWN default 32
# int g_param5 = 28; //ENDFUTPAWN default 28

IntegerParameter  vp  20 40
IntegerParameter  vpr 90 100
IntegerParameter  epb 20 40
IntegerParameter  fp  20 40
IntegerParameter  efp 20 40

# LinearParameter p2  0.0 1.0
# LinearParameter p3  0.0 1.0
# LinearParameter p4  0.0 1.0
# LinearParameter p5  0.0 1.0

# This could be the list of machine names in a distributed experiment.
# In order to run 4 games in parallel, 2 on machine1, 2 on machine2&#58; 
Processor machine1
# Processor machine1
# Processor machine2
# Processor machine2

# Call the script "Replications" times in a row with identical parameter values
# Replications may be used to alternate colors, for instance.
# Seed % Replications would indicate color.
# Replications 2

# Parameters of statistical model of outcome
# For binary outcome &#40;Win/Loss, no draws&#41;, use "DrawElo 0"
# For chess, use "DrawElo 100"
DrawElo 100

# Regression parameters
# H 3 is recommended &#40;it is the default value&#41;
# Correlations may be "all" &#40;default&#41; or "none"
# Even if variables are not correlated "all" should work well. The problem is
# that the regression might become very costly if the number of variables is
# high. So use "Correlations none" only if you are certain parameters are
# independent or you have so many variables that "all" is too costly.
H 3
Correlations all

The experiment is still running, only 38 games so far. I can only use 1'+0" games because the UCCI emulator (uccileag.exe) can only support 1 minute games. I don't know if it really converges, or the result is better than my manual parameters.

Anyway, it is a very good software. At least now I have an automatic engine paramenter tuning procedure. Anyone interested in xiangqi engine tuning can contact me at eykm(at)yahoo.com

Thanks,

Edward Yu

Rémi Coulom · Post by **Rémi Coulom** » Fri Oct 07, 2011 3:17 pm

Thanks Edward for your kind words. I am glad that programmers enjoy using CLOP.

I took a quick look at your script, and it seems you are not using CLOP optimally. If I understand correctly, each invocation of your script runs two games, and you write a W/L/D result based on the total outcome of these two games.

It would be better to let CLOP know about the detailed outcomes of these two games. If you wish to play two games for some parameters, one with White, the other with Black, then you should rather use the "Replications" parameter of CLOP. Then, your script would play a game with White or Black, depending on the parity of the seed.

Rémi

edwardyu · Post by **edwardyu** » Fri Oct 07, 2011 6:01 pm

Rémi Coulom wrote:Thanks Edward for your kind words. I am glad that programmers enjoy using CLOP.

I took a quick look at your script, and it seems you are not using CLOP optimally. If I understand correctly, each invocation of your script runs two games, and you write a W/L/D result based on the total outcome of these two games.

It would be better to let CLOP know about the detailed outcomes of these two games. If you wish to play two games for some parameters, one with White, the other with Black, then you should rather use the "Replications" parameter of CLOP. Then, your script would play a game with White or Black, depending on the parity of the seed.

Rémi

Hi Remi,

Yes, you are right. My script runs two games at a time and the output is like this:

Code: Select all

Round 1&#58;

Eychessu_1.892h       1-0   3DChess092          &#40;92h-3dc0.PGN&#41;

Rank List after Round 1&#58;

No Abbr Engine-Name         ELO  K  Win Draw Loss Score
=======================================================
 1 92h  Eychessu_1.892h     2010 20   1    0    0   1
 2 3dc  3DChess092          1990 20   0    0    1   0
=======================================================

Round 2&#58;

3DChess092          1/2-1/2 Eychessu_1.892h     &#40;3dc-92h0.PGN&#41;

Rank List after Round 2&#58;

No Abbr Engine-Name         ELO  K  Win Draw Loss Score
=======================================================
 1 92h  Eychessu_1.892h     2010 20   1    1    0   1.5
 2 3dc  3DChess092          1990 20   0    1    1   0.5
=======================================================

=== End of Match Process ===

Final Result&#58;

No Abbr Engine-Name         ELO  K  Win Draw Loss Score
=======================================================
 1 92h  Eychessu_1.892h     2010 20   1    1    0   1.5
 2 3dc  3DChess092          1990 20   0    1    1   0.5
=======================================================

It is the limitation of UCCILEAG which only supports roundrobin with alterate colors. I don't think "Replications" parameter will help in this case. It will be more useful if there is an option which accepts 2 output e.g. PRINT("W") PRINT("D")
for each invocation of the script.

CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP - a trial for xiangqi engine

Re: CLOP - a trial for xiangqi engine

Re: CLOP - a trial for xiangqi engine