ICGA's 2015 World Computer Chess Championship/Events

bob · Post by **bob** » Thu Feb 26, 2015 5:31 am

Modern Times wrote:
bob wrote: Then one has to ask why doesn't human tournament play go this way, as is done in one type of checkers, draw an opening position from a hat and the game starts there???

That's not exactly "chess" as most of us learned to play it, where opening preparation is just as important as middle game tactics and endgame knowledge.
Human chess and computer chess aren't totally comparable, and what might be good or possible for one may not be good or possible for the other. For example it is completely possible in computer chess to play without any book at all, and have the engine think from move 1. In human chess that is not possible, you can't turn that part of the brain off that has learned opening theory. So perhaps we should stop trying to compare the two.

Humans can play without a book also. Chess960 started that way.

bob · Post by **bob** » Thu Feb 26, 2015 5:41 am

That model is pretty nonsensical (first experiment, then theorize). Wonder how that would have worked for Oppenheimer and the Manhattan project? Probably would have wiped out a few cities experimenting without the theory, if they could have even done the experiments without the theory existing first...

Sometimes experimental results lead to new or improved theory, but it happens the other way more frequently.

bob · Post by **bob** » Thu Feb 26, 2015 5:46 am

Milos wrote:
bob wrote:Tell you what. From someone that has been involved in this activity since 1968, the most pathetic actions I see here are by you guys. Never seen such egos, such lack of respect for others that are working hard, etc.

In short, as emissaries for computer chess, you guys suck BIG TIME. Hopefully most won't think that all computer chess people act like you guys...
They've improved their engine more in four years than what you've manged with yours in two decades (not to mention that for the level of improvement of your engine that you've spent two decades would take a semi-decent programmer not more than 2-3 years coz improving an engine 500 Elo below the top is like an order of magnitude easier than improving the top engine).
You telling them something is like a guy from a local hardware store telling Intel how to run business.
So much about ego.

Not surprised this is completely over your head...

bob · Post by **bob** » Thu Feb 26, 2015 5:49 am

Michel wrote:
I should have been more specific. I was referring to the initial observation(s) that provide the genesis of a new theory:
Yes sorry, I now realize that. I was thinking in terms of established scientific theories (which exist in many fields).

I assume you are specifically referring to computer chess. In computer chess there is currently simply no theory to speak off (although many people here present their dogmas as undisputable truth). To start nobody understands why minimax search is so effective in chess. The success of the tuning methods in Gaviota (to which you are contributing!) and Texel "suggests" that is is good to have a static evaluation reflecting the statistical properties of a position. But there is theoretically no reason why such "objective evaluaton" would propagate through search (min/max are functions which are notoriously difficult to handle statistically). In other words if you think of the static evaluation as somehow statistically summarizing what a deeper search would reveal then you run into contradictions.

So yes. Personally I appreciate very much the experiments you (and Kai Laskos) are doing!

BTW. What people refer to as "theory" in computer chess is actually tree search algorithms which is really mathematics (or perhaps theoretical computer science). But as I said above, nobody understands why tree search + static evaluation produces a good chess program.

Actually quite a few people DO understand why this works. In fact, Claude Shannon understood it back in the 1950's...

Laskos · Post by **Laskos** » Thu Feb 26, 2015 7:15 am

petero2 wrote:
Laskos wrote:
petero2 wrote: I can try with your draw model too. Is it correct that you use these formulas from the bayeselo documentation:
Code: Select all
f(Delta) = 1 / (1 + 10^(Delta/400))
P(WhiteWins) = f(eloBlack - eloWhite - eloAdvantage + eloDraw)
P(BlackWins) = f(eloWhite - eloBlack + eloAdvantage + eloDraw)
P(Draw) = 1 - P(WhiteWins) - P(BlackWins)
with eloAdvantage = 0 and eloDraw = 200?
Yes, I don't keep colors, and drawelo=200.
In that case I get:
Code: Select all
eng   elo  win prob
  1     0  0.734052
  2  -150  0.0947824
  3  -200  0.0381066
  4  -200  0.038102
  5  -250  0.0134178
  6  -250  0.0133863
  7  -300  0.00408722
  8  -300  0.00408363
  9  -400  0.00024338
 10  -400  0.0002405
 11  -400  0.00024259
ties:      0.0592556
Note though that this draw model also distorts the rating scale.
Code: Select all
elo  s1      s2
  0  0.5000  0.5000
 20  0.5288  0.5210
 40  0.5573  0.5420
 60  0.5855  0.5629
 80  0.6131  0.5838
100  0.6401  0.6045
120  0.6661  0.6250
140  0.6912  0.6454
160  0.7153  0.6654
180  0.7381  0.6852
200  0.7597  0.7045
220  0.7801  0.7235
240  0.7992  0.7419
260  0.8171  0.7597
280  0.8337  0.7769
300  0.8490  0.7934
320  0.8632  0.8092
340  0.8762  0.8242
360  0.8882  0.8385
380  0.8991  0.8519
400  0.9091  0.8645
s1 is the win rate according to the standard elo model and s2 is the win rate according to the above draw model.

Rating distortion is a known issue with assumption 1 win + 1 loss = 1 draw, but for simplicity I kept this draw model. Could you do simulations with the following elos I used to get around 60% for SF:

elo1 SF 3200
elo2 3100
elo3 3050
elo4 3000
elo5 2950
elo6 2900
elo7 2700
elo8 2700
elo9 2200
elo10 2200
elo11 2200

Here I included 3 weaker engines, which basically lose a everything against stronger ones. They are usual occurrences in WCCC. Drawelo=200, white_advantage=0.

michiguel · Post by **michiguel** » Thu Feb 26, 2015 10:26 am

petero2 wrote:
Milos wrote:
Laskos wrote:
Milos wrote:I wrote a sim of my own, with the following assumptions:
11 participants with ELO ratings:
R, R-150, R-200, R-200, R-250, R-250, R-300, R-300, R-400, R-400, R-400
And after the sim I got probability of favorite winning of roughly 90%.
Probability of the second one winning roughly 6%, others below 1.5%.
I don't use drawelo 200 instead I use my own draw percentages for each Elo difference (these are much more realistic values for LTC matches):
0Elo - 64%, 50Elo - 60%, 100Elo - 52%, 150Elo - 45%, 200Elo - 38%, 250Elo - 34.3%, 300Elo - 28.2% and 400Elo - 17.6%
Using these parameters and ignoring the white advantage since I don't know how colors are assigned in this tournament, I got after 1e8 simulated tournaments:
Code: Select all
eng   elo  win prob
  1     0  0.801288
  2  -150  0.0576597
  3  -200  0.0154924
  4  -200  0.0155108
  5  -250  0.00279631
  6  -250  0.00279635
  7  -300  0.00041849
  8  -300  0.00042163
  9  -400  3.9e-06
 10  -400  4.1e-06
 11  -400  3.61e-06
ties:      0.103605
Note that I have not implemented the tie-break rules, so I don't know how those 10.4% ties would be distributed among the participants.

I got this with the Ordo model

Code: Select all

Total engines = 11
Total games = 55
Total rounds = 11
Total boards = 5
Total cycles = 1000000
draw rate (equal strength) = 64.0%
White advantage = 50.0
rating[0]=0
rating[1]=-150
rating[2]=-200
rating[3]=-200
rating[4]=-250
rating[5]=-250
rating[6]=-300
rating[7]=-300
rating[8]=-400
rating[9]=-400
rating[10]=-400

won    = 827663
shared = 89338
loss   = 82999
total  = 1000000
won outright % = 82.8
won shared   % = 8.9

It is including white advantage and assuming 64% draw rate for equal strength.

Miguel

michiguel · Post by **michiguel** » Thu Feb 26, 2015 10:30 am

Laskos wrote:
petero2 wrote:
Laskos wrote:
petero2 wrote: I can try with your draw model too. Is it correct that you use these formulas from the bayeselo documentation:
Code: Select all
f(Delta) = 1 / (1 + 10^(Delta/400))
P(WhiteWins) = f(eloBlack - eloWhite - eloAdvantage + eloDraw)
P(BlackWins) = f(eloWhite - eloBlack + eloAdvantage + eloDraw)
P(Draw) = 1 - P(WhiteWins) - P(BlackWins)
with eloAdvantage = 0 and eloDraw = 200?
Yes, I don't keep colors, and drawelo=200.
In that case I get:
Code: Select all
eng   elo  win prob
  1     0  0.734052
  2  -150  0.0947824
  3  -200  0.0381066
  4  -200  0.038102
  5  -250  0.0134178
  6  -250  0.0133863
  7  -300  0.00408722
  8  -300  0.00408363
  9  -400  0.00024338
 10  -400  0.0002405
 11  -400  0.00024259
ties:      0.0592556
Note though that this draw model also distorts the rating scale.
Code: Select all
elo  s1      s2
  0  0.5000  0.5000
 20  0.5288  0.5210
 40  0.5573  0.5420
 60  0.5855  0.5629
 80  0.6131  0.5838
100  0.6401  0.6045
120  0.6661  0.6250
140  0.6912  0.6454
160  0.7153  0.6654
180  0.7381  0.6852
200  0.7597  0.7045
220  0.7801  0.7235
240  0.7992  0.7419
260  0.8171  0.7597
280  0.8337  0.7769
300  0.8490  0.7934
320  0.8632  0.8092
340  0.8762  0.8242
360  0.8882  0.8385
380  0.8991  0.8519
400  0.9091  0.8645
s1 is the win rate according to the standard elo model and s2 is the win rate according to the above draw model.
Rating distortion is a known issue with assumption 1 win + 1 loss = 1 draw, but for simplicity I kept this draw model. Could you do simulations with the following elos I used to get around 60% for SF:

elo1 SF 3200
elo2 3100
elo3 3050
elo4 3000
elo5 2950
elo6 2900
elo7 2700
elo8 2700
elo9 2200
elo10 2200
elo11 2200

Here I included 3 weaker engines, which basically lose a everything against stronger ones. They are usual occurrences in WCCC. Drawelo=200, white_advantage=0.

Code: Select all

Total engines = 11
Total games = 55
Total rounds = 11
Total boards = 5
Total cycles = 1000000
draw rate (equal strength) = 64.0%
White advantage = 50.0
rating[0]=3200
rating[1]=3100
rating[2]=3050
rating[3]=3000
rating[4]=2950
rating[5]=2900
rating[6]=2700
rating[7]=2700
rating[8]=2200
rating[9]=2200
rating[10]=2200

won    = 631963
shared = 175635
loss   = 192402
total  = 1000000
won outright % = 63.2
won shared   % = 17.6

Miguel

michiguel · Post by **michiguel** » Thu Feb 26, 2015 11:49 am

michiguel wrote:
Laskos wrote:
petero2 wrote:
Laskos wrote:
petero2 wrote: I can try with your draw model too. Is it correct that you use these formulas from the bayeselo documentation:
Code: Select all
f(Delta) = 1 / (1 + 10^(Delta/400))
P(WhiteWins) = f(eloBlack - eloWhite - eloAdvantage + eloDraw)
P(BlackWins) = f(eloWhite - eloBlack + eloAdvantage + eloDraw)
P(Draw) = 1 - P(WhiteWins) - P(BlackWins)
with eloAdvantage = 0 and eloDraw = 200?
Yes, I don't keep colors, and drawelo=200.
In that case I get:
Code: Select all
eng   elo  win prob
  1     0  0.734052
  2  -150  0.0947824
  3  -200  0.0381066
  4  -200  0.038102
  5  -250  0.0134178
  6  -250  0.0133863
  7  -300  0.00408722
  8  -300  0.00408363
  9  -400  0.00024338
 10  -400  0.0002405
 11  -400  0.00024259
ties:      0.0592556
Note though that this draw model also distorts the rating scale.
Code: Select all
elo  s1      s2
  0  0.5000  0.5000
 20  0.5288  0.5210
 40  0.5573  0.5420
 60  0.5855  0.5629
 80  0.6131  0.5838
100  0.6401  0.6045
120  0.6661  0.6250
140  0.6912  0.6454
160  0.7153  0.6654
180  0.7381  0.6852
200  0.7597  0.7045
220  0.7801  0.7235
240  0.7992  0.7419
260  0.8171  0.7597
280  0.8337  0.7769
300  0.8490  0.7934
320  0.8632  0.8092
340  0.8762  0.8242
360  0.8882  0.8385
380  0.8991  0.8519
400  0.9091  0.8645
s1 is the win rate according to the standard elo model and s2 is the win rate according to the above draw model.
Rating distortion is a known issue with assumption 1 win + 1 loss = 1 draw, but for simplicity I kept this draw model. Could you do simulations with the following elos I used to get around 60% for SF:

elo1 SF 3200
elo2 3100
elo3 3050
elo4 3000
elo5 2950
elo6 2900
elo7 2700
elo8 2700
elo9 2200
elo10 2200
elo11 2200

Here I included 3 weaker engines, which basically lose a everything against stronger ones. They are usual occurrences in WCCC. Drawelo=200, white_advantage=0.
Code: Select all
Total engines = 11
Total games = 55
Total rounds = 11
Total boards = 5
Total cycles = 1000000
draw rate (equal strength) = 64.0%
White advantage = 50.0
rating[0]=3200
rating[1]=3100
rating[2]=3050
rating[3]=3000
rating[4]=2950
rating[5]=2900
rating[6]=2700
rating[7]=2700
rating[8]=2200
rating[9]=2200
rating[10]=2200

won    = 631963
shared = 175635
loss   = 192402
total  = 1000000
won outright % = 63.2
won shared   % = 17.6
Miguel

Just to test consistency and debug the simulation (in this case 200,000 tournament runs = 2M games for each engine), I sent all simulated games to a pgn file and recalculated the ratings. I get the proper numbers, including the white advantage and draw rate.

Code: Select all

ordo -p simulated.pgn -WD -a 3200 -A "Engine 1" -N1

   # PLAYER       : RATING    POINTS  PLAYED    (%)
   1 Engine 1     : 3200.0 1731891.0 2000000   86.6%
   2 Engine 2     : 3100.2 1551791.0 2000000   77.6%
   3 Engine 3     : 3050.4 1483871.5 2000000   74.2%
   4 Engine 4     : 3000.3 1370172.0 2000000   68.5%
   5 Engine 5     : 2950.5 1294716.0 2000000   64.7%
   6 Engine 6     : 2900.4 1181569.0 2000000   59.1%
   7 Engine 8     : 2700.4  834811.5 2000000   41.7%
   8 Engine 7     : 2700.3  847499.5 2000000   42.4%
   9 Engine 10    : 2200.2  235562.5 2000000   11.8%
  10 Engine 9     : 2199.9  234158.0 2000000   11.7%
  11 Engine 11    : 2199.7  233958.0 2000000   11.7%

White advantage = 50.09
Draw rate (equal opponents) = 64.06 %

Miguel

michiguel · Post by **michiguel** » Thu Feb 26, 2015 12:22 pm

michiguel wrote:
Laskos wrote:
petero2 wrote:
Laskos wrote:
petero2 wrote: I can try with your draw model too. Is it correct that you use these formulas from the bayeselo documentation:
Code: Select all
f(Delta) = 1 / (1 + 10^(Delta/400))
P(WhiteWins) = f(eloBlack - eloWhite - eloAdvantage + eloDraw)
P(BlackWins) = f(eloWhite - eloBlack + eloAdvantage + eloDraw)
P(Draw) = 1 - P(WhiteWins) - P(BlackWins)
with eloAdvantage = 0 and eloDraw = 200?
Yes, I don't keep colors, and drawelo=200.
In that case I get:
Code: Select all
eng   elo  win prob
  1     0  0.734052
  2  -150  0.0947824
  3  -200  0.0381066
  4  -200  0.038102
  5  -250  0.0134178
  6  -250  0.0133863
  7  -300  0.00408722
  8  -300  0.00408363
  9  -400  0.00024338
 10  -400  0.0002405
 11  -400  0.00024259
ties:      0.0592556
Note though that this draw model also distorts the rating scale.
Code: Select all
elo  s1      s2
  0  0.5000  0.5000
 20  0.5288  0.5210
 40  0.5573  0.5420
 60  0.5855  0.5629
 80  0.6131  0.5838
100  0.6401  0.6045
120  0.6661  0.6250
140  0.6912  0.6454
160  0.7153  0.6654
180  0.7381  0.6852
200  0.7597  0.7045
220  0.7801  0.7235
240  0.7992  0.7419
260  0.8171  0.7597
280  0.8337  0.7769
300  0.8490  0.7934
320  0.8632  0.8092
340  0.8762  0.8242
360  0.8882  0.8385
380  0.8991  0.8519
400  0.9091  0.8645
s1 is the win rate according to the standard elo model and s2 is the win rate according to the above draw model.
Rating distortion is a known issue with assumption 1 win + 1 loss = 1 draw, but for simplicity I kept this draw model. Could you do simulations with the following elos I used to get around 60% for SF:

elo1 SF 3200
elo2 3100
elo3 3050
elo4 3000
elo5 2950
elo6 2900
elo7 2700
elo8 2700
elo9 2200
elo10 2200
elo11 2200

Here I included 3 weaker engines, which basically lose a everything against stronger ones. They are usual occurrences in WCCC. Drawelo=200, white_advantage=0.
Code: Select all
Total engines = 11
Total games = 55
Total rounds = 11
Total boards = 5
Total cycles = 1000000
draw rate (equal strength) = 64.0%
White advantage = 50.0
rating[0]=3200
rating[1]=3100
rating[2]=3050
rating[3]=3000
rating[4]=2950
rating[5]=2900
rating[6]=2700
rating[7]=2700
rating[8]=2200
rating[9]=2200
rating[10]=2200

won    = 631963
shared = 175635
loss   = 192402
total  = 1000000
won outright % = 63.2
won shared   % = 17.6
Miguel

But if the RR is run with the reversed colors

Code: Select all

won    = 572912
shared = 188750
loss   = 238338
total  = 1000000
won outright % = 57.3
won shared   % = 18.9

There is some difference depending who plays white against who, which is not surprising.

Miguel

Modern Times · Post by **Modern Times** » Thu Feb 26, 2015 1:26 pm

bob wrote:
Modern Times wrote:
bob wrote: Then one has to ask why doesn't human tournament play go this way, as is done in one type of checkers, draw an opening position from a hat and the game starts there???

That's not exactly "chess" as most of us learned to play it, where opening preparation is just as important as middle game tactics and endgame knowledge.
Human chess and computer chess aren't totally comparable, and what might be good or possible for one may not be good or possible for the other. For example it is completely possible in computer chess to play without any book at all, and have the engine think from move 1. In human chess that is not possible, you can't turn that part of the brain off that has learned opening theory. So perhaps we should stop trying to compare the two.
Humans can play without a book also. Chess960 started that way.

Yes that's right regarding chess960 - which is why I am a huge fan of chess960.

But humans can't play conventional chess without a book - they can't just switch off that part of their brain which contains their openings learning and experience.

ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events

Re: ICGA's 2015 World Computer Chess Championship/Events