(Sorry for the long post). Today I downloaded the PGN files of CCRL 40/4 FRC, CCRL 40/4 and CCRL 40/40 for collecting some stats about promotions and underpromotions. Why I did not download also games from CEGT? It was too much work for me, simply.
I made a Fortran parser with the help of Internet (the key URL is in the first line (a comment) of the following code):
Code: Select all
! http://www.eng-tips.com/viewthread.cfm?qid=127558
program Search_promotions_and_underpromotions_by_crowning_squares
implicit none
character(len=256) line
character(len=4),parameter::str='h8=N'
integer::n
integer where
n=0
open(unit=11,file='CCRL-4040.[462398].pgn',status='unknown',action='read')
1 read(11,'(A)',end=2) line
where=index (line,str)
if (where .ne. 0) then
n=n+1
end if
go to 1
2 continue
close(11)
write(*,*);write(*,*) trim(str),n
end program Search_promotions_and_underpromotions_by_crowning_squares
Code: Select all
h8=N 103
Code: Select all
CCRL-404FRC.[129100].pgn
a1=Q: 5012
a1=R: 156
a1=B: 28
a1=N: 60
b1=Q: 4010
b1=R: 109
b1=B: 36
b1=N: 58
c1=Q: 3484
c1=R: 116
c1=B: 32
c1=N: 64
d1=Q: 3207
d1=R: 127
d1=B: 30
d1=N: 87
e1=Q: 3167
e1=R: 148
e1=B: 29
e1=N: 113
f1=Q: 3369
f1=R: 141
f1=B: 25
f1=N: 90
g1=Q: 3948
g1=R: 108
g1=B: 29
g1=N: 63
h1=Q: 5004
h1=R: 136
h1=B: 32
h1=N: 51
a8=Q: 5520
a8=R: 140
a8=B: 37
a8=N: 49
b8=Q: 4315
b8=R: 145
b8=B: 32
b8=N: 71
c8=Q: 3874
c8=R: 145
c8=B: 42
c8=N: 89
d8=Q: 3462
d8=R: 141
d8=B: 39
d8=N: 100
e8=Q: 3351
e8=R: 142
e8=B: 39
e8=N: 88
f8=Q: 3644
f8=R: 146
f8=B: 29
f8=N: 79
g8=Q: 4248
g8=R: 121
g8=B: 26
g8=N: 58
h8=Q: 5441
h8=R: 169
h8=B: 34
h8=N: 46
====================
a1=*: 5256 (15.89%)
b1=*: 4213 (12.74%)
c1=*: 3696 (11.18%)
d1=*: 3451 (10.44%)
e1=*: 3457 (10.45%)
f1=*: 3625 (10.96%)
g1=*: 4148 (12.54%)
h1=*: 5223 (15.79%)
SUM: 33069
--------------------
a8=*: 5746 (16.02%)
b8=*: 4563 (12.72%)
c8=*: 4150 (11.57%)
d8=*: 3742 (10.43%)
e8=*: 3620 (10.09%)
f8=*: 3898 (10.87%)
g8=*: 4453 (12.42%)
h8=*: 5690 (15.87%)
SUM: 35862
--------------------
a*=*: 11002 (15.96%)
b*=*: 8776 (12.73%)
c*=*: 7846 (11.38%)
d*=*: 7193 (10.44%)
e*=*: 7077 (10.27%)
f*=*: 7523 (10.91%)
g*=*: 8601 (12.48%)
h*=*: 10913 (15.83%)
SUM: 68931
====================
*1=*: 33069 (47.97%)
*8=*: 35862 (52.03%)
SUM: 68931
====================
*1=Q: 31201 (94.35%)
*1=R: 1041 (3.15%)
*1=B: 241 (0.73%)
*1=N: 586 (1.77%)
SUM: 33069
--------------------
*8=Q: 33855 (94.40%)
*8=R: 1149 (3.20%)
*8=B: 278 (0.78%)
*8=N: 580 (1.62%)
SUM: 35862
--------------------
*=Q: 65056 (94.38%)
*=R: 2190 (3.18%)
*=B: 519 (0.75%)
*=N: 1166 (1.69%)
SUM: 68931
====================
(68931 promotions and underpromotions in 129100 games) ~ 0.5339 (promotions and underpromotions)/game.
Code: Select all
CCRL-404.[1045382].pgn
a1=Q: 37748
a1=R: 1385
a1=B: 259
a1=N: 322
b1=Q: 29373
b1=R: 1135
b1=B: 271
b1=N: 408
c1=Q: 23438
c1=R: 1171
c1=B: 287
c1=N: 505
d1=Q: 20346
d1=R: 1057
d1=B: 296
d1=N: 615
e1=Q: 22416
e1=R: 941
e1=B: 293
e1=N: 785
f1=Q: 26347
f1=R: 919
f1=B: 281
f1=N: 666
g1=Q: 26224
g1=R: 695
g1=B: 199
g1=N: 467
h1=Q: 33058
h1=R: 972
h1=B: 252
h1=N: 395
a8=Q: 44178
a8=R: 1534
a8=B: 291
a8=N: 382
b8=Q: 34839
b8=R: 1264
b8=B: 280
b8=N: 499
c8=Q: 27044
c8=R: 1242
c8=B: 281
c8=N: 604
d8=Q: 22980
d8=R: 1286
d8=B: 311
d8=N: 708
e8=Q: 22030
e8=R: 1015
e8=B: 308
e8=N: 897
f8=Q: 27523
f8=R: 971
f8=B: 261
f8=N: 810
g8=Q: 29502
g8=R: 779
g8=B: 187
g8=N: 547
h8=Q: 36919
h8=R: 1005
h8=B: 289
h8=N: 479
=====================
a1=*: 39714 (17.01%)
b1=*: 31187 (13.35%)
c1=*: 25401 (10.88%)
d1=*: 22314 (9.56%)
e1=*: 24435 (10.46%)
f1=*: 28213 (12.08%)
g1=*: 27585 (11.81%)
h1=*: 34677 (14.85%)
SUM: 233526
---------------------
a8=*: 46385 (17.76%)
b8=*: 36882 (14.12%)
c8=*: 29171 (11.17%)
d8=*: 25285 (9.68%)
e8=*: 24250 (9.28%)
f8=*: 29565 (11.32%)
g8=*: 31015 (11.87%)
h8=*: 38692 (14.81%)
SUM: 261245
---------------------
a*=*: 86099 (17.40%)
b*=*: 68069 (13.76%)
c*=*: 54572 (11.03%)
d*=*: 47599 (9.62%)
e*=*: 48685 (9.84%)
f*=*: 57778 (11.68%)
g*=*: 58600 (11.84%)
h*=*: 73369 (14.83%)
SUM: 494771
=====================
*1=*: 233526 (47.20%)
*8=*: 261245 (52.80%)
SUM: 494771
=====================
*1=Q: 218950 (93.76%)
*1=R: 8275 (3.54%)
*1=B: 2138 (0.92%)
*1=N: 4163 (1.78%)
SUM: 233526
---------------------
*8=Q: 245015 (93.79%)
*8=R: 9096 (3.48%)
*8=B: 2208 (0.85%)
*8=N: 4926 (1.89%)
SUM: 261245
---------------------
*=Q: 463965 (93.77%)
*=R: 17371 (3.51%)
*=B: 4346 (0.88%)
*=N: 9089 (1.84%)
SUM: 494771
=====================
(494771 promotions and underpromotions in 1045382 games) ~ 0.4733 (promotions and underpromotions)/game.
Code: Select all
CCRL-4040.[462398].pgn
a1=Q: 6564
a1=R: 325
a1=B: 81
a1=N: 69
b1=Q: 5462
b1=R: 318
b1=B: 69
b1=N: 129
c1=Q: 4491
c1=R: 295
c1=B: 63
c1=N: 124
d1=Q: 3941
d1=R: 255
d1=B: 94
d1=N: 189
e1=Q: 3956
e1=R: 223
e1=B: 58
e1=N: 186
f1=Q: 4211
f1=R: 201
f1=B: 57
f1=N: 191
g1=Q: 4180
g1=R: 163
g1=B: 47
g1=N: 112
h1=Q: 5010
h1=R: 199
h1=B: 68
h1=N: 106
a8=Q: 8048
a8=R: 383
a8=B: 75
a8=N: 99
b8=Q: 6413
b8=R: 305
b8=B: 66
b8=N: 139
c8=Q: 5296
c8=R: 338
c8=B: 76
c8=N: 146
d8=Q: 4743
d8=R: 370
d8=B: 79
d8=N: 221
e8=Q: 3974
e8=R: 242
e8=B: 70
e8=N: 249
f8=Q: 4551
f8=R: 252
f8=B: 83
f8=N: 188
g8=Q: 4574
g8=R: 183
g8=B: 49
g8=N: 147
h8=Q: 5503
h8=R: 238
h8=B: 62
h8=N: 103
====================
a1=*: 7039 (16.99%)
b1=*: 5978 (14.43%)
c1=*: 4973 (12.00%)
d1=*: 4479 (10.81%)
e1=*: 4423 (10.67%)
f1=*: 4660 (11.25%)
g1=*: 4502 (10.86%)
h1=*: 5383 (12.99%)
SUM: 41437
--------------------
a8=*: 8605 (18.21%)
b8=*: 6923 (14.65%)
c8=*: 5856 (12.39%)
d8=*: 5413 (11.45%)
e8=*: 4535 (9.59%)
f8=*: 5074 (10.74%)
g8=*: 4953 (10.48%)
h8=*: 5906 (12.50%)
SUM: 47265
--------------------
a*=*: 15644 (17.64%)
b*=*: 12901 (14.54%)
c*=*: 10829 (12.21%)
d*=*: 9892 (11.15%)
e*=*: 8958 (10.10%)
f*=*: 9734 (10.97%)
g*=*: 9455 (10.66%)
h*=*: 11289 (12.73%)
SUM: 88702
====================
*1=*: 41437 (46.71%)
*8=*: 47265 (53.29%)
SUM: 88702
====================
*1=Q: 37815 (91.26%)
*1=R: 1979 (4.78%)
*1=B: 537 (1.30%)
*1=N: 1106 (2.67%)
SUM: 41437
--------------------
*8=Q: 43102 (91.19%)
*8=R: 2311 (4.89%)
*8=B: 560 (1.18%)
*8=N: 1292 (2.73%)
SUM: 47265
--------------------
*=Q: 80917 (91.22%)
*=R: 4290 (4.84%)
*=B: 1097 (1.24%)
*=N: 2398 (2.70%)
SUM: 88702
====================
(88702 promotions and underpromotions in 462398 games) ~ 0.1918 (promotions and underpromotions)/game.
I only parsed for the promoted pieces, do not caring about if it gave check (or checkmate) or not (for example, in *=Q are included moves with *=Q, *=Q+ and *=Q#); the same for crowning squares: b8=Q includes axb8=Q, b7-b8=Q and cxb8=Q (and the issue of checks and checkmates). But I think that these stats bring some interesting questions/conclusions:
a) Long games average much less promotions and underpromotions (more accurate play at long TC?).
b) The frequency of promotions and underpromotions is Q > R > N > B, while I expected Q > N > R > B.
c) Corner squares (a1, a8, h1 and h8) have significantly more promotions and underpromotions than central squares (d1, d8, e1 and e8). In general, if we represent files in the x-axis and (promotions + underpromotions) in the y-axis, we will obtain a V-shaped graph. The simmetry (a-h, b-g, c-f and d-e) is greater in FRC and more unbalanced in standard chess (it could be because of less theory in FRC?).
d) White side promotes/underpromotes more than black side (is there a relationship with the first-move advantage?).
e) Promotions and underpromotions are more common in the queenside than in the kingside (it could be because of the greater frequency of castling kingside than castling queenside?).
There are surely more questions/conclusions.
I do not know if these statistics can be helpful for programmers, in the sense of tune bonuses and/or penalties in chess programming stuff. Anyway, I wanted to share it. Enjoy!
Regards from Spain.
Ajedrecista.