Some statistics about promotions and underpromotions.

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Ajedrecista
Posts: 2102
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Some statistics about promotions and underpromotions.

Post by Ajedrecista »

Hello to everybody:

(Sorry for the long post). Today I downloaded the PGN files of CCRL 40/4 FRC, CCRL 40/4 and CCRL 40/40 for collecting some stats about promotions and underpromotions. Why I did not download also games from CEGT? It was too much work for me, simply.

I made a Fortran parser with the help of Internet (the key URL is in the first line (a comment) of the following code):

Code: Select all

! http://www.eng-tips.com/viewthread.cfm?qid=127558
program Search_promotions_and_underpromotions_by_crowning_squares
implicit none
character(len=256) line
character(len=4),parameter::str='h8=N'
integer::n
integer where
n=0
open(unit=11,file='CCRL-4040.[462398].pgn',status='unknown',action='read')
1 read(11,'(A)',end=2) line
where=index (line,str)
if (where .ne. 0) then
  n=n+1
end if
go to 1
2 continue
close(11)
write(*,*);write(*,*) trim(str),n
end program Search_promotions_and_underpromotions_by_crowning_squares
I am not very sure about the reliability of this parser (maybe someone can correct these numbers), but it was the best that Internet could offer to my null programming skills. The output of this parser was something like this:

Code: Select all

 h8=N         103
I had to change manually the desired string (h8=N and others) and compile each time. You see that it is not very automated-friendly, but at least I collected some interesting stats (sums, percentages, etc., are done with a Casio calculator, so may contain errors). I hope no typos when writting the results to the Notepad:

Code: Select all

CCRL-404FRC.[129100].pgn

a1=Q: 5012
a1=R:  156
a1=B:   28
a1=N:   60

b1=Q: 4010
b1=R:  109
b1=B:   36
b1=N:   58

c1=Q: 3484
c1=R:  116
c1=B:   32
c1=N:   64

d1=Q: 3207
d1=R:  127
d1=B:   30
d1=N:   87

e1=Q: 3167
e1=R:  148
e1=B:   29
e1=N:  113

f1=Q: 3369
f1=R:  141
f1=B:   25
f1=N:   90

g1=Q: 3948
g1=R:  108
g1=B:   29
g1=N:   63

h1=Q: 5004
h1=R:  136
h1=B:   32
h1=N:   51

a8=Q: 5520
a8=R:  140
a8=B:   37
a8=N:   49

b8=Q: 4315
b8=R:  145
b8=B:   32
b8=N:   71

c8=Q: 3874
c8=R:  145
c8=B:   42
c8=N:   89

d8=Q: 3462
d8=R:  141
d8=B:   39
d8=N:  100

e8=Q: 3351
e8=R:  142
e8=B:   39
e8=N:   88

f8=Q: 3644
f8=R:  146
f8=B:   29
f8=N:   79

g8=Q: 4248
g8=R:  121
g8=B:   26
g8=N:   58

h8=Q: 5441
h8=R:  169
h8=B:   34
h8=N:   46

====================

a1=*:  5256 (15.89%)
b1=*:  4213 (12.74%)
c1=*:  3696 (11.18%)
d1=*:  3451 (10.44%)
e1=*:  3457 (10.45%)
f1=*:  3625 (10.96%)
g1=*:  4148 (12.54%)
h1=*:  5223 (15.79%)
 SUM: 33069

--------------------

a8=*:  5746 (16.02%)
b8=*:  4563 (12.72%)
c8=*:  4150 (11.57%)
d8=*:  3742 (10.43%)
e8=*:  3620 (10.09%)
f8=*:  3898 (10.87%)
g8=*:  4453 (12.42%)
h8=*:  5690 (15.87%)
 SUM: 35862

--------------------

a*=*: 11002 (15.96%)
b*=*:  8776 (12.73%)
c*=*:  7846 (11.38%)
d*=*:  7193 (10.44%)
e*=*:  7077 (10.27%)
f*=*:  7523 (10.91%)
g*=*:  8601 (12.48%)
h*=*: 10913 (15.83%)
 SUM: 68931

====================

*1=*: 33069 (47.97%)
*8=*: 35862 (52.03%)
 SUM: 68931

====================

*1=Q: 31201 (94.35%)
*1=R:  1041  (3.15%)
*1=B:   241  (0.73%)
*1=N:   586  (1.77%)
 SUM: 33069

--------------------

*8=Q: 33855 (94.40%)
*8=R:  1149  (3.20%)
*8=B:   278  (0.78%)
*8=N:   580  (1.62%)
 SUM: 35862

--------------------

 *=Q: 65056 (94.38%)
 *=R:  2190  (3.18%)
 *=B:   519  (0.75%)
 *=N:  1166  (1.69%)
 SUM: 68931

====================

(68931 promotions and underpromotions in 129100 games) ~ 0.5339 (promotions and underpromotions)/game.

Code: Select all

CCRL-404.[1045382].pgn

a1=Q: 37748
a1=R:  1385
a1=B:   259
a1=N:   322

b1=Q: 29373
b1=R:  1135
b1=B:   271
b1=N:   408

c1=Q: 23438
c1=R:  1171
c1=B:   287
c1=N:   505

d1=Q: 20346
d1=R:  1057
d1=B:   296
d1=N:   615

e1=Q: 22416
e1=R:   941
e1=B:   293
e1=N:   785

f1=Q: 26347
f1=R:   919
f1=B:   281
f1=N:   666

g1=Q: 26224
g1=R:   695
g1=B:   199
g1=N:   467

h1=Q: 33058
h1=R:   972
h1=B:   252
h1=N:   395

a8=Q: 44178
a8=R:  1534
a8=B:   291
a8=N:   382

b8=Q: 34839
b8=R:  1264
b8=B:   280
b8=N:   499

c8=Q: 27044
c8=R:  1242
c8=B:   281
c8=N:   604

d8=Q: 22980
d8=R:  1286
d8=B:   311
d8=N:   708

e8=Q: 22030
e8=R:  1015
e8=B:   308
e8=N:   897

f8=Q: 27523
f8=R:   971
f8=B:   261
f8=N:   810

g8=Q: 29502
g8=R:   779
g8=B:   187
g8=N:   547

h8=Q: 36919
h8=R:  1005
h8=B:   289
h8=N:   479

=====================

a1=*:  39714 (17.01%)
b1=*:  31187 (13.35%)
c1=*:  25401 (10.88%)
d1=*:  22314  (9.56%)
e1=*:  24435 (10.46%)
f1=*:  28213 (12.08%)
g1=*:  27585 (11.81%)
h1=*:  34677 (14.85%)
 SUM: 233526

---------------------

a8=*:  46385 (17.76%)
b8=*:  36882 (14.12%)
c8=*:  29171 (11.17%)
d8=*:  25285  (9.68%)
e8=*:  24250  (9.28%)
f8=*:  29565 (11.32%)
g8=*:  31015 (11.87%)
h8=*:  38692 (14.81%)
 SUM: 261245

---------------------

a*=*:  86099 (17.40%)
b*=*:  68069 (13.76%)
c*=*:  54572 (11.03%)
d*=*:  47599  (9.62%)
e*=*:  48685  (9.84%)
f*=*:  57778 (11.68%)
g*=*:  58600 (11.84%)
h*=*:  73369 (14.83%)
 SUM: 494771

=====================

*1=*: 233526 (47.20%)
*8=*: 261245 (52.80%)
 SUM: 494771

=====================

*1=Q: 218950 (93.76%)
*1=R:   8275  (3.54%)
*1=B:   2138  (0.92%)
*1=N:   4163  (1.78%)
 SUM: 233526

---------------------

*8=Q: 245015 (93.79%)
*8=R:   9096  (3.48%)
*8=B:   2208  (0.85%)
*8=N:   4926  (1.89%)
 SUM: 261245

---------------------

 *=Q: 463965 (93.77%)
 *=R:  17371  (3.51%)
 *=B:   4346  (0.88%)
 *=N:   9089  (1.84%)
 SUM: 494771

=====================

(494771 promotions and underpromotions in 1045382 games) ~ 0.4733 (promotions and underpromotions)/game.

Code: Select all

CCRL-4040.[462398].pgn

a1=Q: 6564
a1=R:  325
a1=B:   81
a1=N:   69

b1=Q: 5462
b1=R:  318
b1=B:   69
b1=N:  129

c1=Q: 4491
c1=R:  295
c1=B:   63
c1=N:  124

d1=Q: 3941
d1=R:  255
d1=B:   94
d1=N:  189

e1=Q: 3956
e1=R:  223
e1=B:   58
e1=N:  186

f1=Q: 4211
f1=R:  201
f1=B:   57
f1=N:  191

g1=Q: 4180
g1=R:  163
g1=B:   47
g1=N:  112

h1=Q: 5010
h1=R:  199
h1=B:   68
h1=N:  106

a8=Q: 8048
a8=R:  383
a8=B:   75
a8=N:   99

b8=Q: 6413
b8=R:  305
b8=B:   66
b8=N:  139

c8=Q: 5296
c8=R:  338
c8=B:   76
c8=N:  146

d8=Q: 4743
d8=R:  370
d8=B:   79
d8=N:  221

e8=Q: 3974
e8=R:  242
e8=B:   70
e8=N:  249

f8=Q: 4551
f8=R:  252
f8=B:   83
f8=N:  188

g8=Q: 4574
g8=R:  183
g8=B:   49
g8=N:  147

h8=Q: 5503
h8=R:  238
h8=B:   62
h8=N:  103

====================

a1=*:  7039 (16.99%)
b1=*:  5978 (14.43%)
c1=*:  4973 (12.00%)
d1=*:  4479 (10.81%)
e1=*:  4423 (10.67%)
f1=*:  4660 (11.25%)
g1=*:  4502 (10.86%)
h1=*:  5383 (12.99%)
 SUM: 41437

--------------------

a8=*:  8605 (18.21%)
b8=*:  6923 (14.65%)
c8=*:  5856 (12.39%)
d8=*:  5413 (11.45%)
e8=*:  4535  (9.59%)
f8=*:  5074 (10.74%)
g8=*:  4953 (10.48%)
h8=*:  5906 (12.50%)
 SUM: 47265

--------------------

a*=*: 15644 (17.64%)
b*=*: 12901 (14.54%)
c*=*: 10829 (12.21%)
d*=*:  9892 (11.15%)
e*=*:  8958 (10.10%)
f*=*:  9734 (10.97%)
g*=*:  9455 (10.66%)
h*=*: 11289 (12.73%)
 SUM: 88702

====================

*1=*: 41437 (46.71%)
*8=*: 47265 (53.29%)
 SUM: 88702

====================

*1=Q: 37815 (91.26%)
*1=R:  1979  (4.78%)
*1=B:   537  (1.30%)
*1=N:  1106  (2.67%)
 SUM: 41437

--------------------

*8=Q: 43102 (91.19%)
*8=R:  2311  (4.89%)
*8=B:   560  (1.18%)
*8=N:  1292  (2.73%)
 SUM: 47265

--------------------

 *=Q: 80917 (91.22%)
 *=R:  4290  (4.84%)
 *=B:  1097  (1.24%)
 *=N:  2398  (2.70%)
 SUM: 88702

====================

(88702 promotions and underpromotions in 462398 games) ~ 0.1918 (promotions and underpromotions)/game.
Asterisks mean all the possibilities in each field (rank, file and promoted piece). The parser was not uber-fast, so it took me a nice amount of time all the task.

I only parsed for the promoted pieces, do not caring about if it gave check (or checkmate) or not (for example, in *=Q are included moves with *=Q, *=Q+ and *=Q#); the same for crowning squares: b8=Q includes axb8=Q, b7-b8=Q and cxb8=Q (and the issue of checks and checkmates). But I think that these stats bring some interesting questions/conclusions:

a) Long games average much less promotions and underpromotions (more accurate play at long TC?).

b) The frequency of promotions and underpromotions is Q > R > N > B, while I expected Q > N > R > B.

c) Corner squares (a1, a8, h1 and h8) have significantly more promotions and underpromotions than central squares (d1, d8, e1 and e8). In general, if we represent files in the x-axis and (promotions + underpromotions) in the y-axis, we will obtain a V-shaped graph. The simmetry (a-h, b-g, c-f and d-e) is greater in FRC and more unbalanced in standard chess (it could be because of less theory in FRC?).

d) White side promotes/underpromotes more than black side (is there a relationship with the first-move advantage?).

e) Promotions and underpromotions are more common in the queenside than in the kingside (it could be because of the greater frequency of castling kingside than castling queenside?).

There are surely more questions/conclusions.

I do not know if these statistics can be helpful for programmers, in the sense of tune bonuses and/or penalties in chess programming stuff. Anyway, I wanted to share it. Enjoy!

Regards from Spain.

Ajedrecista.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Some statistics about promotions and underpromotions.

Post by sje »

Ajedrecista wrote:I made a Fortran parser with the help of Internet
"Dear Father, forgive me for I have sinned. Decades ago I taught the evil that is Fortran, just because I needed the money. I deeply regret this and promise never to do it again."

"My son, you are forgiven. You must say ten Our Fathers and ten Hail Marys as a penance. Further, you will never again use the Devil's goto statement no matter how alluring it might be. And if you fear the fires of Hell, you won't even consider the mortal sins of setjmp()/longjmp()."

Perhaps I'm mistaken, but does your program count promotion moves if there are more than one promotion move per text line?
User avatar
Ajedrecista
Posts: 2102
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Some statistics about promotions and underpromotions.

Post by Ajedrecista »

Hello Steven:
sje wrote:Perhaps I'm mistaken, but does your program count promotion moves if there are more than one promotion move per text line?
Thanks for your interest.

I am not sure at all. If I would only parse '=', then there will be more problems for sure. Since I parsed something more specific (the string 'a1=Q' for example) the probability of two or more 'a1=Q' results in the same line is very small. I started toying with the length of the character line: I got smaller numbers n (the counting integer) when I lowered this length. I remind that I got the same results for various searches with length = 100, length = 128, length = 256 and length = 512, but different results with length = 64 or less. Honestly, I do not know all the intrinsics of this parser (which was not mine).

Probably I did not put enough emphasis in saying that my results should be taken with tons of care. In fact, I wrote about the possible bad reliability of the parser in my original post. I also remind that in the first PGN file (where I toyed with lengths), if I searched the string 'a1=' I got 5255 results while the sum of 'a1=Q' + 'a1=R' + 'a1=B' + 'a1=N' gave 5256 results. The numbers of the original post must not be understood as exact numbers but good approximations (if I am lucky), bringing an overall picture of the topic of promotions and underpromotions.

Talking about Fortran: well, it is the only programming language where I have a little knowledge (do loops, if statements, read, write and almost nothing more). Sorry for the 'go to' statements: I am definitively not a programmer.

Regards from Spain.

Ajedrecista.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Some statistics about promotions and underpromotions.

Post by sje »

Try a test case. Run your program on a single line of input which has two promotion moves and see the result.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Some statistics about promotions and underpromotions.

Post by Adam Hair »

Hi Jesús,

You can use a program called "Joined" (created by Andreas Stable) to doublecheck the total number of promotions for each piece. You can download it from Jim Ablett's site (near bottom of page).
Ajedrecista wrote: a) Long games average much less promotions and underpromotions (more accurate play at long TC?).
This is possible. Keep in mind that adjudication is used extensively for the 40/40 games. This may reduce the total number of promotions (the threat of a promotion could cause scores to be high enough to trigger adjudication).
Ajedrecista wrote: b) The frequency of promotions and underpromotions is Q > R > N > B, while I expected Q > N > R > B.
I have seen engines underpromote to a rook when the promoted piece is obviously going to be taken on the opponent's turn. This could possible skew the order.
Ajedrecista wrote: c) Corner squares (a1, a8, h1 and h8) have significantly more promotions and underpromotions than central squares (d1, d8, e1 and e8). In general, if we represent files in the x-axis and (promotions + underpromotions) in the y-axis, we will obtain a V-shaped graph. The simmetry (a-h, b-g, c-f and d-e) is greater in FRC and more unbalanced in standard chess (it could be because of less theory in FRC?).
The greater symmetry of the FRC results is most likely related to the fact that every FRC starting position has a mirror position.
Ajedrecista wrote: d) White side promotes/underpromotes more than black side (is there a relationship with the first-move advantage?).
That seems like a valid assumption to me.
Ajedrecista wrote: e) Promotions and underpromotions are more common in the queenside than in the kingside (it could be because of the greater frequency of castling kingside than castling queenside?).
I think the answer is yes. Kingside castling occurs at a 9 to 1 ratio compared to queenside castling in the CCRL database. However, I had expected that the promotion asymmetry would have been higher than what you found. So I may be wrong.
Ajedrecista wrote: There are surely more questions/conclusions.

I do not know if these statistics can be helpful for programmers, in the sense of tune bonuses and/or penalties in chess programming stuff. Anyway, I wanted to share it. Enjoy!

Regards from Spain.

Ajedrecista.
User avatar
Ajedrecista
Posts: 2102
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Some statistics about promotions and underpromotions.

Post by Ajedrecista »

Hello again:
sje wrote:Try a test case. Run your program on a single line of input which has two promotion moves and see the result.
It fails miserably. I tried the line 'h8=N h8=N' but it reports the string h8=N only once. I guess that I will lose promotions most of the time when there are two connected passed pawns (for example in a2 and b2, and the sequence of moves is a1=Q; ..., bxa1=Q (the same with the promotion square at b1)). OTOH I must say that my wrong count ratio is small, as you will see below.
Adam Hair wrote:You can use a program called "Joined" (created by Andreas Stable) to doublecheck the total number of promotions for each piece.
Thank you very much! This tool is great, it is a must have. I tried Joined with CCRL-404FRC.[129100].pgn and it does not work (it does not recognize initial FEN strings or something like this).

This is what I got from CCRL-4040.[464454].pgn after typing joined -q -v64 CCRL-4040.[464454].pgn in the command line:

Code: Select all

Scanning file CCRL-4040.[464454].pgn containing 525847439 bytes !
Read 464454 games with 63708628 moves, 12036248 lines, 525847439 bytes !
Got max. 702 moves in game 435187 before line 11287778 !
Got 1824201160 legal moves in 64173082 positons !
Got max. 84 legal moves after move 116 in game 137555 line 3557168 !

Total move statistics:
Pawn moves = 12953536, Knight moves = 8616741, Bishop moves = 9845880
Rook moves = 14240537, Queen moves = 8203793, King moves = 10689572
Check moves = 4186178, Mate moves = 32683, Stalemate moves = 834
Hit moves = 10016211, En passant moves = 30792, Pawn two moves = 2947974
Short castlings = 767126, Long castlings = 74305
Promotion to queen = 81344, Promotion to rook = 4302
Promotion to bishop = 1105, Promotion to knight = 2404

Total game statistics:
White won = 164862, Drawn = 177554, Black won = 122038
Unknown result = 0, Illegal result = 0, Conflicting results = 0
White mated = 14212, White stalemated = 399
Black mated = 18471, Black stalemated = 435

Number of games with different lengths:
 28:   107,  29:   120,  30:   109,  31:   177,  32:   153,  33:   166
 34:   172,  35:   191,  36:   217,  37:   225,  38:   209,  39:   264

[...]
There are more info for games with length > 39. FWIW in this thread:

Code: Select all

Promotion to queen = 81344, Promotion to rook = 4302
Promotion to bishop = 1105, Promotion to knight = 2404
I did not change my clumsy parser but I added more code for making it more automatic: now I only have to write the PGN name and the programme makes the rest. Of course it is slow: it took 1692.9 seconds (0:28:12.9) in my PC to parse CCRL-4040.[464454].pgn file. The results were quite accurate, something unexpected to me:

Code: Select all

Searching promotions and underpromotions...

 1/64  a1=Q    6584
 2/64  a1=R     325
 3/64  a1=B      82
 4/64  a1=N      69
 5/64  b1=Q    5477
 6/64  b1=R     320
 7/64  b1=B      69
 8/64  b1=N     130

[...]

57/64  g8=Q    4584
58/64  g8=R     185
59/64  g8=B      51
60/64  g8=N     147
61/64  h8=Q    5521
62/64  h8=R     239
63/64  h8=B      62
64/64  h8=N     104

Promotions/underpromotions to:
  Q:  81149
  R:   4302
  B:   1105
  N:   2402
I am wrong in around 0.24% of queenings, 0.08% of underpromotions to knight and I got the exact results of underpromotions to rook and bishop! (Of course I assume that Joined outputs the true values).

Furthermore, I calculate the number of games from the PGN and... oh surprise! I got the correct statistics: :)

Code: Select all

Searching for games and results...

Games:      464454
  1-0:      164862
  0-1:      122038
  1/2-1/2:  177554
  Unknown:       0

______________________________

White score:  54.61%
______________________________

White advantage:    32.13 Elo.
______________________________

Approximated elapsed time:  1692.9 seconds.
I can complete results from CEGT archives with the help of Joined and my tool if I find spare time (something difficult now).

Regards from Spain.

Ajedrecista.
jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

Re: Some statistics about promotions and underpromotions.

Post by jacobbl »

I've also seen underpromotion to rook when setting a mate. So I think to get relevant frequencies of the importance of underpromotion, you should remove promotions that are captured in the next move and convert underpromotions that result in a mate (except for knights) to a queen promotion.
User avatar
Ajedrecista
Posts: 2102
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Some statistics about promotions and underpromotions.

Post by Ajedrecista »

Hello Jacob:
jacobbl wrote:I've also seen underpromotion to rook when setting a mate. So I think to get relevant frequencies of the importance of underpromotion, you should remove promotions that are captured in the next move and convert underpromotions that result in a mate (except for knights) to a queen promotion.
Thanks for your interest. Your proposal looks reasonable... disgracefully I am not a programmer and I do not know how to do it. Someone else (if interested) would have to do the task.

Regards from Spain.

Ajedrecista.
flok

Re: Some statistics about promotions and underpromotions.

Post by flok »

For fun and kicks I stripped my "book maker" from most of its code and let it emit the promotion counts. It analyzed 1.6GB of PGN-files which are 2511300 games (not all of them containing promotions). The PGN files are a totally random collection of PGN files found googling the internet.

Code: Select all

Q 142960
B 433
R 1596
N 2313
To parse the PGN files I use the code from pgnparse_0.2.3.tar.gz (it is on a sourceforge project: https://sourceforge.net/projects/pgnpar ... rse_0.2.3/). This code works fairly well altough it doesn't understand comments ({...} ) and has problems with $2 and $6 etc which are "NAGs" (and I don't know what that is).


p.s. if anyone knows sources for PGN-files, any will do (I think that bad games/positions/moves will become statistical noise due to the amount of games analyzed), please let me know!
User avatar
Ajedrecista
Posts: 2102
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Some statistics about promotions and underpromotions.

Post by Ajedrecista »

Hello:

I downloaded the full PGN file of CEGT 40/20. I ran Joined throught it and there was some ugly output:

Code: Select all

[...]

Got FEN tag <r2qkbnr/pp1npppb/2p4p/7P/3P4/5NN1/PPP2PP1/R1BQKB1R w KQkq - 0 1> !
Got FEN tag <rnbqkb1r/1p3ppp/p3pn2/2p5/2BP4/4PN2/PP3PPP/RNBQ1RK1 w kq - 0 1> !
Got FEN tag <r1bqkb1r/5ppp/p1np1n2/1p2p1B1/4P3/N1N5/PPP2PPP/R2QKB1R w KQkq - 0 1> !
Got FEN tag <r1bqkb1r/1p3ppp/p1nppn2/6B1/3NP3/2N5/PPPQ1PPP/2KR1B1R b kq - 0 1> !
Got FEN tag <rnbq1rk1/pp3ppp/4pn2/2pp4/1bPP4/2NBPN2/PP3PPP/R1BQ1RK1 b - - 0 1> !
*** Warning: Read game with no moves in line 16928181 !
(FEN tags and 0-move games were independent things). There were many more things like those ones (I did not got anything like that with CCRL files). Finally, after 18 minutes more less:

Code: Select all

Read 682244 games with 99247346 moves, 17101864 lines, 781729892 bytes !
Got max. 600 moves in game 2500 before line 68538 !
Got 2782019805 legal moves in 99929590 positons !
Got max. 84 legal moves after move 143 in game 254252 line 6846656 !

Total move statistics:
Pawn moves = 19299872, Knight moves = 13024433, Bishop moves = 15267624
Rook moves = 22322133, Queen moves = 12742232, King moves = 17833853
Check moves = 6898774, Mate moves = 55701, Stalemate moves = 2582
Hit moves = 14953687, En passant moves = 46822, Pawn two moves = 4335985
Short castlings = 1135175, Long castlings = 107626
Promotion to queen = 141200, Promotion to rook = 7469
Promotion to bishop = 2036, Promotion to knight = 3959

Total game statistics:
White won = 247769, Drawn = 251462, Black won = 183011
Unknown result = 2, Illegal result = 0, Conflicting results = 0
White mated = 23872, White stalemated = 1285
Black mated = 31829, Black stalemated = 1297

Number of games with different lengths:
  0:     6,  10:     2,  14:     1,  18:     5,  19:     6,  20:    22
 21:    19,  22:    23,  23:    39,  24:    74,  25:    70,  26:    74
 27:   147,  28:   140,  29:   171,  30:   160,  31:   207,  32:   206
 33:   235,  34:   226,  35:   267,  36:   248,  37:   321,  38:   308
 39:   388,  40:   302,  41:   469,  42:   341,  43:   465,  44:   417
 45:   520,  46:   443,  47:   593,  48:   525,  49:   743,  50:   619
 51:   819,  52:   702,  53:   990,  54:   794,  55:  1097,  56:   897
 57:  1224,  58:   982,  59:  1443,  60:  1148,  61:  1597,  62:  1293
 63:  1730,  64:  1408,  65:  1989,  66:  1647,  67:  2111,  68:  1875
 69:  2385,  70:  2006,  71:  2508,  72:  2173,  73:  2835,  74:  2413
 75:  3042,  76:  2689,  77:  3228,  78:  2935,  79:  3554,  80:  3033
 81:  3657,  82:  3376,  83:  4054,  84:  4612,  85:  6256,  86:  5460
 87:  5611,  88:  4289,  89:  5085,  90:  4693,  91:  5217,  92:  4560
 93:  5327,  94:  4942,  95:  5647,  96:  5112,  97:  5783,  98:  5256
 99:  5937, 100:  5392, 101:  6202, 102:  5526, 103:  6140, 104:  5692
105:  6324, 106:  5775, 107:  6341, 108:  5823, 109:  6568, 110:  5977
111:  6453, 112:  6073, 113:  6424, 114:  6248, 115:  6396, 116:  5964
117:  6503, 118:  6047, 119:  6330, 120:  5944, 121:  6418, 122:  5865
123:  6381, 124:  5845, 125:  5967, 126:  5693, 127:  6067, 128:  5445
129:  5981, 130:  5508, 131:  5635, 132:  5287, 133:  5475, 134:  5216
135:  5353, 136:  4974, 137:  5208, 138:  4906, 139:  5093, 140:  4767
141:  4735, 142:  4634, 143:  4740, 144:  4409, 145:  4537, 146:  4325
147:  4445, 148:  4106, 149:  4285, 150:  3932, 151:  4101, 152:  3762
153:  3923, 154:  3616, 155:  3596, 156:  3540, 157:  3608, 158:  3412
159:  3417, 160:  3109, 161:  3335, 162:  3076, 163:  3128, 164:  3374
165:  3965, 166:  3564, 167:  3217, 168:  2832, 169:  3019, 170:  2727
171:  2811, 172:  2566, 173:  2498, 174:  2358, 175:  2390, 176:  2203
177:  2347, 178:  2239, 179:  2303, 180:  1989, 181:  2163, 182:  2167
183:  2089, 184:  1953, 185:  2010, 186:  1858, 187:  1906, 188:  1840
189:  1918, 190:  1790, 191:  1803, 192:  1684, 193:  1748, 194:  1628
195:  1690, 196:  1549, 197:  1613, 198:  1536, 199:  1615, 200:  1501
201:  1485, 202:  1514, 203:  1552, 204:  1381, 205:  1427, 206:  1404
207:  1369, 208:  1358, 209:  1397, 210:  1287, 211:  1306, 212:  1330
213:  1337, 214:  1273, 215:  1277, 216:  1257, 217:  1230, 218:  1187
219:  1264, 220:  1192, 221:  1150, 222:  1160, 223:  1159, 224:  1107
225:  1135, 226:  1133, 227:  1097, 228:  1080, 229:  1050, 230:  1085
231:  1038, 232:  1011, 233:  1054, 234:  1032, 235:   987, 236:   937
237:   966, 238:   964, 239:   963, 240:   988, 241:   924, 242:   916
243:   892, 244:   887, 245:  1090, 246:   957, 247:   876, 248:   844
249:   818, 250:   845, 251:   821, 252:   771, 253:   740, 254:   729
255:   752, 256:   747, 257:   695, 258:   730, 259:   676, 260:   743
261:   701, 262:   668, 263:   662, 264:   593, 265:   611, 266:   614
267:   606, 268:   565, 269:   600, 270:   562, 271:   585, 272:   539
273:   565, 274:   582, 275:   582, 276:   570, 277:   493, 278:   555
279:   502, 280:   550, 281:   509, 282:   492, 283:   490, 284:   471
285:   477, 286:   501, 287:   466, 288:   456, 289:   470, 290:   483
291:   413, 292:   403, 293:   403, 294:   424, 295:   419, 296:   401
297:   374, 298:   386, 299:   373, 300:  1491, 301:   357, 302:   344
303:   364, 304:   380, 305:   357, 306:   333, 307:   352, 308:   312
309:   316, 310:   308, 311:   317, 312:   312, 313:   329, 314:   329
315:   295, 316:   323, 317:   293, 318:   293, 319:   296, 320:  1125
321:   264, 322:   252, 323:   270, 324:   285, 325:   277, 326:   273
327:   272, 328:   243, 329:   259, 330:   231, 331:   228, 332:   222
333:   243, 334:   229, 335:   209, 336:   210, 337:   208, 338:   193
339:   195, 340:   194, 341:   193, 342:   202, 343:   186, 344:   207
345:   179, 346:   176, 347:   174, 348:   153, 349:   182, 350:   165
351:   178, 352:   159, 353:   150, 354:   146, 355:   147, 356:   164
357:   159, 358:   147, 359:   148, 360:   158, 361:   139, 362:   145
363:   152, 364:   142, 365:   129, 366:   144, 367:   120, 368:   135
369:   116, 370:   122, 371:   104, 372:   118, 373:   109, 374:   123
375:   110, 376:   118, 377:   110, 378:   100, 379:   115, 380:   109
381:   109, 382:   101, 383:    92, 384:   111, 385:    82, 386:    95
387:    91, 388:    94, 389:    96, 390:   102, 391:    86, 392:    92
393:    97, 394:    92, 395:    94, 396:    82, 397:    84, 398:    90
399:    77, 400:   113, 401:    72, 402:    87, 403:    70, 404:    80
405:    97, 406:    98, 407:    79, 408:    78, 409:    60, 410:    70
411:    70, 412:    67, 413:    66, 414:    70, 415:    55, 416:    68
417:    57, 418:    53, 419:    61, 420:    50, 421:    59, 422:    72
423:    45, 424:    59, 425:    54, 426:    53, 427:    30, 428:    46
429:    52, 430:    47, 431:    47, 432:    41, 433:    30, 434:    45
435:    40, 436:    34, 437:    45, 438:    46, 439:    45, 440:    58
441:    32, 442:    51, 443:    40, 444:    31, 445:    36, 446:    31
447:    41, 448:    41, 449:    37, 450:    69, 451:    33, 452:    37
453:    34, 454:    34, 455:    27, 456:    29, 457:    19, 458:    30
459:    27, 460:    27, 461:    36, 462:    27, 463:    33, 464:    25
465:    31, 466:    27, 467:    17, 468:    22, 469:    23, 470:    29
471:    22, 472:    23, 473:    23, 474:    20, 475:    23, 476:    15
477:    29, 478:    28, 479:    14, 480:    12, 481:    21, 482:    20
483:    19, 484:    19, 485:    23, 486:    16, 487:    23, 488:    13
489:    15, 490:    19, 491:    25, 492:    20, 493:    19, 494:    18
495:    15, 496:    24, 497:    20, 498:   171, 499:    14, 500:   269
501:    12, 502:     6, 503:    11, 504:     8, 505:     9, 506:     3
507:     6, 508:     8, 509:     7, 510:     9, 511:    10, 512:     3
513:     7, 514:     4, 515:    12, 516:     6, 517:     6, 518:     8
519:    11, 520:     2, 521:     5, 522:     5, 523:     7, 524:     7
525:     9, 526:     4, 527:     6, 528:     4, 529:     7, 530:     7
531:     7, 532:     3, 533:     4, 534:     5, 535:     2, 536:     5
537:     3, 538:     1, 539:     5, 540:     6, 541:     4, 542:     6
543:     5, 544:     5, 545:     2, 546:     3, 547:     3, 548:     4
549:     5, 550:     2, 551:     3, 552:     2, 553:     3, 554:     5
555:     2, 557:     3, 558:     6, 559:     1, 560:     4, 561:     3
562:     1, 563:     3, 564:     1, 565:     4, 566:     2, 567:     4
568:     1, 569:     4, 570:     1, 571:     1, 572:     2, 573:     2
574:     2, 576:     5, 577:     1, 578:     4, 579:     1, 580:     6
581:     1, 582:     2, 584:     1, 585:     3, 586:     1, 587:     2
588:     1, 589:     1, 590:     3, 591:     1, 592:     1, 593:     2
594:     2, 595:     3, 597:     2, 599:     1, 600:    56
Sorry for the large code box, but some lengths like 300, 498, 500 and 600 brought my attention. I find strange that one tail (600) has such number of games. Maybe games longer than 600 plies are counted like if they were 600 plies long? Other issue are 0-move games (six in this PGN file).

My tool took around twice the time that Joined took. Some results:

Code: Select all

Searching promotions and underpromotions...

 1/64  a1=Q   11225     Estimated remaining time:  2109.9 seconds.
 2/64  a1=R     590     Estimated remaining time:  2084.8 seconds.
 3/64  a1=B     108     Estimated remaining time:  2050.2 seconds.
 4/64  a1=N     119     Estimated remaining time:  2019.1 seconds.

[...]

61/64  h8=Q   10580     Estimated remaining time:   224.9 seconds
62/64  h8=R     422     Estimated remaining time:   192.8 seconds
63/64  h8=B     159     Estimated remaining time:   160.6 seconds
64/64  h8=N     187     Estimated remaining time:   128.6 seconds

Promotions/underpromotions to:
  Q: 140867
  R:   7468
  B:   2036
  N:   3957

Searching for games and results...

Games:      682244      Estimated remaining time:    96.4 seconds
  1-0:      247769      Estimated remaining time:    64.2 seconds
  0-1:      183011      Estimated remaining time:    32.1 seconds
  1/2-1/2:  251462
  Unknown:       2
______________________________

White score:  54.75%
______________________________

White advantage:    33.08 Elo.
______________________________

Approximated elapsed time:  2184.7 seconds.
My tool parses perfectly the 'games and results' section (it should include as 'unknown results' the three groups 'unknown', 'illegal' and 'conflicting' results from Joined).

OTOH only underpromotions to bishop were flawlessly counted this time. My tool never counts more promotions/underpromotions than Joined: I was wrong by more less 0.24% of queenings (333 out of 141200), 0.01% of underpromotions to rook (1 out of 7469) and 0.05% of underpromotions to knight (2 out of 3959). Not completely bad considering the low level of my amateur tool.

Other interesting statistics copied from the output file:

Code: Select all

a1=*:   12042 ( 16.96%)
b1=*:   10034 ( 14.13%)
c1=*:    8261 ( 11.64%)
d1=*:    7395 ( 10.42%)
e1=*:    7460 ( 10.51%)
f1=*:    8426 ( 11.87%)
g1=*:    7772 ( 10.95%)
h1=*:    9601 ( 13.52%)
 SUM:   70991
 
-----------------------
 
a8=*:   14636 ( 17.56%)
b8=*:   12096 ( 14.51%)
c8=*:    9691 ( 11.63%)
d8=*:    9235 ( 11.08%)
e1=*:    7969 (  9.56%)
f8=*:    9176 ( 11.01%)
g8=*:    9186 ( 11.02%)
h8=*:   11348 ( 13.62%)
 SUM:   83337
 
-----------------------
 
a*=*:   26678 ( 17.29%)
b*=*:   22130 ( 14.34%)
c*=*:   17952 ( 11.63%)
d*=*:   16630 ( 10.78%)
e*=*:   15429 ( 10.00%)
f*=*:   17602 ( 11.41%)
g*=*:   16958 ( 10.99%)
h*=*:   20949 ( 13.57%)
 SUM:  154328
 
=======================
 
*1=*:   70991 ( 46.00%)
*8=*:   83337 ( 54.00%)
 SUM:  154328
 
=======================
 
*1=Q:   64731 ( 91.18%)
*1=R:    3500 (  4.93%)
*1=B:     954 (  1.34%)
*1=N:    1806 (  2.54%)
 SUM:   70991
 
-----------------------
 
*8=Q:   76136 ( 91.36%)
*8=R:    3968 (  4.76%)
*8=B:    1082 (  1.30%)
*8=N:    2151 (  2.58%)
 SUM:   83337
 
-----------------------
 
 *=Q:  140867 ( 91.28%)
 *=R:    7468 (  4.84%)
 *=B:    2036 (  1.32%)
 *=N:    3957 (  2.56%)
 SUM:  154328
 
=======================
 
( 154328 promotions and underpromotions in  682244 games) ~ 0.2262 (promotions and underpromotions)/game.
Of course these numbers must not be taken as exact ones (numbers reported by Joined should be the correct ones), but as a good overall picture.

Regards from Spain.

Ajedrecista.