Hello Axel:
I find this topic very interesting. I coded a clumsy parser more than a year ago (link
here) in a try to search the frequency of promotions and underpromotions by file (a, b, ..., g, h). The results of that thread are a little off since my programming skills are very poor and that parser would not read correctly more than a promotion per line... but promotions are somewhat uncommon, so these results give a reasonable idea on what is happening... without expecting exact results, of course!
I posted the first version of the code at the original post but I later added more code until I got a programme that is easy to use (however, it is not public but I do not have problems in make it public). I analyzed some huge PGNs from CCRL and CEGT, reaching the conclusion of that wing pawns promoted more frequently than central pawns. I am aware that my parser only reads the squares of promotion: for example, a white pawn that promotes in a8 does not mean that started on a2. But knowing that, I found curious this V-shaped graphic of frequency of promotions by file. I obtained the result of more promotions on a1/a8 than h1/h8, probably due to the higher frequency of O-O instead of O-O-O.
I know that chances of survival are not promotions, but could be weakly correlated, and I wanted to note it. I have not download Million Base 2.2 from Ed's web because it is an .exe instead of a compressed PGN with ZIP, RAR or 7z. If not, I would use my tool for bring some results.
Last but not least, thank you very much to the programmer that did such task.
Regards from Spain.
Ajedrecista.