Hi Thomas, for supplementary data for Science (and any other reputable journal), we had to select what to include based on what we (and reviewers) thought were scientifically valuable and relevant to our claims in the paper. That's why only data supporting our main claims have been included. Unfortunately it's above my pay grade to decide to release data that hasn't been released, but hopefully Lc0 will catch up soon since we have at this point released more or less all relevant details of our algorithms (and I am happy to continue clarifying any points of confusion in our pseudo-code), and then there will be a publicly-accessible way to generate those data!Thomas A. Anderson wrote: ↑Thu Dec 13, 2018 2:47 pmMatthew, as I see from our conversation (and even more in other threads/posts/forums) there is a lot of uncertainty/speculation around, that could be avoided/answered best by having the games from all test at hand (especially the SF9/BF/Opening-position tests). Do you think there is any chance to get them?matthewlai wrote: ↑Wed Dec 12, 2018 6:40 pmThat is a very reasonable explanation, too. We do find that SF and AZ win and lose for very different reasons (AZ often loses to crazy and amazing tactics SF finds for example, that AZ just doesn't have enough NPS to see, while still being able to search deep), so although the strengths are in the same ballpark, there are certainly positions where one does much better than the other, and vice versa.Thomas A. Anderson wrote: ↑Wed Dec 12, 2018 4:53 pmSounds like a very reasonable explanation. But "being good" seems to be an attribute of a position that appears to be much more subjective than I thought. BF is playing the book-resulting positions successfully against a non-book SF, that its purpose, the reason why it exists. This superiority is, as far as I know, confirmed by any match against the "usual suspects", means the crowd of the AB-engines. Now it seems that this is reversed when using the book against AZ. Of course, you can build books specifically against certain components: SF is handling KID positions better than Engine A but is playing them less good than Engine B. Therefore a book that forces SF into the KID might work against Engine A well, but fails against Engine B. But we are talking about the starting position and a complete book that wasn't certainly proofed only against some narrow opening lines, because AZ used was playing with diversity activated. How big was the diversity of those games?
It's surprisingly difficult to quantize diversity. While it's obvious that if two games are exactly the same there is a lack of diversity, once we go beyond that it's very difficult to quantize, and we don't usually get identical games. For example, there are transpositions, or games that are substantially similar except for a few irrelevant pieces at different places, etc. We didn't look too much into this because there are just too many possibilities and it's not part of the main results. We really only did it because people said they wanted to see it, but I don't think there's really much scientific value.
Yeah many moves at move 1 (and the next few moves) have very similar values. It's possible that with diversity it's just taking SF out of book earlier or something like that. This is pure speculation.I would assume that AZ was playing different moves starting from move 1 on because there should be some of them within the 1% range already. We would need the games to answer the question finally, but my gut feeling is that here is something covered we can learn a lot from. The most "zero-ish" created opening book we have is shifting the match score of SF playing white pieces against AZ playing black from a 1-95-4 % towards a 9-73-18 % shape (both are rough values derived from the published graphs. Format: SF wins-Draws-AZ wins). Another interesting fact: that the BF-book works well for SF if it is playing the black pieces and fails only as white. This evens out and leads to the statement in the paper, that the usage of the opening book didn't have had a significant impact on the total match score .
Does SF do better against Lc0 with the Brainfish opening book (using whatever setting people think is optimal) at long time control, with diversity ensured with TCEC openings for example? I think the result of that would answer most of the questions here. At short time control I am pretty sure the opening book will help, but at long time control I am much less certain.