All that I can add is that if you don't shut up, my head is going to explode, much like it feels after some of Gerd's bit-wizard stuff.Fritzlein wrote:For comparison, if I had a bag of 7,213 game results, and I picked out of that bag with equal likelihood of getting each result, then in 100 picks I would have a 50% of getting no duplicates. (birthday paradox) Therefore the actual results of getting 1 duplicate game in 100 playouts is consistent with there only being being 7,213 playouts you will ever get, each of them equally likely. Of course, in practice there will probably be more playouts that you eventually get if you try long enough, because they won't all be equally likely; some will be more likely than others, and there's where you will get your duplicates. I just wanted to throw out a ballpark number to help guide the intuition of how much variation this is.MartinBryant wrote:OK I've re-run the test on Bob's randomly selected Silver position.
[...]
In 100 games there was ONE duplicate game pair
Come to think of it, it's pretty much the same as the ballpark number you threw out there yourself, namely 4,950 (pairs of comparisons)...
Yes indeed, lots of different playouts possible. But if we look at the results of the playouts (win, draw, loss) does the distribution correspond to the distribution for a randomly chosen position? If the average over all positions (or even just over all balanced positions) is 30% white wins, 45% draw, 25% black wins, and looking at the actual playouts of this position (drawn from a pool of 4,950 or 7,213 or whatever) there are 10% white wins, 40% draws, and 50% black wins, then reusing this position is causing strong correlation in scores.MartinBryant wrote:There's still a mass (excuse the un-scientific term) of variation.
Blame insufficient clock jitter, blame the engine for not understanding black's strategic weaknesses, blame sunspots if you like, but if the results of playouts of this position are not the same shape as results of playouts from a random position, then there is correlation of the results to the position.
One interesting experiment I could try later is to determine just how many different games are possible from a single position. I could just start playing those 1/2 second games, and continue until I play a hundred thousand or so without getting a "new" result. That won't be an exact upper bound of course, but it would be a sort of monte-carlo approach to get real close to the right number.
I am somewhat interested in the issue, because of the way I measure time. Depending on the system, getting the current time can be an expensive operation. On the old Cray's for example, if you checked the time every 10 nodes or so, the NPS would drop by a factor of 4 or so because of the rather large context-switching time required to get down into the O/S to get that value. So I started using a counter that is set to some fixed value, and decremented each time a node is searched. When the counter reaches zero, I check the time.
The only problem is, I was perhaps too clever, as you have to answer the question "how often do you want to check?" and the answer is "often enough to not burn too much extra time by sampling too infrequently. If you play long games, 3 minutes per move, and sample once per second, you can never be more than a second over your target time. But for game in 1 minute, where the searches end up way less than a second, you would lose on time quickly. So I have a dynamically-adjusted "counter" that is reduced as the target search time is reduced.
What all that means is that in longish games, where the sample interval is one second or longer, I would probably see only two different node counts on the same position. One where I searched exactly N nodes and got a time of T-.01 (10 milliseconds short of the actual target time so I go for another time step and burn more nodes. The second time I reach this position, after N nodes I sample and get T+.01 and stop. Clock jitter is not going to be measured in seconds except perhaps on supercomputers. But as the games get shorter and shorter, 10 ms might be 10K nodes at 1M nodes per second, or on my 8-core, it would be 200K at 20M nodes per second. So if the clock jitter is no more than 10ms (probably is more than that on a PC) if my sample interval is small enough so that I do several samples within a "jitter interval" any of those node counts could be the "final value"...
I did run a test with this interval set to 1. so that I checked the time after every node, and the variability did not change significantly or go away. Probably enough games would show how this sample counter affects the number of different games, but it would be a really large number of games to get a real handle on this.
BTW, it has been a breath of fresh air to see real discussions all of a sudden, rather than the old "stampee feet" stuff that has happened each time I brought this up in the past.
I think the realizatoin that this is actually happening, is a real problem, and affects results is slowly creeping in as more and more are beginning to test and join in...