I am not sure what you want to compare. Well what I am doing is comparing the effect of splitting using different formulas available to me. AFAIK you expand to a fixed depth uniformly so that will be "uniform" in my tests. All tests are done with a fixed number of simulations so the actual number doesn't matter. Anyway I have mentioned it many times already 100 milhgm wrote:I have completely lost track of what you are doing. Which is OK, because I am only interested in comparing the outcome tomy own results. But the standard deviations and errors you quote, how many games did you have to play to obtain those?
Perft(13) betting pool
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
And here is the missing Log-tree size result with free splitting versions. I think it has come in fourth place (i.e after the new number one is added) .
And detailed result
Code: Select all
Mean 6.286348E+16
StdDev 4.966971E+12
Error 8.583E+12
Code: Select all
Log-tree size
0 h2h3 1.81E+15 +- 6.93E+11 1.81E+15 2939466 33797 4.80E+23
0 h2h4 2.61E+15 +- 8.55E+11 2.62E+15 3660969 58215 7.30E+23
0 g2g3 2.49E+15 +- 8.18E+11 2.50E+15 3566734 54204 6.68E+23
0 g2g4 2.21E+15 +- 7.32E+11 2.22E+15 3335250 42521 5.36E+23
0 f2f3 1.55E+15 +- 6.03E+11 1.55E+15 2636787 20806 3.63E+23
0 f2f4 2.04E+15 +- 7.52E+11 2.05E+15 3179819 39369 5.66E+23
0 e2e3 7.10E+15 +- 2.17E+12 7.16E+15 5629451 141422 4.70E+24
0 e2e4 7.20E+15 +- 2.35E+12 7.27E+15 5657350 141739 5.52E+24
0 d2d3 4.57E+15 +- 1.17E+12 4.59E+15 4763097 84516 1.37E+24
0 d2d4 6.28E+15 +- 1.65E+12 6.33E+15 5388945 125126 2.73E+24
0 c2c3 2.74E+15 +- 8.93E+11 2.75E+15 3756436 67732 7.97E+23
0 c2c4 3.10E+15 +- 1.00E+12 3.12E+15 4001560 72858 1.00E+24
0 b2b3 2.40E+15 +- 8.08E+11 2.41E+15 3494193 51282 6.52E+23
0 b2b4 2.40E+15 +- 8.00E+11 2.41E+15 3499493 52710 6.39E+23
0 a2a3 1.82E+15 +- 6.99E+11 1.83E+15 2951167 34338 4.89E+23
0 a2a4 2.56E+15 +- 8.42E+11 2.57E+15 3623677 57003 7.10E+23
0 g1h3 2.12E+15 +- 7.44E+11 2.13E+15 3257143 42625 5.54E+23
0 g1f3 2.69E+15 +- 8.31E+11 2.70E+15 3723154 53329 6.91E+23
0 b1c3 2.72E+15 +- 9.35E+11 2.73E+15 3741753 59205 8.75E+23
0 b1a3 2.09E+15 +- 7.69E+11 2.10E+15 3226556 42175 5.91E+23
6.252344E+16 6.286348E+16 76033000 1274972 2.467080E+25
Mean 6.286348E+16
StdDev 4.966971E+12
Error 8.583E+12
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Yet another piece of perft(13)
Yet another piece of perft(13), the first draft 11 result from the current run:
[D]rnbqkbnr/1ppppppp/p7/8/8/N7/PPPPPPPP/R1BQKBNR w KQkq - 0 2
perft(11): 1,865,423,942,798,492
One down, 399 to go.
[D]rnbqkbnr/1ppppppp/p7/8/8/N7/PPPPPPPP/R1BQKBNR w KQkq - 0 2
perft(11): 1,865,423,942,798,492
One down, 399 to go.
-
- Posts: 27795
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
So this would be comparable (in effort) as when I start from a 6-ply full-with from the opening position, which is about 119M paths? The errors you quote do seem all larger, which made me wonder.Daniel Shawul wrote:I am not sure what you want to compare. Well what I am doing is comparing the effect of splitting using different formulas available to me. AFAIK you expand to a fixed depth uniformly so that will be "uniform" in my tests. All tests are done with a fixed number of simulations so the actual number doesn't matter. Anyway I have mentioned it many times already 100 milhgm wrote:I have completely lost track of what you are doing. Which is OK, because I am only interested in comparing the outcome tomy own results. But the standard deviations and errors you quote, how many games did you have to play to obtain those?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
No. Your MC part is heavy. Peter's (a.k.a pure monte-carlo) simulation considers only one move. So it is not comparable in effort. Also I am not sure how you do your tests ,but do you visit the perft(6) nodes only once to get your result? And the maximum depth of expansion for the uniform selector is 5 not 6 (i.e the maximum number of nodes I store in memory). See the column second from last which prints the number of nodes stored for each move.hgm wrote:So this would be comparable (in effort) as when I start from a 6-ply full-with from the opening position, which is about 119M paths? The errors you quote do seem all larger, which made me wonder.Daniel Shawul wrote:I am not sure what you want to compare. Well what I am doing is comparing the effect of splitting using different formulas available to me. AFAIK you expand to a fixed depth uniformly so that will be "uniform" in my tests. All tests are done with a fixed number of simulations so the actual number doesn't matter. Anyway I have mentioned it many times already 100 milhgm wrote:I have completely lost track of what you are doing. Which is OK, because I am only interested in comparing the outcome tomy own results. But the standard deviations and errors you quote, how many games did you have to play to obtain those?
Edit: I see that with the expansion method a certain number of simulations are required for a node before split . So the equivalent
will be to visit each perft(5) node 100,000,000 / perft(5) ~= 20.55 times. And that with a simple MC should be the equivalent.
Anyway you are missing the whole point. The MC part is a black box to me and I am only comparing different ways of splitting. Your version of MC does more work but doesn't mean it is better.
(a) Pure MC gave better results (lower variance) per _simulation time_ or so I read, and also is easier to implement.
(b) If indeed you have a better MC simulator I can use that and I should be able to get better results with one of the non-unirorm splitters because uniform method ,as you do it now, is consistently performing as the worst now.
(c) I have tested doing 2 monte carlo simulations instead of 1 for the leaf nodes and it gives lower variance as expected. Obviously the more work you do in the MC part the lower the variance.
Given all of the above, it is good to take the MC part out of the equation unless there is some interaction with upper part.
-
- Posts: 27795
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
Well, there is obviously more than one point I don't get. Like what you mean by "MC part" or "heavy". Also, with 1 move you cannot get perft(12) in any method.That needs at least 12 moves, and the std deviation would be huge (close to 100%, or even larger) in that case.
When I am talking about 119M paths I do indeed mean I visit every perft(6) leaf exactly once. But in fact the number of paths that make it all the way to 11 ply is only 62M (because the acceptance probability 1/32 is smaller than the inverse average branching factor). On the 12th ply I don't do any move at all, of course, but I count (and thus have to generate) moves in the 11-ply nodes. But I guess Peter's method has to do that too, to get the factor for that ply.
So it seems my effort is some where between 62M and 119M 12-ply games (the initial 5 plies being highly combined,and thus not really contributing to the effort). Why is that more effort than your generation of 100M games?
When I am talking about 119M paths I do indeed mean I visit every perft(6) leaf exactly once. But in fact the number of paths that make it all the way to 11 ply is only 62M (because the acceptance probability 1/32 is smaller than the inverse average branching factor). On the 12th ply I don't do any move at all, of course, but I count (and thus have to generate) moves in the 11-ply nodes. But I guess Peter's method has to do that too, to get the factor for that ply.
So it seems my effort is some where between 62M and 119M 12-ply games (the initial 5 plies being highly combined,and thus not really contributing to the effort). Why is that more effort than your generation of 100M games?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
The MC part is the "selective" part in your case. Most of us consider only one move so that makes it more like a conventional montecarlo. OTOH you prune many moves but still could have two or more moves to invesitgate at a ply. At least that is how I understood it. I don't pretend to know all about your method so do care to explain if i misunderstood. So that makes it heavy because it investigates more than one moves at a ply. And it is not surprising if it gives lower variance.The point is if it can perform better with a limited time because that is the only comparison that makes sense in this case. From what I gathered, Peter's improvement is that considering only one move gave a lower variance for a fixed time.hgm wrote:Well, there is obviously more than one point I don't get. Like what you mean by "MC part" or "heavy". Also, with 1 move you cannot get perft(12) in any method.That needs at least 12 moves, and the std deviation would be huge (close to 100%, or even larger) in that case.
Well the point is your visits use a heavier selective part than the rest of us do. If I raise the value of N in your method (Peter decreased it to 1), then obviously I will get lower variance per visits. But the test will be meaningless if it is still based on reduction per visit because you do more work per vist with higher N. Also your old test results took longer when you start from larger depth. Starting from depth=3 it took 54 sec and from depth=6 took 1392 sec (23min). Sure the later will give lower variance but it is majorly due to the higher number of simulations. That is why I insisted on equal number of simulations which ever depth you start selectivity from. For instance it would be interesting to compare the variance for the above two tests by visiting the depth=3 tests by (pert(6) / perft(3)) times more. Both tests should take roughly equal amount of time.When I am talking about 119M paths I do indeed mean I visit every perft(6) leaf exactly once. But in fact the number of paths that make it all the way to 11 ply is only 62M (because the acceptance probability 1/32 is smaller than the inverse average branching factor). On the 12th ply I don't do any move at all, of course, but I count (and thus have to generate) moves in the 11-ply nodes. But I guess Peter's method has to do that too, to get the factor for that ply.
So it seems my effort is some where between 62M and 119M 12-ply games (the initial 5 plies being highly combined,and thus not really contributing to the effort). Why is that more effort than your generation of 100M games?
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
As I have been saying all the time: the only correct measure to compare methods isBut the test will be meaningless if it is still based on reduction per visit because you do more work per vist with higher N.
work*variance
If you halve the variance for twice the work you have gained nothing since you could just as well repeat your original method twice.
The question is: how to measure work? Ideally it should be cpu time but that depends on the particular computer architecture being used. So an approximation would be "nodes visited" or even "leaf nodes visited".
But this is also not ideal since it does not take into account the overhead
associated with calling the move generator (less if you visit more
children per node) or cache access.
-
- Posts: 27795
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
You overlook the fact that I can also have zero moves per node,and that that probability is larger than having two or more. On the average I search less than one node, because my acceptance probability is 1/32, and there are on average less than 32 moves per node. So I guess that in your terminology my method is actually lighter than all the others.Daniel Shawul wrote:OTOH you prune many moves but still could have two or more moves to invesitgate at a ply. At least that is how I understood it. I don't pretend to know all about your method so do care to explain if i misunderstood. So that makes it heavy because it investigates more than one moves at a ply.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Agreed.Michel wrote:As I have been saying all the time: the only correct measure to compare methods isBut the test will be meaningless if it is still based on reduction per visit because you do more work per vist with higher N.
work*variance
If you halve the variance for twice the work you have gained nothing since you could just as well repeat your original method twice.
I think with the same type of algorithm fixing number of simulations is a good way. It is similar to fixed nodes test for engine-engine matches. If the algorithms used for the MC simulations are different, comparison should be made by time instead. For that we have to use the _same_ program (Java and c++ perfts can't be compared). For example, no one has yet compared the gain from starting the MC simulation from depth=3 instead of depth=0 by doing same number of simulations. [A hunch] I suspect the gain could be small for a uniform full-width and uniform MC selection. That is if the tree is more or less uniform [/hunch]The question is: how to measure work? Ideally it should be cpu time but that depends on the particular computer architecture being used. So an approximation would be "nodes visited" or even "leaf nodes visited".
But this is also not ideal since it does not take into account the overhead
associated with calling the move generator (less if you visit more
children per node) or cache access.