New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm
First results at: http://www.top-5000.nl/moresu2.htm
I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.
-------
We can do a third and last test later that deals with nodes that get no A/B at all.
Move ordering test (2)
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Move ordering test (2)
If we are going to study this, I would suggest a change to the methodology:Rebel wrote:New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm
First results at: http://www.top-5000.nl/moresu2.htm
I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.
-------
We can do a third and last test later that deals with nodes that get no A/B at all.
(1) fail_high += move_number * move_number;
This tries to differentiate between the case where a program fails high on the 2nd move twice, and the fourth once. In the current test, there is little to differentiate between the two cases, by squaring the number, failing high on the 4th move is way worse...
I suppose that if you want to, you could then compute the average and take the square root of that to approximate the "average" position. The raw number would be just as informative without the sort().
-
- Posts: 6995
- Joined: Thu Aug 18, 2011 12:04 pm
Re: Move ordering test (2)
The move_count * move_count makes sense.bob wrote:If we are going to study this, I would suggest a change to the methodology:Rebel wrote:New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm
First results at: http://www.top-5000.nl/moresu2.htm
I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.
-------
We can do a third and last test later that deals with nodes that get no A/B at all.
(1) fail_high += move_number * move_number;
This tries to differentiate between the case where a program fails high on the 2nd move twice, and the fourth once. In the current test, there is little to differentiate between the two cases, by squaring the number, failing high on the 4th move is way worse...
I suppose that if you want to, you could then compute the average and take the square root of that to approximate the "average" position. The raw number would be just as informative without the sort().
There are a couple of more lose ends,
1. If you (like me) have the factor updated every second on the screen you will notice a pattern that after a fail-high at the root during the research and the wider window that comes with it the factor will increase rapidly because many new variants (not in TT) are searched.
2. The same counts for a new iteration searching the best move at the new depth, the factor goes up. While searching the rest of the moves (provided no new best move is found), the factor rapidly goes down.
Meaning, that the 30 seconds limit is highly unreliable, if after 30 seconds you are in the middle of several fail-high's than just 10 positions can influence the overall performance in a negative way. I can only see one good solution, increase the number of positions.
3. There has to be made a decision about the nodes where no A/B is possible. My suggestion using your example code:
Code: Select all
if ( move_number <= 5 ) fail_high += move_number * move_number;
And at the end of the move list:
Code: Select all
fail_high += 36;
-
- Posts: 6995
- Joined: Thu Aug 18, 2011 12:04 pm
Re: Move ordering test (2)
8 entries received so for. Results start to make more sense.
-
- Posts: 454
- Joined: Sat Apr 04, 2009 6:44 pm
- Location: Bulgaria
Re: Move ordering test (2)
Hi Ed! I apologize if this is a stupid question, or already explained, but should we use the search in it's current state, or we should eliminate everything that could cause more noise to the result. I'll try to explain...
If an engine have a good move ordering scheme in general, but there's a bug somewhere (IID, FP, e.t.c.) this would raise a rather different result than if all the search "add-ons" have been disabled during the measurement.
I understand that the move-picking is influenced by the entire search as it is, but still...
Again sorry if I misunderstood the concept, but I think is a plausible question.
If an engine have a good move ordering scheme in general, but there's a bug somewhere (IID, FP, e.t.c.) this would raise a rather different result than if all the search "add-ons" have been disabled during the measurement.
I understand that the move-picking is influenced by the entire search as it is, but still...
Again sorry if I misunderstood the concept, but I think is a plausible question.
-
- Posts: 6995
- Joined: Thu Aug 18, 2011 12:04 pm
Re: Move ordering test (2)
Hi Mincho,Mincho Georgiev wrote:Hi Ed! I apologize if this is a stupid question, or already explained, but should we use the search in it's current state, or we should eliminate everything that could cause more noise to the result. I'll try to explain...
If an engine have a good move ordering scheme in general, but there's a bug somewhere (IID, FP, e.t.c.) this would raise a rather different result than if all the search "add-ons" have been disabled during the measurement.
I understand that the move-picking is influenced by the entire search as it is, but still...
Again sorry if I misunderstood the concept, but I think is a plausible question.
In an almost perfect world such as the early days of CC when most chess programs were pure brute-force searchers with no reductions, pruning and only check extensions (hence the "almost") this test we are doing now would be conclusive.
So you are right, nowadays a conclusive result is impossible. It doesn't mean there is no value in what we are currently doing. Hope this helps.
-
- Posts: 454
- Joined: Sat Apr 04, 2009 6:44 pm
- Location: Bulgaria
Re: Move ordering test (2)
Yes, it does help and it makes sense, the search - as it is...
Probably though, someone would be interested to see his results in bare-bones search, compared to his search - as it is currently.
I would for one.
And finally, should we use the counters at the root as well, I supposed so, but tell me if we shouldn't.
Probably though, someone would be interested to see his results in bare-bones search, compared to his search - as it is currently.
I would for one.
And finally, should we use the counters at the root as well, I supposed so, but tell me if we shouldn't.
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Move ordering test (2)
Counting the square of the move number seems exactly the opposite of what is needed, as late moves are often reduced, and thus hardly count. If per node on average only the first 6 moves are not reduced, having all cutoffs on the 6th move is much worse than having half at the 1st, and half at the 11th ((1+6.5)/2 = 3.75, if we count a reduced move for 0.1). With the square counting you would get sqrt((1 + 121)/2) = sqrt(61) ~ 8.
Anyway, Fairy-Max scores ~2.5 - ~3.5. I guess its main problem is that Pawn double pushes (and castlings) cannot become hash move, as it scores especially bad when there are many 2nd-rank Pawns. That it doesn't have killers of course also doesn't help. This is not counting earlier IID iterations, which score of course much worse (but take very little time compared to the last one).
Anyway, Fairy-Max scores ~2.5 - ~3.5. I guess its main problem is that Pawn double pushes (and castlings) cannot become hash move, as it scores especially bad when there are many 2nd-rank Pawns. That it doesn't have killers of course also doesn't help. This is not counting earlier IID iterations, which score of course much worse (but take very little time compared to the last one).
Last edited by hgm on Fri May 31, 2013 3:07 pm, edited 1 time in total.
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Move ordering test (2)
We should also account for the depth. Ordering badly at depth=1 is not a disaster, but at depth 10... And there are much more nodes at depth 1 thant nodes at depth 10 due to the "exponential shape" of the tree. So we need some kind of depth rescaling here.bob wrote: (1) fail_high += move_number * move_number;
Anyway, I think this move ordering contest doesn't make much sense, because we are comparing apples and pears, with engines that do a lot of things differently that will pollute these results. For example if I make move count pruning more aggressive, I will improve my score at Ed's "move ordering contest", but in reality I will make my engine weaker.
That's why I am not very interested to participate to this contest. The only contest that makes any real sense is an ELO contest
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Move ordering test (2)
Well, this was inherent with adopting a defenition of move-ordering quality that does not correlate (or perhaps even anti-correlates) with search efficiency...