Move ordering test (2)

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Rebel
Posts: 4156
Joined: Thu Aug 18, 2011 10:04 am

Move ordering test (2)

Post by Rebel » Thu May 30, 2013 9:15 pm

New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm

First results at: http://www.top-5000.nl/moresu2.htm

I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.

-------

We can do a third and last test later that deals with nodes that get no A/B at all.

bob
Posts: 20346
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Move ordering test (2)

Post by bob » Thu May 30, 2013 11:28 pm

Rebel wrote:New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm

First results at: http://www.top-5000.nl/moresu2.htm

I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.

-------

We can do a third and last test later that deals with nodes that get no A/B at all.
If we are going to study this, I would suggest a change to the methodology:

(1) fail_high += move_number * move_number;

This tries to differentiate between the case where a program fails high on the 2nd move twice, and the fourth once. In the current test, there is little to differentiate between the two cases, by squaring the number, failing high on the 4th move is way worse...

I suppose that if you want to, you could then compute the average and take the square root of that to approximate the "average" position. The raw number would be just as informative without the sort().

User avatar
Rebel
Posts: 4156
Joined: Thu Aug 18, 2011 10:04 am

Re: Move ordering test (2)

Post by Rebel » Fri May 31, 2013 10:48 am

bob wrote:
Rebel wrote:New formula as suggested by Don is up, see: http://www.top-5000.nl/motest2.htm

First results at: http://www.top-5000.nl/moresu2.htm

I took the liberty to use Don's and Peter's posted numbers. If they are not good please fill in the form or lemme know in another way.

-------

We can do a third and last test later that deals with nodes that get no A/B at all.
If we are going to study this, I would suggest a change to the methodology:

(1) fail_high += move_number * move_number;

This tries to differentiate between the case where a program fails high on the 2nd move twice, and the fourth once. In the current test, there is little to differentiate between the two cases, by squaring the number, failing high on the 4th move is way worse...

I suppose that if you want to, you could then compute the average and take the square root of that to approximate the "average" position. The raw number would be just as informative without the sort().
The move_count * move_count makes sense.

There are a couple of more lose ends,

1. If you (like me) have the factor updated every second on the screen you will notice a pattern that after a fail-high at the root during the research and the wider window that comes with it the factor will increase rapidly because many new variants (not in TT) are searched.

2. The same counts for a new iteration searching the best move at the new depth, the factor goes up. While searching the rest of the moves (provided no new best move is found), the factor rapidly goes down.

Meaning, that the 30 seconds limit is highly unreliable, if after 30 seconds you are in the middle of several fail-high's than just 10 positions can influence the overall performance in a negative way. I can only see one good solution, increase the number of positions.

3. There has to be made a decision about the nodes where no A/B is possible. My suggestion using your example code:

Code: Select all

 if ( move_number <= 5 ) fail_high += move_number * move_number;
So first 5 moves only.

And at the end of the move list:

Code: Select all

  fail_high += 36;

User avatar
Rebel
Posts: 4156
Joined: Thu Aug 18, 2011 10:04 am

Re: Move ordering test (2)

Post by Rebel » Fri May 31, 2013 11:31 am

8 entries received so for. Results start to make more sense.

Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 4:44 pm
Location: Bulgaria
Contact:

Re: Move ordering test (2)

Post by Mincho Georgiev » Fri May 31, 2013 11:42 am

Hi Ed! I apologize if this is a stupid question, or already explained, but should we use the search in it's current state, or we should eliminate everything that could cause more noise to the result. I'll try to explain...
If an engine have a good move ordering scheme in general, but there's a bug somewhere (IID, FP, e.t.c.) this would raise a rather different result than if all the search "add-ons" have been disabled during the measurement.
I understand that the move-picking is influenced by the entire search as it is, but still...
Again sorry if I misunderstood the concept, but I think is a plausible question.

User avatar
Rebel
Posts: 4156
Joined: Thu Aug 18, 2011 10:04 am

Re: Move ordering test (2)

Post by Rebel » Fri May 31, 2013 12:25 pm

Mincho Georgiev wrote:Hi Ed! I apologize if this is a stupid question, or already explained, but should we use the search in it's current state, or we should eliminate everything that could cause more noise to the result. I'll try to explain...
If an engine have a good move ordering scheme in general, but there's a bug somewhere (IID, FP, e.t.c.) this would raise a rather different result than if all the search "add-ons" have been disabled during the measurement.
I understand that the move-picking is influenced by the entire search as it is, but still...
Again sorry if I misunderstood the concept, but I think is a plausible question.
Hi Mincho,

In an almost perfect world such as the early days of CC when most chess programs were pure brute-force searchers with no reductions, pruning and only check extensions (hence the "almost") this test we are doing now would be conclusive.

So you are right, nowadays a conclusive result is impossible. It doesn't mean there is no value in what we are currently doing. Hope this helps.

Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 4:44 pm
Location: Bulgaria
Contact:

Re: Move ordering test (2)

Post by Mincho Georgiev » Fri May 31, 2013 12:58 pm

Yes, it does help and it makes sense, the search - as it is...
Probably though, someone would be interested to see his results in bare-bones search, compared to his search - as it is currently.
I would for one.
And finally, should we use the counters at the root as well, I supposed so, but tell me if we shouldn't.

User avatar
hgm
Posts: 22319
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Move ordering test (2)

Post by hgm » Fri May 31, 2013 1:05 pm

Counting the square of the move number seems exactly the opposite of what is needed, as late moves are often reduced, and thus hardly count. If per node on average only the first 6 moves are not reduced, having all cutoffs on the 6th move is much worse than having half at the 1st, and half at the 11th ((1+6.5)/2 = 3.75, if we count a reduced move for 0.1). With the square counting you would get sqrt((1 + 121)/2) = sqrt(61) ~ 8.

Anyway, Fairy-Max scores ~2.5 - ~3.5. I guess its main problem is that Pawn double pushes (and castlings) cannot become hash move, as it scores especially bad when there are many 2nd-rank Pawns. That it doesn't have killers of course also doesn't help. This is not counting earlier IID iterations, which score of course much worse (but take very little time compared to the last one).
Last edited by hgm on Fri May 31, 2013 1:07 pm, edited 1 time in total.

User avatar
lucasart
Posts: 2957
Joined: Mon May 31, 2010 11:29 am
Contact:

Re: Move ordering test (2)

Post by lucasart » Fri May 31, 2013 1:06 pm

bob wrote: (1) fail_high += move_number * move_number;
We should also account for the depth. Ordering badly at depth=1 is not a disaster, but at depth 10... And there are much more nodes at depth 1 thant nodes at depth 10 due to the "exponential shape" of the tree. So we need some kind of depth rescaling here.

Anyway, I think this move ordering contest doesn't make much sense, because we are comparing apples and pears, with engines that do a lot of things differently that will pollute these results. For example if I make move count pruning more aggressive, I will improve my score at Ed's "move ordering contest", but in reality I will make my engine weaker.

That's why I am not very interested to participate to this contest. The only contest that makes any real sense is an ELO contest :lol:
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

User avatar
hgm
Posts: 22319
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Move ordering test (2)

Post by hgm » Fri May 31, 2013 1:09 pm

Well, this was inherent with adopting a defenition of move-ordering quality that does not correlate (or perhaps even anti-correlates) with search efficiency...

Post Reply