Move ordering mystery

Richard Allbert · Post by **Richard Allbert** » Wed Jul 16, 2008 3:25 pm

Hi,

I'm currently testing new versions of Lime - having squeezed a whole 50 points out of it

(Could even be less), but I seem to have something wrong that I can't put my finger on.

I've been adjusting the move ordering scores using test suites to try and get some more strength, and on one position, IQ4 no. 96, Lime_v64 finds the winning line from depth 8.

Lime_v65 (different move ordering) doesn't find the winning move until depth 10,

Other than the ordering scores, the rest of the search is performed the same.

Does this indicate a bug? What should I look for?

How do you (99% stronger engine programmers than me) tune the move ordering?

Thanks

Richard

Michael Sherwin · Post by **Michael Sherwin** » Wed Jul 16, 2008 4:52 pm

Hi Richard,

Move ordering changes can do what you described in a couple of ways.

1.) A key move gets pushed back into a LM Reduction and is not seen as good untill deeper into the search.

2.) At some node because of a move ordering change alpha becomes bigger and is passed as the next nodes beta which is smaller and as a result a null move is pruned where before it was not.

Romi says, hi!

Bill Rogers · Post by **Bill Rogers** » Wed Jul 16, 2008 5:31 pm

I may be wrong but I believe that move ordering only helps to make things fun faster not to add elo points. One exception is that if you move ordering is entirely wrong then you may not have time to actually find winning moves in a specific amount of time.
Bill

Richard Allbert · Post by **Richard Allbert** » Wed Jul 16, 2008 5:44 pm

Bill Rogers wrote:I may be wrong but I believe that move ordering only helps to make things fun faster not to add elo points. One exception is that if you move ordering is entirely wrong then you may not have time to actually find winning moves in a specific amount of time.
Bill

Err...

Good move ordering = faster cutoffs = less nodes = faster through the ply.

For example, if I take away the MVV/LVA ordering for captures, I search typically 2 plies shallower for the same number of nodes. This is a lot of elo points.

Richard

Richard Allbert · Post by **Richard Allbert** » Wed Jul 16, 2008 5:46 pm

Michael Sherwin wrote:Hi Richard,

Move ordering changes can do what you described in a couple of ways.

1.) A key move gets pushed back into a LM Reduction and is not seen as good untill deeper into the search.

2.) At some node because of a move ordering change alpha becomes bigger and is passed as the next nodes beta which is smaller and as a result a null move is pruned where before it was not.

Romi says, hi!

Hi Michael,

Point 2 could well be the answer..... thanks!

I'll make a note of testing with Null pruning switched off if I see these problems again.

As for LMR... now come on, do you seriously think I'm that far down the road

Richard

bob · Post by **bob** » Wed Jul 16, 2008 7:30 pm

Richard Allbert wrote:Hi,

I'm currently testing new versions of Lime - having squeezed a whole 50 points out of it (Could even be less), but I seem to have something wrong that I can't put my finger on.

I've been adjusting the move ordering scores using test suites to try and get some more strength, and on one position, IQ4 no. 96, Lime_v64 finds the winning line from depth 8.

Lime_v65 (different move ordering) doesn't find the winning move until depth 10,

Other than the ordering scores, the rest of the search is performed the same.

Does this indicate a bug? What should I look for?

How do you (99% stronger engine programmers than me) tune the move ordering?

Thanks

Richard

Nope. Indicates that you just suffer from the normal hash position grafting problem, where scores from one branch get grafted to other branches to produce more accurate scores. Problem is, this is dependent on luck, that you search the bad branch first to discover you can force a win, then search the good branch later so that you can use information from the bad branch to improve the good branch.

This is why fine #70 can be found anywhere between depth=16 and depth=26 depending on move ordering, hash size, hash replacement strategy, etc...

Richard Allbert · Post by **Richard Allbert** » Wed Jul 16, 2008 7:35 pm

Hi Bob,

Thanks for the feedback.

That means tactical suite testing is not such a good idea for move ordering testing?

bob · Post by **bob** » Wed Jul 16, 2008 10:59 pm

Richard Allbert wrote:Hi Bob,

Thanks for the feedback.

That means tactical suite testing is not such a good idea for move ordering testing?

Never was. Problem is that tactical positions, by their very nature, have moves that appear to be bad but are actually key moves. If you improve move ordering in oddball positions, you have to hurt it in normal positions. The only question is, which area do you want to excel in? I'm into real games rather than tactical tests, because we have to play real games to win tournaments. Of course, occasionally a tournament game is won by a neat tactical shot, so it isn't completely cut and dried as to which way to test.

jesper_nielsen · Post by **jesper_nielsen** » Thu Jul 17, 2008 9:51 am

Richard Allbert wrote:Hi,

I'm currently testing new versions of Lime - having squeezed a whole 50 points out of it (Could even be less), but I seem to have something wrong that I can't put my finger on.

I've been adjusting the move ordering scores using test suites to try and get some more strength, and on one position, IQ4 no. 96, Lime_v64 finds the winning line from depth 8.

Lime_v65 (different move ordering) doesn't find the winning move until depth 10,

Other than the ordering scores, the rest of the search is performed the same.

Does this indicate a bug? What should I look for?

How do you (99% stronger engine programmers than me) tune the move ordering?

Thanks

Richard

I gather statistics on the move number where a beta cutoff occurs.
That way, after a search I can see the average number of moves tried before the cutoff move is found.

So the closer to 1.00 that average is, the better.

This gives a rough indication of how good the move ordering is. It worked for me at the early stages of developement. But as the move ordering improves, it becomes more and more difficult to see an improvement on this number.

Also the number just gives an indication. The only way to verify if the changes are good or bad, is to run a lot of test games.

And now slightly off topic ...

I have found it to be very important to run test matches agains a diverse number of opponents.

Warning! Rambling based on pure speculation!

The reason for this is that computer chess is a "game of bottlenecks".

When two engines play, the positions they reach tend to gravitate towards positions where they disagree on the evaluation.
They typically end up in situations where each of them thinks they are slightly better.
In these situations the engine who is "right" about the evaluation tends to do better.

An small example to illustrate: My engine had poor passed pawn and king safety evaluation. Therefore against an engine with better passed pawn evaluation it would lose, because the opponent would, more often than not, create a passed pawn, and thereby creating an advantage and typically win the game.
Therefore my "bottleneck" against this engine was my poor passed pawn evaluation.
In essence poor passed pawn evaluation was preventing me from improving the performance against this particular engine.

Now there are many many bottlenecks hindering improvement typically.
Lack of search depth caused by poor move ordering could be a bottleneck!

If a make a change to my evaluation function improving the passed pawn evaluation, then how much improvement I gain against an opponent now depends upon whether or not passed pawn was a bottleneck against this particular opponent!

If it was, the games now tend to steer towards another "disagreement". And if this is another bottleneck for me, then no strength is apparently gained!
It might actually become a slight decrease in performance against this engine because the added evaluation might make the engine perform slightly slower.

When I added better pawn evaluation to Pupsi2, the test result indicated a huge gain against some opponents, ranging all the way to slightly worse performance against others. The net total was a huge plus, but if i had tested against only a few unluckily selected opponents, i might have thrown the changes away, thinking they were bad!

There it is! My "Simplified theory of bottlenecks in computer chess"

Summa summarum:
Test against a relatively large number of opponents.

I test against 10 opponents selected from the rankings in the "UCI Engines League". More specifically the 10 engines just above Pupsi in the rankings. I then run 80 Noomen games agaist each at 1+1.

The end

Kind regards,
Jesper

Kempelen · Post by **Kempelen** » Thu Jul 17, 2008 10:55 am

I agree with Bob using tactical test is not good for testing move order. You better use positional ones or games.

Anyway, I will you a tip about how to catch bugs in move order. I was doing bad until I catch one by casuality. My move order rutine is based on the following move scores:

Code: Select all

#define ORDEN_HASH          110000
#define ORDEN_VP	          100000
#define ORDEN_MVV_LVA	    90000
#define ORDEN_CAPS_SEE	    80000
#define ORDEN_AL_PASO	    70000
#define ORDEN_KILLER1	    60000
#define ORDEN_KILLER2	    50000
#define ORDEN_PROMOCION  40000
#define ORDEN_NORMAL	    30000
#define ORDEN_MOVS_MALOS    20000
#define ORDEN_CAPS_MALAS    10000

As you can see, there are increments of 10000 between each type. In one line of my code I had:

Code: Select all

n->valor_movs[j] = ORDEN_NORMAL + historico[ORIGEN ( mov ) ][DESTINO ( mov ) ];

That's fine, but what if your move score goes above ORDEN_NORMAL+10000 ?. I was get above that value when historic values grow very fast. So you better add :

Code: Select all

ASSERT(n->valor_movs[j] >= ORDEN_NORMAL && n->valor_movs[j] < (ORDEN_NORMAL + 10000));

and do the same in place where you valuate all kind of moves.

This was a annoyted bug for me. When fixed I get a significant strength, because it was fixed for other move types to, like SEE ones.

Regards,
FS

Move ordering mystery

Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery

Re: Move ordering mystery