How effective is move ordering from TT?

Discussion of chess software programming and technical issues.

Moderator: Ras

diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: How effective is move ordering from TT?

Post by diep »

rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
I don't take poison for it if it is wrong, but i tend to believe komodo will be forever weaker than latest deepsjeng version.

There is a whole stack of engines that are doing more or less the same thing the differences are so tiny, without anyone accusing of illegal behaviour. The only difference is search. All these programmers simply share they aren't very good in evaluation. They brag something online about it or shut up about it, but they wouldn't be able to write an evaluation for tictactoe. If they would build that eval themselves, they can't of course 'join the pack' of strong programs, as they'd be at least 600 elo weaker.

If you beat latest deepsjeng in tests, which is a well working parallel engine, i bet you beat komodo 100%.

Hiarcs and Diep and deepjunior are total different from this. Real original engines.

Of course if you produce 1 engine, it's easy to produce 20 similar engines...

Yet only 1 of them always will lead the pack. That seems to be DeepSjeng.

Evaluation IS better than the ivanhoe thing that the clones such as houdini are using.

Note that testsuites, i remember how some years ago Tiger was even years later scoring higher at endgames there in all testsuites, just because it always pushed passed pawns :)
Uri Blass
Posts: 10906
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: How effective is move ordering from TT?

Post by Uri Blass »

Don wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
That is a nice test, but I'm not sure it proves anything. We cannot be weaker positional and tactically and yet still be so strong so Komodo must be doing something right. I think that perhaps Komodo just doesn't test well on any sort of problem set and I don't have an explanation of why.

Do you?
If tests prove nothing then maybe you are not weaker tactically.

A possible test to see if you are weaker tactically or positionally may be to look in a long match when you score 50% and look only at the cases when both programs disagree for some moves(for example every program believes that it is better by at least +0.3 pawns for consecutive 3 moves).

If Komodo scores significantly more than 50% in this subset of games then probably it is stronger positionally and weaker tactically.

Uri
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: How effective is move ordering from TT?

Post by diep »

ZirconiumX wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
Because Strategy isn't Positional Play. Even I could work that out.

Matthew:out
Without commenting on the evaluation here - it's a search issue in case of komodo. We can prove with deduction that because it is nearly the same evaluation like a bunch of rybka type clones, with maybe 1 or 2 additions to Komodo, similar things like deepsjeng has, that Komodo is searching its searchtree more selective.

Basically Komodo is just searching its mainline over and over again.

How do you find a BETTER move in such case easier than engines that are doing that to less extend?

Realize Komodo is a single threaded engine, that's his own choice, he has little choice but to search really selective...

So nothing bad about Don here, it's a simple fact that if you check mainline deep and try to win elo that manner, that your engine is not so much suited for correspondence players.

To find moves that are more to the human mind, you want mobility programs. The father of mobility is Marc Uniacke with Hiarcs.

Diep's mobility is just based upon his ideas and uses more knowledge, that's all.

So those programs are far more suited if you prepare openings as a chessplayer or want to find that brilliant genius move.

In human games deep tactics is simply less relevant... ...it's about that first move you make simply and if you prune a lot and search ultra selective you simply cannot search it that deep like you search mainline.

That's not a judgement, that's a fact.

For super bullet it's trivial which of the 2 works better... ...the deeper mainline.
User avatar
Ajedrecista
Posts: 2134
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Off-topic, sorry...

Post by Ajedrecista »

Hello Kai:
Laskos wrote:
Houdini wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world. There is no test that can prove that I am right or wrong. I base this on the fact that we are one of the top 2 programs and yet we are probably only top 10 in tactical problem sets. We must be doing something right.
Apparently you don't understand your own engine ;).

Being poor in tactics but having a strong engine over-all doesn't demonstrate the quality of the evaluation, it's a by-product of the LMR and null move reductions. Tactics are based on playing non-obvious, apparently unsound moves. If you LMR/NMR much, you'll miss tactics, it's as simple as that.
Stockfish is, probably to an even higher degree than Komodo, relatively poor in tactical tests but very good over-all, for exactly the same reason.

Instead I would measure the quality of the evaluation function by the performance at very fast TC. If you take out most of the search, what remains is evaluation.

Robert
1-ply match 400 games

Code: Select all

Games Completed = 400 of 10000 (Avg game length = 1.029 sec)
Settings = RR/16MB/100ms per move/M 400000cp for 1000 moves, D 120000 moves/PGN:C:\Users\Ani\Downloads\LittleBlitzer\swcr.pgn(5120)
Time = 516 sec elapsed, 12385 sec remaining
 1.  Komodo 4                 	246.0/400	188-96-116  	(L: m=96 t=0 i=0 a=0)	(D: r=94 i=4 f=6 s=12 a=0)	(tpm=11.6 d=1.00 nps=843878)
 2.  Houdini 1.5a             	154.0/400	96-188-116  	(L: m=188 t=0 i=0 a=0)	(D: r=94 i=4 f=6 s=12 a=0)	(tpm=10.3 d=1.00 nps=543249)
Sorry for being off-topic: may I ask you how you managed to run fixed depth test under LB? I have ran some test using this GUI, so I know how to built Engines.lbe file, although I am unable of getting the engines play until the fixed depth I want... what option must I write in Engines.lbe file?

I have LB 2.5 (I know that the last version is 2.74, which is what you have used in view of the two decimal numbers of average depth) but I suppose that there will not be big differences. Thanks in advance!

Regards from Spain.

Ajedrecista.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: How effective is move ordering from TT?

Post by lkaufman »

diep wrote:
Jan Brouwer wrote:
Rebel wrote:Quite funny you think Diep would win such a match, I would put my money on Komodo. One requirement of a top-engine programmer is accuracy and punctuality. I see that in Don's postings, I don't see that quality in Vincent's postings. I think it matters.
Agreed!
Vincent regularly complains about the complexity (== bugs) and mis-tuned weights of his evaluation function.
Komodo has fought its way to the top with a methodical approach, so should be relatively bug-free.
My money would also be on Komodo, even with possible tactical handicap.

Come on, let's all agree that such a match proves nothing, and do it. :)
I have no idea why they didn't take the 1 ply challenge.
If you want to play a one ply match with the engines as is, go ahead. But Don doesn't want to waste programming time to play a match that will only show which program has more tactical knowledge in eval, which we expect would be Diep. One ply chess is something like 1200 elo human chess; it has nothing to do with subtlety and is just about tactical blunders.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Off-topic, sorry...

Post by diep »

Ajedrecista wrote:Hello Kai:
Laskos wrote:
Houdini wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world. There is no test that can prove that I am right or wrong. I base this on the fact that we are one of the top 2 programs and yet we are probably only top 10 in tactical problem sets. We must be doing something right.
Apparently you don't understand your own engine ;).

Being poor in tactics but having a strong engine over-all doesn't demonstrate the quality of the evaluation, it's a by-product of the LMR and null move reductions. Tactics are based on playing non-obvious, apparently unsound moves. If you LMR/NMR much, you'll miss tactics, it's as simple as that.
Stockfish is, probably to an even higher degree than Komodo, relatively poor in tactical tests but very good over-all, for exactly the same reason.

Instead I would measure the quality of the evaluation function by the performance at very fast TC. If you take out most of the search, what remains is evaluation.

Robert
1-ply match 400 games

Code: Select all

Games Completed = 400 of 10000 (Avg game length = 1.029 sec)
Settings = RR/16MB/100ms per move/M 400000cp for 1000 moves, D 120000 moves/PGN:C:\Users\Ani\Downloads\LittleBlitzer\swcr.pgn(5120)
Time = 516 sec elapsed, 12385 sec remaining
 1.  Komodo 4                 	246.0/400	188-96-116  	(L: m=96 t=0 i=0 a=0)	(D: r=94 i=4 f=6 s=12 a=0)	(tpm=11.6 d=1.00 nps=843878)
 2.  Houdini 1.5a             	154.0/400	96-188-116  	(L: m=188 t=0 i=0 a=0)	(D: r=94 i=4 f=6 s=12 a=0)	(tpm=10.3 d=1.00 nps=543249)
Sorry for being off-topic: may I ask you how you managed to run fixed depth test under LB? I have ran some test using this GUI, so I know how to built Engines.lbe file, although I am unable of getting the engines play until the fixed depth I want... what option must I write in Engines.lbe file?

I have LB 2.5 (I know that the last version is 2.74, which is what you have used in view of the two decimal numbers of average depth) but I suppose that there will not be big differences. Thanks in advance!

Regards from Spain.

Ajedrecista.
Yes i wondered about this as well, as most of those engines do not allow 1 ply but have 3 ply as a minimum depth.

The 8 plies contest is of course not possible as houdini has more extensions than Komodo and in Houdini you cannot turn those off.
Last edited by diep on Mon Aug 13, 2012 3:48 pm, edited 1 time in total.
Uri Blass
Posts: 10906
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: How effective is move ordering from TT?

Post by Uri Blass »

Uri Blass wrote:
Don wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
That is a nice test, but I'm not sure it proves anything. We cannot be weaker positional and tactically and yet still be so strong so Komodo must be doing something right. I think that perhaps Komodo just doesn't test well on any sort of problem set and I don't have an explanation of why.

Do you?
If tests prove nothing then maybe you are not weaker tactically.

A possible test to see if you are weaker tactically or positionally may be to look in a long match when you score 50% and look only at the cases when both programs disagree for some moves(for example every program believes that it is better by at least +0.3 pawns for consecutive 3 moves).

If Komodo scores significantly more than 50% in this subset of games then probably it is stronger positionally and weaker tactically.

Uri
I think that this test against diep may be also interesting if you give enough time advantage to diep so diep scores 50% against Komodo
and calculate the score for the relevant part of the games when there is a disagreement between the programs.

I expect basically the program with the better evaluation and inferior search to score better when there is a disagreement in scores for some moves.

Note that
I believe that komodo has superior evaluation and superior search relative to diep but if we give time advantage for diep we can generate a situation that diep has something equivalent to superior search relative to komodo.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: How effective is move ordering from TT?

Post by diep »

Uri Blass wrote:
Uri Blass wrote:
Don wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
That is a nice test, but I'm not sure it proves anything. We cannot be weaker positional and tactically and yet still be so strong so Komodo must be doing something right. I think that perhaps Komodo just doesn't test well on any sort of problem set and I don't have an explanation of why.

Do you?
If tests prove nothing then maybe you are not weaker tactically.

A possible test to see if you are weaker tactically or positionally may be to look in a long match when you score 50% and look only at the cases when both programs disagree for some moves(for example every program believes that it is better by at least +0.3 pawns for consecutive 3 moves).

If Komodo scores significantly more than 50% in this subset of games then probably it is stronger positionally and weaker tactically.

Uri
I think that this test against diep may be also interesting if you give enough time advantage to diep so diep scores 50% against Komodo
and calculate the score for the relevant part of the games when there is a disagreement between the programs.

I expect basically the program with the better evaluation and inferior search to score better when there is a disagreement in scores for some moves.

Note that
I believe that komodo has superior evaluation and superior search relative to diep but if we give time advantage for diep we can generate a situation that diep has something equivalent to superior search relative to komodo.
If they don't accept any match as proving anything, then it doesn't make sense to do any test Uri.

If i have a normal match against Komodo, they claim diep is parallel and uses all cores and they just 1, so logically they lose.

On other hand i didn't optimize Diep for single core contests as just forward pruning a lot and super selective search helps you there, as todays evaluations seem to need 20 ply (selective plies - not really comparable with real plies) for the evaluation functions to get the maximum eloscaling and above 20 plies you see most engines hardly win elo each ply.

So the struggle is to get that 20 plies quickly no matter how dubious you search. If you search SMP you can get it at rapid levels if you have a bunch of cores.

In most of those superbullet tests they do nowadays of course no one gets that 20 ply yet.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: How effective is move ordering from TT?

Post by Don »

Uri Blass wrote:
Don wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
That is a nice test, but I'm not sure it proves anything. We cannot be weaker positional and tactically and yet still be so strong so Komodo must be doing something right. I think that perhaps Komodo just doesn't test well on any sort of problem set and I don't have an explanation of why.

Do you?
If tests prove nothing then maybe you are not weaker tactically.

A possible test to see if you are weaker tactically or positionally may be to look in a long match when you score 50% and look only at the cases when both programs disagree for some moves(for example every program believes that it is better by at least +0.3 pawns for consecutive 3 moves).

If Komodo scores significantly more than 50% in this subset of games then probably it is stronger positionally and weaker tactically.

Uri
Without having put a lot of thought into your suggestion, it seems like a very reasonable test to me right now. Surely much better than running a 1 ply match between two totally different programs.

I'm sure it has some flaws but I do the expect to find a perfect test and I like this one.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Uri Blass
Posts: 10906
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: How effective is move ordering from TT?

Post by Uri Blass »

diep wrote:
Uri Blass wrote:
Uri Blass wrote:
Don wrote:
rvida wrote:
Don wrote:I have written many time that I BELIEVE we have the best positional program in the world.
lkaufman wrote: If we miss tactics but have the same rating as another engine, we must be better at positional play, what else is there?
Then how do you explain Komodo's relatively weaker results with the Strategic Test Suite?
That is a nice test, but I'm not sure it proves anything. We cannot be weaker positional and tactically and yet still be so strong so Komodo must be doing something right. I think that perhaps Komodo just doesn't test well on any sort of problem set and I don't have an explanation of why.

Do you?
If tests prove nothing then maybe you are not weaker tactically.

A possible test to see if you are weaker tactically or positionally may be to look in a long match when you score 50% and look only at the cases when both programs disagree for some moves(for example every program believes that it is better by at least +0.3 pawns for consecutive 3 moves).

If Komodo scores significantly more than 50% in this subset of games then probably it is stronger positionally and weaker tactically.

Uri
I think that this test against diep may be also interesting if you give enough time advantage to diep so diep scores 50% against Komodo
and calculate the score for the relevant part of the games when there is a disagreement between the programs.

I expect basically the program with the better evaluation and inferior search to score better when there is a disagreement in scores for some moves.

Note that
I believe that komodo has superior evaluation and superior search relative to diep but if we give time advantage for diep we can generate a situation that diep has something equivalent to superior search relative to komodo.
If they don't accept any match as proving anything, then it doesn't make sense to do any test Uri.

If i have a normal match against Komodo, they claim diep is parallel and uses all cores and they just 1, so logically they lose.
I do not see that they logically lose even with single processor against parallel diep.

Komodo single processor is better than many quad in the ccrl list and I have no proof that diep can beat programs like Shredder that is 130 elo weaker than komodo(with 4 processors against one that means that even with 8 processors against one it is going to lose against komodo).

From the ccrl list:

Komodo 5 64-bit 3266 +19 −19
Deep Shredder 12 64-bit OA On 4CPU 3136 +14 −14