Komodo MP vs Houdini 3

lkaufman · Post by **lkaufman** » Tue Jun 18, 2013 4:31 pm

So far I've run 100 games of Komodo 5.1 MP (released version) against Houdini 3, 50 games on 4 cores and 50 games on 12. Time limit 3' + 2", Noomen testsuite, 256 hash, no ponder. Much to my surprise, the result is dead even, almost identical on both tests (+1 on 4 cores, -1 on 12). Sample is still small and I expect we will end up losing, but considering that we are not yet close to Houdini 3 on one core (per IPON) this is more evidence that our MP scaling is good and likely to be better than Houdini. I'll run another hundred games.

beram · Post by **beram** » Tue Jun 18, 2013 4:58 pm

I am yet underway 100 games with my own testsuite
After 50 games with 4CPU 4m2s, I have a 41% score for Komodo 5.1MP
So this doesnt confirm your amazing results

Previous tests against Houdini3 with different TC's Komodo 5 scored 37-38 %

lkaufman · Post by **lkaufman** » Tue Jun 18, 2013 5:48 pm

beram wrote:I am yet underway 100 games with my own testsuite
After 50 games with 4CPU 4m2s, I have a 41% score for Komodo 5.1MP
So this doesnt confirm your amazing results

Previous tests against Houdini3 with different TC's Komodo 5 scored 37-38 %

After 114 games (combining 4 and 12 core results as they are similar) Komodo MP is just one game behind. I will be the first to admit that this must be a lucky result for Komodo, while your 41% is probably unlucky. If the true result ends up somewhere in the middle, say 45% or so, this would still indicate better scaling for Komodo as on one core I would only expect around 40% against Houdini 3, as the one core version is only slightly stronger than Komodo 5 which you say scored 37-38%.

Laskos · Post by **Laskos** » Tue Jun 18, 2013 7:45 pm

lkaufman wrote:
beram wrote:I am yet underway 100 games with my own testsuite
After 50 games with 4CPU 4m2s, I have a 41% score for Komodo 5.1MP
So this doesnt confirm your amazing results

Previous tests against Houdini3 with different TC's Komodo 5 scored 37-38 %
After 114 games (combining 4 and 12 core results as they are similar) Komodo MP is just one game behind. I will be the first to admit that this must be a lucky result for Komodo, while your 41% is probably unlucky. If the true result ends up somewhere in the middle, say 45% or so, this would still indicate better scaling for Komodo as on one core I would only expect around 40% against Houdini 3, as the one core version is only slightly stronger than Komodo 5 which you say scored 37-38%.

You seem to have a bit different scaling of K5.1 with time compared to K CCT. At ultra-fast controls, K 5.1 on one core is some 10 points stronger than the CCT version, on IPON it seems 30 points weaker. Do you have data on how it scales at long TC on one core? Does it catch CCT version again?

As for MP (4 cores), it seems to scale well, as time-to-depth goes, but one has to play many not-so-short TC games.

yanquis1972 · Post by **yanquis1972** » Tue Jun 18, 2013 8:55 pm

i posted this in the other forum, but it belongs here. i am running a very much unofficial test (as other processes are running when i'm at the computer), but under these conditions:

5+3, 256MB hash, 4CPU ponder off, HT enabled, silver suite--

the result is +12 =15 -13 so far.

i just realized there is some sort of problem with the log (fritz 13 GUI), as i've run 51 games, but only 40 are reported. i'll have to check up what happened in the missing games. nevertheless, very good results from komodo that are in line with larry's more accurate test.

yanquis1972 · Post by **yanquis1972** » Tue Jun 18, 2013 9:12 pm

here are the missing 11 -- somewhat to my surprise, komodo still comes out looking great:

draw
~+2.70 adv komodo, move 109 (+2.80 komodo, -2.60 houdini)
draw
draw
draw
draw
-10.5 eval (houdini), +5.3 eval (komodo), move 110. adj. score win for komodo.
draw
draw
draw
draw

so barring the 2nd game, which i will try to play out after i stop the tournament, the score is in fact:

+13 =24 -13

edit: in my haste misread the score of the arbitrated game. komodo in fact had a clearly winning position w/ the black pieces.

lkaufman · Post by **lkaufman** » Tue Jun 18, 2013 9:37 pm

Laskos wrote:
lkaufman wrote:
beram wrote:I am yet underway 100 games with my own testsuite
After 50 games with 4CPU 4m2s, I have a 41% score for Komodo 5.1MP
So this doesnt confirm your amazing results

Previous tests against Houdini3 with different TC's Komodo 5 scored 37-38 %
After 114 games (combining 4 and 12 core results as they are similar) Komodo MP is just one game behind. I will be the first to admit that this must be a lucky result for Komodo, while your 41% is probably unlucky. If the true result ends up somewhere in the middle, say 45% or so, this would still indicate better scaling for Komodo as on one core I would only expect around 40% against Houdini 3, as the one core version is only slightly stronger than Komodo 5 which you say scored 37-38%.
You seem to have a bit different scaling of K5.1 with time compared to K CCT. At ultra-fast controls, K 5.1 on one core is some 10 points stronger than the CCT version, on IPON it seems 30 points weaker. Do you have data on how it scales at long TC on one core? Does it catch CCT version again?

As for MP (4 cores), it seems to scale well, as time-to-depth goes, but one has to play many not-so-short TC games.

I haven't seen the ultra-fast results, but I think they just mean that the start-up time is a small fraction of a second less per move on k5.1, so basically they are meaningless. On one core, K5.1 is basically just a somewhat slower version of Komodo CCT (partly to accomodate MP, partly for unknown reasons we hope to fix for next version). I think the IPON result so far is a bit unlucky, I'm expecting a final rating around midway between Komodo 5 and Komodo CCT. There should be no difference in scaling between Komodo 5.1 and Komodo CCT except for the usual rule that rating differences contract with more time.
There is one substantive difference between Komodo CCT and Komodo 5.1 (on one core). We reduced the value of a pawn by 2 centipawns. This makes Komodo a bit more willing to play a gambit, and a bit more inclined to favor a piece or two minors vs. rook against pawns. I felt that these changes were primarily beneficial in terms of "predicting" theoretical moves, and of course will make Komodo a bit more fun to play against. The impact on the elo rating is negligible, probably plus or minus one elo or less.
After 167 MP games, my results (Komodo MP vs Houdini 3) are dead even (+3 on 4 cores, -3 on 12). Needless to say, these results are way better than I hoped for. There is no way this version can be as strong as Houdini 3, but it seems the gap (on 4 or more cores) is much less than we thought, so we can realistically hope to catch/pass Houdini 3 with our next release, whenever that might be.

yanquis1972 · Post by **yanquis1972** » Tue Jun 18, 2013 9:43 pm

amazing, & once again congratulations to you & don. it's great to have this pressure on robert (and you two!).

however, one flaw: unless i missed it, you are still running houdini with contempt. this is going to lead it to play, i believe, objectively suboptimal chess, even if it leads to better scores against inferior opponents. now that we seem to have an opponent on its level, its looking like contempt should be disabled for houdini.

IWB · Post by **IWB** » Tue Jun 18, 2013 9:50 pm

lkaufman wrote:... I think the IPON result so far is a bit unlucky, I'm expecting a final rating around midway between Komodo 5 and Komodo CCT. ....

Just my 2 cents. I played now a few Komodo matches with different version and Komodo always increased at the end. I expect that this time as well.

Why: My 75 opening are in a randon order choosen years ago. Now I play the opening in the Classic GUI with "variaty ON". That means nothing else than the first match (K51-H3) gets the first position, the second (K51-Cr14) the second pos and so on. In my case the tourney start with the first 20 openings and then it goes step by step thrue all opening and cycles to the first positions towards the end. Just by luck it seems that this is an order witch plays the less suited positions for Komodo first ...
I don't have to defend somethign in case I am not right so I am free to speculate: I expect something very close to CCT (10 Elo difference!*) when the 2800 games are finished and got a propper calculation by Bayeselo:-)

BYe
Ingo

* In case this is right I will consider to substitute CCT in my RRRL as CCT seems to be some kind of dead end while K5.1 is the real Komodo

yanquis1972 · Post by **yanquis1972** » Tue Jun 18, 2013 10:16 pm

ingo, you run a fantastic testing site (my favorite), but why do you no longer test 2 cores? is it a matter of resources, or just time or preference? i feel that'd be an excellent addition, if it's possible, as it'd give us an idea of how well multicore is implemented. in this day & age, of course, everyone has at least a dual core machine.

i notice among the outliers so far (50 elo from performance elo), komodo's results are pretty evenly distributed between stronger & weaker engines, despite larry's (great, imo) decision to make it a bit more dynamic. this seems like a very good thing, as there's no real ratings inflation against weaker competition. again, because houdini defaults with contempt, i wonder if this may've been the case, but i can't find a page showing its test results.

Komodo MP vs Houdini 3

Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3

Re: Komodo MP vs Houdini 3