Long TC matches with Houdini 3 Beta

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Long TC matches with Houdini 3 Beta

Post by Houdini »

- 90+30 TC, single-core, 512 MB hash, 60 positions from Noomen 2006/2008 test suites played from both sides.
lucasart
Posts: 3241
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Long TC matches with Houdini 3 Beta

Post by lucasart »

Houdini wrote:- 90+30 TC, single-core, 512 MB hash, 60 positions from Noomen 2006/2008 test suites played from both sides.
OK. Thank you for the screenshots, I understand much better the setup now. You have a 64 CPU machine, 2x31 processes running, each uising its "own CPU", leaving 2 for the OS to make sure there's no pollution to the experiment. In that context, pondering could/should be on, right ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Long TC matches with Houdini 3 Beta

Post by Houdini »

The server is a quad AMD Opteron 6274 box running Windows 2008 R2.
It has 32 Bulldozer modules with 64 cores, but I consider this a 32-core server with hyper-threading and only use 1 core of each Bulldozer module. You'll note this on the CPU usage screen shot.

I don't use each second core because this would add unpredictability. If two cores of a Bulldozer module are used, they run at about 65% of the speed when only a single core is used. That complicates the testing, as there is no guarantee that all threads run at the same speed all the time.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Long TC matches with Houdini 3 Beta

Post by Don »

Houdini wrote:The second match just finished.

Final result of the Houdini 3 - Stockfish 2.3.1 match: +44 -16 =60
74-46 (+82 Elo ± 42 Elo).
Download Games

Final result of the Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).
Download Games

The results of both matches have very large confidence intervals, please consider these when discussing results.
My impression is that Houdini 3 had a very favorable run against Houdini 2.0c and a slightly unfavorable run against Stockfish 2.3.1.
Neither engine scored 40% against Houdini 3, let's see whether Komodo 5 fares any better.

Houdini 3 - Komodo 5 match has now started - no games finished yet.
Hi Robert,

The results so far are pretty impressive. It looks like you have a nice gain here.

I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.

What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
LudiBuda
Posts: 76
Joined: Sat Mar 03, 2012 7:53 pm

Re: Long TC matches with Houdini 3 Beta

Post by LudiBuda »

I would expect memory to be a huge bottleneck here.
How different engines handle the memory access may influence results more then on the normal machines.
Or maybe each cpu gets its own memory module and bus? Do you have any info on that?
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Long TC matches with Houdini 3 Beta

Post by lkaufman »

M ANSARI wrote:I have been out for a while, did Komodo 5 get MP going?
Not yet, but hopefully soon.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Long TC matches with Houdini 3 Beta

Post by Houdini »

Don wrote:Hi Robert,

The results so far are pretty impressive. It looks like you have a nice gain here.

I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.

What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.

Don
Don, thank you, we both know how hard work every Elo point gain is.
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.

Robert
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Long TC matches with Houdini 3 Beta

Post by MM »

Houdini wrote:
Don wrote:Hi Robert,

The results so far are pretty impressive. It looks like you have a nice gain here.

I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.

What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.

Don
Don, thank you, we both know how hard work every Elo point gain is.
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.

Robert
Hi Robert, I read somewhere that you added some knowledge to H3 but i would like to ask you if you improved the search. Will H3 have a different way to choose the moves? You said some time ago that H3 in normal mode wasn't better than H2 in solving tactical puzzles so i wonder if generally H3 is anyway better than H2 in tactics, because i think i understood it will be certainly better in positional play.

p.s. any new about H3-K5 match?

Thank you in advance.

Best Regards
MM
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Long TC matches with Houdini 3 Beta

Post by Don »

Houdini wrote:
Don wrote:Hi Robert,

The results so far are pretty impressive. It looks like you have a nice gain here.

I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.

What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.

Don
Don, thank you, we both know how hard work every Elo point gain is.
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.

Robert
Actually, I would not want everyone to lay down and die. I want the competition and I doubt Komodo would be very strong if everything had stagnated 5 years ago.

So yes, we are trying to catch Houdini - our current dev version is almost certainly better than Houdini 1.5 but it's difficult to catch a moving target so please sit still for a minute or two.

Here are some results based on my distributed tester, where volunteers use their machines to help me test at much longer time controls (the time controls are adjusted to the hardware where the stated time control represents a very fast overclocked machine.) In these tests Komodo never plays other versions of itself.

Code: Select all

60+1

Rank Name                       Elo      +      -    games   score   oppo.   draws 
   1 Komodo 4485.00 64 bit    3032.5    6.8    6.8    8867   56.7%  2972.4   42.6% 
   2 Komodo 4476.04 64 bit    3022.3    6.3    6.3   10206   54.6%  2982.3   44.1% 
   3 Komodo 4483.00 64 bit    3021.5    6.3    6.3   10181   54.7%  2979.3   44.0% 
   4 Komodo 4485.19 64 bit    3020.2    6.8    6.8    8848   54.2%  2983.2   44.8% 
   5 Komodo 4477.45 64 bit    3019.6    6.8    6.8    8714   54.8%  2977.5   44.5% 
   6 Komodo 4487.06 64 bit    3019.3    6.4    6.4    9859   54.1%  2983.2   43.9% 
   7 Houdini 1.5a x64         3018.0    3.7    3.7   30090   50.6%  3012.1   41.9% 
   8 Komodo 4481.02 64 bit    3017.7    6.9    6.9    8555   54.5%  2978.3   44.5% 
   9 Komodo 4482.02 64 bit    3017.3   11.0   11.0    3358   54.0%  2981.2   44.3% 
  10 Komodo 4481.00 64 bit    3017.0    5.8    5.8   12144   53.8%  2983.2   43.9% 
  11 Komodo 4477.08 64 bit    3015.6    9.0    9.0    5051   54.7%  2974.5   42.6% 
  12 Komodo 4479.00 64 bit    3014.6   11.0   11.0    3349   53.9%  2979.5   44.3% 
  13 Komodo 4477.15 64 bit    3013.8    5.4    5.4   13705   54.6%  2973.8   44.7% 
  14 Critter 1.4 64-bit SSE4  3000.0    3.0    3.0   44334   48.6%  3011.8   45.8% 
  15 Stockfish 2.2.2 JA       2931.7    3.0    3.0   44317   40.3%  3017.8   44.6% 


90+1

Rank Name                       Elo      +      -    games   score   oppo.   draws 
   1 Komodo 4471.02 64 bit    3060.9   12.9   12.9    2530   57.9%  2990.0   35.4% 
   2 Komodo 4467.01 64 bit    3027.6    8.7    8.7    5300   54.4%  2990.0   42.0% 
   3 Houdini 1.5a x64         3025.2    7.2    7.2    7884   49.7%  3027.6   39.1% 
   4 Komodo 4468.00 64 bit    3024.8    8.7    8.7    5298   54.1%  2990.1   42.4% 
   5 Komodo 4471.01 64 bit    3021.6    8.6    8.6    5321   53.7%  2990.0   43.5% 
   6 Komodo 5 64 bit dev      3020.7    8.7    8.7    5313   53.7%  2990.1   43.1% 
   7 Critter 1.4 64-bit SSE4  3000.0    7.1    7.1    7957   46.7%  3027.6   44.4% 
   8 Stockfish 2.2.2 JA       2945.0    7.1    7.1    7921   40.4%  3027.6   42.4% 


120+2

Rank Name                       Elo      +      -    games   score   oppo.   draws 
   1 Komodo 4467.01 64 bit    3036.1    8.5    8.5    5594   55.0%  2992.1   43.9% 
   2 Houdini 1.5a x64         3030.6    6.0    6.0   11500   50.4%  3027.7   42.1% 
   3 Komodo 4463.00 64 bit    3029.7    6.1    6.1   10939   54.3%  2992.3   44.6% 
   4 Komodo 4466.02 64 bit    3027.1    7.5    7.5    7127   53.9%  2992.2   45.0% 
   5 Komodo 5 64 bit dev      3021.9    6.1    6.1   10906   53.4%  2992.2   44.4% 
   6 Critter 1.4 64-bit SSE4  3000.0    5.9    5.9   11530   46.8%  3027.7   45.5% 
   7 Stockfish 2.2.2 JA       2946.2    5.9    5.9   11536   40.7%  3027.7   45.9% 
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Long TC matches with Houdini 3 Beta

Post by MM »

http://rybkaforum.net/cgi-bin/rybkaforu ... 25731;pg=3

Houdini 3 - Komodo 5 match: +43 -19 =58
72-48 (+68 Elo ± 42 Elo).

Houdini 3 - Stockfish 2.3.1 match: +44 -16 =60
74-46 (+82 Elo ± 42 Elo).

Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).

Best Regards
MM