Long TC matches with Houdini 3 Beta
Moderator: Ras
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Long TC matches with Houdini 3 Beta
- 90+30 TC, single-core, 512 MB hash, 60 positions from Noomen 2006/2008 test suites played from both sides.
-
- Posts: 3241
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Long TC matches with Houdini 3 Beta
OK. Thank you for the screenshots, I understand much better the setup now. You have a 64 CPU machine, 2x31 processes running, each uising its "own CPU", leaving 2 for the OS to make sure there's no pollution to the experiment. In that context, pondering could/should be on, right ?Houdini wrote:- 90+30 TC, single-core, 512 MB hash, 60 positions from Noomen 2006/2008 test suites played from both sides.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Long TC matches with Houdini 3 Beta
The server is a quad AMD Opteron 6274 box running Windows 2008 R2.
It has 32 Bulldozer modules with 64 cores, but I consider this a 32-core server with hyper-threading and only use 1 core of each Bulldozer module. You'll note this on the CPU usage screen shot.
I don't use each second core because this would add unpredictability. If two cores of a Bulldozer module are used, they run at about 65% of the speed when only a single core is used. That complicates the testing, as there is no guarantee that all threads run at the same speed all the time.
It has 32 Bulldozer modules with 64 cores, but I consider this a 32-core server with hyper-threading and only use 1 core of each Bulldozer module. You'll note this on the CPU usage screen shot.
I don't use each second core because this would add unpredictability. If two cores of a Bulldozer module are used, they run at about 65% of the speed when only a single core is used. That complicates the testing, as there is no guarantee that all threads run at the same speed all the time.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Long TC matches with Houdini 3 Beta
Hi Robert,Houdini wrote:The second match just finished.
Final result of the Houdini 3 - Stockfish 2.3.1 match: +44 -16 =60
74-46 (+82 Elo ± 42 Elo).
Download Games
Final result of the Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).
Download Games
The results of both matches have very large confidence intervals, please consider these when discussing results.
My impression is that Houdini 3 had a very favorable run against Houdini 2.0c and a slightly unfavorable run against Stockfish 2.3.1.
Neither engine scored 40% against Houdini 3, let's see whether Komodo 5 fares any better.
Houdini 3 - Komodo 5 match has now started - no games finished yet.
The results so far are pretty impressive. It looks like you have a nice gain here.
I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.
What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.
Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 76
- Joined: Sat Mar 03, 2012 7:53 pm
Re: Long TC matches with Houdini 3 Beta
I would expect memory to be a huge bottleneck here.
How different engines handle the memory access may influence results more then on the normal machines.
Or maybe each cpu gets its own memory module and bus? Do you have any info on that?
How different engines handle the memory access may influence results more then on the normal machines.
Or maybe each cpu gets its own memory module and bus? Do you have any info on that?
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Long TC matches with Houdini 3 Beta
Not yet, but hopefully soon.M ANSARI wrote:I have been out for a while, did Komodo 5 get MP going?
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Long TC matches with Houdini 3 Beta
Don, thank you, we both know how hard work every Elo point gain is.Don wrote:Hi Robert,
The results so far are pretty impressive. It looks like you have a nice gain here.
I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.
What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.
Don
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.
Robert
-
- Posts: 766
- Joined: Sun Oct 16, 2011 11:25 am
Re: Long TC matches with Houdini 3 Beta
Hi Robert, I read somewhere that you added some knowledge to H3 but i would like to ask you if you improved the search. Will H3 have a different way to choose the moves? You said some time ago that H3 in normal mode wasn't better than H2 in solving tactical puzzles so i wonder if generally H3 is anyway better than H2 in tactics, because i think i understood it will be certainly better in positional play.Houdini wrote:Don, thank you, we both know how hard work every Elo point gain is.Don wrote:Hi Robert,
The results so far are pretty impressive. It looks like you have a nice gain here.
I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.
What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.
Don
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.
Robert
p.s. any new about H3-K5 match?
Thank you in advance.
Best Regards
MM
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Long TC matches with Houdini 3 Beta
Actually, I would not want everyone to lay down and die. I want the competition and I doubt Komodo would be very strong if everything had stagnated 5 years ago.Houdini wrote:Don, thank you, we both know how hard work every Elo point gain is.Don wrote:Hi Robert,
The results so far are pretty impressive. It looks like you have a nice gain here.
I think you have to run some long tests like you are doing here in order to really know that you have improved the program. I have noticed that a lot of programs are coming out with new versions that have impressive ELO gains until they are tested at "real" time controls. It's almost certainly a by-product of the fact that you are forced to test this fast to resolve small ELO improvements. It's more and more difficult to get big ELO improvements from a single change.
What makes your results impressive, ignoring the large error margin of course, is that at long time controls the relative ELO difference between programs tends to close up significantly.
Don
Before this run my slowest test match with Houdini 3 was at 2'+2" so I'm very happy that at about 30 times longer TC the gain is still significant.
Hopefully you'll catch up with Komodo, it's more fun for everyone if there's a good competition at the top.
Robert
So yes, we are trying to catch Houdini - our current dev version is almost certainly better than Houdini 1.5 but it's difficult to catch a moving target so please sit still for a minute or two.
Here are some results based on my distributed tester, where volunteers use their machines to help me test at much longer time controls (the time controls are adjusted to the hardware where the stated time control represents a very fast overclocked machine.) In these tests Komodo never plays other versions of itself.
Code: Select all
60+1
Rank Name Elo + - games score oppo. draws
1 Komodo 4485.00 64 bit 3032.5 6.8 6.8 8867 56.7% 2972.4 42.6%
2 Komodo 4476.04 64 bit 3022.3 6.3 6.3 10206 54.6% 2982.3 44.1%
3 Komodo 4483.00 64 bit 3021.5 6.3 6.3 10181 54.7% 2979.3 44.0%
4 Komodo 4485.19 64 bit 3020.2 6.8 6.8 8848 54.2% 2983.2 44.8%
5 Komodo 4477.45 64 bit 3019.6 6.8 6.8 8714 54.8% 2977.5 44.5%
6 Komodo 4487.06 64 bit 3019.3 6.4 6.4 9859 54.1% 2983.2 43.9%
7 Houdini 1.5a x64 3018.0 3.7 3.7 30090 50.6% 3012.1 41.9%
8 Komodo 4481.02 64 bit 3017.7 6.9 6.9 8555 54.5% 2978.3 44.5%
9 Komodo 4482.02 64 bit 3017.3 11.0 11.0 3358 54.0% 2981.2 44.3%
10 Komodo 4481.00 64 bit 3017.0 5.8 5.8 12144 53.8% 2983.2 43.9%
11 Komodo 4477.08 64 bit 3015.6 9.0 9.0 5051 54.7% 2974.5 42.6%
12 Komodo 4479.00 64 bit 3014.6 11.0 11.0 3349 53.9% 2979.5 44.3%
13 Komodo 4477.15 64 bit 3013.8 5.4 5.4 13705 54.6% 2973.8 44.7%
14 Critter 1.4 64-bit SSE4 3000.0 3.0 3.0 44334 48.6% 3011.8 45.8%
15 Stockfish 2.2.2 JA 2931.7 3.0 3.0 44317 40.3% 3017.8 44.6%
90+1
Rank Name Elo + - games score oppo. draws
1 Komodo 4471.02 64 bit 3060.9 12.9 12.9 2530 57.9% 2990.0 35.4%
2 Komodo 4467.01 64 bit 3027.6 8.7 8.7 5300 54.4% 2990.0 42.0%
3 Houdini 1.5a x64 3025.2 7.2 7.2 7884 49.7% 3027.6 39.1%
4 Komodo 4468.00 64 bit 3024.8 8.7 8.7 5298 54.1% 2990.1 42.4%
5 Komodo 4471.01 64 bit 3021.6 8.6 8.6 5321 53.7% 2990.0 43.5%
6 Komodo 5 64 bit dev 3020.7 8.7 8.7 5313 53.7% 2990.1 43.1%
7 Critter 1.4 64-bit SSE4 3000.0 7.1 7.1 7957 46.7% 3027.6 44.4%
8 Stockfish 2.2.2 JA 2945.0 7.1 7.1 7921 40.4% 3027.6 42.4%
120+2
Rank Name Elo + - games score oppo. draws
1 Komodo 4467.01 64 bit 3036.1 8.5 8.5 5594 55.0% 2992.1 43.9%
2 Houdini 1.5a x64 3030.6 6.0 6.0 11500 50.4% 3027.7 42.1%
3 Komodo 4463.00 64 bit 3029.7 6.1 6.1 10939 54.3% 2992.3 44.6%
4 Komodo 4466.02 64 bit 3027.1 7.5 7.5 7127 53.9% 2992.2 45.0%
5 Komodo 5 64 bit dev 3021.9 6.1 6.1 10906 53.4% 2992.2 44.4%
6 Critter 1.4 64-bit SSE4 3000.0 5.9 5.9 11530 46.8% 3027.7 45.5%
7 Stockfish 2.2.2 JA 2946.2 5.9 5.9 11536 40.7% 3027.7 45.9%
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 766
- Joined: Sun Oct 16, 2011 11:25 am
Re: Long TC matches with Houdini 3 Beta
http://rybkaforum.net/cgi-bin/rybkaforu ... 25731;pg=3
Houdini 3 - Komodo 5 match: +43 -19 =58
72-48 (+68 Elo ± 42 Elo).
Houdini 3 - Stockfish 2.3.1 match: +44 -16 =60
74-46 (+82 Elo ± 42 Elo).
Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).
Best Regards
Houdini 3 - Komodo 5 match: +43 -19 =58
72-48 (+68 Elo ± 42 Elo).
Houdini 3 - Stockfish 2.3.1 match: +44 -16 =60
74-46 (+82 Elo ± 42 Elo).
Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).
Best Regards
MM