Testing New IvanHoes complies.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Testing New IvanHoes complies.

Post by Tomcass »

TESTING NEW COMPILES OF IVANHOE
240 games

TEST 1

Quad 2.33
Gui: Fritz 12
Ponder: Off
No Triple or Robbo Bases
Book: HS Masterbook 2
Time Control: 5 min+ 1 sec.
2010-07Ivan55bPeterpan2-2 2010
4 cores
120 games

1 2 3 4
1 Ivanhoe-B 55mPPsse3w32 10.5 - 9.5 10.5 - 9.5 12.0 - 8.0 33.0/60
2 Ivanhoe-B57d_11_w32(x4)Bill 9.5 - 10.5 10.0 - 10.0 11.0 - 9.0 30.5/60
3 Houdini 1.02 w32 4_CPU 9.5 - 10.5 10.0 - 10.0 10.0 - 10.0 29.5/60
4 Deep Rybka 4 w32 8.0 - 12. 09.0 - 11.0 10.0 - 10.0 27.0/60



TEST 2.
I7 975
Gui: Fritz 12
Ponder: Off
No Triple or Robbo Bases
Book: HS Masterbook 2
Time Control: 5 min+ 1 sec.
4 cores and 8 threads
120 games

201007IvanPeterpanB55-1 2010

1 2 3 4
1 Deep Rybka 4 x64 10.5 - 9.5 11.5 - 8.5 10.5 - 9.5 32.5/60
2 Ivanhoe-B 55mPPsse4.2(x8) 9.5 - 10.5 12.0 - 8.0 11.0 - 9.0 32.5/60 0.00
3 Houdini DEVEL x64 8_CPU 8.5 - 11.5 8.0 - 12.0 11.0 - 9.0 27.5/60 0.00
4 Ivanhoe-B57d_11_w32(x8)Bill 9.5 - 10.5 9.0 - 11.0 9.0 - 11.0 27.5/60 0.00

GLOBAL RESULTS:
1 Ivanhoe-B 55mPP 65,5/120
4 Deep Rybka 4 59,5/120
3 Ivanhoe-B57d_11_w32_Bill 58,0/120
4 Houdini 1.02 57,0/120

COMMENTS.

PeterPan’s compile is extremely strong in my slower computer and very strong in my faster i7.
Bill’s compile performs very well in my slower computer with a w32 environment, but not so well in i7 because it works with w32 and all other engines works with w64. (Speed handicap about 40%).

Regards from Barcelona.

Tom.
Marc MP

Re: Testing New IvanHoes complies.

Post by Marc MP »

Hi Tom,

Did you consider doing gauntlets against various opponents? If you have the resources and the time to do so, I would encourage you along this way; I'm a bit sceptical of the results coming from the same family of engines.

Cheers,

Marc
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

Hi, Marc.

Thank you for your suggestion.

I really can test new compiles against a set of opponents. In fact, I have bought all the commercial programmes available. And I have one old quad and another faster i7 975. And time to test. The reasons for not testing this way are:

1.- The number of new compiles generated is enormous. I prefer to test them against the best commercial today: Deep Rybka 4.

2.- When I have tested through 'gauntlet' against various non Rybka opponents, perhaps with the only exception of Stockfish 1.8.1, there is a substancial imbalance in strength and I get results such as 75%-25% or similar, making the testing process less exciting. (I try to see as many games as I can).

Anyway, probably I will test the way you suggest if there is a quiet period of new compiles in the Igorrit family.

Regards from Barcelona.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

Tomcass wrote: Anyway, probably I will test the way you suggest if there is a quiet period of new compiles in the Igorrit family.
... woops, I meant obviously Ippolit family. Sorry!. :oops:

Tom.
User avatar
David Dahlem
Posts: 900
Joined: Wed Mar 08, 2006 9:06 pm

Re: Testing New IvanHoes complies.

Post by David Dahlem »

Tomcass wrote: Anyway, probably I will test the way you suggest if there is a quiet period of new compiles in the Igorrit family.
Not likely, there is already new compiles, T54 and T54A. :lol:
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

David Dahlem wrote:
Tomcass wrote: Anyway, probably I will test the way you suggest if there is a quiet period of new compiles in the Igorrit family.
Not likely, there is already new compiles, T54 and T54A. :lol:
With half a dozen of such great compilers one can not make any plans for the future, David!. I have my two quads working 24 hours per day, 7 days a week and I am not able to test everything!. :D

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

240 games

TEST 1


Quad 2.33
Gui: Fritz 12
Ponder: Off
No Triple or Robbo Bases
Book: HS Masterbook 2
Time Control: 5 min+ 1 sec.
4 cores
120 games

2010-07IvanAhmed+Vlad0 2010

Deep Rybka 4 w32 - IvanHoe 55mU-x32Vlad0(x4) 25.5 - 34.5 +7/=37/-16 42.50%
Deep Rybka 4 w32 - IvanHoe T63DAhmed(x4) 29.0 - 31.0 +13/=32/-15 48.33%



TEST 2

I7 975
Gui: Fritz 12
Ponder: Off
No Triple or Robbo Bases
Book: HS Masterbook 2
Time Control: 5 min+ 1 sec.
4 cores and 8 threads
120 games

201007IvanAhmed+Vlad0 2010

Deep Rybka 4 x64 - IvanHoe T63DAhmed 23.0 - 17.0 +10/=26/-4 57.50%
Deep Rybka 4 x64 - IvanHoe 57aU-x64Vlad0 20.0 - 20.0 +10/=20/-10 50.00%
Deep Rybka 4 x64 - IvanHoe 55mU-x64Vlad0 18.5 - 21.5 +6/=25/-9 46.25%

GLOBAL RESULTS:

IvanHoe 55mUVlad0 56/100 56,00%
IvanHoe 57aU64Vlad0 20/40 50,00%
Deep Rybka 4 116/240 48,33%
IvanHoe T63DAhmed 48/100 48,00%

COMMENTS.

I know that 100 games is not enough to conclude anything. But in my tests Ivan55mUVlad0 has got the best result till now against Deep Rybka 4 among all the Ippolit engines/compiles. +25 -13 =62 (!!!).

T63 Ahmed performed better than DR4 in my slower computer, but worse in my i7 975.

Regards from Barcelona.

Tom.
Marc MP

Re: Testing New IvanHoes complies.

Post by Marc MP »

Tomcass wrote:Hi, Marc.

Thank you for your suggestion.

I really can test new compiles against a set of opponents. In fact, I have bought all the commercial programmes available. And I have one old quad and another faster i7 975. And time to test. The reasons for not testing this way are:

1.- The number of new compiles generated is enormous. I prefer to test them against the best commercial today: Deep Rybka 4.

<snip>

Anyway, probably I will test the way you suggest if there is a quiet period of new compiles in the Igorrit family.

Regards from Barcelona.

Tom.
Yes that is the problem. With so much new compiles there is some sort of data mining going on. The complies might be close in strength, but chances are that one will come up with a very strong result.

I remember Vas saying so when he used to supply quantities of new compiles in the Rybka 1 and 2 series.

Maybe skipping some (unless obvious strength improvement shown elsewhere) could be the right strategy.

Anyway, I'm following your tourneys with interest,

Cheers,
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

I find your comments very constructive and useful, Mark. In fact, in this test I introduced Deep Shredder 12 and Naum 4.2 following your suggestions. On the other hand for me it is difficult to skip a new version. I feel excited testing. :wink:


TESTING IvanHoe 55mU-x32Vlad0(x4)
100 (+60) games

TEST 1
Quad 2.33
Gui: Fritz 12
Ponder: Off
No Triple or Robbo Bases
Book: HS Masterbook 2
Time Control: 5 min+ 1 sec.
4 cores
100 games (+60)

2010-07Ivan55Vlad0 2010

IvanHoe 55mU-x32Vlad0(x4) - Deep Shredder 12 UCI 14.0 - 6.0 +9/=10/-1 70.00%
IvanHoe 55mU-x32Vlad0(x4) - Houdini 1.02 w32 4_CPU 9.5 - 10.5 +2/=15/-3 47.50%
IvanHoe 55mU-x32Vlad0(x4) - FireBird 1.2 w32 new x4 11.5 - 8.5 +3/=17/-0 57.50%
IvanHoe 55mU-x32Vlad0(x4) - Igorrit 0.086v9_w32 9.0 - 11.0 +2/=14/-4 45.00%
IvanHoe 55mU-x32Vlad0(x4) - Naum 4.2 (x4) 12.5 - 7.5 +7/=11/-2 62.50%

From a previous test: 60 games.

Deep Rybka 4 w32 - IvanHoe 55mU-x32Vlad0(x4) 25.5 - 34.5 +7/=37/-16 42.50%

COMMENTS:

IvanHoe 55mU-x32Vlad0(x4) performs very well against non Ippo engines. Not so well against Houdini 1.02 and Igorrit V.9. Especially well tuned against Deep Rybka4.

Regards from Barcelona.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing New IvanHoes complies.

Post by Tomcass »

TESTING Ivanhoe-B57d_whm01_w64 (WHMoveryJr)

120 GAMES

Both tests in:
i7 975
Gui: Fritz 12
Ponder: off
4 cores + 4 hypertread
No Robbobases, Triple or Nalimov.
Book: HS Masterbook 1.0

201007IvanBillw64-1 2010

Time control: 10 min +0

Ivanhoe-B57d_whm01_w64+PP(x8) - Deep Rybka 4 x64 15.0 - 15.0 +5/=20/-5 50.00%
Ivanhoe-B57d_whm01_w64+PP(x8) - Houdini 1.02 x64 POPCNT 8_CPU 17.0 - 13.0 +4/=26/-0 56.67%

201007IvanBillw64_b 2010

Time control: 5 min +0

Ivanhoe-B57d_whm01_w64+PP(x8) - Deep Rybka 4 x64 16.0 - 14.0 +8/=16/-6 53.33%
Ivanhoe-B57d_whm01_w64+PP(x8) - Houdini 1.02 x64 POPCNT 8_CPU 14.5 - 15.5 +4/=21/-5 48.33%

GLOBAL

Ivanhoe-B57d_whm01_w64+PP(x8) - Deep Rybka 4 x64 31,0 - 29,0
Ivanhoe-B57d_whm01_w64+PP(x8) - Houdini 1.02 x64 POPCNT 8_CPU 31,5 - 28,5

Regards,

Tom.