Hello everyone!
How would you expect the current number one (Houdini 3) to perform against the number one engine in seven years time in 2019? Perhaps a decent defeat (a score around 20%) or perhaps worse? It is ofcourse impossible to predict but considering the strength of Houdini 3 it’s simply hard to imagine that the current number one would be totally outplayed. However if you look at a small test I have made it might just become true….
Let’s go seven years back in time. In December 2005 (it was before the Rybka-era) engines like Fritz 9, Shredder 9, Hiarcs 10 and Fruit 2.2.1 were fighting for the topspot in the ratinglists. A tight struggle but with Fruit as the number one in at least two ratinglists from that time:
http://www.stmintz.com/ccc/index.php?id=485608
http://www.computerschach.de/index.php? ... Itemid=248
Also seven years ago I was testing a lot and I was running my own ratinglist (The PEJ Ratinglist) before I had the chance to join Klaus Wlotzka and made a MP ratinglistversion of the CSS Ratinglist. In this period I was testing thousands of games and I must confess that Fruit was one of my favourite engines (and so much bigger was my disappointment that a multicore version never was released). In my opinion Fruit was (and still is) a strong all-round engine and I was impressed that it could compete with its strong german rivals (Fritz 9 and Shredder 9). If someone exactly seven years ago had asked me how my favourite engine Fruit would perform against the number one seven years later I think I would predict a score of 20 or maybe even 25 percent…
I have had a very long break from testing but I have always once or twice a month checked the talkchess forum and some of the leading ratinglists. When I discovered that Houdini 3 had made a huge step forward I couldn’t help buying this extraordinary engine and soon I decided to make a match between the past (Fruit 2.2.1) and the present (Houdini 3) based on the way Klaus Wlotzka was running his great ratinglist. The tests were made without openingbooks but based on 10 carefully selected openingpositions that every engine had to play with both white and black (all together 20 games in one match). Here you can see the fixed openingpositions:
http://www.computerschach.de/index.php? ... Itemid=169
And here follows the exact testconditions:
Operatingsystem: Windows 7 64-bits
Intel Core i7-2670QM 2,20 GHz (each engine using one core)
Fritz 13 GUI
10+10 (10 Minutes for the whole game plus 10 seconds for each move)
Ponder = ON
Tablebases: 3+4 pieces (8 MB cache)
1024 MB Hashtables for each engine
Books: No books allowed, engines play on their own from the startpoint of each openingposition
Before starting the match I checked a couple of ratinglists to see what I should expect regarding the final result and the prediction was a clear victory like 18,5 - 1,5 or even 19 – 1 in favour of Houdini. It was (and still is) hard for me to believe that good old Fruit was about to face not only a defeat but a true massacre.
So how did the match go between the past and the present? Well, first of all I must say that I really enjoyed following these games. Again and again Houdini surprised and amazed both Fruit and me and very soon I realized that the difference between the two engines were bigger than I expected, in fact I had the feeling that there were more than seven years between them! In a typical middlegame position Fruit would evaluate the position like minus 0,6 or 0,7 while Houdini had an evaluation around 1,5 or even more. And once and again Houdini proved that its evaluation was correct and won game after game. The final result was:
Houdini 3 – Fruit 2.2.1 19,5 - 0,5
In fact it was a small miracle that Fruit achieved a draw. After 19 games Houdini was leading 19 to 0 but then in the 20th and final game Fruit managed to secure a hard-fought draw! The total massacre was avoided but the match proved to someone like me (that haven’t been testing for several years) that there has been a huge development of strength of chessengines in the last seven years.
I wonder if this development can continue with the same speed in the next seven years? Will the number one engine in December 2019 be able to humiliate Houdini 3 the way this engine humiliated Fruit 2.2.1? I really don’t know but I do know that I will run this match in December 2019 and publish it here in Talkchess forum!
Best regards
Per
A match between the past (Fruit) and the present (Houdini 3)
Moderator: Ras
-
- Posts: 12
- Joined: Thu Dec 13, 2012 10:31 am
- Location: Odense
-
- Posts: 3241
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: A match between the past (Fruit) and the present (Houdin
The best version of Fruit at that time is Fruit 05/11/03, the last "beta" version by Fabien Letouzey. This version is stronger than Fruit 2.2.1, and even stronger than many of Ryan Benitez's private versions released after:Yarget wrote: Let’s go seven years back in time. In December 2005 (it was before the Rybka-era) engines like Fritz 9, Shredder 9, Hiarcs 10 and Fruit 2.2.1 were fighting for the topspot in the ratinglists. A tight struggle but with Fruit as the number one in at least two ratinglists from that time:
http://www.stmintz.com/ccc/index.php?id=485608
http://www.computerschach.de/index.php? ... Itemid=248
http://www.computerchess.org.uk/ccrl/40 ... t_all.html
But, as always, rating lists are lagging behind. **Especially** SSDF, which got updated once every year or something (I can't remember, but I remember it took forever, so I guess by now, everybody has lost interest in SSDF).
Anyway, seen from Houdini 3's perspective, Fruit 2.2.1, Fruit 05/11/03, Toga II, etc. are all newbie opponents...
Last edited by lucasart on Sat Dec 29, 2012 4:33 am, edited 1 time in total.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 44636
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: A match between the past (Fruit) and the present (Houdin
Not in our 40/40 testing, it isn't. Fruit 2.3.1 is the strongest. And yes, I know that error margins have to be taken into account, but that works both ways.lucasart wrote:The best version of Fruit at that time is Fruit 05/11/03, the last "beta" version by Fabien Letouzey. This version is stronger than Fruit 2.2.1, and even stronger than Fruit 2.3.1 by Ryan Benitez:
http://www.computerchess.org.uk/ccrl/40 ... t_all.html....
Code: Select all
CCRL 40/40 Rating List - Custom engine selection
427978 games played by 1259 programs, run by 18 testers
Ponder off, General books (up to 12 moves), 3-4-5 piece EGTB
Time control: Equivalent to 40 moves in 40 minutes on Athlon 64 X2 4600+ (2.4 GHz)
Computed on December 22, 2012 with Bayeselo based on 427'978 games
Tested by CCRL team, 2005-2012, http://computerchess.org.uk/ccrl/4040/
Rank Engine Elo + - Score AvOp Games
1 Fruit 2.3.1 2798 +13 -13 50.3% -2.3 1737
Fruit 051103 2781 +19 -19 49.9% -1.5 761
Fruit 2.2.1 2749 +12 -12 52.3% -16.6 2037
gbanksnz at gmail.com
-
- Posts: 2127
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
A match between the past (Fruit) and the present (Houdini 3)
Hello:
Taking a look on LOS, it gives 96% for Fruit 2.3.1 against 051103 version, which has 4% obviously.
------------------------
Just a thing I have noticed: with the inclusion of pgn codes, the button of underline and even the shortcut alt+u now print pgn instead of u (of underlining), which I had to enter manually on the keyboard... well, I usually use the keyboard for opening and closing tags, but I say it for other people on the forum.
Regards from Spain.
Ajedrecista.
I just took a glance to your data of CCRL 40/40 and the difference between two engines including their error bars can be modelled as a Normal Difference Distribution if I am not wrong... the new mean is the ratings difference while the new standard deviation is the SRSS (Square Root of Squares Sum):Graham Banks wrote:Not in our 40/40 testing, it isn't. Fruit 2.3.1 is the strongest. And yes, I know that error margins have to be taken into account, but that works both ways.lucasart wrote:The best version of Fruit at that time is Fruit 05/11/03, the last "beta" version by Fabien Letouzey. This version is stronger than Fruit 2.2.1, and even stronger than Fruit 2.3.1 by Ryan Benitez:
http://www.computerchess.org.uk/ccrl/40 ... t_all.html....
Code: Select all
CCRL 40/40 Rating List - Custom engine selection 427978 games played by 1259 programs, run by 18 testers Ponder off, General books (up to 12 moves), 3-4-5 piece EGTB Time control: Equivalent to 40 moves in 40 minutes on Athlon 64 X2 4600+ (2.4 GHz) Computed on December 22, 2012 with Bayeselo based on 427'978 games Tested by CCRL team, 2005-2012, http://computerchess.org.uk/ccrl/4040/ Rank Engine Elo + - Score AvOp Games 1 Fruit 2.3.1 2798 +13 -13 50.3% -2.3 1737 Fruit 051103 2781 +19 -19 49.9% -1.5 761 Fruit 2.2.1 2749 +12 -12 52.3% -16.6 2037
Code: Select all
2798 - 2781 ± sqrt(|13|² + |19|²) = 17 ± sqrt(530) ~ 17 ± 23.02 ~ 17 ± 23 ~ [-6, 40]
------------------------
Just a thing I have noticed: with the inclusion of pgn codes, the button of underline and even the shortcut alt+u now print pgn instead of u (of underlining), which I had to enter manually on the keyboard... well, I usually use the keyboard for opening and closing tags, but I say it for other people on the forum.
Regards from Spain.
Ajedrecista.
Last edited by Ajedrecista on Sat Dec 29, 2012 1:08 pm, edited 2 times in total.
-
- Posts: 12
- Joined: Thu Dec 13, 2012 10:31 am
- Location: Odense
Re: A match between the past (Fruit) and the present (Houdin
The best version of Fruit at that time is Fruit 05/11/03, the last "beta" version by Fabien Letouzey.
Not in our 40/40 testing, it isn't. Fruit 2.3.1 is the strongest.
True, I remember that following the release of Fruit 2.2.1 there appeared quite a lot of Fruit versions (Fruit 2.3.1, Fruit 05/11/03, Fruit 2.3.4n, Fruit 2.3.5m and several more) and some of them are stronger than the original version 2.2.1. However my aim was not to test Houdini 3 against the strongest Fruit version but to make a match between the current number one and the engine that was number one exactly seven years ago (and in December 2005 Fruit 2.2.1 was the number one (although very close to Hiarcs 10 and Fritz 9)).
Naturally I agree with you Lucas, no matter which Fruit version Houdini will face the result will be the same. I'm still amazed by the brutal way Houdini crushed Fruit and I still wonder if the engine that is leading the ratinglists in December 2019 will be able to humiliate Houdini 3 like this engine treated Fruit 2.2.1? Quite frankly I don't think so but this would probably also have been my answer if I was asked the same question seven years ago!
Best regards
Per
Not in our 40/40 testing, it isn't. Fruit 2.3.1 is the strongest.
True, I remember that following the release of Fruit 2.2.1 there appeared quite a lot of Fruit versions (Fruit 2.3.1, Fruit 05/11/03, Fruit 2.3.4n, Fruit 2.3.5m and several more) and some of them are stronger than the original version 2.2.1. However my aim was not to test Houdini 3 against the strongest Fruit version but to make a match between the current number one and the engine that was number one exactly seven years ago (and in December 2005 Fruit 2.2.1 was the number one (although very close to Hiarcs 10 and Fritz 9)).
Naturally I agree with you Lucas, no matter which Fruit version Houdini will face the result will be the same. I'm still amazed by the brutal way Houdini crushed Fruit and I still wonder if the engine that is leading the ratinglists in December 2019 will be able to humiliate Houdini 3 like this engine treated Fruit 2.2.1? Quite frankly I don't think so but this would probably also have been my answer if I was asked the same question seven years ago!
Best regards
Per
-
- Posts: 12
- Joined: Thu Dec 13, 2012 10:31 am
- Location: Odense
Re: A match between the past (Fruit) and the present (Houdin
Ups, sorry I made a small mistake. The first two lines in my recent post was quotes by Lucas and Graham....
-
- Posts: 5298
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: A match between the past (Fruit) and the present (Houdin
And here are all the Fruit versions from the unified list with more than 500 games :
[/quote]
Code: Select all
Rank Name Elo + - Games Score Oppo.
429 Fruit 090705 64-bit 2936.26 12.69 10.35 6047 44.22% 2976.98
450 Fruit 2.3.5m p15 2925.92 18.77 18.12 1150 49.17% 2930.33
496 Fruit 2.4 Beta A 2908.94 15.62 15.67 1922 49.92% 2909.33
516 Grapefruit 1.0 2902.56 13.02 10.14 4637 48.56% 2911.93
540 Fruit 090705 2892.31 15.28 15.35 2303 43.75% 2937.54
599 Fruit 2.3.3f Beta 2872.45 13.07 12.99 2900 48.22% 2883.26
626 Fruit 2.3.3j Beta 2863.41 19.87 20.14 959 46.98% 2882.12
640 Fruit 2.3 Lac 2858.89 21.33 21.36 895 47.71% 2873.19
668 Fruit 2.3.1 2850.58 14.87 14.83 2615 50.08% 2849.18
694 Fruit 05/11/03 2844.69 12.75 10.35 5459 44.06% 2884.99
703 Fruit 2.3 agg 2841.23 23.02 22.55 762 48.43% 2850.52
773 Fruit 2.2.1 2813.95 11.15 11.75 12957 55.51% 2774.81
803 Fruit 2.2 Uri 2802.91 26.30 25.70 578 64.97% 2702.23
843 Fruit 1.0 Gambit Beta 4bx 2787.90 19.77 19.50 1106 50.27% 2788.19
931 Fruit 2.1 2754.06 12.35 11.08 5377 55.18% 2718.58
1308 Fruit 2.0 2656.34 17.00 16.29 1841 48.94% 2664.16
-
- Posts: 8514
- Joined: Thu Mar 09, 2006 3:25 am
- Location: Jerusalem Israel
Re: A match between the past (Fruit) and the present (Houdin
When it knows the job, it knows the job.Yarget wrote:Hello everyone!
How would you expect the current number one (Houdini 3) to perform against the number one engine in seven years time in 2019? Perhaps a decent defeat (a score around 20%) or perhaps worse? It is ofcourse impossible to predict but considering the strength of Houdini 3 it’s simply hard to imagine that the current number one would be totally outplayed. However if you look at a small test I have made it might just become true….
Let’s go seven years back in time. In December 2005 (it was before the Rybka-era) engines like Fritz 9, Shredder 9, Hiarcs 10 and Fruit 2.2.1 were fighting for the topspot in the ratinglists. A tight struggle but with Fruit as the number one in at least two ratinglists from that time:
http://www.stmintz.com/ccc/index.php?id=485608
http://www.computerschach.de/index.php? ... Itemid=248
Also seven years ago I was testing a lot and I was running my own ratinglist (The PEJ Ratinglist) before I had the chance to join Klaus Wlotzka and made a MP ratinglistversion of the CSS Ratinglist. In this period I was testing thousands of games and I must confess that Fruit was one of my favourite engines (and so much bigger was my disappointment that a multicore version never was released). In my opinion Fruit was (and still is) a strong all-round engine and I was impressed that it could compete with its strong german rivals (Fritz 9 and Shredder 9). If someone exactly seven years ago had asked me how my favourite engine Fruit would perform against the number one seven years later I think I would predict a score of 20 or maybe even 25 percent…
I have had a very long break from testing but I have always once or twice a month checked the talkchess forum and some of the leading ratinglists. When I discovered that Houdini 3 had made a huge step forward I couldn’t help buying this extraordinary engine and soon I decided to make a match between the past (Fruit 2.2.1) and the present (Houdini 3) based on the way Klaus Wlotzka was running his great ratinglist. The tests were made without openingbooks but based on 10 carefully selected openingpositions that every engine had to play with both white and black (all together 20 games in one match). Here you can see the fixed openingpositions:
http://www.computerschach.de/index.php? ... Itemid=169
And here follows the exact testconditions:
Operatingsystem: Windows 7 64-bits
Intel Core i7-2670QM 2,20 GHz (each engine using one core)
Fritz 13 GUI
10+10 (10 Minutes for the whole game plus 10 seconds for each move)
Ponder = ON
Tablebases: 3+4 pieces (8 MB cache)
1024 MB Hashtables for each engine
Books: No books allowed, engines play on their own from the startpoint of each openingposition
Before starting the match I checked a couple of ratinglists to see what I should expect regarding the final result and the prediction was a clear victory like 18,5 - 1,5 or even 19 – 1 in favour of Houdini. It was (and still is) hard for me to believe that good old Fruit was about to face not only a defeat but a true massacre.
So how did the match go between the past and the present? Well, first of all I must say that I really enjoyed following these games. Again and again Houdini surprised and amazed both Fruit and me and very soon I realized that the difference between the two engines were bigger than I expected, in fact I had the feeling that there were more than seven years between them! In a typical middlegame position Fruit would evaluate the position like minus 0,6 or 0,7 while Houdini had an evaluation around 1,5 or even more. And once and again Houdini proved that its evaluation was correct and won game after game. The final result was:
Houdini 3 – Fruit 2.2.1 19,5 - 0,5
In fact it was a small miracle that Fruit achieved a draw. After 19 games Houdini was leading 19 to 0 but then in the 20th and final game Fruit managed to secure a hard-fought draw! The total massacre was avoided but the match proved to someone like me (that haven’t been testing for several years) that there has been a huge development of strength of chessengines in the last seven years.
I wonder if this development can continue with the same speed in the next seven years? Will the number one engine in December 2019 be able to humiliate Houdini 3 the way this engine humiliated Fruit 2.2.1? I really don’t know but I do know that I will run this match in December 2019 and publish it here in Talkchess forum!
Best regards
Per
Rybka 1 was way above everything that came before it.
Rybka 2 did almost the same thing again, and then Rybka 3 made Rybka 2 look like a joke, as did Rybka 4.1 do to Rybka 3.
Then houdini 1.5a basically toppled Rybka 4.1 convincingly, and Houdini 3 is a little stronger than Houdini 1.5a.
I would have expected that Houdini 3 would beat anything in every game, which came before Rybka 1. At worsed, perhaps a very rare draw from the best of them.
Even if there were a few draws from a hundred games, i would expect it would take much more to have one win. (This has always been my speculation, because Houdini ought to know its job vs Fruit, whilst a draw might happen if Fruit did not make a bad enough mistake, and Houdini didn't provoke it to, strongly enough).
Besides, I thought there was something which was better than Fruit, before Rybka came out. Maybe Toga? or something similar or even better than that?
-
- Posts: 3241
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: A match between the past (Fruit) and the present (Houdin
Versions of Toga that were stronger than Fruit 05/11/03, eventually appeared. But that was later. I still say that, as of december 2005 (which is the context of this thread), Fruit 05/11/03 was the strongest.S.Taylor wrote: Besides, I thought there was something which was better than Fruit, before Rybka came out. Maybe Toga? or something similar or even better than that ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: A match between the past (Fruit) and the present (Houdin
Fruit 05/11/03 was the strongest Fruit that existed in December 2005. But, it does not appear to have been available until 2007. Fruit 2.2.1 was the strongest version available to the community in December 2005.lucasart wrote:Versions of Toga that were stronger than Fruit 05/11/03, eventually appeared. But that was later. I still say that, as of december 2005 (which is the context of this thread), Fruit 05/11/03 was the strongest.S.Taylor wrote: Besides, I thought there was something which was better than Fruit, before Rybka came out. Maybe Toga? or something similar or even better than that ?