Stockfish 2.3.1 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: Stockfish 2.3.1 running for the IPON

Post by Houdini » Tue Sep 25, 2012 8:21 pm

lkaufman wrote:I'm pretty sure that SF 2.3.1 is not actually worse than 2.2.2, and if it finishes lower on the IPON list it will probably just be due to margin-of-error. Still, we see the same thing happening that happened with Critter and with Houdini 2.0, namely that newer versions look like big improvements in ultra-fast testing but the gains disappear once you get beyond blitz levels. Some diminution of gains with longer time limits is normal due to increased draws, we get that with Komodo, but when the gains disappear (or nearly so) with depth it indicates that the changes did not scale well. Finding improvements that help at any level is the challenge.
I have good hopes that Houdini 3 will change this trend.
I'm currently running the first long TC matches with Houdini 3 Beta, against Houdini 2.0c, Stockfish 2.3.1 and Komodo 5. Against each opponent 120 games will be played at 90 min+30 sec/move starting from 60 Noomen suite positions. The first match against Houdini 2.0c is running, current standing after 70 games is 45-25 (+27 -7 =36).

More information and updates can be found in a Rybka Forum thread or on the Houdini Facebook page.

Robert

MM
Posts: 766
Joined: Sun Oct 16, 2011 9:25 am

Re: Stockfish 2.3.1 running for the IPON

Post by MM » Tue Sep 25, 2012 9:02 pm

Houdini wrote:
lkaufman wrote:I'm pretty sure that SF 2.3.1 is not actually worse than 2.2.2, and if it finishes lower on the IPON list it will probably just be due to margin-of-error. Still, we see the same thing happening that happened with Critter and with Houdini 2.0, namely that newer versions look like big improvements in ultra-fast testing but the gains disappear once you get beyond blitz levels. Some diminution of gains with longer time limits is normal due to increased draws, we get that with Komodo, but when the gains disappear (or nearly so) with depth it indicates that the changes did not scale well. Finding improvements that help at any level is the challenge.
I have good hopes that Houdini 3 will change this trend.
I'm currently running the first long TC matches with Houdini 3 Beta, against Houdini 2.0c, Stockfish 2.3.1 and Komodo 5. Against each opponent 120 games will be played at 90 min+30 sec/move starting from 60 Noomen suite positions. The first match against Houdini 2.0c is running, current standing after 70 games is 45-25 (+27 -7 =36).

More information and updates can be found in a Rybka Forum thread or on the Houdini Facebook page.

Robert
Thank you Robert, sorry for my question that is off topic but may i ask you if you planned a date for the release of Houdini 3?

(Indeed a very impressive result by Houdini 3 beta)

Thank you
Best Regards
MM

gladius
Posts: 537
Joined: Tue Dec 12, 2006 9:10 am

Re: Stockfish 2.3.1 running for the IPON

Post by gladius » Tue Sep 25, 2012 9:27 pm

IWB wrote:
lkaufman wrote:Finding improvements that help at any level is the challenge.
Is that a petitition and if so where can I sign it?

I believe that the ultra fast engine testing with tousands of games reaches a border line for "normal" chess when it comes to elo improvements which should be visible for humans!
Maybe we need a change (or extension) in paradigma of engine development as we had it a few years ago with the change from knowledge to search ... maybe the pendulum is swining back ...

Ok, implementing knowledge is more difficult than "speeding up" the search or cutting the tree but I would find that very interesting!

Bye
Ingo

Edit: "implementing knowledge is more difficult than "speeding up" the search" is easy to say ... no offence ment, I know that this is difficult enough!
There were quite a few "improvements" to the evaluation for 2.3.1, and it didn't seem to help too much unfortunately :). Not many search/pruning related changes.

I'm going to go back through the evaluation changes, and test them against other engines. Hopefully there are some good ones in there that we can keep. Would welcome help testing the changes!

User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: Stockfish 2.3.1 running for the IPON

Post by Houdini » Tue Sep 25, 2012 9:51 pm

MM wrote:Thank you Robert, sorry for my question that is off topic but may i ask you if you planned a date for the release of Houdini 3?
The engine development is completely finished, but there's still a lot of work in packaging and documentation (e.g. write the User's Guide).
Hopefully around October 10.

bupalo
Posts: 82
Joined: Fri Mar 16, 2012 1:04 pm

Re: Stockfish 2.3.1 running for the IPON

Post by bupalo » Wed Sep 26, 2012 4:35 am

impressive. Let's hope Houdini 3 doesn't kill the development of the other engines. We are about to see a real shredder in action

IWB
Posts: 1539
Joined: Thu Mar 09, 2006 1:02 pm

Re: Stockfish 2.3.1 running for the IPON

Post by IWB » Thu Sep 27, 2012 8:24 am

Hi all,

The run is finished and Stockfish 2.3.1 ended exactly with the same rating according to bayes in the overall rating list (but would be 4 elo better than 2.2.2 according to Elostat)

However, if I make the IPON-RRRL (just the top 20) with 2.2.2 and 2.3.1 the 2.3.1 is 7 Elo lower than 2.2.2.

IPON-RRRL:
4 Stockfish 2.2.2 JA 2972 10 10 2850 70% 2830 40%
4 Stockfish 2.3.1 JA 2965 10 10 2850 69% 2827 41%


The LOS that 2.2.2 is better than 2.3.1 is 52%, not too convincing but I will continue with Stockfish 2.2.2!

Full results here: http://www.inwoba.de.

Bye
Ingo

Jouni
Posts: 1953
Joined: Wed Mar 08, 2006 7:15 pm

Re: Stockfish 2.3.1 running for the IPON

Post by Jouni » Mon Oct 01, 2012 3:52 pm

Progress: null (0) ELO points in 9 months :cry: Takes really loooooong to catch Houdini.
Jouni

Post Reply