Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

geots · Post by **geots** » Mon Jul 09, 2012 12:44 pm

Tennison wrote:I completely agree with you Robert : it's impossible to have a huge gain with original because you have tested it a lot and choose the best you can.

But maybe it's possible to have a small gain ('til +5/+10) but not more I think.

As an example I have another test running for the moment :

Houdini 1.5a - Houdini 1.5a T3

It's a 10000 games test ... As soon as it is finished I can give you the games if you want.

And for the time the result is :

Houdini 1.5a : 2627 / 5334
Houdini 1.5a T3 : 2707 / 5334

As you can see, there is no significant plus, as above.
But each time I run a test with the Storm values to 40,50,60, I have little but better results than original.

Maybe a way to follow.

Don't get me wrong, if you enjoy what you do, far be it from me to cast dispersions. But I just wondered if I am the first person to ever ask you if you had considered helping a programmer who might really need it. God knows, there are plenty who could use your "cpu time" in their testing. I imagine Robert has got things covered with Houdini.

Best,

george

Uri Blass · Post by **Uri Blass** » Mon Jul 09, 2012 12:55 pm

Houdini wrote:
Tennison wrote:T3 settings and Z settings seems (as in Sedat Rating List) a little bit better than original. A small plus to Z here but need more games to confirm because so close !
If you want to make the claims above, you need to play more games. For the moment everything is buried under the tower of statistical uncertainty.
For T3 and Z settings there is only one valid conclusion after 1000 games: they are not significantly different from the default settings.

On the Rybka Forum I published the results obtained with "z", "s" and your "T4" settings in 16´000 game matches against the default Houdini 2.0c.
Here are the results against the standard Houdini 2.0c.
- "s" scored 7797-8203 (38% draws), -8 Elo (+/- 4 Elo)
- "z" scored 7922-8078 (39% draws), -4 Elo (+/- 4 Elo)
- "T4" scored 7914-8086 (40% draws), -4 Elo (+/- 4 Elo)
Again, the only conclusion can be that these settings change very little to the objective strength of the engine.

No

The only conclusion is that they change very little in the time control that you tested.

I read that with longer time control things are different

http://rybkaforum.net/cgi-bin/rybkaforu ... ?tid=25226

+172,=240,-113 seems convincing but we are going to have 1000 games at 5+3 blitz time control.

Houdini · Post by **Houdini** » Mon Jul 09, 2012 1:01 pm

Uri Blass wrote:No

The only conclusion is that they change very little in the time control that you tested.

I read that with longer time control things are different

http://rybkaforum.net/cgi-bin/rybkaforu ... ?tid=25226

+172,=240,-113 seems convincing but we are going to have 1000 games at 5+3 blitz time control.

We are discussing Ben's tests and my tests.
There may be different tests played with different conditions on different computers by different people leading to different results.
But that's a different thread.

Sedat Canbaz · Post by **Sedat Canbaz** » Mon Jul 09, 2012 1:15 pm

Uri Blass wrote:
Houdini wrote:
Tennison wrote:T3 settings and Z settings seems (as in Sedat Rating List) a little bit better than original. A small plus to Z here but need more games to confirm because so close !
If you want to make the claims above, you need to play more games. For the moment everything is buried under the tower of statistical uncertainty.
For T3 and Z settings there is only one valid conclusion after 1000 games: they are not significantly different from the default settings.

On the Rybka Forum I published the results obtained with "z", "s" and your "T4" settings in 16´000 game matches against the default Houdini 2.0c.
Here are the results against the standard Houdini 2.0c.
- "s" scored 7797-8203 (38% draws), -8 Elo (+/- 4 Elo)
- "z" scored 7922-8078 (39% draws), -4 Elo (+/- 4 Elo)
- "T4" scored 7914-8086 (40% draws), -4 Elo (+/- 4 Elo)
Again, the only conclusion can be that these settings change very little to the objective strength of the engine.
No

The only conclusion is that they change very little in the time control that you tested.

I read that with longer time control things are different

http://rybkaforum.net/cgi-bin/rybkaforu ... ?tid=25226

+172,=240,-113 seems convincing but we are going to have 1000 games at 5+3 blitz time control.

Agreed...

Just i'd like to add we need to concentrate on game results with:
-Latest fast CPU machines (recommended engines to be tested at least with 6 cores, more cores even better...)
-Popular time controls (not recommended ultra fast time controls)
-The strongest modern openings (the winning percentage should be min 40% Blacks;min 50% Whites)
-For reliable rating min 1000 games is required

Btw,i have a such testing with the above conditions:
http://www.sedatcanbaz.com/chess/scct-rating/

Best Wishes,
Sedat

Houdini · Post by **Houdini** » Mon Jul 09, 2012 1:35 pm

Sedat Canbaz wrote:-For reliable rating min 1000 games is required

That is only correct if "reliable" means "with a 20 Elo confidence interval".

The number of games has to match the precision or Elo difference you want to measure. To measure to 10 Elo precision you need at least 4000 games.

Sedat Canbaz · Post by **Sedat Canbaz** » Mon Jul 09, 2012 1:54 pm

Houdini wrote:
Sedat Canbaz wrote:-For reliable rating min 1000 games is required
That is only correct if "reliable" means "with a 20 Elo confidence interval".

The number of games has to match the precision or Elo difference you want to measure. To measure to 10 Elo precision you need at least 4000 games.

Unfortunately so far i did not notice a such rating (including SCCT),where each participant is based on 4000 games per player

SCCT Rating List does not include 4000 games per player,probably due to i have a lot of other different computer chess activities

From my experience i can say (my opinion about SCCT results),the Elo difference of any Engine has very small chance to be changed

I mean about SCCT results,e.g between 1000 games and 4000 games per player we can see approx. +/- 5 or 10 Elo,no more no less...

Best,
Sedat

geots · Post by **geots** » Mon Jul 09, 2012 1:57 pm

Houdini wrote:
Sedat Canbaz wrote:-For reliable rating min 1000 games is required
That is only correct if "reliable" means "with a 20 Elo confidence interval".

The number of games has to match the precision or Elo difference you want to measure. To measure to 10 Elo precision you need at least 4000 games.

Yea Robert, but the only thing left out could be the most important. How do we know that the results from 16,000- 30 second games are as valuable as the results from 4,000- 2 minute games? I mean who is supposed to be the one to decide at which time control you decide anything faster than that would lose accuracy?

Best,

george

Houdini · Post by **Houdini** » Mon Jul 09, 2012 2:12 pm

Sedat Canbaz wrote:Unfortunately so far i did not notice a such rating (including SCCT),where each participant is based on 4000 games per player

That's why most rating lists are only accurate to about 20 Elo.
And that is just the random error, we're not even talking about the systematic errors arising from hardware and opening choice.

Sedat Canbaz wrote:I mean about SCCT results,e.g after 1000 games per player we can see approx. +/- 5 or 10 Elo,no more no less...

Obviously you don't accept the significance of the "+" and "-" columns in your own rating list which show around +/- 20 Elo.
Why would your rating list somehow escape the basic laws of statistics?

Robert

Sedat Canbaz · Post by **Sedat Canbaz** » Mon Jul 09, 2012 2:27 pm

Houdini wrote:
Sedat Canbaz wrote:Unfortunately so far i did not notice a such rating (including SCCT),where each participant is based on 4000 games per player
That's why most rating lists are only accurate to about 20 Elo.
And that is just the random error, we're not even talking about the systematic errors arising from hardware and opening choice.

Sedat Canbaz wrote:I mean about SCCT results,e.g after 1000 games per player we can see approx. +/- 5 or 10 Elo,no more no less...
Obviously you don't accept the significance of the "+" and "-" columns in your own rating list which show around +/- 20 Elo.
Why would your rating list somehow escape the basic laws of statistics?

Robert

Of course... this is my view about SCCT results

Note also that my opinion is for Top 20 Engines

And once more i 'd like to mention that so far i did not notice +/- 20 Elo or 40 Elo difference (in case of after 1000 games per player)

For example,up to 300-500 games per player,we have such possibility to see a such high Elo difference

Greetings,
Sedat

Laskos · Post by **Laskos** » Mon Jul 09, 2012 2:41 pm

geots wrote:
Houdini wrote:
Sedat Canbaz wrote:-For reliable rating min 1000 games is required
That is only correct if "reliable" means "with a 20 Elo confidence interval".

The number of games has to match the precision or Elo difference you want to measure. To measure to 10 Elo precision you need at least 4000 games.

Yea Robert, but the only thing left out could be the most important. How do we know that the results from 16,000- 30 second games are as valuable as the results from 4,000- 2 minute games? I mean who is supposed to be the one to decide at which time control you decide anything faster than that would lose accuracy?

Best,

george

The most relevant are 20 games at extremely long time controls like 2hours/40moves. The errors in this case are less than 3-4 Elo points, in fact there are almost no errors.

Kai

Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)

Re: Houdini 2.0 : Settings (Z, T3, Baracuda, Baracuda T3)