Singular Extensions

bob · Post by **bob** » Mon Aug 02, 2010 10:31 pm

Don wrote:
Daniel Shawul wrote:Duh I am _not_ comapring 1.7.1 and 1.8.
The point is Stockfish 1.7 or 1.8 both has SE and their blitz or long time rating yet remains the same. If it gave it a push we should see its benefits there too, no ?
No. If SF gets the same rating at both short and long time controls, why is it you think you can pick out one thing (such as SE) and claim that this is proof that SE does not help or hurt it at long time controls?

I go with Daniel here. Most of those programs in the lists are not SE-based. If SE "picks up Elo" as the depth increases should it not widen the gap between itself and other programs below it that won't pick up that same boost since they don't have SE?

It could be (and almost certainly is the case) that some things in SF scale better than others. They have the same trouble everyone else does, it's very difficult to get a lot of games in at long time controls.

So some of the things in SF probably help the program even more at longer time controls and some things help less, or even hurt it at longer time controls.

The fact that it does not get weaker or stronger at long time controls means that on average they balance out. It doesn't mean you can pick a feature at random and say this proves that feature does not help or hurt at long time controls.

No, but if you have features that do worse and offset any potential gain from something else, just because you go deeper, should not those kinds of features be removed a.s.a.p.? Why have something that does _worse_ as hardware improves?

Ralph Stoesser · Post by **Ralph Stoesser** » Mon Aug 02, 2010 10:55 pm

Daniel Shawul wrote: 5 + 5 gives enough depth so why ask for more ??

Because 10+10 was measured (comparatively much) stronger, with an increasing tendency? Don't you believe in holy cluster test results??

But suddenly the test was stopped ... suprise, surprise.

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 11:11 pm

Daniel Shawul wrote:That was the only escape route available to them *, so they can take it ...
No offence but the exaggerated SE benefit numbers are based on one's haunch so far as I can tell. CEGT & Bob's test says otherwise..

If you want to test the effect of SE then you need to have SE as the only change between version X and version X+1.
X has been matched with X+1 with a time control I believe no one used before releasing SE versions. See the results from Bob. Even if the 60+60 is done, some one would say not unless 200+200 and so on and so forth.

* 'SE is superb at long tc' believers

I saw the result of Bob
3 elo improvement at 5+5(statistical noice of 4 or 5)
18 elo improvement at 10+10(statistical noise of 8)

There may be some statistical noise but it seems that SE
is better at longer time control.

It is possible that you get something like 35 elo at 60+60 and 45 elo at 200+200

Note that even if SE performs better at long time control I do not expect it to be linear and I expect something like diminishing returns.

I am not sure if SE performs significantly better at long time control and
the only way to know is by testing(and I consider improvement of 30 elo as significant).

Uri

Don · Post by **Don** » Mon Aug 02, 2010 11:12 pm

Daniel Shawul wrote:
You compare 64 bits with 32 bits
I saw that but it is the best I can get by looking at the page ....

1.5.1 has a higer rating at blitz and the same for most stockfish versions
and it proves nothing

Note that if SE gives additional 10 elo for doubling the time control(possible based on bob's result) then the difference between 40/4 and 40/20 is only 23 elo and I think that other changes may help more at blitz and less at long time control(I believe that changes that are equivalent to speed improvement help more at blitz).
Uri
The question is really simple.
If an engine performs more or less the same with or without SE, then what is the effect of SE ? Not to the overall strength of the engine, but as to changing it performance at different tcs.

The question is simple but totally irrelevant, because we are not talking about engines where the only change is that SE was added.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 11:13 pm

Don't you think it maybe it is because it just takes _too_ much time ? I can confirm the test results at 5 + 5 again and come up with 5 elos or so, dont dare me

While you on the other hand do the 60 + 60, and show us the results with the +70 elo . I dare you.

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 11:20 pm

bob wrote:
Don wrote:
Daniel Shawul wrote:Duh I am _not_ comapring 1.7.1 and 1.8.
The point is Stockfish 1.7 or 1.8 both has SE and their blitz or long time rating yet remains the same. If it gave it a push we should see its benefits there too, no ?
No. If SF gets the same rating at both short and long time controls, why is it you think you can pick out one thing (such as SE) and claim that this is proof that SE does not help or hurt it at long time controls?
I go with Daniel here. Most of those programs in the lists are not SE-based. If SE "picks up Elo" as the depth increases should it not widen the gap between itself and other programs below it that won't pick up that same boost since they don't have SE?

It could be (and almost certainly is the case) that some things in SF scale better than others. They have the same trouble everyone else does, it's very difficult to get a lot of games in at long time controls.

So some of the things in SF probably help the program even more at longer time controls and some things help less, or even hurt it at longer time controls.

The fact that it does not get weaker or stronger at long time controls means that on average they balance out. It doesn't mean you can pick a feature at random and say this proves that feature does not help or hurt at long time controls.
No, but if you have features that do worse and offset any potential gain from something else, just because you go deeper, should not those kinds of features be removed a.s.a.p.? Why have something that does _worse_ as hardware improves?

The reason to have something that does worse as hardware improves is that it simply does better than previous versions at all time controls.

I believe that a simple speed improvement does worse as hardware improves because of small diminishing returns.

being 10 times faster may give you 220 elo at 5+5 time control and only 200 elo at 10+10 time control and there may be improvements that are equivalent to speed improvements.

Don · Post by **Don** » Mon Aug 02, 2010 11:24 pm

Ralph Stoesser wrote:
Daniel Shawul wrote: 5 + 5 gives enough depth so why ask for more ??

Because 10+10 was measured (comparatively much) stronger, with an increasing tendency? Don't you believe in holy cluster test results??

But suddenly the test was stopped ... suprise, surprise.

It was a governmental cover up.

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 11:25 pm

Daniel Shawul wrote:Don't you think it maybe it is because it just takes _too_ much time ? I can confirm the test results at 5 + 5 again and come up with 5 elos or so, dont dare me
While you on the other hand do the 60 + 60, and show us the results with the +70 elo . I dare you.

assuming 20,000 games
I will be surprised to see +70 elo at 60+60 but I will not be surprised to see +35 elo at 60+60 that is significantly more than the elo improvement at 5+5 or 10+10.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 11:28 pm

I won't be surprised if it doesn't get any better or even worse for longer tcs. The result could be simply explained by the fact that , depthleft = 8 is a bit too high and with that constrained form of SE, the chances of extensions are too rare even for an engine averaging 21+ in middle game. Or it could be just due to luck, also the improvement is not that significant compared to what has been claimed. CEGT tests at 40/4 and 40/20. So if we _assume_ it improves a bit then most probably will fail in 20-30 elo range max at 40/20.

What do you think is the reason for believing SE would give more improvement with longer tc? Has the test been done before ? No. Just hunch.
Also why would I believe it when extensions have been always worse with larger depth.

Don · Post by **Don** » Mon Aug 02, 2010 11:28 pm

Ralph Stoesser wrote:
Daniel Shawul wrote: 5 + 5 gives enough depth so why ask for more ??

Because 10+10 was measured (comparatively much) stronger, with an increasing tendency? Don't you believe in holy cluster test results??

But suddenly the test was stopped ... suprise, surprise.

Here is my test so far:

Code: Select all

  RANK      ELO     +/-     Tme/Gme  Tot Gms  PLAYER
-------  -------  -----  ----------  -------  ----------------
     1    3000.0   36.4     109.456      364  komodo 1.2
     2    2974.2   36.4     108.495      364  komodo 1.2-noSing

Don

Singular Extensions

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games