This margin will always be. It's just like running during the Olympics, everyone knows what's going on, but the best always wins.Fabio Gobbato wrote: ↑Sat Jul 02, 2022 1:25 pm Of course testing a longer time control is better than testing at short time control. The problem is that with the hardware we have today playing 10000 games or more to have small error margin at 5:00+3 or 15:00+5 takes a lot of time also if you have an high end cpu. Developers have to find a good compromise between testing more things and use a suitable tc.
New release Revenge 3.0
Moderator: Ras
-
Krzysztof Grzelak
- Posts: 1588
- Joined: Tue Jul 15, 2014 12:47 pm
Re: New release Revenge 3.0
-
Chessqueen
- Posts: 5685
- Joined: Wed Sep 05, 2018 2:16 am
- Location: Moving
- Full name: Jorge Picado
Re: New release Revenge 3.0
In statistic you can always take a small sample of 250 games, in which the standard error is dependent on sample size, most of the times larger sample sizes produce smaller standard errors in the vicinity of 5:00+3, which give you a better estimate with higher precision. but if you take a small sample you can proportionally predict the outcome.Krzysztof Grzelak wrote: ↑Sat Jul 02, 2022 1:31 pmThis margin will always be. It's just like running during the Olympics, everyone knows what's going on, but the best always wins.Fabio Gobbato wrote: ↑Sat Jul 02, 2022 1:25 pm Of course testing a longer time control is better than testing at short time control. The problem is that with the hardware we have today playing 10000 games or more to have small error margin at 5:00+3 or 15:00+5 takes a lot of time also if you have an high end cpu. Developers have to find a good compromise between in statistic you can always take a small sample, in which the standard error is dependent on sample size, most of the times larger sample sizes produce smaller standard errors, which most of the times give you an estimate with higher precision. but if you take a small sample you can proportionally predict the outcome testing more things and use a suitable tc.
-
Wolfgang
- Posts: 989
- Joined: Sat May 13, 2006 1:08 am
Re: New release Revenge 3.0
He? I did never write that! Not he, YOU!Krzysztof Grzelak wrote: ↑Sat Jul 02, 2022 1:08 pm Disagreement. And he doesn't write nonsense Wolfgang.
No.You think silly.
I don't care what you prefer. Simple solution: test it yourselfI prefer to have games for longer times than for shorter times.
Sorry to say, but this is completely ridiculous. Inform yourself instead of presenting fake news!Unfortunately, you also have no tests for a long time for games.
We, CEGT, have lists with
40/4 (40 moves / 4 minutes, no Ponder)
5'+3" (with Ponder)
40/20 (40 moves / 20 minutes, no Ponder)
25'+8" (no Ponder)
on fast hardware (our testers page with hardware infos, http://www.cegt.net/testers/testers.html, is outdated, will be revised shortly)
Revenge 3.0 will be in these lists soon.
Once again: inform yourself first, then post here!
EoD
Don't want to feed the troll furthermore!
-
pohl4711
- Posts: 2843
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: New release Revenge 3.0
And is clearly proven in the FGRL Ratinglists!Wolfgang wrote: ↑Sat Jul 02, 2022 12:08 pm Do you really believe what you say/write??
LOL (for you: laughing out loud)
What you write is simply wrong, not to say nonsense. With longer TC the difference between the programs may become smaller, but the sequence remains the same. Maybe not every single place in a rating list, but no program is e.g. No.3 with short TC and suddenly No. 10 or so with longer TC. That's simply not true![]()
http://www.fastgm.de/
Where the ratinglist with longest timecontrol is 60min+15secs !!!
Testing with long timecontrols is indeed a double disadvantage:
(a) you will get less number of games, which leads to wider errorbars and
(b) you get more draws, which shrinks the Elo-distances between the engines in the ratinglist, that means, you need much more games, to get results out of the errorbar, even though the errorbars would not get bigger because of (a)...
So, long thinkingtimes are double-bad for testing (and for development)...
Last edited by pohl4711 on Sat Jul 02, 2022 3:03 pm, edited 1 time in total.
-
Wolfgang
- Posts: 989
- Joined: Sat May 13, 2006 1:08 am
-
Krzysztof Grzelak
- Posts: 1588
- Joined: Tue Jul 15, 2014 12:47 pm
Re: New release Revenge 3.0
Gentlemen,( Wolfgang and pohl4711) take to heart what the author of the engine wrote. Do not writes nonsense and stupid things.Fabio Gobbato wrote: ↑Sat Jul 02, 2022 1:25 pm Of course testing a longer time control is better than testing at short time control. The problem is that with the hardware we have today playing 10000 games or more to have small error margin at 5:00+3 or 15:00+5 takes a lot of time also if you have an high end cpu. Developers have to find a good compromise between testing more things and use a suitable tc.
-
Wolfgang
- Posts: 989
- Joined: Sat May 13, 2006 1:08 am
-
Krzysztof Grzelak
- Posts: 1588
- Joined: Tue Jul 15, 2014 12:47 pm
-
pohl4711
- Posts: 2843
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: New release Revenge 3.0
I find it quite funny, that there are people out there, which truly believe, that mathematical laws are an opinion and that they can see a truth beyond math. Thats really insane.
Fun fact for these people: One engine tests its patches with 10sec+100ms games, followed by 60sec+600ms, singlethreading. Guess which engine this is... Starts with "S" and ends with "ish"...seems, this kind of bulletspeed testing and development works not so bad at all - what a huge surprise...
But I am out here. Discussing math is the perfect example of wasting time.
-
Uri Blass
- Posts: 11124
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: New release Revenge 3.0
In theory it is possible that the engine has a bug that cause problems only at LTC so I think that it is better if the developer also test at long time control before the release.Wolfgang wrote: ↑Sat Jul 02, 2022 12:08 pm Do you really believe what you say/write??
LOL (for you: laughing out loud)
What you write is simply wrong, not to say nonsense. With longer TC the difference between the programs may become smaller, but the sequence remains the same. Maybe not every single place in a rating list, but no program is e.g. No.3 with short TC and suddenly No. 10 or so with longer TC. That's simply not true![]()
Of course the developer cannot test 10000 games with longer time control but even having 300 games at 5+3 of the new version and the old version
before the release of the new version and reporting the result is better than nothing.