come on Thorsten, how long do you test chess computers and engines? 20 years or 20 days?mclane wrote:.....
i wonder how those different results happen.
different conditions, time controls, books, number of games etc. etc.
Look at CCRL-list. CT2007.1 has only played 34 games so far (error bar +/- 100) and is about 130 points (!) behind CT2007. Even if there are very very few games played so far, i can't believe that it will be significantly ahead the older version after (many) more games.
thats my favourite one too. But CT also runs good under others, i also tested with Shredder Classic and ERT and did not face problems.i do mainly test in ARENA.
maybe for "someone", but not for experienced testers like Werner. We all know the problem with identical names but it is quite easy to test (doubleclick the .exe, type "uci" + enter) which version is playingsince the name of the engine-executable of tiger2007 is similar to that of tiger2007.1 it could be possible that someone tests e.g. the old version but believes he is testing the new version.
as Werner just wrote, your posting is quite unfair, because you know very well that things like this can happen (must not, but can). It is ridiculous to speak about "hidden parameter in testing". Our testing methods are well known and did not change. Even if there would be such a "hidden" thing (how can I do that? ), why does it only affect three engines and not 300?as you maybe know, CEGT list has also problems to differenciate different Loop versions.
so CT is not the only case where the list cannot find out about different versions. i wonder why.
there is another strange thing in the CEGT list. that shredder10.1 is behind.
so we have in fact 3 cases where something very strange is going on. i wonder if there is a hidden parameter in testing methods that produces wrong data or that produces wrong interpretation.
Wolfgang