Marco, can you change SF time management so that it does not use the increment till after it plays its move ?mcostalba wrote:...
Marco
See http://74.220.23.57/forum/viewtopic.php?p=484664#484664
Matthias.
Moderators: hgm, Rebel, chrisw
Marco, can you change SF time management so that it does not use the increment till after it plays its move ?mcostalba wrote:...
Marco
This is interesting. Thanks for reporting. I will look at this issue.Matthias Gemuh wrote: See http://74.220.23.57/forum/viewtopic.php?p=484664#484664
No problem, blitz isn't a huge effort.mcostalba wrote: I would like also to thank Ingo, Werner and the CEGT, Ray and all the other people that are testing this release: I know I made your job a tad difficult due to the small ELO increase and the different releases. I promise, also to myself, that the next one will be better prepared.
Thanks
Marco
Hi Mathias,Matthias Gemuh wrote:Marco, can you change SF time management so that it does not use the increment till after it plays its move ?mcostalba wrote:...
Marco
See http://74.220.23.57/forum/viewtopic.php?p=484664#484664
Matthias.
ChessGUI has a near-perfect solution against GUI time lag.zamar wrote: Hi Mathias,
I've played >1000 1+1' test blitz games with SF using XBoard. SF often goes really low on time (0.3 seconds), but it has never stepped over.
The current time management code doesn't use increment before move is played (unless there is a bug).
My first thought is that the cause is a slow interaction between GUI and engine. Default Emergency Base Time = 300ms, you may want to increase that...
After 100 games (which is way to little games) with 1/5 TC the two engines played equal.Lyudmil Tsvetkov wrote:Hi Gary,
I think you have made some changes and tested them against 2.2.2,
and then tuned the values until 2.3 was clearly superior to 2.2.2.
In this case the reasonable thing to do would be to try tuning the
changes in respect to a wider range of opponents, the way you did with 2.2.2. The changes might be good, but not tuned adequately.
It is also interesting to know how 2.3 fares in blitz against 2.2.2.
Best regards,
Ludmil
Seriously I don't think that there is anything wrong with the current testing method.gladius wrote:Interesting, thanks Larry. A few of the evaluation changes were more tactical terms (pinned piece penalty, undefended pieces, and rook-pawn-rank bonus). So, that could be an explanation.lkaufman wrote:My tests indicate that 2.3.1 is a clear improvement even against foreign opponents at hyperspeed levels. So the problem, if there is one, is not the choice of opponents but the time control of the tests. My guess is that the change involving lateral attacks on pawns, being a tactical term, is great at speeds like game/10" but pretty useless at IPON levels.gladius wrote:Agreed, the results are very disappointing. The improvements were tested against Stockfish 2.2.2. It seems while they were good in heads up matches, they made things worse against weaker opponents, and didn't help against stronger ones.
I'm going back now and applying each eval change to 2.2.2, and testing against a wider set of opponents (still at hyperblitz, 4s+0.05). It will be interesting to see how the changes do there. If they do the same, I guess testing at longer TC is the only way to go.
Yes, the progress on Stockfish has been great! However, with each change we made for 2.3.1, things looked quite positive. The sum of all those changes seems to be that 2.3.1 is about equal, and maybe a bit stronger. So, something is definitely amiss.zamar wrote:Seriously I don't think that there is anything wrong with the current testing method.
Looking at the results now (CCRL FRC: +9 elo, CCRL 40/4: +9 elo, IPON: -7 elo, CEGT 40/20: -1 elo), it's fully possible and I'd say even likely that SF 2.3.1 is ~5 ELO strong than SF 2.2.2.
If you look at the history of recent SF releases, almost in every release a new version has done worse in some rating list than previous version. Still in the long run it has been going up in all rating lists.
This is completely natural and one just needs to learn to live with it and have confidence in long term slow progress.
Not necessarily. Two thing must be kept in mind:gladius wrote: Yes, the progress on Stockfish has been great! However, with each change we made for 2.3.1, things looked quite positive. The sum of all those changes seems to be that 2.3.1 is about equal, and maybe a bit stronger. So, something is definitely amiss.