Cheers, Arjun.arjuntemurnikar wrote:I don't think so. There were other functional changes in the period you describe too. Why do you only see the simplifications as candidates for bad scalability? All simplifications committed in recent times have passed both STC and LTC, so I see no reason why they should be less scalable than standard patches that also pass STC and LTC. Anyway, you are again not taking into account that as TC/depth increases, and as the gap in relative strength widens, the apparent elo difference curve flattens out naturally, so what you are seeing are ghosts. You may also cross-check this with regression tests before SF DD release. Each consecutive regression test against master seemed to gain less and less elo, especially in the later stages (~40-65 elo) and that is because of what I explained. It is only natural.Lyudmil Tsvetkov wrote: Still, I support my point that simplifications scale worse than knowledge patches at LTC.
It doesn't look like to me that the test did positively. 60,60 did better than 40,40 (which failed quite clearly), but it was still neutral. Perhaps I might try increasing it further 80,80 or 100,100 and seeing if there is improvement. After that if it doesn't work, I think it would be good to give that idea a rest for a while.Lyudmil Tsvetkov wrote: Now that we are here, Arjun, and I depend entirely upon you for submitting tests, could you please schedule the more successful blocked pawn patch for standard SPRT? I guess this would be the 40cps bonus. It would be interesting to see the difference in performance at both TC.
Many thanks in advance, Arjun!
Cheers!
I am speaking of 80% SPRT 0;6 patches in the first period, and 50% SPRT 0;6 patches in the last period, so that obviously the 80% did great job back then, it is not those patches that scale bad. Same behaviour was observed in the second period. Do you really not see it: when a test scores negatively, 5-6 more lost games after 100 000, although passing SPRT, that is a liability that is going to have its consequences on performance. Same when it scores only 50 more won games out of 40 000. It passes SPRT, but it contributes very few to strength. +200 games after 40 000 is another thing, you should not be afraid there that the patch does not contribute. Flattening out of the curve is a different thing, we are speaking here that the changes in the last month performed 3 times worse in terms of scalability than the changes in the first period. (13 elo gained and 1.6 elo SMP loss vs 39 elo gained and 1.7 elo SMP loss) I think this is too obvious, too blatant.
Do you really believe a game in 5 sec. says anything about how good an idea is? I bet I will win all the games with me having 5 min. per game vs SF just 5 sec. You do storm tuning with 5 sec., it works great, it still scales well with 15 sec., and then fails with 60 sec. Probably it is the same, but the other way round, with those blocked pawns, they might perform better at 15 or 60 sec., although failing at 5 sec. Some changes perform better at LTC, others at STC.
Please, do not get me wrong, Arjun, I just care for SF, I would not like seeing it suddenly starting scaling worse at very long TC, which is the time control that matters.
Many thanks again for the blocked test. I see you have queued another test, many thanks indeed! So take 2 was the 60cps value? What is the 3rd try, 80 or 100cps? Please Arjun, be so kind, just one more time for the sake of curiosity at least, but also because blocked on the 6th are a big problem for SF and fixable anyhow, to schedule just one more test at standard 15 sec. SPRT with the version that performed best, although it might have scored negatively. Just to see the difference in performance at longer TC. I am sure it will do better there.