One of the greatest disappointments in this field is what it is.
For the sake of completeness I even ran 1:1 executable once against itself and the most bizarre results popped up.

Moderator: Ras
I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!bob wrote:I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.JVMerlino wrote:Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...![]()
jm
I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...
No explanation, just an observation.
I never stop a test early unless the elo dropped out the bottom due to a programming error as mentioned previously. You might be right, but I have a utility that continuously displays the Elo as games complete, and my perception is that things often start off high and settle down, sort of like a one-tailed normal curve, rather than the just as often expected starting low and going up to the expected value.Don wrote:I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!bob wrote:I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.JVMerlino wrote:Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...![]()
jm
I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...
No explanation, just an observation.
So if a test starts out badly and we have reasonable belief that it's bad, we stop it before it's complete. If it starts out good we continue to run it to completion regardless before accepting it.
zamar wrote:Earlier I was referring strictly bad capturing reducing/pruning and it didn't work. But reducing in low depths aggressively AND tweaking alpha, gave us something (take a look at SF 2.1), but the downside is (as I said) that engine became much weaker in tactics. As H.G wisely put it, if capture is futile _enough_ it can be reduced aggressively.Don wrote:But Joona claimed earlier that it didn't work! I would kill for 5 ELO! Are you saying that would give up 5 ELO for better tactics but a weaker program?zamar wrote:SF got something like +5 elo using this idea. The problem is that although engine is a bit stronger, it's much weaker in tactical puzzles (which often involve a bad capture or even two of them!).Don wrote: If losing capture:
reduce search using normal schedule, set scout to alpha-MARGIN
It's possible that we have already tried this. I have probably only tried about a million things in the last 3 years and don't remember all of them
Don
But still I view this as a separate thing than LMR, because we apply this technique only close to leaves, reduce much more aggressively and tweak alpha around 100cp.
Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...bob wrote:I never stop a test early unless the elo dropped out the bottom due to a programming error as mentioned previously. You might be right, but I have a utility that continuously displays the Elo as games complete, and my perception is that things often start off high and settle down, sort of like a one-tailed normal curve, rather than the just as often expected starting low and going up to the expected value.Don wrote:I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!bob wrote:I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.JVMerlino wrote:Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...![]()
jm
I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...
No explanation, just an observation.
So if a test starts out badly and we have reasonable belief that it's bad, we stop it before it's complete. If it starts out good we continue to run it to completion regardless before accepting it.
However, at 63, I am not going to bet that my perception is always "right". But I can say it is strange why when playing so many games, the results start off in never-never land and return.
In any case, first test completed, no elo change that can be measured with just 30K games. Got a second run going just to see if they disagree by much...
You mentioned that you are not reducing these moves "near the tips". What is your minimum depth for this reduction, and did you change it for this test?bob wrote:Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...
Didn't change anything except that the reduction for losing captures is just one ply rather than 2. Here's the ideas I am currently using...JVMerlino wrote:You mentioned that you are not reducing these moves "near the tips". What is your minimum depth for this reduction, and did you change it for this test?bob wrote:Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...
jm
Code: Select all
if (moveIsTactical(move)) newdepth -= ((LMRTable[depth][movesplayed] >= 3) ? (LMRTable[depth][movesplayed] - 2) : 0);
else newdepth -= LMRTable[depth][movesplayed];