Would you be willing to share it?Dann Corbit wrote: ↑Mon Jun 29, 2020 4:00 amThe best is BluefishXI with a couple small tweaks, which has no rivals so far in my testing.
How good is your engine?
Re: How good is your engine?

Re: How good is your engine?
https://drive.google.com/file/d/1utvbyS ... sp=sharingOvyron wrote: ↑Mon Jun 29, 2020 4:45 amWould you be willing to share it?Dann Corbit wrote: ↑Mon Jun 29, 2020 4:00 amThe best is BluefishXI with a couple small tweaks, which has no rivals so far in my testing.
You will want to turn off logging in the UCI options.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
Re: How good is your engine?
It is very similar to : Bluefish XILP FD (Tactical=2; defensive=off)
the only functional difference being prune less when mobility is extremely cramped and I prune more when mobility is extremely high.
For reference:
http://dorszcz.blogspot.com/p/ttt1.html
Look at the distance between both bluefish entries and the next one, with the tweaked version doing incredibly well.
Re: How good is your engine?
Here are the distinct results of the experiment:
Here is the program:
You will probably get slightly different results if you change the value of the seed.
So, if you had seen this:
losses: 36 wins: 64 ties: 0
would you have concluded that engine A was enormously stronger than engine B
And if you had seen this:
losses: 63 wins: 37 ties: 0
would you have concluded that engine A was enormously weaker than engine B
As we know, these two "engines" are exactly the same strength.
If you increase the game count, you will see things drifting sensibly towards the mean.
Code: Select all
losses: 36 wins: 64 ties: 0
#include <random>
#include <iostream>
using namespace std;
class results {
public:
int lt_h;
int gt_h;
int ties;
results()
{
lt_h = 0;
gt_h = 0;
ties = 0;
}
};
int main(void)
{
std::mt19937 generator (17);
std::uniform_real_distribution<double> urd(0.0, 1.0);
results * contests = new results[1000];
for (int contest = 0; contest < 1000; contest++)
{
for (int result = 0; result < 100; result++)
{
double value = urd(generator);
if (value < 0.5) contests[contest].lt_h++;
else if (value > 0.5) contests[contest].gt_h++;
else contests[contest].ties++;
}
}
for (int contest = 0; contest < 1000; contest++)
{
std::cout << "losses: " << contests[contest].lt_h << " wins: " << contests[contest].gt_h << " ties: " << contests[contest].ties << std::endl;
}
delete [] contests;
return 0;
}
Re: How good is your engine?
In case you are wondering, "Why so few draws?"
That was only when the coin landed on its edge (value == 0.5)
Re: How good is your engine?
Result from changing game count per contest to 1000 (AFTER filtering with sortuniq):
Code: Select all
losses: 450 wins: 550 ties: 0
Re: How good is your engine?
And 10,000:
So now you know why CCRL and CEGT run all those games.
Code: Select all
losses: 4857 wins: 5143 ties: 0
Re: How good is your engine?
If you want to solve problems you need other engine than Stockfish but if you analyse positions of a running party you need Stockfish (together with Leela) and not a weak one.thewhip wrote: ↑Sun Jun 28, 2020 9:43 pmWhat I mean is that one thing does not remove the other. That the deficit of the engines. The better they are for competition the worse they are at solving problems and vice versa. Older, lowerrange engines perform better than current ones.corres wrote: ↑Sun Jun 28, 2020 9:09 pmThis is only your idea.thewhip wrote: ↑Sun Jun 28, 2020 8:16 pmCapablanca was a genius for playing game endings and was also a Chess World Champion.
The problem I left on this topic was solved by Tahl and he was also the Chess World Champion.
So a good engine is one that can solve a problem and also be a Champion in engine vs engine match.
The old time Chess World Champions were not machines and they were also not perfect.
There will be never such engine what totally universal. You would know that.
Re: How good is your engine?
Hello Dann:
Where z is the zscore. Chauvenet's criterion might be of interest for you but you may know it.
Regards from Spain.
Ajedrecista.
Wouldn't it be the direct application of the binomial distribution with P = 0.5? Then:Dann Corbit wrote: ↑Mon Jun 29, 2020 7:28 amAnd 10,000:So now you know why CCRL and CEGT run all those games.Code: Select all
losses: 4857 wins: 5143 ties: 0 [...] losses: 5000 wins: 5000 ties: 0 [...] losses: 5156 wins: 4844 ties: 0
Code: Select all
n = wins + losses
(n >> 1)
Mean = n*P = 0.5*n
Standard deviation = sqrt[n*P*(1  P)] = 0.5*sqrt(n)
z = (wins  mean)/(standard deviation) = 2*wins/sqrt(n)  sqrt(n)

For n = 10^4
z = wins/50  100
Re: How good is your engine?
Thanks Dann. I wonder if some pseudoMultiPV 4 setting would outperform everything on the future, one that starts with Tactical=2 but reduces it to 0 as the search gets deeper (and the best move has already been caught.)Dann Corbit wrote: ↑Mon Jun 29, 2020 5:59 amhttps://drive.google.com/file/d/1utvbyS ... sp=sharingOvyron wrote: ↑Mon Jun 29, 2020 4:45 amWould you be willing to share it?Dann Corbit wrote: ↑Mon Jun 29, 2020 4:00 amThe best is BluefishXI with a couple small tweaks, which has no rivals so far in my testing.
You will want to turn off logging in the UCI options.