There is much to be developed. It can be boring as most of these are technical issues. The pawn structure of Kayra 1.0 v6 is much better than the old versions, but it is still far from perfect. The thing I want to do the most right now is to minimize the number of parameters used in Kayra in a way that does not cause much power loss.
My test for Kayra 1.0 v6 bmi2
Moderator: Ras
-
mehmet123
- Posts: 695
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: My test for Kayra 1.0 v6 bmi2
-
mehmet123
- Posts: 695
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: My test for Kayra 1.0 v6 bmi2
Things i changed in codes in a short time:
Number of Parameters at Stockfish Evalution Section:~ 300
Number of Parameters at Kayra Evalution Section:~ 100
Number of Parameters at Stockfish PSQT Section: 416
Number of Parameters at Kayra PSQT Section: 36
Parameter Changes at Pawns Section:12
Some code changes at Evalution section. Some changes at piece values , etc.
There is no similarity in Kayra and Stockfish parameters. Having less number of parameters will make it easier to optimize Kayra. Because these values were not well optimized.
Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.
Number of Parameters at Stockfish Evalution Section:~ 300
Number of Parameters at Kayra Evalution Section:~ 100
Number of Parameters at Stockfish PSQT Section: 416
Number of Parameters at Kayra PSQT Section: 36
Parameter Changes at Pawns Section:12
Some code changes at Evalution section. Some changes at piece values , etc.
There is no similarity in Kayra and Stockfish parameters. Having less number of parameters will make it easier to optimize Kayra. Because these values were not well optimized.
Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.
-
Eduard
- Posts: 1439
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: N.N.
Re: My test for Kayra 1.0 v6 bmi2
Now 487 games undefeated.
Now I've started a really tough test. The engine only plays 1.f4 and 1.e4 where after 1.e5 2.d4 and f4 are played. Black is only ever played 1.d6. That is hard! I'm curious when there will be the first defeat.
Now I've started a really tough test. The engine only plays 1.f4 and 1.e4 where after 1.e5 2.d4 and f4 are played. Black is only ever played 1.d6. That is hard! I'm curious when there will be the first defeat.
-
kranium
- Posts: 2130
- Joined: Thu May 29, 2008 10:43 am
Re: My test for Kayra 1.0 v6 bmi2
Hi Mehmet-
there's a couple of ways to truly determine if Kayra is better..
download littleblitzer
compile sf_dev and kayra in exact same way (very important)
load the two engines into Littleblitzer
http://www.kimiensoftware.com/software/ ... tleblitzer
run many thousands (32,000 to 40,000 is what I normally do) of games
this lowers the error bars to just 1/2 Elo.
set concurrency (simultaneous games) to as many threads your system has to offer...minus a couple if you intend to use the system while the test is running
run the games at the default ultra-fast TC (1000ms +100ms)
on a fast system with 12-16 threads, you can finish 30-40,000 in about 8-10 hours.
https://ibb.co/511d1dz
when finished (or at any time during the test), run the pgn thru Ordo...check details.txt
you'll also get a good indication of Elo @ ultra-fast time control
if you see 99.99 or 100% confidence (CFS) you have obtained a result indicating one or the other engine's superiority, about as close to empirical truth as possible
alternatively, you could also use cutechess and SPRT, but I like the above method more because it gives an indication of Elo difference...
but that must be taken with a grain if salt, as Elo differences at ultra-fast shrink with longer and longer TCs
by testing in this manner, you will avoid the false appearance of superiority, which can be so easily made when running fewer games
a rigorous testing methodology (for dev) would require testing each singular change in this manner...
Last edited by kranium on Tue Dec 21, 2021 8:14 pm, edited 1 time in total.
-
mehmet123
- Posts: 695
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: My test for Kayra 1.0 v6 bmi2
There is only one problem. Kayra's performance against Stockfish is much better with increasing time. That's why I'm not very hopeful about Kayra's ultra fast time control.kranium wrote: ↑Tue Dec 21, 2021 8:01 pmHi Mehmet-
there's a couple of ways to truly determine if Kayra is better..
download littleblitzer
compile sf_dev and kayra in exact same way (very important)
load the two engines into Littleblitzer
http://www.kimiensoftware.com/software/ ... tleblitzer
run many thousands (32,000 to 40,000 is what I normally do) of games
this lowers the error bars to just 1/2 Elo.
set concurrency (simultaneous games) to as many threads your system has to offer...minus a couple if you intend to use the system while the test is running
run the games at the default ultra-fast TC (1000ms +100m)
on a fast system with 12-16 threads, you can finish 30-40,000 in about 8-10 hours.
https://ibb.co/511d1dz
when finished (or at any time during the test), run the pgn thru Ordo...check details.txt
you'll also get a good indication of Elo @ ultra-fast time control
if you see 99.99 or 100% confidence (CFS) you have obtained a result indicating one or the other engine's superiority, about as close to empirical truth as possible
alternatively, you could also use cutechess and SPRT, but I like the above method more because it gives an indication of Elo difference...
but that must be taken with a grain if salt, as Elo differences at ultra-fast shrink with longer and longer TCs
by testing in this manner, you will avoid the false appearance of superiority, which can be so easily made when running fewer games
-
kranium
- Posts: 2130
- Joined: Thu May 29, 2008 10:43 am
Re: My test for Kayra 1.0 v6 bmi2
developing the engine while relying on long TCs is almost impossible, the # of games you'd get would be so low that error margins would be huge
You might as well flip a coin
I can pretty much guarantee you that a result at ultra-fast will be directly proportional to a result at longer TCs
If you don't want to know the truth, then avoid this method
...or SPRT
You might as well flip a coin
I can pretty much guarantee you that a result at ultra-fast will be directly proportional to a result at longer TCs
If you don't want to know the truth, then avoid this method
...or SPRT
-
mehmet123
- Posts: 695
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: My test for Kayra 1.0 v6 bmi2
For now, I don't care much if Kayra is stronger than Stockfish. For now, my main goal is to develop an engine with a different playing style.
In order to reduce the number of parameters in Kayra, I consented to some power loss in Kayra.
-
kranium
- Posts: 2130
- Joined: Thu May 29, 2008 10:43 am
Re: My test for Kayra 1.0 v6 bmi2
Ok fair enough...
I only replied because from your post above I assumed you wanted to concentrate on strength
Best of luck with your project!
-
mehmet123
- Posts: 695
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: My test for Kayra 1.0 v6 bmi2
After the release of Kayra 1.0 (very soon) the aim will be to increase the power of Kayra. I expect you and people like you to help in the development process of Kayra.
-
Eduard
- Posts: 1439
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: N.N.
Re: My test for Kayra 1.0 v6 bmi2
I see, Kayra 1.0 is released now. Thank you.
I have now played another 32 games with the moves indicated above. Still, there was no defeat. tomorrow more.