My test for Kayra 1.0 v6 bmi2

mehmet123 · Post by **mehmet123** » Tue Dec 21, 2021 6:21 pm

Eduard wrote: ↑Tue Dec 21, 2021 5:36 pm I asked my question because I can't think of anything else I could test. I thought, maybe you'd like to see certain pawn structures in the games of kayra, maybe more king attack? What exactly do you want to improve?

There is much to be developed. It can be boring as most of these are technical issues. The pawn structure of Kayra 1.0 v6 is much better than the old versions, but it is still far from perfect. The thing I want to do the most right now is to minimize the number of parameters used in Kayra in a way that does not cause much power loss.

mehmet123 · Post by **mehmet123** » Tue Dec 21, 2021 6:22 pm

Things i changed in codes in a short time:

Number of Parameters at Stockfish Evalution Section:~ 300
Number of Parameters at Kayra Evalution Section:~ 100

Number of Parameters at Stockfish PSQT Section: 416
Number of Parameters at Kayra PSQT Section: 36

Parameter Changes at Pawns Section:12

Some code changes at Evalution section. Some changes at piece values , etc.

There is no similarity in Kayra and Stockfish parameters. Having less number of parameters will make it easier to optimize Kayra. Because these values were not well optimized.

Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.

Eduard · Post by **Eduard** » Tue Dec 21, 2021 7:25 pm

Now 487 games undefeated.

Now I've started a really tough test. The engine only plays 1.f4 and 1.e4 where after 1.e5 2.d4 and f4 are played. Black is only ever played 1.d6. That is hard! I'm curious when there will be the first defeat.

kranium · Post by **kranium** » Tue Dec 21, 2021 8:01 pm

mehmet123 wrote: ↑Tue Dec 21, 2021 6:22 pm Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.

Hi Mehmet-

there's a couple of ways to truly determine if Kayra is better..
download littleblitzer
compile sf_dev and kayra in exact same way (very important)

load the two engines into Littleblitzer
http://www.kimiensoftware.com/software/ ... tleblitzer

run many thousands (32,000 to 40,000 is what I normally do) of games
this lowers the error bars to just 1/2 Elo.

set concurrency (simultaneous games) to as many threads your system has to offer...minus a couple if you intend to use the system while the test is running

run the games at the default ultra-fast TC (1000ms +100ms)
on a fast system with 12-16 threads, you can finish 30-40,000 in about 8-10 hours.

https://ibb.co/511d1dz

when finished (or at any time during the test), run the pgn thru Ordo...check details.txt
you'll also get a good indication of Elo @ ultra-fast time control

if you see 99.99 or 100% confidence (CFS) you have obtained a result indicating one or the other engine's superiority, about as close to empirical truth as possible

alternatively, you could also use cutechess and SPRT, but I like the above method more because it gives an indication of Elo difference...
but that must be taken with a grain if salt, as Elo differences at ultra-fast shrink with longer and longer TCs

by testing in this manner, you will avoid the false appearance of superiority, which can be so easily made when running fewer games
a rigorous testing methodology (for dev) would require testing each singular change in this manner...

mehmet123 · Post by **mehmet123** » Tue Dec 21, 2021 8:14 pm

kranium wrote: ↑Tue Dec 21, 2021 8:01 pm
mehmet123 wrote: ↑Tue Dec 21, 2021 6:22 pm Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.
Hi Mehmet-

there's a couple of ways to truly determine if Kayra is better..
download littleblitzer
compile sf_dev and kayra in exact same way (very important)

load the two engines into Littleblitzer
http://www.kimiensoftware.com/software/ ... tleblitzer

run many thousands (32,000 to 40,000 is what I normally do) of games
this lowers the error bars to just 1/2 Elo.

set concurrency (simultaneous games) to as many threads your system has to offer...minus a couple if you intend to use the system while the test is running

run the games at the default ultra-fast TC (1000ms +100m)
on a fast system with 12-16 threads, you can finish 30-40,000 in about 8-10 hours.

https://ibb.co/511d1dz

when finished (or at any time during the test), run the pgn thru Ordo...check details.txt
you'll also get a good indication of Elo @ ultra-fast time control

if you see 99.99 or 100% confidence (CFS) you have obtained a result indicating one or the other engine's superiority, about as close to empirical truth as possible

alternatively, you could also use cutechess and SPRT, but I like the above method more because it gives an indication of Elo difference...
but that must be taken with a grain if salt, as Elo differences at ultra-fast shrink with longer and longer TCs

by testing in this manner, you will avoid the false appearance of superiority, which can be so easily made when running fewer games

There is only one problem. Kayra's performance against Stockfish is much better with increasing time. That's why I'm not very hopeful about Kayra's ultra fast time control.

kranium · Post by **kranium** » Tue Dec 21, 2021 8:17 pm

developing the engine while relying on long TCs is almost impossible, the # of games you'd get would be so low that error margins would be huge
You might as well flip a coin

I can pretty much guarantee you that a result at ultra-fast will be directly proportional to a result at longer TCs

If you don't want to know the truth, then avoid this method
...or SPRT

mehmet123 · Post by **mehmet123** » Tue Dec 21, 2021 8:20 pm

kranium wrote: ↑Tue Dec 21, 2021 8:17 pm developing the engine while relying on long TCs is almost impossible, the # of games you'd get would be so low that error margins would be huge
You might as well flip a coin

For now, I don't care much if Kayra is stronger than Stockfish. For now, my main goal is to develop an engine with a different playing style.

In order to reduce the number of parameters in Kayra, I consented to some power loss in Kayra.

kranium · Post by **kranium** » Tue Dec 21, 2021 8:25 pm

mehmet123 wrote: ↑Tue Dec 21, 2021 6:22 pm Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.

Ok fair enough...
I only replied because from your post above I assumed you wanted to concentrate on strength

Best of luck with your project!

mehmet123 · Post by **mehmet123** » Tue Dec 21, 2021 8:40 pm

kranium wrote: ↑Tue Dec 21, 2021 8:25 pm
mehmet123 wrote: ↑Tue Dec 21, 2021 6:22 pm Kayra 1.0 will be released as open source very soon, but I am really excited for Kayra 1.1. If Kayra is well optimized with the help of people with powerful systems, then Kayra will be at the top.
Ok fair enough...
I only replied because from your post above I assumed you wanted to concentrate on strength

Best of luck with your project!

After the release of Kayra 1.0 (very soon) the aim will be to increase the power of Kayra. I expect you and people like you to help in the development process of Kayra.

Eduard · Post by **Eduard** » Tue Dec 21, 2021 10:53 pm

Eduard wrote: ↑Tue Dec 21, 2021 7:25 pm Now 487 games undefeated.

Now I've started a really tough test. The engine only plays 1.f4 and 1.e4 where after 1.e5 2.d4 and f4 are played. Black is only ever played 1.d6. That is hard! I'm curious when there will be the first defeat.

I see, Kayra 1.0 is released now. Thank you.

I have now played another 32 games with the moves indicated above. Still, there was no defeat. tomorrow more.

My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2

Re: My test for Kayra 1.0 v6 bmi2