To generate quality games by string engines, what is a good balance between average depth and computation time?
I'm trying to find the balance to find good time controls
There are testing groups that do that.
CCRL
CEGT
There are contests that make even higher quality games such as TCEC.
You will have a very difficult time trying to replicate these efforts.
They have large teams of workers with lots of hardware
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Even if it is an approximation, I was wondering if using strong engines at depth 25/30 for example would create games interesting enough to find nice novelties that humans can explore.
I am looking for a good ratio quality/quantity. My hardware is "good", 3970X processors and dual 2070 Super
People already collect CCRL, CEGT, ICCF, etc. games.
Your data will be interesting if:
1) it is better than the current data sources
OR
2) it has interesting theoretical novelties
OR
3) it focuses on openings that are in vogue and other testing sites are not analyzing those openings
There are other things that might make it interesting to some people. For instance, I am interested in Orangutan games, but most other people are not.
But you have one machine. Testing organizations have a giant pile of them, and very dedicated testers.
Consider for CCRL 40/15:
Total: 1187224 games
played by 2729 programs
And for CEGT 40/20:
Database: 1395658 games
So well over one million games by both organizations. They spend a lot of effort ensuring the correctness of contests and weeding out bad book lines.
Throughout its entire history, TCEC has an average depth of analysis of 28 plies, and much deeper for recent contests.
So ask yourself:
How can I improve upon these efforts?
You will also be competing with Lichess for data volume statistics
You will also be competing with Playchess for data volume statistics
You have chosen a tall mountain to scale.
I don't want to discourage you, only to show you that you have chosen the mountain K2 as your first assault. So plan carefully.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Even if it is an approximation, I was wondering if using strong engines at depth 25/30 for example would create games interesting enough to find nice novelties that humans can explore.
I am looking for a good ratio quality/quantity. My hardware is "good", 3970X processors and dual 2070 Super
That was a nice paper. So H1.5a is around 2900 compared to some top human players with average rating of around 2800 that participated in candidates 2013.
Running a small sample of games, Sf12 at depth 16 is around 200 rating points over H1.5a at depth 20.
Score of Sf12_d18 vs Sf12_d16: 18 - 1 - 41 [0.642] 60
... Sf12_d18 playing White: 10 - 1 - 19 [0.650] 30
... Sf12_d18 playing Black: 8 - 0 - 22 [0.633] 30
... White vs Black: 10 - 9 - 41 [0.508] 60
Elo difference: 101.2 +/- 46.8, LOS: 100.0 %, DrawRatio: 68.3 %
So Sf12 at depth 18 could reach 2900+200+100 or 3200 compared to the players from candidates 2013.
Note that Sf12 can reach depth 16 faster than H1.5a to reach depth 20. Game generation using Sf12 would be faster. With aggressive win and draw adjudication as you are only looking for novelties, and with that hardware, you could probably generate around 1game/25sec per core.
Does it make a difference to use multiple threads per game or to use the same number of threads for simultaneous games that is to say 20 threads for 20 games simultaneously or 20 games with 20 thread each, 1 game at a time?
Peperoni wrote: ↑Sat Nov 07, 2020 3:36 pm
Interesting Ferdy
Does it make a difference to use multiple threads per game or to use the same number of threads for simultaneous games that is to say 20 threads for 20 games simultaneously or 20 games with 20 thread each, 1 game at a time?
I don't know which of them generates more games. Just try both and see which one is faster. But with more threads, it will reach the required depth faster.
Peperoni wrote: ↑Sat Nov 07, 2020 3:36 pm
By core you mean physical core or thread?
Core, but you can use threads as well as this is fixed-depth games anyway.
BTW my estimate of 200 rating points of Sf12 at depth 16 over H1.5a at depth 20 was off by 100. Sf12 depth 16 is only around 100 rating points over H1.5a at depth 20. You can also personally test this.
So probably just go for Sf12 at depth 20 to achieve 3200.
Ferdy wrote: ↑Sat Nov 07, 2020 6:10 pm
So probably just go for Sf12 at depth 20 to achieve 3200.
That is the depth of the data generated by Bojun Guo for his opening project.
He has more than a billion nodes calculated.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
82 Billion nodes for FEOBOS project with Excel and 16 million of formulas (for an example).
Project ended 2018!
On my website are around 500.000 of more as 2.5 million games my systems played.
For my blitz games I never create a download file.
Most of the games with a higher quality as CEGT (stronger hardware, FCP Rating-List for an example).
Currently FCP-Tourney-2020 is still running (over 35.000 of 41.000 games played) with a time control: 20 minutes per game and 5 seconds with 5.3Ghz strong Intel i9-10900 hardware. And the next will be follow with 30 minutes and 6 seconds Fischer time control.
Please, not CEGT or CCRL have game material only.
On the other hand, I like CEGT, CCRL and FGRL a lot.
I am not longer activ with rating systems but with testing material around Wasp tourneys.
I am thinking that better rating material can be produce with tourney's each of against each other. Ratings are not better if again and again the old results and game material inside. The advantage is, that CCRL, CEGT and FGRL can give us very fast results of new engine versions. The deal is that not only results from best engines are available here. That's much more interesting as to have results from top engines only.
FGRL have a lot of strong game material.
Jurek Chess Engine Ratings.
Others ...
There are lots of good contests for rating (I have all of your games in my database and there are many others as well). I mention CCRL and CEGT a lot because they are oldest thorough testers of the free engines. Of course, SSDF is older than those but for a long time only tested the professional engines.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.