I'm looking to build an "Opening Book" for specific endgames. For now, I happen to be interested in Rook(s) + Pawn(s) endgames. I'm hoping to build a book of ~4million positions, and then play an Ethereal match starting at each of the positions.
My (lazy) gut instinct would be to try to grab a massive collection of Fishtest games. It appears to me, in my data, that somewhere between 5% and 10% of all games descend into a Rook(s) + Pawn(s) endgame. This would mean that I need somewhere around 60 million games. I could pull games down with some dirty script, but I'de rather not scrape the Fishtest site and hurt their performance.
Does there exist a collection of tens of millions, if not billions, of games in PGN format? I don't think quality is a huge concern, but I would not want human games. Had I saved every openbench game ever played, I would have 140million. I should have done that, but I would not put that network strain on users.
"Opening Books" for Endgames
Moderators: hgm, Dann Corbit, Harvey Williamson
-
AndrewGrant
- Posts: 1660
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
"Opening Books" for Endgames
Talkchess is dead without moderation. If you want my attention, contact me via andrew@grantnet.us
-
Dann Corbit
- Posts: 12476
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: "Opening Books" for Endgames
https://database.lichess.org/AndrewGrant wrote: ↑Thu Oct 01, 2020 8:50 am I'm looking to build an "Opening Book" for specific endgames. For now, I happen to be interested in Rook(s) + Pawn(s) endgames. I'm hoping to build a book of ~4million positions, and then play an Ethereal match starting at each of the positions.
My (lazy) gut instinct would be to try to grab a massive collection of Fishtest games. It appears to me, in my data, that somewhere between 5% and 10% of all games descend into a Rook(s) + Pawn(s) endgame. This would mean that I need somewhere around 60 million games. I could pull games down with some dirty script, but I'de rather not scrape the Fishtest site and hurt their performance.
Does there exist a collection of tens of millions, if not billions, of games in PGN format? I don't think quality is a huge concern, but I would not want human games. Had I saved every openbench game ever played, I would have 140million. I should have done that, but I would not put that network strain on users.
one and a half billion games.
Quality is suspect
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
Dann Corbit
- Posts: 12476
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: "Opening Books" for Endgames
Mostly human games at Lichess
You can get CCRL, CEGT, and similar contest games for free download, and they are decorated with the analysis.
There used to be an easy way to collect the playchess computer games but that seems to have dried up.
You can get CCRL, CEGT, and similar contest games for free download, and they are decorated with the analysis.
There used to be an easy way to collect the playchess computer games but that seems to have dried up.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
yurikvelo
- Posts: 710
- Joined: Sat Dec 06, 2014 1:53 pm
Re: "Opening Books" for Endgames
retrieve 6-piece FEN from Syzygy?
-
Ajedrecista
- Posts: 1950
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: "Opening Books" for Endgames.
Hello Andrew:
Fishtest allows to download PGNs of the workers during a short period of time, let's say four days or so. Currently, the oldest test where I can download PGNs is this one:
https://tests.stockfishchess.org/tests/ ... afa50699a2
Please take a look at Idx column. However, I do not know if there are adjudications that prevent engines to reach few pieces endgames.
A hard task could be to scan all the URLs of the tests that allow PGN downloads. Once you get them, the URLs of the PGNs are easy to fill... but you must know where to stop in each test:
Some of the PGNs will have an only game, others only two... up to the limit per task/batch, which seems 200 right now. You can get thousands or even millions of games in few days or weeks if you track Fishtest. I do not know if Fishtest can block aggresive or automated download managers like wget.
Regards from Spain.
Ajedrecista.
Fishtest allows to download PGNs of the workers during a short period of time, let's say four days or so. Currently, the oldest test where I can download PGNs is this one:
https://tests.stockfishchess.org/tests/ ... afa50699a2
Please take a look at Idx column. However, I do not know if there are adjudications that prevent engines to reach few pieces endgames.
A hard task could be to scan all the URLs of the tests that allow PGN downloads. Once you get them, the URLs of the PGNs are easy to fill... but you must know where to stop in each test:
Code: Select all
https://tests.stockfishchess.org/tests/view/5f7232ee3b22d6afa50699a2
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-0.pgn
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-1.pgn
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-2.pgn
[...]
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-407.pgn
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-408.pgn
https://tests.stockfishchess.org/api/pgn/5f7232ee3b22d6afa50699a2-409.pgn
------------
https://tests.stockfishchess.org/tests/view/5f723b483b22d6afa5069a99
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-0.pgn
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-1.pgn
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-2.pgn
[...]
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-635.pgn
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-636.pgn
https://tests.stockfishchess.org/api/pgn/5f723b483b22d6afa5069a99-637.pgn
------------
[...]Regards from Spain.
Ajedrecista.
-
AndrewGrant
- Posts: 1660
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Re: "Opening Books" for Endgames
Well so in this case, I don't want any positions that are already solved by Syzygy. Learning on them would be futile.
Talkchess is dead without moderation. If you want my attention, contact me via andrew@grantnet.us
-
D Sceviour
- Posts: 570
- Joined: Mon Jul 20, 2015 5:06 pm
Re: "Opening Books" for Endgames
You could build a polyglot book with four million hash keys. Why 4 million when 4 Gb would be better? Rook and pawn endings used to occur about 50% of the time in human vs human games. What about rook and knight endings or any combination greater than 7 pieces? What would you do with such a book? The ICS tournaments may allow it, but CCRL and a number of other testers do not allow large learning files.AndrewGrant wrote: ↑Thu Oct 01, 2020 8:50 am I'm looking to build an "Opening Book" for specific endgames. For now, I happen to be interested in Rook(s) + Pawn(s) endgames. I'm hoping to build a book of ~4million positions, and then play an Ethereal match starting at each of the positions.
My (lazy) gut instinct would be to try to grab a massive collection of Fishtest games. It appears to me, in my data, that somewhere between 5% and 10% of all games descend into a Rook(s) + Pawn(s) endgame. This would mean that I need somewhere around 60 million games. I could pull games down with some dirty script, but I'de rather not scrape the Fishtest site and hurt their performance.
Does there exist a collection of tens of millions, if not billions, of games in PGN format? I don't think quality is a huge concern, but I would not want human games. Had I saved every openbench game ever played, I would have 140million. I should have done that, but I would not put that network strain on users.
One method may be to simply let your engine run from a given position. Save all hash positions with a terminal ending ( 0 or MATE) to an epd file. At 3 million nodes per second, you should be able to fill up a hard drive with unique positions very quickly.