Page 1 of 2

nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 10:15 am
by flok
Hi,

Does anyone know of the nalimov database (3+4+5) in ascii/fen format?
Or are there tools for doing so?

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 10:25 am
by hgm
For one, Nalimov is a specific binary format, so this is the same thing as asking for a Windows .exe file in English ascii. As converting them to FEN would blow them up end-game tables in general by a factor 1000 or so, I don't think FEN is a really practical format for any EGT except perhapst the simples 3-men...

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 12:00 pm
by Joost Buijs
flok wrote:Hi,

Does anyone know of the nalimov database (3+4+5) in ascii/fen format?
Or are there tools for doing so?
Do you want to put a fen string in and get the result from the Nalimov database out?
I don't know whether it exists but it is not very difficult to translate a fen string to the format that is needed by the Nalimov probe code.

Or do you mean the entire database in ascii/fen format?
Like Harm Geert already said this file would be so large that a common PC will choke on it.

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 12:10 pm
by flok
Joost Buijs wrote:Or do you mean the entire database in ascii/fen format?
Like Harm Geert already said this file would be so large that a common PC will choke on it.
Yes that indeed is what I meant.

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 12:13 pm
by flok
hgm wrote:For one, Nalimov is a specific binary format, so this is the same thing as asking for a Windows .exe file in English ascii. As converting them to FEN would blow them up end-game tables in general by a factor 1000 or so, I don't think FEN is a really practical format for any EGT except perhapst the simples 3-men...
Do you know how many positions are in it?
Because then we can calculate how much diskspace it'll use.
I mean let's say currently 16 bytes per position and 10GB in nalimov format. That's about 671088640 positions. Let's say that each position with all data uses 128 bytes (a fen string is in my test on average less than 70 bytes). That gives you 85899345920 bytes which is 80GB. Then some mysql-overhead, maybe 200GB.

Am I overlooking something here?

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 12:54 pm
by Joost Buijs
flok wrote:
hgm wrote:For one, Nalimov is a specific binary format, so this is the same thing as asking for a Windows .exe file in English ascii. As converting them to FEN would blow them up end-game tables in general by a factor 1000 or so, I don't think FEN is a really practical format for any EGT except perhapst the simples 3-men...
Do you know how many positions are in it?
Because then we can calculate how much diskspace it'll use.
I mean let's say currently 16 bytes per position and 10GB in nalimov format. That's about 671088640 positions. Let's say that each position with all data uses 128 bytes (a fen string is in my test on average less than 70 bytes). That gives you 85899345920 bytes which is 80GB. Then some mysql-overhead, maybe 200GB.

Am I overlooking something here?
You were talking about 5 pieces only, maybe it is doable.
This still leaves the question why on earth you want to do something like this?

It is not so straightforward to tell how many position there are stored in the Nalimov database, a lot of positions are probably mirrored and reflected and they are also LZW compressed.
Maybe Ronald de Man can tell how many different positions there are stored in his 5 piece syzygy database.

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 1:38 pm
by Edmund
This will give you an idea:
http://kirill-kryukov.com/chess/nulp/results.html
flok wrote:
hgm wrote:For one, Nalimov is a specific binary format, so this is the same thing as asking for a Windows .exe file in English ascii. As converting them to FEN would blow them up end-game tables in general by a factor 1000 or so, I don't think FEN is a really practical format for any EGT except perhapst the simples 3-men...
Do you know how many positions are in it?
Because then we can calculate how much diskspace it'll use.
I mean let's say currently 16 bytes per position and 10GB in nalimov format. That's about 671088640 positions. Let's say that each position with all data uses 128 bytes (a fen string is in my test on average less than 70 bytes). That gives you 85899345920 bytes which is 80GB. Then some mysql-overhead, maybe 200GB.

Am I overlooking something here?

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 4:20 pm
by hgm
flok wrote:Do you know how many positions are in it?
Roughly, for a single material combination:

Pawnless
3 men: 64^3/8 = 32k
4 men: 64^4/8 = 2M
5 men: 64^5/8 = 128M

with Pawns
3 men: 64^3/2 = 128k
4 men: 64^4/2 = 8M
5 men: 64^5/2 = 512M

Nalimov saves a bit on that by excluding illegal positions with neighboring Kings, but that is less than a factor 2.

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 4:50 pm
by bob
flok wrote:Hi,

Does anyone know of the nalimov database (3+4+5) in ascii/fen format?
Or are there tools for doing so?
Do you have any idea how big that would be? 7+ gigs at 1 or 2 bytes per entry. Compressed. This would choke a mule...

Re: nalimov database for 3, 4 and 5 in FEN format

Posted: Mon Jul 13, 2015 4:52 pm
by bob
flok wrote:
hgm wrote:For one, Nalimov is a specific binary format, so this is the same thing as asking for a Windows .exe file in English ascii. As converting them to FEN would blow them up end-game tables in general by a factor 1000 or so, I don't think FEN is a really practical format for any EGT except perhapst the simples 3-men...
Do you know how many positions are in it?
Because then we can calculate how much diskspace it'll use.
I mean let's say currently 16 bytes per position and 10GB in nalimov format. That's about 671088640 positions. Let's say that each position with all data uses 128 bytes (a fen string is in my test on average less than 70 bytes). That gives you 85899345920 bytes which is 80GB. Then some mysql-overhead, maybe 200GB.

Am I overlooking something here?
Quite probably. :)

Nalimov stores most positions using one byte. Some require 2 when the distance to mate passes something like 125 or so. And they are highly compressed on top of that...