Compression of Underdetermined Data in a 7-Piece Chess Table

Discussion of chess software programming and technical issues.

Moderator: Ras

Dave Gomboc
Posts: 19
Joined: Sun Aug 15, 2021 12:22 am
Full name: Dave Gomboc

Compression of Underdetermined Data in a 7-Piece Chess Table

Post by Dave Gomboc »

It seems surprising to me that the paper "Compression of Underdetermined Data in a 7-Piece Chess Table" (https://link.springer.com/article/10.31 ... 1916010076), while citing the authors of the RE-PAIR compression method (https://ieeexplore.ieee.org/document/755679), doesn't also cite Syzygy's use of the RE-PAIR compression method. The commit history of https://github.com/syzygy1/tb shows that by April 2013, Syzygy was using the RE-PAIR compression method for compressing its six-piece endgame tables, which precedes the paper by at least a couple of years.

Can anyone clearly explain which portions of this paper represent novel contribution(s) by its authors, and which portions of this paper are reproducing or making a minor update to prior work?
BeyondCritics
Posts: 410
Joined: Sat May 05, 2012 2:48 pm
Full name: Oliver Roese

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by BeyondCritics »

As far as I know the author never produced something citable. And that in turn means, he also gave never anybody credit. He should explain what he has done there, this will help science and himself.
Last edited by BeyondCritics on Mon Apr 07, 2025 3:54 am, edited 1 time in total.
User avatar
phhnguyen
Posts: 1517
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by phhnguyen »

It is not a new problem. Ronald de Man had shown his angry and reasons about that paper in an old post (in this forum) long time ago. I’m in the office thus can’t help to find but you may search that post.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
Dave Gomboc
Posts: 19
Joined: Sun Aug 15, 2021 12:22 am
Full name: Dave Gomboc

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by Dave Gomboc »

phhnguyen wrote: Mon Apr 07, 2025 3:52 am It is not a new problem. Ronald de Man had shown his angry and reasons about that paper in an old post (in this forum) long time ago. I’m in the office thus can’t help to find but you may search that post.
Thanks. It seems that viewtopic.php?t=60222 may be the post to which you're referring. Reviewing it, I found reference to an older thread that is still online today at the CCRL Discussion Board, which has an area for discussing Endgame Tablebases. Ronald de Man references RE-PAIR in this 2008 posting: http://kirill-kryukov.com/chess/discuss ... 300#p37300.
syzygy
Posts: 5671
Joined: Tue Feb 28, 2012 11:56 pm

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by syzygy »

Dave Gomboc wrote: Sun Apr 06, 2025 10:50 pm It seems surprising to me that the paper "Compression of Underdetermined Data in a 7-Piece Chess Table" (https://link.springer.com/article/10.31 ... 1916010076), while citing the authors of the RE-PAIR compression method (https://ieeexplore.ieee.org/document/755679), doesn't also cite Syzygy's use of the RE-PAIR compression method. The commit history of https://github.com/syzygy1/tb shows that by April 2013, Syzygy was using the RE-PAIR compression method for compressing its six-piece endgame tables, which precedes the paper by at least a couple of years.

Can anyone clearly explain which portions of this paper represent novel contribution(s) by its authors, and which portions of this paper are reproducing or making a minor update to prior work?
Indeed this paper is a near-total rip off. I once posted some of my compression ideas to their google plus page, and their response was entirely dismissive. Then they ended up implementing all the ideas (which is fine) and wrote them down in a paper without any attribution (which is not fine).

I considered writing to the editors of the journal, but in the end I could not be bothered enough :).
Also, I can blame myself for never properly writing stuff up.
abulmo2
Posts: 460
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by abulmo2 »

Dave Gomboc wrote: Sun Apr 06, 2025 10:50 pm It seems surprising to me that the paper "Compression of Underdetermined Data in a 7-Piece Chess Table" (https://link.springer.com/article/10.31 ... 1916010076), while citing the authors of the RE-PAIR compression method (https://ieeexplore.ieee.org/document/755679), doesn't also cite Syzygy's use of the RE-PAIR compression method. The commit history of https://github.com/syzygy1/tb shows that by April 2013, Syzygy was using the RE-PAIR compression method for compressing its six-piece endgame tables, which precedes the paper by at least a couple of years.

Can anyone clearly explain which portions of this paper represent novel contribution(s) by its authors, and which portions of this paper are reproducing or making a minor update to prior work?
As I understand, Zakharov & al are behind the Lomonosov 7-piece endgame tables, published in 2012. It is possible that their RE-PAIR compression method predates Syzygy's one, although their papers were published later.
Richard Delorme
syzygy
Posts: 5671
Joined: Tue Feb 28, 2012 11:56 pm

Re: Compression of Underdetermined Data in a 7-Piece Chess Table

Post by syzygy »

abulmo2 wrote: Sat Apr 19, 2025 3:03 am
Dave Gomboc wrote: Sun Apr 06, 2025 10:50 pm It seems surprising to me that the paper "Compression of Underdetermined Data in a 7-Piece Chess Table" (https://link.springer.com/article/10.31 ... 1916010076), while citing the authors of the RE-PAIR compression method (https://ieeexplore.ieee.org/document/755679), doesn't also cite Syzygy's use of the RE-PAIR compression method. The commit history of https://github.com/syzygy1/tb shows that by April 2013, Syzygy was using the RE-PAIR compression method for compressing its six-piece endgame tables, which precedes the paper by at least a couple of years.

Can anyone clearly explain which portions of this paper represent novel contribution(s) by its authors, and which portions of this paper are reproducing or making a minor update to prior work?
As I understand, Zakharov & al are behind the Lomonosov 7-piece endgame tables, published in 2012. It is possible that their RE-PAIR compression method predates Syzygy's one, although their papers were published later.
Nope, they took it from my generator and/or my posts on various forums. This is of course fine, but they committed severe academic fraud by then publishing it without attribution.
https://web.archive.org/web/20180713195 ... C252LggQQS
7-man TB on April 12, 2013 wrote:Found an interesting research.

http://kirill-kryukov.com/chess/discuss ... 3e52f02879

Although 6-man generator is not something exciting there are two very interesting ideas in the project.

1) Using RE-PAIR algorithm for tablebases compress. Compared to LZMA (which is used for LTB) the advantage of RE-PAIR is having common dictionary for the whole file (LZMA builds separate dictionary for each block of data). This provides better compression for small blocks allowing to make them as small as you need. So two important problems can be solved: (a) better compression, (b) smaller blocks and accordingly faster access time.

2) Probing code keeps compressed blocks in memory. So memory usage can be times better. But you need to pay access time for this. So it is still unclear whether to use or not to use this method.

Possibly some other findings from the project should be checked, but we definetly need to check how RE-PAIR compression will work for 7-man DTM-tables.

It would be good if we will be able to migrate to 4K or 8K blocks instead of current 16K blocks. But many things are unclear for a while. compression ratio, reasonable dictionary size, compression time,
And the similarities between what is in the paper and what is in my generator go way beyond the use of Re-Pair compression.