jja: convert CTG books to PolyGlot format (and more!)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

I am happy to inform you that jja-0.6.1 is released with many important fixes for BrainLearn experience files. With this version, jja subcommands find, dump, and restore work with BrainLearn files which means you can query single positions, convert to PGN, dump to a stream of JSON arrays, edit them as necessary and restore them again with "jja restore". BrainLearn experience files must have the "exp" extension to be recognized by jja. Since BrianLearn version 25.1, with this commit and ShashChess version 33.1, with this commit, the engines BrainLearn and ShashChess also expect experience files to have ".exp" extension so jja is consistent with recent upstream. You can simply rename old experience files with ".bin" extension to ".exp".

Download

- Windows: jja-0.6.1.exe, sha512sum, signature
- Linux-Glibc: jja-0.6.1-glibc.bin, sha512sum, signature
- Linux-Musl: jja-0.6.1-musl.bin, sha512sum, signature

To build from source, use

Code: Select all

cargo install jja
ChangeLog
## 0.6.1

- fix deserializing of promotions of BrainLearn experience file entries.
- edit learned to convert BrainLearn experience files to PGN. Note a lot of memory
may be required in converting these files. This is planned to be improved in the
future.
- fix deserializing of castling moves of BrainLearn experience file entries.
- fix `jja restore` which incorrectly wrote BrainLearn entries in big-endian rather
than little-endian.
- fix sorting of BrainLearn entries in `jja find` tree output.
- upgrade `rust-embed` crate from `6.7` to `6.8`.
- upgrade `num_cpus` crate from `1.15` to `1.16`.
- find can now correctly query BrainLearn experience files using Stockfish
compatible Zobrist hashes.
- hash learned `-S, --stockfish` to generate Stockfish compatible Zobrist hashes.
- New public module `jja::stockfish`, and new public function
`jja::stockfish::stockfish_hash` to generate Stockfish compatible Zobrist hashes.
Before calling this function `jja::stokfish::zobrist::init` function must be
called once to compute hashtables used in Zobrist hashing.
- New public functions `jja::chess::{de,}serialize_chess` to serialize/deserialize a
`shakmaty::Chess` instance to/from an array of 5 unsigned 64-bit numbers. `jja
dump` uses this functionality in PGN dumps.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

Hello Chess enthusiasts,

I'm thrilled to announce the release of jja-0.7.0, the newest iteration of our powerful chess opening book tool. This version comes with a significant number of enhancements, performance improvements, and some key bug fixes.

In jja-0.7.0, we've managed to improve the efficiency of our tool significantly, thanks to the implementation of buffered I/O during PGN conversion. We've also enabled a progress bar during PGN conversion to give you a clear idea of the operation's progress.

We've redesigned key aspects of the CTG interface to make it leaner and the PGN converter is now much more memory efficient, making it possible to convert huge opening books without running out of system memory.

We've updated the PolyGlotBook implementation for better usability and improved the overall efficiency of various functions including edit, merge, and find. These include the new ability to merge BrainLearn experience files, in-place editing of these files, and more.

We've also brought about substantial improvements in the OBK, ABK, and CTG opening book functions, including the efficient memory mapping of the files, implementation of progress bars, and improved editing capabilities.

Another key enhancement is the dump function learning binary format to dump PGNs in PostgreSQL binary output format, bringing more flexibility to your data handling processes.

We're excited to share that the tool now supports the full range of Numeric Annotation Glpyhs (NAGs) in CTG, and it prioritizes CTG coloured move recommendations over NAGs in both ABK priority calculation and Polyglot weight calculation.

The release comes with a multitude of other improvements, bug fixes, and changes aimed at improving the tool's usability, efficiency, and accuracy. Please see the full changelog for more details.

We continue to value your feedback, and we're looking forward to making the jja tool even better with your help. Gens una sumus!

Download

- Windows: jja-0.7.0.exe, sha512sum, signature
- Linux-Glibc: jja-0.7.0-glibc.bin, sha512sum, signature
- Linux-Musl: jja-0.7.0-musl.bin, sha512sum, signature

To build from source, use

Code: Select all

cargo install jja
ChangeLog
## 0.7.0

- enable the lint `#![deny(clippy::cast_precision_loss)]` for jja library and fix
offending code.
- edit learned to use buffered I/O, rather than direct I/O during PGN conversion to
improve efficiency, especially with huge opening books.
- edit learned to display a progress bar during PGN conversion.
- CTG interface has seen many **breaking change**s to make the interface leaner,
with regards to the recent mmap changes. As a result, the function
`jja::ctg::find_piece` returns an `Option<i32>` rather than a `Result<i32,
std::io::Error>`, the functions `jja::ctgbook::CtgBook::extract_all{,2}` return a
`CtgTree` rather than a `Result<CtgTree, std::io::Error>`, and the function
`jja::ctgbook::CtgBook::lookup_moves` returns an `Option<Vec<CtgEntry>>` rather
than a `Result<Option<Vec<CtgEntry>>, std:io::Error>`.
- edit now uses a much more efficient PGN converter implementation which does not
load the whole tree of variations in memory. This makes it possible to convert
huge opening book files into PGN without running out of system memory. Moreover,
the PGN converter now respects the `-f, --fen`, and `-p, --pgn` arguments of `jja
edit` so it is possible to convert a subtree of the opening book into PGN. See the
respective issues [#14](https://todo.sr.ht/~alip/jja/14), and
[#16](https://todo.sr.ht/~alip/jja/16) for more details. This change deprecates
the function `jja::chess::pgn_from_tree`, it is recommended to use the new
`write_pgn` function of the respective opening book file.
- `jja::polyglotbook::PolyGlotBook` implementation has seen many improvements to be
a leaner interface. `PolyGlotBook::get{,_key}` functions no longer panic on
invalid indexes, rather return an `Option<BookEntry>` rather than a `BookEntry`.
This implementation avoids an ugly hack to map a dummy anonymous memory region for
PolyGlot books with zero-size to allow creating PolyGlot books from scratch (e.g:
`touch new-file.bin && jja edit -i new-file.bin`). As these functions are public,
this is a **breaking change**.
- merge now supports merging BrainLearn experience files together. The option
`--weight-cutoff` has been renamed to `--cutoff` and now supports filtering out by
min depth in BrainLearn experience files.
- edit now supports editing BrainLearn experience files. In-place editing of such
files is also supported.
- reduce minimum supported Rust version (MSRV) from `1.70` to `1.64` for portability.
- bring back the dependency on `is_terminal` crate rather than depending on >=rust-1.70.
- downgrade `pgn-reader` crate from `0.25` to `0.24`.
- downgrade `shakmaty` crate from `0.26` to `0.25`.
- upgrade `regex` crate from `1.8` to `1.9`.
- upgrade `smallvec` crate from `1.10` to `1.11`.
- The `progress_bar` member of `jja::obkbook::ObkBook` has been removed, in return
the public functions `jja::obkbook::ObkBook::{tree,traverse_tree}` require an
optional reference to a progress bar now. This avoids a needless clone of the
progress bar and it is a **breaking change**. Moreover, the function
`jja::obkbook::ObkBook::read_moves` has been renamed to `load` which is again a
**breaking change**.
- find now memory maps the OBK opening book files rather than reading the whole file
into memory at once for efficiency. This caused a change in public function
signatures of `jja::obkbook::ObkBook::{read_moves,traverse_tree,tree}` which is a
**breaking change**.
- dump learned `binary` format to dump PGNs in PostgreSQL binary output format. This
brings in a dependency on crate `pgcopy`.
- find now memory maps the PolyGlot opening book files rather than maintaining a
`BufReader<File>` handle to them. This removes the `book` public member of
`jja::polyglotbook::PolyGlotBook`, and changes signatures of public functions
`jja::polyglotbook::PolyGlotBook::{lookup_moves,tree}` which is a **breaking
change**. This also changes names of the public functions
`jja::polyglotbook::PolyGlotBook::{find_book_key,read_book_entry,read_book_key}`
to `jja::polyglotbook::PolyGlotBook::{find,get,get_key}` respectively which is
again a **breaking change**. Moreover `jja::polyglotbook::PolyGlotBook`'s default
iterator implementation has been changed to iterate over single book entries. New
function introduced `jja::polyglotbook::PolyGlotBook::into_iter_grouped()` may be
used to iterate over entries grouped by key.
- check for whether standard error is a TTY, rather than standard output when
displaying progress bars. This allows commands such as dump to display progress
bars during execution.
- optimize the stockfish hash function implementation, making it almost double as
fast.
- hash learned `-B`, `--benchmark`, and `-I`, `--benchmark-iterations` to benchmark
Stockfish and Zobrist hash functions. This brings in a dependency on
`benchmarking` crate.
- Avoid translating CSV headers of ABK entries in `jja find` output. Note, this
change is for CSV output only which is printed when the output is not a TTY or the
option `--porcelain=csv` is given.
- The `progress_bar` member of `jja::ctgbook::CtgBook` has been removed, in return
the public functions `jja::ctgbook::CtgBook::extract_all{,2}` require an optional
reference to a progress bar now. This avoids a needless clone of the progress bar
and it is a **breaking change**.
- avoid needless conversion to and from EPD in `jja::ctgbook::CtgBook` functions
improving efficiency. The function `jja::ctgbook::CtgBook::lookup_moves` now
accepts a `&dyn shakmaty::Position` rather than an EPD string which is a
**breaking change**.
- drop unused public functions `jja::ctg::find_piece`, `jja::ctg::decode_fen_board`,
`jja::ctg::invert_board`, `jja::ctg::needs_flipping`, `jja::ctg::flip_board`
which is a **breaking change**.
- quote now also accepts a search term as a case-insensitive regular expression as
well as a quote index.
- The `progress_bar` member of `jja::abkbook::AbkBook` has been removed, in return
many public functions of `jja::abkbook::AbkBook` require an optional reference to
a progress bar now. This avoids a needless clone of the progress bar and it is a
**breaking change**.
- find now memory maps the ABK opening book files rather than reading the whole file
into memory at once for efficiency. `jja::abkbook::AbkBook` no longer implements
`Clone` which is a **breaking change**.
- edit now uses buffered writing when converting books to the ABK opening book
format which improves efficiency. The function `jja::AbkBook::write_file` has been
changed to take an argument a `BufWriter<W: Seek + Write>` rather than a `File`
which is a **breaking change**.
- the positions table, `p`, in jja-0 databases now have an index on id, `p_idx` so
as to be able to query for Zobrist hash collisions more efficiently.
- info prints file type in uppercase rather than lowercase now.
- edit learned to print Polyglot book information after a successful CTG conversion,
like we already do for CTG to ABK conversions.
- CTG gained support for the full range of NAGs, `$0 - $255`, thus edit no longer
panics when stumbling upon a previously unsupported NAG. Note, only the move
assessments, `$1 - $9`, are used in Polyglot weight and ABK priority calculation
during edit. Other NAGs are merely used for display for the find subcommand. Note,
this changes the `jja::ctg::Nag` public type, and hence is a **breaking change**.
- dump learned `-f=<FORMAT>`, `--format=<FORMAT>` argument to choose the dump format
of PGN dumps. This option has no effect on non-PGN dumps. The function
`jja::pgn::pgn2csv` has been renamed to `jja::pgn::pgn_dump` which is a
**breaking change**.
- prioritize CTG coloured move recommendations over Numeric Annotation Glpyhs (NAGs)
in ABK priority calculation. edit learned three command-line parameters which are
`--color-priority-green`, `--color-priority-blue`, and `--color-priority-red`.
Their default values are `9`, `5`, and `1` respectively.
- make learned `--min-wins` which can be used to filter moves by their win count.
The default value is `0` which has no effect.
- prioritize CTG coloured recommendations over Numeric Annotation Glpyhs (NAGs) in
Polyglot weight calculation. edit learned three command-line parameters which are
`--color-weight-green`, `--color-weight-blue`, and `--color-weight-red`. Their
default values are `10000`, `1000`, and `1` respectively.
- CTG numerical annotation glyph `7` which is a forced move is now supported.
Previously, we mistakenly used `8` which was an only move, not a forced move.
Although the distinction is not really clear, we've implemented `ctg::Nag::Only`
in addition to `ctg::Nag::Forced`, and the corresponding command-line flags for
jja edit are `--nag-weight-only=<WEIGHT>`, and `--nag-priority-only=<PRIO>`. Both
defaults are identical to the default values of `--nag-weight-forced`, and
`--nag-priority-forced`.
- **important fix** `jja::CtgBook::search_position` function from mistakenly missing
some positions causing some huge CTG books, larger than ~2,5-3G, to be seen as
having 0 positions during edit, or causing position lookups to fail during find. A
multiplication overflow in `jja::CtgBook::read_page` function is also fixed.
- make informs user about the `--{win,draw,loss}-factor` values during filtering as
they're also important in determining weight, and preserval of entries.
- fix a bug in make where negative values in `--draw-factor`, and `--loss-factor`
was not counted as deficits.
- dump and restore learned an experimental PGN to an array of chess position Zobrist
keys and setups in streaming JSON. restore can save these dump into a sqlite3
database with `.jja-0` extension. The format of this file is **experimental**, and
is subject to change. Once this format is stable, the extension `.jja-1` is going
to be used. We're using this format currently only to detect Zobrist hash
collisions.
- use the [`XorShift`](https://www.jstatsoft.org/v08/i14/paper) random number
generator to randomly pick moves during random playouts using the play command. This
algorithm is cryptographically insecure but is very fast. See the benchmark
[here](https://git.sr.ht/~alip/jja/commit/b265 ... 80b528919b)
- **breaking change**: `jja::quote::print_quote` now requires a second argument which
is a boolean which specifies whether the output should be formatted with ANSI
colour codes or not. By default, when the standard output is not a TTY,
`print_quote` will now print quote, and author information without styling.
- fix PGN dumps to produce proper JSON arrays, previously the array markers `[]`
were erroneously not printed out.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
chesskobra
Posts: 216
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by chesskobra »

Hi, I downloaded jja-0.7.0-glibc.bin, but when I run it jja-0.7.0-glibc.bin -V, I get jja v0.6.1-111-gf057314. So I am wondering if I have the correct version.
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

chesskobra wrote: Thu Jul 13, 2023 1:45 am Hi, I downloaded jja-0.7.0-glibc.bin, but when I run it jja-0.7.0-glibc.bin -V, I get jja v0.6.1-111-gf057314. So I am wondering if I have the correct version.
Good spot! I am very sorry to say I have uploaded the wrong build. This is identical to 0.7.0, only except the version reports something different.
I have uploaded the correct build, you may download them again using the same URL.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

Hello everyone,

The trusty "jja" tool has rolled out its v0.8.0 update. Let's cut to the chase and highlight some of the advanced changes.

## New Commands
- Perft: Count legal move paths of a set length.
- Probe: Akin to Fathom, it probes Syzygy tablebases up to 7 pieces and introduces a swift "--test --fast" mode.

## PGN Generation Fix
- Re-export your PGNs. The repetition tracker issue that was omitting certain lines has been rectified.

## Enhanced Editing & Dumping
- Edit: Convert PGNs (including compressed) to EPD files with new command options.
- Dump: Precision PGN dumping with the "-e"/"--elements" option.

## Crate Updates
- Fresh versions for "pgn-reader", "shakmaty", and "tempfile".

## Breaking Changes
Ensure to skim through these as a few public API function signatures have been altered.

## Error Handling
Transition to the "anyhow" crate should refine the error handling process.

## Misc
The command "hash" has morphed into "digest".

## Full Changelog
- new "perft" subcommand to count legal move paths of a given length
- upgrade "pgn-reader" crate from "0.24" to "0.25"
- upgrade "shakmaty" crate from "0.25" to "0.26"
- upgrade "tempfile" create from "3.6" to "3.7"
- important fix in PGN generation to avoid skip some lines due to the incorrect
usage of the repetition tracker. This effects PGN generation from all supported
opening book formats ("abk", "bin", "ctg", "exp", "obk") so users are highly
recommended to re-export their previously exported PGNs using `jja edit
book.{abk,bin,ctg,exp,obk} -o book.pgn`.
- bump MSRV (minimal supported Rust version) from "1.64" to "1.70".
- quote now matches at word boundaries unless the given quote pattern includes the
wildcard characters ".", "?", or "*".
- jja now ignores invalid castling rights and en passant squares in PGN games' FEN
header rather than skipping the game.
- edit learned to dump all positions in a PGN file to an output EPD file. Compressed
PGN files are supported, hence the usage is
"jja edit source.pgn{,.bzip2,gz,lz4,xz,zst} -o output.epd".
- dump learned "-e", "--elements" to specify a list of elements to dump which
defaults to "id, position". This option works only for PGN dumps.
- new subcommand "probe" which can be used to probe Syzygy tablebases up to 7
pieces. This is almost functionally identical to the awesome Fathom tool, but also
offers some alternative modes such as "--test --fast" to skip walking the DTZ line
at the cost of misevaluating "MaybeWin" and "MaybeLoss" positions.
- edit learned two new command line options "--max-ply=<PLY>", and
"--look-ahead=<PLY>". The former limits PGN generation to a certain number of
plies, and defaults to "1024". The latter is used to specify the number of plies
to look ahead on PolyGlot book lookup misses during PGN generation which is useful
to generate PGNs of book generated by "--only-black|white". Currently
"--look-ahead" only supports "0", and "1" as argument, of which "0" being the
default value, other values will generate an "unimplemented" panic which directs
the user to report a bug. This change comes along with a change in the public
function signature of "jja::polyglotbook::PolyGlotBook::write_pgn" which is a
breaking change.
- The function "jja::polyglot::from_move" now takes argument a reference to a
"shakmarty::Move" rather than a "shakmaty::Move" which is a breaking change.
- The default values of some weight conversion arguments to "jja edit" have been
changed to be more intuitive, in that default value of "--color-weight-green" has
been changed from "10000" to "65520", "--color-weight-blue" from "1000" to
"32768", "--nag-weight-good" from "9000" to "65280", "--nag-weight-hard" from
"10000" to "65520", "--nag-weight-interesting" from "7500" to "61440",
"--nag-weight--forced" from "10000" to "65520", and "--nag-weight-only" from
"10000" to "65520". The new defaults are empirical, and advanced users are
recommended to use "--color-weight-*", and "--nag-weight-*" arguments with `jja
edit" for fine-tuning during "CTG->BIN" or "CTG->ABK` conversions.
- Important fix in "jja::ctgbook::CtgBook::read_page" function to prevent panics
due to out-of-bounds access. This fixes search & conversion with some huge CTG
books.
- The function "jja::abkbook::AbkBook::traverse_book_and_merge" function now returns
nothing rather than "Result<(), Box<dyn std::error::Error>>" which is a **breaking
change**.
- The function "jja::chess::lines_from_tree" now returns a "Table" rather than a
"Result<Table, Box<dyn std::error::Error>>" which is a breaking change.
- Use "anyhow" crate for error handling. This is mainly used in the main code, and
not the library code but there are points where the library code is changed, and
there are breaking changes: "jja::pgnbook::create_opening_book" now returns an
"anyhow::Result<GameBase>", rather than
"Result<GameBase, Box<dyn std::error::Error>>". Similarly the function
"jja::pgn::pgn_dump" now returns "anyhow::Result<()>" rather than
"Result<(), Box<dyn std::error::Error>>".
- The functions "jja::system::get_progress_{bar,spinner}" now return a "ProgressBar"
rather than a "Result<ProgressBar, Box<dyn std::error::Error>>". These functions
now panic when there is an error in the template, which shouldn't happen normally.
This change is a breaking change.
- The function "jja::system::edit_tempfile" now returns "EditTempfileError" on error
rather than "Box<dyn std::error::Error>". Moreover the "EditTempfileError" enum
has a new member "EditTempfileError::InputOutputError(std::io::Error)". These
changes are in the public API and hence are breaking changes.
- restore learned experimental EPD output support. This may be used in a pipeline
with dump, e.g: "jja dump -fcsv file.pgn | jja restore file.epd" to create an
Extended Position Description file of all the positions in the given Portable Game
Notation file. The EPD entries include the Zobrist hash of the positions in the "id"
field.
- fix deserializing of Chess960 castling rights in "jja::chess::deserialize_chess".
- "jja::chess::deserialize_chess" function now panics on invalid piece indexes
rather than silently continuing.
- "hash" subcommand has been renamed to "digest". The former will work as an alias
until the next major version bump.

## Download
* Windows: jja-0.8.0.exe, sha512sum, signature
* Linux GLibc: jja-0.8.0-glibc.bin, sha512sum, signature
* Linux Musl: jja-0.8.0-musl.bin, sha512sum, signature

To install from source, use "cargo install jja".

For those knee-deep in "jja" integrations, ensure to delve into the nitty-gritty of the changelog to maintain smooth operations.

Warm regards,
Ali
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

jja-0.8.1 has been released.

## Download

* Windows: jja-0.8.1.exe, sha512sum, signature
* Linux GLibc: jja-0.8.1-glibc.bin, sha512sum, signature
* Linux Musl: jja-0.8.1-musl.bin, sha512sum, signature

To install from source, use "cargo install jja".

## ChangeLog

- upgrade "tempfile" crate from "3.7" to "3.8".
- upgrade "clap" crate from "4.3" to "4.4".
- info learned to print the sha256 checksum of the chess file.
- fix a memory leak in builds with the "i18n" feature disabled.
- edit now converts CTG books to ABK books around 5% times faster using 30% less
memory thanks to the new public functions "jja::ctgbook::CtgBook::extract_map{,2}"
which directly returns a "BTreeMap<u64, Vec<SBookMoveEntry>>" rather than processing a
"CtgTree" into a "SBookMoveEntryHashMap" as an additional step.
- info now prints information about the size of a single entry.
- info now prints information about ABK number of entries.
- New public function "jja::abkbook::AbkBook::total_entries()" to calculate the
number of entries in an Arena opening book file.
- Various memory efficiency improvements for "CTG->ABK" conversion.
- edit now reads from standard input rather than spawning the default editor if the
standard input does not refer to a terminal, ie is not a TTY. This is useful for
programmatic editing of opening books. find output may be piped to edit which
makes it practical to write entries from an opening book to another opening book
of the same type. Finally this also gives users who are unable to use the editor
functionality a chance to edit their opening books by submitting a custom CSV
file via standard input.
- replace the final "lazy_static!" usages with "once_cell" and drop the dependency
on "lazy_static" crate. Our usage of "once_cell" is currently in Rust nightly and
is hopefully soon going to land in stable Rust.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

jja-0.9.0 has been released.

## Download

* Windows: jja-0.9.0.exe, sha512sum, signature
* Linux GLibc: jja-0.9.0-glibc.bin, sha512sum, signature
* Linux Musl: jja-0.9.0-musl.bin, sha512sum, signature

To install from source, use "cargo install jja".

## A note on merge strategies

This release brings various merge strategies to be used in merging Polyglot opening books. Here is some brief information about them:

## MERGE STRATEGIES
Merge subcommand offers various merge strategies to customize how the
weight of equivalent move entries in both books are merged together.
Below you may find brief descriptions of different merge strategies
available, for more information read the code documentation of
`jja::merge::MergeStrategy':

* avg: Calculates the average weight of the moves in both books.
* pavg (default): Merges by computing the weighted average of percentage
weights, taking into account the total number of entries in each book.
This approach gives higher importance to moves from larger books versus
smaller ones, ensuring that the resultant weights reflect the relative
contributions of each book based on its size.
* wavg: Calculates the weighted average weight of the moves in both
books, considering the given weights for each book. This strategy
requires the user to specify respective book weights using -w,
--weight1, and -W, --weight2 command line options.
* sum: Calculates the sum of the weights of the moves in both books.
* max: Takes the maximum weight of the moves in both books.
* min: Takes the minimum weight of the moves in both books.
* ours: Prioritizes the moves of the first book, ignoring moves from the
second book.
* geometric: Geometric scaling strategy focuses on
multiplying numbers together rather than adding, which can help in
equalizing disparities.
* harmonic: Calculates the harmonic mean weight of the moves in both
books. This approach tends to favor more balanced weights and is less
influenced by extreme values.
* sigmoid: Remap the weights in a non-linear fashion using the sigmoid
function. The idea here is to diminish the influence of extreme values,
which might be causing the dissatisfaction in previous strategies.
* sort: Calculates weight based on relative position in sorted move
entries.

## ChangeLog

* merge cutoff specified using -c, --cutoff is now applied after the merge strategy assigns new weights. This way, cutoff may also be used to filter out unwanted entries after the merge.
* the default value of edit --color-weight-blue has been changed from 32768 to 65280 for consistency with --nag-weight-good.
* merge learned new merge strategy jja::merge::MergeStrategy::Sort to calculate weight based on the relative position of the move entry in sorted move entries.
* merge learned new merge strategy jja::merge::MergeStrategy::GeometricScaling to calculate the weight using geometric scaling: The geometric scale focuses on multiplying numbers together rather than adding, which can help in equalizing disparities.
* merge learned new merge strategy jja::merge::MergeStrategy::HarmonicMean to calculate the harmonic mean weight of the identical move entries in both books. This approach tends to favor more balanced weights and is less influenced by extreme values.
* merge learned new merge strategy jja::merge::MergeStrategy::Sigmoid to remap the weights in a non-linear fashion using the sigmoid function. The idea here is to diminish the influence of extreme values, which might be causing the dissatisfaction in previous strategies.
* edit now properly prioritizes CTG move entries with higher combined color, NAG, performance score such that the ordering is more close to the original.
* edit learned new commandline flags -C, --no-colors, and -N, --no-nags to avoid using CTG move colours and NAGs to assign PolyGlot move weights or Arena move priorities.
* merge and make commands now outputs info on the output book after successful run.
* merge strategy defaults to the new pavg strategy rather than sum now.
* merge learned --strategy pavg which stands for jja::merge::MergeStrategy::PercentageAverage. This strategy merges by computing the weighted average of percentage weights, taking into account the total number of entries in each book. This approach gives higher importance to moves from larger books versus smaller ones, ensuring that the resultant weights reflect the relative contributions of each book based on its size.
* merge learned --rescale to rescale weights of merged book entries globally to fit into 16 bits. This is similar to edit --rescale but works on the final merged book rather than the input books.
* make now uses estimate-num-keys property of the temporary RocksDB database to determine the approximate unique position count rather than bulk scanning the database twice which improves performance.
* make now uses read ahead during the bulk scan of the temporary RocksDB database. This increases performance significantly especially on spinning disks. The read ahead size may be configured using the --read-ahead=<SIZE> commandline option. This defaults to 4 MB and may be disabled by passing 0 as size.
* make now uses asynchronous I/O during the bulk scan of the temporary RocksDB database. This increases performance significantly especially with the io_uring feature enabled. This may be disabled using the new --sync commandline argument or using the environment variable JJA_SYNC.
* drop smallvec crate usage, and remove dependency.
* replace humansize with bytefmt crate.
* jja::abk::traverse_tree function now returns a BTreeMap<u64, Vec<CompactBookEntry> rather than a BTreeMap<u64, Vec<BookEntry> which is a breaking change. Moreover this function now sorts the value vectors using binary search which makes abk->bin both more memory efficient and faster.
* jja::ctg::ByteBoard::from_position has been replaced with the function from_board which is a breaking change.
* jja::ctgbook::CtgBook::process_move function now utilizes binary search for move table lookup improving efficiency. jja::ctg::MOVETABLE's encoding element has changed its type from char to u8 which is a breaking change.
* edit converts CTG books to BIN books with 20% less memory usage using the new compact jja::polyglot::CompactBookEntry data structure.
jja::abk::PackedSBookMoveEntry has been renamed to CompactSBookMoveEntry which is a breaking change.
drop the unused functions jja::ctgbook::CtgBook::extract_all{,2} and the unused type jja::ctgbook::CtgTree both of which are breaking changes.
rename jja::ctgbook::CtgBook::extract_map{,2} to extract_abk{,2} for consistency which is a breaking change.
* use extract_bin in CTG->BIN to improve efficiency
* implement new functions jja::ctgbook::CtgBook::extract_bin{,2}
* use human-panic crate for user friendly panic messages.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
chesskobra
Posts: 216
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by chesskobra »

I have a book of size about 3.3 MB. I ran the command

Code: Select all

jja edit -o book.pgn book.bin
and it goes on writing the pgn, and the pgn size grew to 6 GB, when I did ctrl-c. The original pgn from which I had constructed the book itself is only 400 MB in size. Why is the output so large?
User avatar
alpltl
Posts: 57
Joined: Tue Mar 14, 2023 3:04 pm
Location: Berlin
Full name: Ali Polatel

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by alpltl »

chesskobra wrote: Tue Sep 12, 2023 2:57 pm I have a book of size about 3.3 MB. I ran the command

Code: Select all

jja edit -o book.pgn book.bin
and it goes on writing the pgn, and the pgn size grew to 6 GB, when I did ctrl-c. The original pgn from which I had constructed the book itself is only 400 MB in size. Why is the output so large?
This is because PolyGlot format has no information about move orders and therefore jja includes all transpositions into the PGN. Without this, the book actually ends up too small and misses many legit lines. This has been changed recently due to a user request.
Caissa-AI, Caissa-Test, and Caissa-X on LiChess
ChessWoB: Chess without Boundaries
jja: Jin, Jîyan, Azadî!
Follow @alip on Mastodon!
amchess
Posts: 345
Joined: Tue Dec 05, 2017 2:42 pm

Re: jja: convert CTG books to PolyGlot format (and more!)

Post by amchess »

I tried the ctg->abk conversion with the following command:

Code: Select all

jja edit Book.ctg -o Book.abk -C -N --author=amchess --comment=amchessBook --probability-priority=2 --probability-games=15 --probability-win-percent=0
The book is created, but the probabilities in Arena are totally wrong (decreasing order) and overall the book is unusable with an engine (demo mode, for example), contrary to ohter abk books.
In what I'm wrong?