The Stockfish of shogi

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Fabian Fichter
Posts: 50
Joined: Mon Dec 12, 2016 2:14 pm

Re: The Stockfish of shogi

Post by Fabian Fichter »

In Fairy-Stockfish the base piece values are the same for all variants, they are only adjusted for a few rules that can heavily influence dynamics like losing chess rules, piece drops, and board size (for sliders). E.g., for drop games, the piece values are scaled by a v_max/(v_max+v) formula, where v_max is around 3 times the value of a queen, so pieces with a high value (v) lose relative strength in drop games. Additionally the piece values are halved for drop games to have a more natural scale for thresholds in futility pruning, razoring, SEE, etc., but this of course does not change their relative value.

In crazyhouse Fairy-Stockfish is only around 200 Elo weaker (~100 of which due to speed) than the multi-variant Stockfish used on lichess for which we heavily tuned dozens/hundreds of parameters (including piece values) specifically for crazyhouse, so the generic adaptions already seem to work well. However, I have not much doubt that playing strength could be increased a lot for shogi by improving the evaluation, especially since the Stockfish-based engines using evaluation files show that with the same search engine you can get a ~1500 Elo stronger engine, but currently my focus is on adding more features/variants.

Back to the original topic of the thread, I found some more info on the most recent generation of evaluation files for shogi at https://github.com/ynasu87/nnue/blob/ma ... s/nnue.pdf (in Japanese).
Abstract
Most of the strongest shogi programs nowadays employ a linear evaluation function, which is computationally efficient but lacks nonlinear modeling capability. This report presents a new class of neural-network-based
nonlinear evaluation functions for computer shogi, called NNUE (Efficiently Updatable Neural-Network-based
evaluation functions). NNUE evaluation functions are designed to run efficiently on CPU using various acceleration techniques, including incremental computation. The first shogi program with a NNUE evaluation
function, the end of genesis T.N.K.evolution turbo type D, will be unveiled at the 28th World Computer Shogi
Championship.
GregNeto
Posts: 35
Joined: Thu Sep 08, 2016 8:07 am

Re: The Stockfish of shogi

Post by GregNeto »

May I point to aobazero, don´t know how strong it is.

http://www.yss-aya.com/aobazero/index_e.html

Hiroshi's Computer Shogi and Go site is a treasure for me, the samples for the UEC Computer Go workshop are great.

http://www.yss-aya.com/
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: The Stockfish of shogi

Post by Ovyron »

Raphexon wrote: Wed Jan 08, 2020 1:42 pm Fairly sure her networks store chess piece values.
Sure, but the value depends on the position, it'd be something like "this Bishop here is useless so it has some 0.80 value and this other Bishop is a killer so it's worth 6.00", not like A/B's "all Bishops are 3.00 but this one is locked to lets subtract some 2.20 penalty, and the other one is great so let's add some 3.00 bonus." I guess my question is how strong a chess engine can play without the starting base piece values and if in games like shogi (where pieces might be more valuable in the hand than in the board) a different approach from base piece values would work better.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: The Stockfish of shogi

Post by Raphexon »

Ovyron wrote: Wed Jan 08, 2020 5:11 pm
Raphexon wrote: Wed Jan 08, 2020 1:42 pm Fairly sure her networks store chess piece values.
Sure, but the value depends on the position, it'd be something like "this Bishop here is useless so it has some 0.80 value and this other Bishop is a killer so it's worth 6.00", not like A/B's "all Bishops are 3.00 but this one is locked to lets subtract some 2.20 penalty, and the other one is great so let's add some 3.00 bonus." I guess my question is how strong a chess engine can play without the starting base piece values and if in games like shogi (where pieces might be more valuable in the hand than in the board) a different approach from base piece values would work better.
Sounds like a piece square table.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: The Stockfish of shogi

Post by lkaufman »

Fabian Fichter wrote: Wed Jan 08, 2020 1:52 pm In Fairy-Stockfish the base piece values are the same for all variants, they are only adjusted for a few rules that can heavily influence dynamics like losing chess rules, piece drops, and board size (for sliders). E.g., for drop games, the piece values are scaled by a v_max/(v_max+v) formula, where v_max is around 3 times the value of a queen, so pieces with a high value (v) lose relative strength in drop games. Additionally the piece values are halved for drop games to have a more natural scale for thresholds in futility pruning, razoring, SEE, etc., but this of course does not change their relative value.

In crazyhouse Fairy-Stockfish is only around 200 Elo weaker (~100 of which due to speed) than the multi-variant Stockfish used on lichess for which we heavily tuned dozens/hundreds of parameters (including piece values) specifically for crazyhouse, so the generic adaptions already seem to work well. However, I have not much doubt that playing strength could be increased a lot for shogi by improving the evaluation, especially since the Stockfish-based engines using evaluation files show that with the same search engine you can get a ~1500 Elo stronger engine, but currently my focus is on adding more features/variants.

Back to the original topic of the thread, I found some more info on the most recent generation of evaluation files for shogi at https://github.com/ynasu87/nnue/blob/ma ... s/nnue.pdf (in Japanese).
Abstract
Most of the strongest shogi programs nowadays employ a linear evaluation function, which is computationally efficient but lacks nonlinear modeling capability. This report presents a new class of neural-network-based
nonlinear evaluation functions for computer shogi, called NNUE (Efficiently Updatable Neural-Network-based
evaluation functions). NNUE evaluation functions are designed to run efficiently on CPU using various acceleration techniques, including incremental computation. The first shogi program with a NNUE evaluation
function, the end of genesis T.N.K.evolution turbo type D, will be unveiled at the 28th World Computer Shogi
Championship.
Are there any chess engines using this NNUE? Is shogi programming now ahead of chess programming?
Komodo rules!
Fabian Fichter
Posts: 50
Joined: Mon Dec 12, 2016 2:14 pm

Re: The Stockfish of shogi

Post by Fabian Fichter »

lkaufman wrote: Wed Jan 08, 2020 7:29 pm
Fabian Fichter wrote: Wed Jan 08, 2020 1:52 pm In Fairy-Stockfish the base piece values are the same for all variants, they are only adjusted for a few rules that can heavily influence dynamics like losing chess rules, piece drops, and board size (for sliders). E.g., for drop games, the piece values are scaled by a v_max/(v_max+v) formula, where v_max is around 3 times the value of a queen, so pieces with a high value (v) lose relative strength in drop games. Additionally the piece values are halved for drop games to have a more natural scale for thresholds in futility pruning, razoring, SEE, etc., but this of course does not change their relative value.

In crazyhouse Fairy-Stockfish is only around 200 Elo weaker (~100 of which due to speed) than the multi-variant Stockfish used on lichess for which we heavily tuned dozens/hundreds of parameters (including piece values) specifically for crazyhouse, so the generic adaptions already seem to work well. However, I have not much doubt that playing strength could be increased a lot for shogi by improving the evaluation, especially since the Stockfish-based engines using evaluation files show that with the same search engine you can get a ~1500 Elo stronger engine, but currently my focus is on adding more features/variants.

Back to the original topic of the thread, I found some more info on the most recent generation of evaluation files for shogi at https://github.com/ynasu87/nnue/blob/ma ... s/nnue.pdf (in Japanese).
Abstract
Most of the strongest shogi programs nowadays employ a linear evaluation function, which is computationally efficient but lacks nonlinear modeling capability. This report presents a new class of neural-network-based
nonlinear evaluation functions for computer shogi, called NNUE (Efficiently Updatable Neural-Network-based
evaluation functions). NNUE evaluation functions are designed to run efficiently on CPU using various acceleration techniques, including incremental computation. The first shogi program with a NNUE evaluation
function, the end of genesis T.N.K.evolution turbo type D, will be unveiled at the 28th World Computer Shogi
Championship.
Are there any chess engines using this NNUE? Is shogi programming now ahead of chess programming?
Yes, it seems like that is the most recent standard, "Kristallweizen" apparently also uses that: https://github.com/Tama4649/Kristallweizen/
See also http://www.uuunuuun.com/single-post/201 ... -v2019-May
there are remarkable developments of shogi engines. The strongest software at that time has a rating 4150, but now it reaches R4400! (We note that the ELO rating of top human players is about R3100). This dramatical change comes from an invention of a neural network system to evaluate the position. It should be distinguished from the software which uses deep learning, such as Alpha Zero. Instead, people use shallow layers, which is manageable by CPU. This new method was invented by Yu Nasu who belong to a team "the end of genesis T.N.K.evolution turbo type D" in the "world championship of Shogi software 2018 (WCSC28)". While the evaluation file in my last instruction (based on so-called KPPT format) is considerably large (nearly 1GB), the new evaluation (which is called NNUE format) is merely 64MB, but the latter is much stronger than the previous format!
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: The Stockfish of shogi

Post by Daniel Shawul »

This sounds very interesting! Stockfish should definitely try out the NNEU evaluation function to catch up to lc0 with regard to evaluation.
How is this NNEU trained? Was a bigger net with convolutions trained first and the result distilled to this shallow net ?
Direct training of a shallow neural net is often weaker than one you distill from a bigger net ...
GregNeto
Posts: 35
Joined: Thu Sep 08, 2016 8:07 am

Re: The Stockfish of shogi

Post by GregNeto »

Some older info for the non_programmers:

Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: The Stockfish of shogi

Post by Gian-Carlo Pascutto »

I don't speak Japanese but their architecture seems to be:

W1 = 125388 x 256
W2 = 512 x 32
W3 = 32 x 32
W4 = 32 x 1

And I can make some educated guesses: they exploit the fact that W1 doesn't change much to compute the result of that layer incrementally, which is coincidentally the heaviest layer. The other layers have a structure that's perfect for SIMD optimization.

However, much of the devil is in the details I'm sure, they also seem to exploit the fact that the white or black inputs to W1 don't change per turn so you can just flip them after a move, but captures would be an issue there?

Also, *what* are the inputs exactly? This is critical. 125388 inputs is a lot, so this is in itself the product of something. Piece-on-square x piece-on-square?

Note also that W1 output does not match up with W2 input, it's only half.
User avatar
aphirst
Posts: 8
Joined: Thu Mar 19, 2020 3:24 pm
Full name: Adam Hirst

Re: The Stockfish of shogi

Post by aphirst »

I'm a (bad) amateur shōgi player, and came upon this thread while trying to find some information about the NNUE format for shogi engines, along with having already asked myself the following two questions:
  • "How come this approach isn't being used for FIDE chess?"
  • "How many parameters would be needed in an NNUE nn.bin for FIDE chess"
For the second one, it's necessary to understand where the magic 125388 for the shogi evaluation comes from, but even after attempting to read the (earlier linked) NNUE.pdf paper, I remain ignorant of how this comes from the board state. To be honest, I haven't even been able to work out how the older KKP/KPP formats work - english documentation is between scarce and nonexistent.

For the 125388 I tried to break it down into prime factors - (2^2) * (3^6) * 43, and then tried to make some very basic observations
  • 81 squares = 9*9 = 3^4 factor for whatever the per-square representation is
  • empty + (black, white)*(pawn, tokin, lance, lance+, knight, knight+, silver, silver+, gold, bishop, bishop+, rook, rook+, king) = 1 + (2*14) = 29 options per square, could either be wasteful and have a sparse 29-bit pattern per square OR use a compact 5-bit (2^5 > 29)
  • Whose turn it is = 1 bit
  • Something for the piece stands = 2*(38, all pieces minus kings) or a bit-representation of the number of each piece-in-hand?
As you can see, I get lost very quickly, so there's currently no chance of me working out what the equivalent would be for FIDE chess. I'd be very interested to see this tried out though, as NNUE-shogi engines are now obscenely powerful - orqha1018+dolphin1 is no AlphaZero but it's still a monster.