I posted in wrong thread i see. Here is my reply:Volker Annuss wrote:Hi Stephan,
Hermann uses neural networks for material evaluation and for time allocation.
Material evaluation
A neural network with 11 input nodes, one hidden layer with 5 nodes and one output node. It calculates an average score you can expect for the material on the board. This avarage score is transformed to a normal centipawn evaluation.
The input nodes are the number of pieces for each type (except kings) and a flag for even/opposite coloured bishops.
A small hash table is sufficient to get a hit rate very close to 100%, so there are no problems with hundreds of floating point operations per evaluation.
It took me very much work until it gave 20 or 30 ELO. So if I would write another engine from scratch, I would not do that again.
Time allocation
Another neural network in Hermann is for time allocation. It has 22 input nodes one hidden layer with 7 nodes and one output node. It calculates the probability for a change of the best move when iterating one ply deeper.
Input nodes are
- Scores from the last 2 iterations
- Number of possible moves
- Changes of the best move in the last 2 iterations
- Search instabilities in the last 2 iterations
- Checks and captures in the first moves of the PV
- Differences and transpositions in the first moves of the last 2 PVs
I would like to add
- Time for searching the best move (in % of the total time)
- Times for the 2 not best moves that took most of the time
but this does not work with my primitive multi processor search with a shared hash table, because times are massively influenced by hash hits from moves that were searched by another thread.
Volker
I'm a bit amazed by the +20 elo claim i saw from Herman from a neural network as a gain for his material values. I really like to know the original values to understand the small elowin better.
When in 2005 i bugfixed my old material values of diep quite a bit, so i did NOT change the code at all. I just put the material values from diep from
pawn=1000 , knight = 3625, bishop 3675
to the current basic values, realize there is all kind of other influences in diep's material code, but this is the basic values. I did not change that influence code back in 2005 after world champs 2005.
{ 1000, 3875, 3875, 6175, 12350 }, /* 0 */
pawn=1000, knight = 3875
This is also the CURRENT values. Note they were initially a bit more polite like 3.825. So i did very slightly change it but not a lot.
Those slight changes i hardly noticed in elorating either. However the jump from 3.6 to 3.8 of course first took some readjusting values of other parameters. Instantly it got up like 200 elopoints, not 20.
A magic change it was, really. Not +20 elo.
So i don't really understand that change of +20 elo that Hermann got.
Small sample snip that your collegue Alex Trofimov dug up. Maybe it is time you take a photo from him at work?
Lots of code look like the blow code. If you look at the code does it look like produced by a neural network or by mankind?
Code: Select all
static UINT64
materialize_valuations (int white_pawns_count, int white_knight_count, int [white_bishop_count, int white_bishop_count_1, int white_bishop_count_2, int white_rook_count, int white_queen_count, int black_pawns_count, int black_knight_count, int black_bishop_count, int black_bishop_count_1, int black_bishop_count_2, int black_rook_count, int black_queen_count)
{
UINT64 value = 0;
value += (white_bishop_count / 2 - black_bishop_count / 2) * ((((UINT64) 55) << 48) + (((UINT64) 50) << 32) + (((UINT64) 40) << 16) + (((UINT64) 35) << 0));
value += (white_pawns_count - black_pawns_count) * ((((UINT64) 125) << 48) + (((UINT64) 110) << 32) + (((UINT64) 90) << 16) + (((UINT64) 80) << 0));
value += (white_knight_count - black_knight_count) * ((((UINT64) 355) << 48) + (((UINT64) 320) << 32) + (((UINT64) 280) << 16) + (((UINT64) 265) << 0));
value += (white_rook_count - black_rook_count) * ((((UINT64) 610) << 48) + (((UINT64) 550) << 32) + (((UINT64) 450) << 16) + (((UINT64) 405) << 0));
value += (white_queen_count - black_queen_count) * ((((UINT64) 1150) << 48) + (((UINT64) 1025) << 32) + (((UINT64) 875) << 16) + (((UINT64) 800) << 0));
value += (white_bishop_count - black_bishop_count) * ((((UINT64) 360) << 48) + (((UINT64) 325) << 32) + (((UINT64) 295) << 16) + (((UINT64) 280) << 0));
if (white_rook_count == 2)
value -= ((((UINT64) 32) << 48) + (((UINT64) 28) << 32) + (((UINT64) 20) << 16) + (((UINT64) 16) << 0));
if (black_rook_count == 2)
value += ((((UINT64) 32) << 48) + (((UINT64) 28) << 32) + (((UINT64) 20) << 16) + (((UINT64) 16) << 0));
if (white_queen_count + white_rook_count >= 2)
value -= ((((UINT64) 16) << 48) + (((UINT64) 14) << 32) + (((UINT64) 10) << 16) + (((UINT64) 8) << 0));
if (black_queen_count + black_rook_count >= 2)
value += ((((UINT64) 16) << 48) + (((UINT64) 14) << 32) + (((UINT64) 10) << 16) + (((UINT64) 8) << 0));
value -= (white_pawns_count - 5) * white_rook_count * ((((UINT64) 0) << 48) + (((UINT64) 2) << 32) + (((UINT64) 4) << 16) + (((UINT64) 5) << 0));
value += (white_pawns_count - 5) * white_knight_count * ((((UINT64) 5) << 48) + (((UINT64) 4) << 32) + (((UINT64) 2) << 16) + (((UINT64) 0) << 0));
value += (black_pawns_count - 5) * black_rook_count * ((((UINT64) 0) << 48) + (((UINT64) 2) << 32) + (((UINT64) 4) << 16) + (((UINT64) 5) << 0));
value -= (black_pawns_count - 5) * black_knight_count * ((((UINT64) 5) << 48) + (((UINT64) 4) << 32) + (((UINT64) 2) << 16) + (((UINT64) 0) << 0));
return value;
}
My point is, in some buggy diep version (all kind of bugs in passed pawns for example in 2005 diep which was disaster and co - still suffer from that to some extend, fixing it rapidly now though) it already mattered +200 elo.
If i'd put back the values i had back during and before world champs 2005 in diep's material evaluation, Diep would lose of course more than 200 elopoints nowadays.
You'll say now of course in choir: "but if you gave in 2004 the tip to put a piece at 4.2 for Fruit, why didn't you do it yourself".
Yeah you know, how silly can one be. I have other 'compensation code' of course when you have a bunch of pawns against a piece. That just didn't work in the case you have 1 pawn against a piece.
The compensation you have in that case with old values is +2.6 versus today it's +2.8. If you look at it like this it seems like peanuts to talk about and it is, but it mattered really a lot of elo for diep.
The 1930s values of 3 or 3.25 for 1 piece in number of pawns is just so so ugly bad elowise, and it takes up to 3.8-4.3 pawns to flatten off and really work well.
It really matters a lot of elo. Yet if you look to that code that the other functions around it, does it look like a neural network to you or does it look like someone tried to win the worldchampionships bitshifting?
Vincent