I'm glad your back!

Moderator: Ras
I'm on generation 6 now via the random weight init 'boot' net and testing successive generation ELO has been (ish) +inf, +inf, +inf, +400, +130, +100. But not tested gen 6 against existing net yet (the others were all -inf). I made the mistake (I think) of leaving lerp at 0.5 but I'll live with it for this experiment. I may restart and use lerp 1.0 (WDL), at least for early generations, if this experiment fails.lithander wrote: ↑Fri Jan 17, 2025 2:10 pm It's definitely possible to get off the ground without "external" chess knowledge like EGTB or opening books which is really cool.
It seems like it's not really important that the positions you train with are always correctly labeled and scored. (I completely ignore score) But what is important is that the errors are equally balanced. The first net trained on my first batch of training data was showing really odd behaviour and unfit of generating viable training data on it's own. So I kept the temperature high at +/-100cp. Now it was basically a random mover with a tendency towards what the net says. And *that* did the trick.
So from now on the question was: When is the net good enough that I should set the temperature to zero? And I haven't yet found a clear answer. So far the randomness in the training data still seems to do more good than harm.
Hi Mike! Your interest in real-time learning is not forgotten. I recently implemented correction history (https://github.com/lithander/Leorik/com ... 11aa72af5a) and while I worked on it I thought that instead of using a small hashtable and discriminate positions only by pawn structure it could also be promising to keep track of correction PSQTs instead!Mike Sherwin wrote: ↑Sat Jan 18, 2025 4:59 am The warmth of life has entered my tomb.
I'm glad your back!![]()
Interesting choice to discard the data at each generation. Considering that both quantity and quality matters I have kept positions from multiple generations and phase out the data of old generations as new batches come in. Each generation only changes about ~100M of positions in the pool.op12no2 wrote: ↑Sun Jan 19, 2025 1:16 am Interesting. I kept the hidden layer size the same (128) in this experiment but am gradually increasing the number of games played.
https://github.com/op12no2/lozza/blob/master/zero.md
Early days...
86 threads is impressive! I might have to requisition additional hardware, too, if I want to stay ahead.
You made my day!lithander wrote: ↑Sun Jan 19, 2025 11:48 amHi Mike! Your interest in real-time learning is not forgotten. I recently implemented correction history (https://github.com/lithander/Leorik/com ... 11aa72af5a) and while I worked on it I thought that instead of using a small hashtable and discriminate positions only by pawn structure it could also be promising to keep track of correction PSQTs instead!Mike Sherwin wrote: ↑Sat Jan 18, 2025 4:59 am The warmth of life has entered my tomb.
I'm glad your back!![]()
Code: Select all
// Knight Tables
moves = knight[sq] & ~aPieces;
trgts = knight[sq] & (bRooks | bQueens | bKings);
wKnightTbl[sq] = PopCnt(moves) * 4 + PopCnt(trgts) * 8;
wKnightTbl[sq] += ((cenRow + cenRow) * (cenCol + cenCol) + row * 3);
if(cenCol == 0) wKnightTbl[sq] -= 20;
g = evalHistTblg[WN][sq]; n = evalHistTbln[WN][sq]; // <- //
wKnightTbl[sq] += (n > 99 ? g/n/2 : 0); // <- //
Code: Select all
--> 256HL-1047M-Tmixed-Q5-v13.nnue
Score of Leorik-3.0.11v13 vs Leorik-3.0.10: 2702 - 2285 - 5013 [0.521] 10000
... Leorik-3.0.11v13 playing White: 1566 - 862 - 2572 [0.570] 5000
... Leorik-3.0.11v13 playing Black: 1136 - 1423 - 2441 [0.471] 5000
... White vs Black: 2989 - 1998 - 5013 [0.550] 10000
Elo difference: 14.5 +/- 4.8, LOS: 100.0 %, DrawRatio: 50.1 %