Best Nets Made from Human Games Data

BrendanJNorman · Post by **BrendanJNorman** » Thu Jun 08, 2023 3:37 am

Hi Guys,

Just wondering what the best nets available are which were trained on human games (liChess, Kingbase, whatever).

These tend to have an interesting style and I'd love to check a few out.

Thanks for any info in this regard,

Brendan

Dann Corbit · Post by **Dann Corbit** » Thu Jun 08, 2023 4:44 am

I guess that there are very few (if any) nets made exclusively from human games. I guess you could build the nets based on the win/loss/draw alone.
You would need a lot of statistics to create such a net.

The Lichess games would be a good candidate for that, because the volume is so high.
I use a formula like this to estimate centipawns, based on wins/losses/draws:
UPDATE Epd SET coef = (white_wins - black_wins)* 1.0/(1e-20+white_wins+black_wins+draws)
UPDATE Epd SET coef = -coef WHERE Epd like '% b %'
and then:
cce = round(coef * 444.0,0)
It's only a good fit for a score of +/- one full piece. At 444 or -444 it is all wins or losses.

I would not want to use the data if there are less than 32 decisive games for the position.
Also, I would use only games where both players were at or above 2500.
Because there are so many Lichess games, that will still leave you more than ten million games from which to draw analysis.

smatovic · Post by **smatovic** » Thu Jun 08, 2023 6:19 am

There is Maia Chess:

https://www.chessprogramming.org/Maia_Chess
https://maiachess.com/

--
Srdja

dkappe · Post by **dkappe** » Fri Jun 09, 2023 6:04 am

Dann Corbit wrote: ↑Thu Jun 08, 2023 4:44 am I guess that there are very few (if any) nets made exclusively from human games. I guess you could build the nets based on the win/loss/draw alone.
You would need a lot of statistics to create such a net.

The Lichess games would be a good candidate for that, because the volume is so high.
I use a formula like this to estimate centipawns, based on wins/losses/draws:
UPDATE Epd SET coef = (white_wins - black_wins)* 1.0/(1e-20+white_wins+black_wins+draws)
UPDATE Epd SET coef = -coef WHERE Epd like '% b %'
and then:
cce = round(coef * 444.0,0)
It's only a good fit for a score of +/- one full piece. At 444 or -444 it is all wins or losses.

I would not want to use the data if there are less than 32 decisive games for the position.
Also, I would use only games where both players were at or above 2500.
Because there are so many Lichess games, that will still leave you more than ten million games from which to draw analysis.

From Kingbase with material qsearch mixed in. Uses the original SF nnue format. Could use some better, deeper material search.

https://www.patreon.com/posts/harmon-nnue-44549655

jdart · Post by **jdart** » Sat Jun 10, 2023 4:35 pm

There are two issues with human games for training: first, humans blunder a lot. Even strong players. Secondly, you won't get enough games that way. You need billions of positions to get a well-trained net.

chrisw · Post by **chrisw** » Sat Jun 10, 2023 7:47 pm

jdart wrote: ↑Sat Jun 10, 2023 4:35 pm There are two issues with human games for training: first, humans blunder a lot. Even strong players. Secondly, you won't get enough games that way. You need billions of positions to get a well-trained net.

Could be possible. A bit of crowd sourced help would be useful (supply of cores).
I would guess there are enough human-human games on LiChess of "reasonable" strength players.
Positions could be blunderchecked by SF or whatever, that's the time consuming part.
A 50 core machine can blunder check about 5000 games per second at d9 (which would be enough, probably) = 400M positions a day. 10B positions would take 25 days per 50 core machine. Any core volunteers? I'll organise it.

chesskobra · Post by **chesskobra** » Sat Jun 10, 2023 8:45 pm

Dann Corbit wrote: ↑Thu Jun 08, 2023 4:44 am I guess that there are very few (if any) nets made exclusively from human games. I guess you could build the nets based on the win/loss/draw alone.
You would need a lot of statistics to create such a net.

I have a naive question. Suppose we have a certain amount of data, say 10M games. Can we make many copies of the games, shuffle them together, and use it for training?

jdart · Post by **jdart** » Sat Jun 10, 2023 9:15 pm

I have a naive question. Suppose we have a certain amount of data, say 10M games. Can we make many copies of the games, shuffle them together, and use it for training?

Well, you can do that, but you are not going to get improved results that way. Training already visits the same positions over and over, because it is iterative and each iteration goes through the positions, or a subset of them. If you put multiple copies of a position set into the training set, that is pretty much the same process: it is like bunching some of those iterations together in one batch.

Also note there is no technical barrier to training on a small position set. You just won't get good results, for multiple reasons, one of them being that you might not visit positions with all the features that you are trying to train for.

chesskobra · Post by **chesskobra** » Sun Jun 11, 2023 12:26 am

Let us say I play a game and analyse it, and I learn some things. In a few months, after I have improved a little, and I look at some old game, and I may learn a few more things. At least I have experienced such a thing in other domains. Is this not likely with neural networks?

BrendanJNorman · Post by **BrendanJNorman** » Sun Jun 11, 2023 2:34 pm

jdart wrote: ↑Sat Jun 10, 2023 4:35 pm There are two issues with human games for training: first, humans blunder a lot. Even strong players. Secondly, you won't get enough games that way. You need billions of positions to get a well-trained net.

Really depends on the goal, doesn't it?

For example, a lot of chess engines are described (and marketed!) as "human-like" in their playing style, despite playing at 3000+ Elo and essentially flawless from a cp loss perspective vs humans.

If you *really* want to make a human-like net, wouldn't it be cool/better for it to actually blunder once in a while?

This is literally what makes "human-like", human-like. "To err is human".

I'd love to get hold of a net that occasionally "misses" something in a complicated position, chooses the wrong plan, etc, even if said position is only complicated at a human level.

Who cares about Elo at this stage?

Best Nets Made from Human Games Data

Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data

Re: Best Nets Made from Human Games Data