Turns out there were two problems with datagen.
The first problem was that my idea of duplicating positions where static eval was far off of search score led to way too many positions being duplicated. I'm not particularly sure what changed because when I first implemented it, only about 10% of the positions were duplicates, and now when I tested it the number was nearly 30%.
Another problem was that my unbalanced datagen of 9k vs 1k was simply too unbalanced. I removed the duplication code, changed the unbalanced datagen to 7k nodes vs 3k nodes, and changed the opening book used for some positions to the UHO opening book (was previously 4moves), and this fixed the problem.
Small net testing is as strong as Willow data and creates networks with significantly higher EAS. Hopefully this will translate to larger nets and in particular the net retraining stage.
Me and Gabe are trying a couple more datagen patches too to squeeze out what we can out of datagen before starting another big dataset.
patricia devlog
Moderators: hgm, Rebel, chrisw
-
- Posts: 219
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
-
- Posts: 7140
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: patricia devlog
Idea, create 2 versions
1. Patricia EAS
2. Patricia ELO
1. Patricia EAS
2. Patricia ELO
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 2568
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: patricia devlog
IMHO, this is not a good idea. Patricia is all about sacrifices and spectacular games. If anybody wants a more solid play, there are tons of engines out there, which play "normal" chess.
The new Velvet 8 has 2 nets (normal and risky), my tests are running. The risky net looks very promising in EAS (around 230000) and is only a little bit weaker (Elo) than the normal net.
-
- Posts: 219
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Perhaps I could do that later, right now the problem is even Patricia ELO is probably only about the same strength as Willow and is certainly not near the top levels. I don't want to release a maximal strength Patricia if it's only going to be 25th place, that contributes much less to the chess engine scene than Patricia as is does.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 219
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Datagen is fixed and even some improvements were found. A quick summary of what I do (in addition to the usual):
1. I play 7k nodes vs 3k nodes to increase the number of decisive games and winning attacks. When I then train on that data, the resulting net is more aggressive even before retraining.
2. I adjudicate games when either the eval gets past 1000 centipawns, or the game goes on beyond 200 plies. This is done to avoid oversaturating the net with worthless positions such as opposite colored bishop draws that go on for hundreds of moves.
3. I use datagen with an opening book: I set the start position of each game to a random position from a UHO testing book containing 2 million positions, then make 4-5 random moves. This also helps reduce the number of worthless positions/games (not much to learn when the game is already completely over out of book).
4. I disable search-related aggression during datagen, because (1) the engine is stronger without it and (2) it makes little sense for the network to learn what search can handle.
The results in smallnet testing are promising. Patricia data seems to be 25 ELO stronger than Willow data, and small nets trained on Patricia data have an EAS score of over 200k (before filtering) as opposed to Willow nets having an EAS score of about 100k. This is very good for both strength and style - if the retraining stage works as well on Patricia data networks as it does on Willow data networks, that's 500k EAS, which would be... incredible to say the least.
1. I play 7k nodes vs 3k nodes to increase the number of decisive games and winning attacks. When I then train on that data, the resulting net is more aggressive even before retraining.
2. I adjudicate games when either the eval gets past 1000 centipawns, or the game goes on beyond 200 plies. This is done to avoid oversaturating the net with worthless positions such as opposite colored bishop draws that go on for hundreds of moves.
3. I use datagen with an opening book: I set the start position of each game to a random position from a UHO testing book containing 2 million positions, then make 4-5 random moves. This also helps reduce the number of worthless positions/games (not much to learn when the game is already completely over out of book).
4. I disable search-related aggression during datagen, because (1) the engine is stronger without it and (2) it makes little sense for the network to learn what search can handle.
The results in smallnet testing are promising. Patricia data seems to be 25 ELO stronger than Willow data, and small nets trained on Patricia data have an EAS score of over 200k (before filtering) as opposed to Willow nets having an EAS score of about 100k. This is very good for both strength and style - if the retraining stage works as well on Patricia data networks as it does on Willow data networks, that's 500k EAS, which would be... incredible to say the least.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 219
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
1.2 billion positions generated and counting.
Patricia now uses a bitboard board representation instead of the 0x88 I have known and loved for so long! Well, kind of; for now it's still a weird hybrid, but it's already fast enough to be used over regular mailbox. It didn't take nearly as long as I thought it would, partly because much of the implementation wasn't difficult and partly because Gabe was giving me advice on what to do and fixing small bugs here and there. Could be another 50 or so elo STC.
Patricia now uses a bitboard board representation instead of the 0x88 I have known and loved for so long! Well, kind of; for now it's still a weird hybrid, but it's already fast enough to be used over regular mailbox. It didn't take nearly as long as I thought it would, partly because much of the implementation wasn't difficult and partly because Gabe was giving me advice on what to do and fixing small bugs here and there. Could be another 50 or so elo STC.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 1888
- Joined: Thu Sep 18, 2008 10:24 pm
Re: patricia devlog
+1Whiskers wrote: ↑Tue Aug 27, 2024 8:47 pmPerhaps I could do that later, right now the problem is even Patricia ELO is probably only about the same strength as Willow and is certainly not near the top levels. I don't want to release a maximal strength Patricia if it's only going to be 25th place, that contributes much less to the chess engine scene than Patricia as is does.
Aggression is everything for this engine. Don't get sidetracked.
-
- Posts: 219
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Still having trouble with Patricia data, it's better than Willow data for small networks but on large networks seems to be -50 elo. Not sure if me and Gabe are doing something wrong with training somehow or if the data is just broken. I hope it's not the latter, again.
On the bright side full power Patricia is approaching 3600 CCRL blitz. Not bad.
On the bright side full power Patricia is approaching 3600 CCRL blitz. Not bad.
go and star https://github.com/Adam-Kulju/Patricia!