OK, this might be true. I referred to TheKing-Chesscomputer, which only has less than 400 Kilobytes for Hashtables...Whiskers wrote: ↑Wed Apr 10, 2024 4:58 pm
I definitely understand this for max depth (and will come back around to revising Patricia's skill levels before releasing), but for endgames why does it need more nodes? Thanks to the transposition table engines can hit very high depths with comparatively very few nodes.
patricia devlog
Moderators: hgm, Rebel, chrisw
-
- Posts: 2559
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: patricia devlog
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Whiskers wrote: ↑Wed Apr 10, 2024 7:59 am I decided to extract data from SPCC testing to get some better data for retraining Patricia's net on. To do this, I grabbed Patricia's games, as well as all the games played in SPCC testing (found on the site), used the interesting wins filter to search for, well, interesting games, used pgn-extract to grab the FENs (with best moves and scores) from the PGNs, and wrote a script to perform filtering + conversion on those FENs. This yielded about 8.25m "interesting" FENs; if retraining Patricia's network on it yields positive results, I'll probably grab CCRL games as well.
For testing the new retrained net I'm going to remove the features that directly force sacrifices in Patricia. I feel like they're a bit unhealthy for how she plays, especially as the bonuses get *huge* for some sacrifices. I think I'm also not going to let Patricia give bonuses for sacrifices if she's losing, because sacrifices in losing positions are really just throwing pieces in the garbage and are not conducive whatsoever to style of play.
After several games, I've come to the conclusion that this isn't working. However I'm finding promise with a new filtering script for Willow data. Once I maximize style and minimize strength loss I'll make a post about it!
I also think I swatted another bug. This one was brought to my attention by a GitHub issue and apparently the code that breaks it only fails in rare circumstances at the root node. I still don't understand why none of these bugs cause Patricia to crash on my machine.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Progress is still somewhat slow. I got a decent net that sacrifices at the same rate as Komodo on its aggressive setting with no other aggression changes, but am having trouble improving on it.
One thing I've noticed is how thin the line to tread in terms of learning rate is. I've had nets where going from a learning rate of 0.000001 to 0.0000025 has no effect on ELO, but going from 0.0000025 to 0.000005 instantly loses 200 or more ELO. So for each net I usually have to spend several runs just zeroing in on the best LR that maximizes the net's response to retraining without killing its strength.
One thing I've noticed is how thin the line to tread in terms of learning rate is. I've had nets where going from a learning rate of 0.000001 to 0.0000025 has no effect on ELO, but going from 0.0000025 to 0.000005 instantly loses 200 or more ELO. So for each net I usually have to spend several runs just zeroing in on the best LR that maximizes the net's response to retraining without killing its strength.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Finally success!
I managed to get net retraining to make a significantly more aggressive engine after months of attempts and hundreds of nets. More importantly, Patricia has a very high EAS score against engines much stronger than her (100-250 ELO at STC on an unbalanced book in my testing), which is extremely important for TCEC/CCC where Patricia will mostly be playing engines a cut above her.
Gauntlet test (3000 games):
She scored similarly on EAS against weaker engines, which is a good sign. Unlike Patricia 2, this Patricia doesn't fall off against stronger opposition.
Oh yeah, I neglected to mention, this is 330k EAS with only the neural network change. All other search/eval modifications to make it play more aggressively were reverted, making this result a very impressive one as I can add back in a lot of handcoded heuristics to crank up the aggression absurdly high. This time around, though, I will be very careful to test against much stronger engines, at a variety of time controls.
The way I did it is as follows:
Filtering was done by going through my big Willow dataset and for each position, grabbing the material score as well as the search score. If the search eval is greater than material + 300 and material + (abs(eval) * 0.75) (the last condition is there to handle cases where you're way up in material and also completely winning, if you're up a rook and the eval is +8 I don't think that should be used as part of the training), or if the opposite is true (eval is less than material - 300 and material - (abs(eval) * 0.75), I saved that position, writing it into another file.
That gave me about 100m positions; I then started retraining my "original" net on that dataset using JW's Bullet trainer's resume function. The retraining runs for 25 total epochs with a WDL of 0.2, starting at an LR of 0.000000075 (yes, you have to be very careful for it to not overfit!) and dropping that LR by 10x at epoch 10. I tried retraining twice with two filtered datasets but to little avail.
I am grateful for this breakthrough and I am hopeful that Patricia 3 will be released within a couple months.
I managed to get net retraining to make a significantly more aggressive engine after months of attempts and hundreds of nets. More importantly, Patricia has a very high EAS score against engines much stronger than her (100-250 ELO at STC on an unbalanced book in my testing), which is extremely important for TCEC/CCC where Patricia will mostly be playing engines a cut above her.
Gauntlet test (3000 games):
Code: Select all
Rank EAS-Score sacs shorts draws moves Engine/player
-------------------------------------------------------------------
1 330048 44.02% 17.95% 06.67% 66 Patricia
2 84442 01.55% 35.27% 32.77% 59 Willow 3.0
3 63628 01.21% 33.47% 51.67% 58 Avalanche 2.1
4 51556 00.84% 23.68% 43.88% 60 Pawn 3.0
5 35185 01.64% 21.05% 38.96% 65 Saturn 1.3
6 28182 00.37% 17.60% 42.45% 66 Stockdory 0.1
7 10522 01.45% 06.65% 39.17% 70 Virithidas 9.0
-------------------------------------------------------------------
Oh yeah, I neglected to mention, this is 330k EAS with only the neural network change. All other search/eval modifications to make it play more aggressively were reverted, making this result a very impressive one as I can add back in a lot of handcoded heuristics to crank up the aggression absurdly high. This time around, though, I will be very careful to test against much stronger engines, at a variety of time controls.
The way I did it is as follows:
Filtering was done by going through my big Willow dataset and for each position, grabbing the material score as well as the search score. If the search eval is greater than material + 300 and material + (abs(eval) * 0.75) (the last condition is there to handle cases where you're way up in material and also completely winning, if you're up a rook and the eval is +8 I don't think that should be used as part of the training), or if the opposite is true (eval is less than material - 300 and material - (abs(eval) * 0.75), I saved that position, writing it into another file.
That gave me about 100m positions; I then started retraining my "original" net on that dataset using JW's Bullet trainer's resume function. The retraining runs for 25 total epochs with a WDL of 0.2, starting at an LR of 0.000000075 (yes, you have to be very careful for it to not overfit!) and dropping that LR by 10x at epoch 10. I tried retraining twice with two filtered datasets but to little avail.
I am grateful for this breakthrough and I am hopeful that Patricia 3 will be released within a couple months.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
A few search patches later:
I want to hit 400k EAS against strong opponents, work on filtering a bit, make the engine strong enough to not get wiped out by absolutely everyone at TCEC (it's significantly weaker than Willow 3.0, which is not a very good sign), add a couple of skill level options, and then it should be ready to release. That's a bunch of things but I have a clear path to a release now!
Code: Select all
Rank EAS-Score sacs shorts draws moves Engine/player
-------------------------------------------------------------------
1 378056 48.09% 25.30% 05.83% 62 Patricia
2 74092 00.60% 35.56% 44.97% 56 Willow 3.0
3 71035 00.45% 34.42% 46.22% 56 Avalanche 2.1
4 58484 01.09% 23.74% 36.84% 58 Pawn 3.0
5 39852 00.97% 21.19% 38.41% 61 Saturn 1.3
6 26243 00.39% 16.81% 43.85% 65 Stockdory 0.1
7 17009 00.65% 07.50% 34.77% 67 Virithidas 9.0
-------------------------------------------------------------------
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Patricia's strength is up to 3300-3350 CCRL blitz at this point, which is probably good enough.
A game against Viri 9.0 (3515 CCRL blitz):
[pgn]
[Event "?"]
[Site "?"]
[Date "2024.06.11"]
[Round "400"]
[White "Patricia"]
[Black "Virithidas 9.0"]
[Result "1-0"]
[TimeControl "10+0.1"]
[SetUp "1"]
[FEN "rnbqkbnr/p2p1pp1/4p2p/1pp5/8/P1PP3P/1P2PPP1/RNBQKBNR w KQkq - 0 1"]
[PlyCount "51"]
[GameDuration "00:00:16"]
[GameEndTime "2024-06-11T14:40:32.563 CDT"]
[GameStartTime "2024-06-11T14:40:16.450 CDT"]
1. e4 Nf6 2. e5 Nd5 3. Nf3 Bb7 4. Be2 Nc6 5. O-O d6 6. d4 a6 7. a4 b4 8. c4
Nde7 9. exd6 Qxd6 10. Nbd2 Nxd4 11. Nxd4 cxd4 12. Bd3 Nc6 13. Nb3 Be7 14.
c5 Qd5 15. Re1 f5 16. Qh5+ Kf8 17. Qg6 Qxb3 18. Rxe6 Qxd3 19. Bf4 Qc4 20.
Rc1 Qd5 21. Rce1 d3 22. R1e3 Rg8 23. Bd6 Re8 24. Rxd3 Qxd3 25. Rf6+ gxf6
26. Qxf6# 1-0
[/pgn]
A game against Viri 9.0 (3515 CCRL blitz):
[pgn]
[Event "?"]
[Site "?"]
[Date "2024.06.11"]
[Round "400"]
[White "Patricia"]
[Black "Virithidas 9.0"]
[Result "1-0"]
[TimeControl "10+0.1"]
[SetUp "1"]
[FEN "rnbqkbnr/p2p1pp1/4p2p/1pp5/8/P1PP3P/1P2PPP1/RNBQKBNR w KQkq - 0 1"]
[PlyCount "51"]
[GameDuration "00:00:16"]
[GameEndTime "2024-06-11T14:40:32.563 CDT"]
[GameStartTime "2024-06-11T14:40:16.450 CDT"]
1. e4 Nf6 2. e5 Nd5 3. Nf3 Bb7 4. Be2 Nc6 5. O-O d6 6. d4 a6 7. a4 b4 8. c4
Nde7 9. exd6 Qxd6 10. Nbd2 Nxd4 11. Nxd4 cxd4 12. Bd3 Nc6 13. Nb3 Be7 14.
c5 Qd5 15. Re1 f5 16. Qh5+ Kf8 17. Qg6 Qxb3 18. Rxe6 Qxd3 19. Bf4 Qc4 20.
Rc1 Qd5 21. Rce1 d3 22. R1e3 Rg8 23. Bd6 Re8 24. Rxd3 Qxd3 25. Rf6+ gxf6
26. Qxf6# 1-0
[/pgn]
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Some updates on what has happened:
I finally tracked down the bug that was plaguing me since Patricia 2's release (it was a PV printing bug, of all things!) Most miserable bug I've ever encountered, took 2 weeks to find and fix.
I worked on strength a little bit (adding stuff such as improving and TT static eval), and added a few more search aggression features such as bringing back the bonuses for sacrificing during the search tree, though not to the same extent as before. Finally she hit 400,000 against my gauntlet of TCEC tier engines at both short and long time controls, so I felt ready to release.
I also spent a bit of time cleaning up the code. I think it looks pretty nice now (thank you Ciekce).
I'm probably going to start generating Patricia data to see how effective it is in training an aggressive net. Additionally, Patricia 3 has only been out for 15 minutes and I already have requests to add MultiPV, so I guess I'm adding that too
Patricia is my pride and joy!
I finally tracked down the bug that was plaguing me since Patricia 2's release (it was a PV printing bug, of all things!) Most miserable bug I've ever encountered, took 2 weeks to find and fix.
I worked on strength a little bit (adding stuff such as improving and TT static eval), and added a few more search aggression features such as bringing back the bonuses for sacrificing during the search tree, though not to the same extent as before. Finally she hit 400,000 against my gauntlet of TCEC tier engines at both short and long time controls, so I felt ready to release.
I also spent a bit of time cleaning up the code. I think it looks pretty nice now (thank you Ciekce).
I'm probably going to start generating Patricia data to see how effective it is in training an aggressive net. Additionally, Patricia 3 has only been out for 15 minutes and I already have requests to add MultiPV, so I guess I'm adding that too
Patricia is my pride and joy!
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 240
- Joined: Thu Jul 21, 2022 12:30 am
- Full name: Chesskobra
Re: patricia devlog
Would you at some point release the training data? I am particularly interested in the collections of filtered positions according to the filters in the utils directory. I am not training any neural network, but I am always interested in collections of positions classified by different characteristics.
I have compiled the filtering programs (with the same options that you use for compiling the engine). But I don't know how to use them. I didn't understand your comment in Readme "Additionally, the converter.cpp file allows you to transform bullet-format data into text data, so that you can then use the filtering scripts on the resulting file.". What is bullet-format?
Congrats on the new version. I noticed that the executable on linux is pretty small compared to most engines I have. I just played a couple of games between Patricia 3 and Crafty 25.6. I plan to sometime test the skill adjustment parameters and also the endgame skills.
I have compiled the filtering programs (with the same options that you use for compiling the engine). But I don't know how to use them. I didn't understand your comment in Readme "Additionally, the converter.cpp file allows you to transform bullet-format data into text data, so that you can then use the filtering scripts on the resulting file.". What is bullet-format?
Congrats on the new version. I noticed that the executable on linux is pretty small compared to most engines I have. I just played a couple of games between Patricia 3 and Crafty 25.6. I plan to sometime test the skill adjustment parameters and also the endgame skills.
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
https://www.kaggle.com/datasets/adamkulju/willowdata is where most of the Willow data is stored; I have a few hundred million more stored locally.chesskobra wrote: ↑Mon Jul 15, 2024 12:04 am Would you at some point release the training data? I am particularly interested in the collections of filtered positions according to the filters in the utils directory. I am not training any neural network, but I am always interested in collections of positions classified by different characteristics.
I have compiled the filtering programs (with the same options that you use for compiling the engine). But I don't know how to use them. I didn't understand your comment in Readme "Additionally, the converter.cpp file allows you to transform bullet-format data into text data, so that you can then use the filtering scripts on the resulting file.". What is bullet-format?
Congrats on the new version. I noticed that the executable on linux is pretty small compared to most engines I have. I just played a couple of games between Patricia 3 and Crafty 25.6. I plan to sometime test the skill adjustment parameters and also the endgame skills.
In order to use them: ./[exe name] [input text file] [output text file]. For example, "./position_filter_9 input.txt filtered.txt".
Bullet-format is a binary format that the Bullet trainer (https://github.com/jw1912/bullet) uses, with their main advantage being that they take up much less spacethan FENs. Compiling and running the converter.cpp file (format is the same as the filtering scripts) will transform the bullet-format into plain-text FENs and write them to an output file, where you can then use filtering scripts on them.
go and star https://github.com/Adam-Kulju/Patricia!
-
- Posts: 218
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: patricia devlog
Patricia's UCI_Elo option has been renamed to Skill_Level because some GUIs hide the UCI_Elo option and don't let you set it, and I've been getting a couple complaints about it. I still need to tune the skill levels because 1100 Patricia plays closer to 2000 strength!
I added datagen support for Patricia and am starting to mess around with training small nets (50 million positions, 64 size hidden layer). I want to see how good Patricia data is for strength and aggressiveness. (After a few hundred games, her strength with just the small net is somewhere around 3200-3250, which I think is pretty high for a small net; Willow smallnets were around 3100 strength when I tested them.)
I added datagen support for Patricia and am starting to mess around with training small nets (50 million positions, 64 size hidden layer). I want to see how good Patricia data is for strength and aggressiveness. (After a few hundred games, her strength with just the small net is somewhere around 3200-3250, which I think is pretty high for a small net; Willow smallnets were around 3100 strength when I tested them.)
go and star https://github.com/Adam-Kulju/Patricia!