Pedantic Developer's Log Stardate...

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate...

Post by JoAnnP38 »

I'm still working hard on the v0.3 release of Pedantic. I was just reviewing my release plan and compared to what I released in 0.2, 0.3 has over 2.5 times the amount of content. Of course, not all of this new content is related to playing strength. Some of the release is dedicated to bug fixes and quality of life enhancements to encourage people to use Pedantic. As happens to all engines, increase of playing strength starts to slow when all of the low hanging fruit has been picked. However, I still have one "big" idea left to implement and it has to do with having 4 piece-square tables (PSqT) per piece that are selected based on a 2-bit index formed by the following:
  • The high order bit is set to 0 if the friendly king is on the "king-side" of the board, and 1 if "queen-side"
  • The low order bit is set the same except for the enemy king.
This two bit index will allow Pedantic to choose an appropriate PSqT based on the relative positions of the kings. Originally, I was just going to base this purely off the location of the enemy king, but I think this is a better plan and will have more application over more games. I can't wait to finish the implementation, tune the PSqT values and get it into SPRT testing because I think this could provide a large increase in playing strength. (fingers-crossed!)

Currently, Pedantic 0.2 is competing in Grahams's "The Queen and the Princhess" tournament and while it's still early, it appears to be doing well. Princhess 0.11.0 is a strong competitor having played some games with it against the dev version of Pedantic 0.3. I get so many ideas for improving my engine from watching these matches and noting strengths and weaknesses of play. So, if there is a delay in the release of Pedantic 0.3 (looking like 1-2 weeks from now) you can blame it on Graham! :D :mrgreen:
User avatar
lithander
Posts: 881
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Pedantic Developer's Log Stardate...

Post by lithander »

JoAnnP38 wrote: Sat May 20, 2023 9:40 pm However, I still have one "big" idea left to implement and it has to do with having 4 piece-square tables (PSqT) per piece that are selected based on a 2-bit index [...] This two bit index will allow Pedantic to choose an appropriate PSqT based on the relative positions of the kings. Originally, I was just going to base this purely off the location of the enemy king, but I think this is a better plan and will have more application over more games.
That closely related to the "big" idea I'm working on and that will hopefully result in a future Leorik 2.5 release. Maybe mine is even a little more ambitious. If you think of PSTs with their two values for midgame and endgame it's like there's a very simple PieceSquareValue() formula for each piece: value = mg + eg * phase
I plan not to use different set of simple PSTs but add more coefficients per piece such as mg and eg. In the end a PieceSquareValue() formula could for example look like this: value = c0 + c1 * phase + c2 * my_king_file + c3 * my_king_rank + c4 * enemy_king_file + c5 * enemy_king_rank

Right now I'm working on a tuning pipeline that takes just selfplay games as input and produces these coefficients automatically.
JoAnnP38 wrote: Sat May 20, 2023 9:40 pmI can't wait to finish the implementation, tune the PSqT values and get it into SPRT testing because I think this could provide a large increase in playing strength. (fingers-crossed!)
I think I know how you feel! :) Never before did I spent so much time on a single idea without having anything testable. If it fails it will be my biggest disappointment so far but meanwhile with each additional evening spent on some tiny component of the increasingly complex pipeline I get more giddy and excited.

Maybe I should update Leorik's devlog instead of hijacking yours, though! ;)
JoAnnP38 wrote: Sat May 20, 2023 9:40 pm Currently, Pedantic 0.2 is competing in Grahams's "The Queen and the Princhess" tournament and while it's still early, it appears to be doing well.
Good luck in the tourney! :)
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
hgm
Posts: 27931
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Pedantic Developer's Log Stardate...

Post by hgm »

Wouldn't it be better to use king-relative tables, pst[piece][sqr] + pst2[piece][sqr-friendlyKing] + pst3[piece][sqr-enemyKing]?
JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate... (BIG IDEA DOWNGRADED)

Post by JoAnnP38 »

Okay, my big idea that I discussed earlier turned out to be not so original as it seems several other engines have employed this technique to use relative king positions to index four piece-square tables per piece. It wasn't difficult to implement I just had to be careful to introduce as little overhead as possible for the feature. So, after tuning my new piece-square tables and running a SPRT test, I can officially say that my "big-idea" is mostly a smallish-medium idea. Perhaps it was the hopefulness of a new engine developer or maybe just downright foolishness, but I thought I might be able to gain 30-40 Elo from this feature. Instead, it looks like is was worth around +19 Elo. I will be committing this feature shortly.

Code: Select all

Score of Pedantic 0.3A vs Pedantic 0.3B: 539 - 434 - 905  [0.528] 1878
...      Pedantic 0.3A playing White: 303 - 206 - 429  [0.552] 938
...      Pedantic 0.3A playing Black: 236 - 228 - 476  [0.504] 940
...      White vs Black: 531 - 442 - 905  [0.524] 1878
Elo difference: 19.4 +/- 11.3, LOS: 100.0 %, DrawRatio: 48.2 %
SPRT: llr 2.97 (100.9%), lbound -2.94, ubound 2.94 - H1 was accepted
JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate...

Post by JoAnnP38 »

hgm wrote: Sun May 21, 2023 7:13 am Wouldn't it be better to use king-relative tables, pst[piece][sqr] + pst2[piece][sqr-friendlyKing] + pst3[piece][sqr-enemyKing]?
This looks really interesting, but I have to admit I may not be following everything. Wouldn't several combinations of (sqr - enemyKing) or (sqr - friendlyKing) resolve the same square, but actually represent drastically different positions on the board? I thinking I'm missing something key here.
User avatar
hgm
Posts: 27931
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Pedantic Developer's Log Stardate...

Post by hgm »

Ah, I assumed 0x88-style square numbering, where the difference is always unique. If you use 0-63 numbering, you would either have to convert it to 0x88 (e.g. by sqr+(sqr&070) or a lookup), or use an extra dimension pst[piece][sqr][kingSqr]. Which would allow more flexibility (and goes into the direction of NNUE), but would consume much more space.
JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate...

Post by JoAnnP38 »

My work on the release of Pedantic 0.3 is coming to a close. All major features planned for this release are now code complete and I will start both functional testing and estimation of overall Elo gain. The previous version of Pedantic (0.2) has been rated 2474 at CCRL's blitz time controls (i.e. 2'+1") and 2442 at 40/15. Throughout this release cycle I have been keeping an eye on the relative play of Pedantic with other engines that at the start of this release were much stronger. Here are the most recent results of small round-robin tournaments with the following engines. Kudos to the developers of these engines that have made them publicly available.
Blitz (2+1)

Code: Select all

-----------------Pedantic-----------------
Pedantic - ApotheosisV4.0.1 : 4.0/6 3-1-2 (0=11=1)  67%  +123
Pedantic - Knightx37b_64    : 3.5/6 3-2-1 (0=0111)  58%   +56
Pedantic - Peacekeeper-140  : 1.5/6 0-3-3 (00==0=)  25%  -191
Pedantic - Zevra 2.5        : 5.0/6 5-1-0 (111101)  83%  +275
40/15

Code: Select all

-----------------Pedantic-----------------
Pedantic - ApotheosisV4.0.1 : 4.0/6 2-0-4 (=11===)  67%  +123
Pedantic - Knightx37b_64    : 3.5/6 2-1-3 (=0=1=1)  58%   +56
Pedantic - Peacekeeper-140  : 1.5/6 1-4-1 (0=1000)  25%  -191
Pedantic - Zevra 2.5        : 3.5/6 2-1-3 (=1=1=0)  58%   +56
You may ask, "why so few games?" The reason for that is two-fold. I watched every single game of the blitz matches and much of the 40/15. I find that I can learn different things about Pedantic's style of play that I would never learn by just running SPRT tests to measure Elo gains. This process also brought to attention some "bugs" that I wouldn't have caught otherwise. Additionally, I have run these same round-robins a few different times during the development process to get a "feel" for where Pedantic stands overall and even with only 6 games per match, it stills takes a lot of time to wait for a 40/15 tournament to complete.

I should be wrapping up final testing over the next day or two and will announce final availability in this thread.
User avatar
lithander
Posts: 881
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Pedantic Developer's Log Stardate...

Post by lithander »

Looks like there's a big increase in strength from version 0.2 to version 0.3! I'm curious about the total list of changes you made and how much the individual strength contributions were. I love detailed release descriptions! ;)

Your rapid progress is definitely an inspiration to try and pick up my own pace again!
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate...

Post by JoAnnP38 »

lithander wrote: Fri Jun 02, 2023 11:37 pm Looks like there's a big increase in strength from version 0.2 to version 0.3! I'm curious about the total list of changes you made and how much the individual strength contributions were. I love detailed release descriptions! ;)

Your rapid progress is definitely an inspiration to try and pick up my own pace again!
While I try to update my commit comments with SPRT testing results, I'm not as pedantic as I probably need to be and there are probably some smaller changes that were checked in without being tested individually. Looking through history for this release I see the following changes had positive Elo gains:
  • Add Static Null Move (Reverse Futility) Pruning +35 Elo
  • Redesign/enhance time management +34 Elo
  • Optimize move generation and choose between Magic or PEXT bitboards at startup +20 Elo
  • Change bonus used by history table when recording cutoff +20 Elo
  • Add 2x2 King location buckets to select between 4 PST / piece +19 Elo
  • Implement IID +18Elo
  • Disable pruning during endgame (total material on board < 1300) +11 Elo
  • Reduce killers per ply from four to two +10 Elo
  • Replace quiescence search with tiered or staged quiescence search +7 Elo
  • Various parameter/constant optimizations that control various features +10 Elo
Strangely enough, when the Elo above are summed that are just a few points shy of the Elo gains I'm seeing now in self-play over Pedantic 0.2. Perhaps there were other changes I made that I did not document. I'll have to fire the developer. :wink:
JoAnnP38
Posts: 250
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Pedantic Developer's Log Stardate...

Post by JoAnnP38 »

Wow, I am pretty dense sometimes. As I was testing Pedantic in order to release version 0.3, I noticed that its Elo rating was almost 150 points higher when I enabled adjudications in cute chess!! I first thought to myself, that maybe everyone else's engine which I've been testing against just had better end game logic or maybe used egtb. My initial self-play test indicated that Pedantic had gained +184 elo in this release:

Code: Select all

SELF-PLAY ELO
Score of Pedantic 0.3 vs Pedantic 0.2: 573 - 88 - 339  [0.743] 1000
...      Pedantic 0.3 playing White: 309 - 39 - 152  [0.770] 500
...      Pedantic 0.3 playing Black: 264 - 49 - 187  [0.715] 500
...      White vs Black: 358 - 303 - 339  [0.527] 1000
Elo difference: 184.0 +/- 18.4, LOS: 100.0 %, DrawRatio: 33.9 %
I'm not going to lie, but this had felt like such a long tough slog of a release since many enhancements I tried either yielded none or negative gains (ex. counter move history, history + gravity, enhanced root node move ordering, ...). Even some of the features that gained Elo didn't yield everything that I had hoped for. However, when I started reviewing the PGNs I noticed Pedantic was losing end-games that even I could have won. Additionally, it was making blunders while trying to force mate even when search had already revealed a line that forced mate-in-N. But this was something particular to the 0.3 release since Pedantic 0.2 looked like an endgame genius in comparison. Going back through my change log for this release, one of the first things I did early was implement changes that would speed up search. One of those changes was that I no longer clear the transposition table after every search. This seems to contribute significantly to search efficiency, at least until the end-game. Also, looking at blundered games in analysis mode, Pedantic's analysis and suggested moves seemed spot on. This all added up to me as a misunderstanding of the dynamics of the transposition table. As an experiment I decided to clear the transposition table after every search once I approached the end-game and voilà!! Pedantic immediately gained about 130 Elo!!!

Code: Select all

SELF-PLY ELO
Score of Pedantic 0.3 vs Pedantic 0.2: 385 - 28 - 87  [0.857] 500
...      Pedantic 0.3 playing White: 200 - 12 - 38  [0.876] 250
...      Pedantic 0.3 playing Black: 185 - 16 - 49  [0.838] 250
...      White vs Black: 216 - 197 - 87  [0.519] 500
Elo difference: 311.1 +/- 35.2, LOS: 100.0 %, DrawRatio: 17.4 %
While I still don't have a clear understanding why clearing the TT is having such a pronounced affect, I am going to leave this "hack" in for now and release. I am running a gauntlet now with no adjudications that should give me a better idea of where Pedantic 0.3 stands in terms of playing strength. I should be able to finally release Pedantic 0.3 in the next day or so.