Progress on Rustic

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
mvanthoor
Posts: 982
Joined: Wed Jul 03, 2019 2:42 pm
Location: Netherlands
Full name: Marcel Vanthoor
Contact:

Re: Progress on Rustic

Post by mvanthoor » Mon Jan 25, 2021 9:38 pm

unserializable wrote:
Mon Jan 25, 2021 9:01 pm
Cool, sounds like the reason indeed has been found. Though it is also bit sad :(, if bugfixes continue at such pace, soon Monchester 1.0 will not be able to beat Rustic under any circumstances. I will also put it to play some more games for the night before the code gets updated :D.
Thanks for testing :)

I dare say that I don't often have severe bugs in the software I write. The drawback is that I'm relatively slow (compared to others) with regard to writing new code, and I test quite extensively. I *REALLY* dislike having to go back and debug code after I've written it. I prefer to only touch a piece of code when I'm actually going to improve it somehow. (Structure, speed, functionality, whatever...)

I wonder how this bug could have happened. The code actually says:

Code: Select all

	// Determine if we are in check.
	...
        // If so, extend search depth by 1 to determine the best way to get
        // out of the check before we go into quiescence search.
Then WTH did I put it AFTER "depth = 0 / qsearch" ?! It is well known that an engine should NOT go into qsearch if the side being searched is in check. Being in check is PER DEFINITION not quiet, because it COULD be mate. The only thing I could imagine that has caused this merging multiple branches and then resolving a merge conflict the wrong way or something weird like that.

Ah well. In a hyperspeed time control, the fix doesn't seem to matter much. I'll test the fixed version against Alpha 1 in the mentioned 1m + 0.6s time control (which is also used in Stockfish testing if I remember correctly). Looking at the playout of some games reminds me that time management is indeed quite basic.
Rustic's upcoming preference for Bd6-e5 bishops' activity on open diagonal instead of bishop blocked by own pawns seems like true grandmaster move.
I don't know how the engine suddenly decided on that move. I (as a person) would be extremely scared of that pawn on f5. Bd6-e5 guards f6, but I would rather have that pawn off the board. I also couldn't really understand the offered sacrifice of Be5xd4, but after I took the bishop with white, Rustic demolished me in short order.
EDIT: wow, from the statistics in your last post, Rustic really loves the black pieces :)
No comment... must have been the opening book. I'm switching to the much bigger GrandMaster 1950 book, for a more varied opening play. I'll see if that fixes it. It's not logical for the engine to prefer the white or black pieces, because there's nothing in the evaluation to make it do so. It just does ("material + PST for white" - "material + PST for black") and that's it for now. (Apart from the extra King endgame table.)

Sometimes, I see the engine do truly heroic stuff with regard to attacks and sacrifices, making it look like a tactical genius... and then a few moves later, it gets distracted by a weak pawn on the other side of the board, breaking off a massive attack on the king because it can eat just a bit more material... which makes it look like a complete positional idiot.

PS: I took a look at Monchester after I saw the score of your program against Rustic... it's in the list at ELO 837. How did you get it all the way down there? I don't understand...Have you not yet implemented quiescence search, MVV-LVA and check extension? I intended to leave them out of my very first version, but when I saw the _MASSIVE_ blunders to program played, I quickly decided to not release it like that.

By the way: I also tried to look at the code: https://github.com/unserializable/monchester, but there doesn't seem to be any. Am I missing something? Edit: Oh. It's not in the repository, but attached to the release. That's a novel idea :D
Author of Rustic.
Releases | Code | Docs

User avatar
mvanthoor
Posts: 982
Joined: Wed Jul 03, 2019 2:42 pm
Location: Netherlands
Full name: Marcel Vanthoor
Contact:

Re: Progress on Rustic

Post by mvanthoor » Mon Jan 25, 2021 10:00 pm

With regard to testing: if you want to test the fixed version, you can pull the qsearch-fix branch. If the 1m+0.6s test checks out tonight, I'll merge it into master tomorrow so it'll be in the next release down the road. (The next release will focus on the transposition table, at least with regard to the playing part of the engine.)
Author of Rustic.
Releases | Code | Docs

unserializable
Posts: 63
Joined: Sat Oct 24, 2020 4:39 pm
Full name: Taimo Peelo

Re: Progress on Rustic

Post by unserializable » Mon Jan 25, 2021 10:18 pm

mvanthoor wrote:
Mon Jan 25, 2021 9:38 pm
PS: I took a look at Monchester after I saw the score of your program against Rustic... it's in the list at ELO 837. How did you get it all the way down there? I don't understand...Have you not yet implemented quiescence search, MVV-LVA and check extension? I intended to leave them out of my very first version, but when I saw the _MASSIVE_ blunders to program played, I quickly decided to not release it like that.
Of course Monchester 1.0 does not implement any of this fancy stuff, 1.0 is as basic as conceived :)! There are plans for next version and also for one fun experiment, but I have not had much time for this, hopefully there will be fun update by 1st April or 2.0 at Halloween though. With regard to CCRL 404, probably Monchester is bit underrated there, as adjudication rules are used and many of these engines at lower end might not be able to actually convert the 'won' positions, e.g. it is rated bit lower than ChessPuter on CCRL 404. But against ChessPuter the constant-strength Monchester performed +210 =250 -40 (+123 ELO / 100 Bayes ELO) in my local tests on 2+1 (ChessPuter was just totally unable to convert won positions, most of these were drawn when played out).

For amusement, typical draw in ChessPuter style, when played out.



There are obscure results for some of its CCRL games, as the commented .pgns are not available I am not certain how these happened, e.g. one game ends with opponent putting its queen en-prise and winning...


mvanthoor wrote:
Mon Jan 25, 2021 9:38 pm
By the way: I also tried to look at the code: https://github.com/unserializable/monchester, but there doesn't seem to be any. Am I missing something? Edit: Oh. It's not in the repository, but attached to the release. That's a novel idea :D
Engine code is also present in version branches ('1.0-branch'), master branch has README and binaries only, repository structure is also explained at the start of README :).
Monchester 1.0, chess engine playing at scholastic level: https://github.com/unserializable/monchester ("Daddy, it is gonna take your horsie!")
Tickle Monchester at: https://lichess.org/@/monchester

User avatar
mvanthoor
Posts: 982
Joined: Wed Jul 03, 2019 2:42 pm
Location: Netherlands
Full name: Marcel Vanthoor
Contact:

Re: Progress on Rustic

Post by mvanthoor » Mon Jan 25, 2021 10:29 pm

unserializable wrote:
Mon Jan 25, 2021 10:18 pm
Of course Monchester 1.0 does not implement any of this fancy stuff, 1.0 is as basic as conceived :)! There are plans for next version and also for one fun experiment, but I have not had much time for this, hopefully there will be fun update by 1st April or 2.0 at Halloween though.
I haven't extensively checked the code, but it seems Monchester is either stuck at 4 moves deep, or it limits itself to that search depth. That would be one explanation for the low rating.

I've just ran a test for 200 games, 10s+0, with the following results:

Code: Select all

Score of Rustic QSearch Fix vs Monchester 1.0.1: 199 - 0 - 1 [0.998]
...      Rustic QSearch Fix playing White: 100 - 0 - 0  [1.000] 100
...      Rustic QSearch Fix playing Black: 99 - 0 - 1  [0.995] 100
...      White vs Black: 100 - 99 - 1  [0.502] 200
Elo difference: 1040.4 +/- nan, LOS: 100.0 %, DrawRatio: 0.5 %
200 of 200 games finished.
It seems the score is now at 99.8% for Rustic... Monchester managed only one draw, which was a move repetition, almost right out of the opening book:



If given a few seconds to think however, Rustic doesn't repeat the moves but after the bishop retreats to d3, it captures on c4.

I look forward to the time where you'll implement iterative deepning search... until that time, Monchester isn't really a viable sparring partner for Rustic (or basically, any other engine that implements iterative deepening).
Author of Rustic.
Releases | Code | Docs

unserializable
Posts: 63
Joined: Sat Oct 24, 2020 4:39 pm
Full name: Taimo Peelo

Re: Progress on Rustic

Post by unserializable » Mon Jan 25, 2021 10:48 pm

mvanthoor wrote:
Mon Jan 25, 2021 10:29 pm
I haven't extensively checked the code, but it seems Monchester is either stuck at 4 moves deep, or it limits itself to that search depth. That would be one explanation for the low rating.
Yes, it is constant strength at 4 plies, no matter the hardware or time controls. It is meant as reference engine and for scholastic chess, for sparring it is suitable for beginning engines or mass-testing. In blitz, it is rated currently around 1550 at lichess.

Draw you posted has notable 253 depth extension search there: 14. ... Kh8 {0.00/253 0.003s}.

EDIT: I will take a look at Rustics 'qsearch-fix' branch tomorrow, additional games against older version are still running.
Monchester 1.0, chess engine playing at scholastic level: https://github.com/unserializable/monchester ("Daddy, it is gonna take your horsie!")
Tickle Monchester at: https://lichess.org/@/monchester

User avatar
mvanthoor
Posts: 982
Joined: Wed Jul 03, 2019 2:42 pm
Location: Netherlands
Full name: Marcel Vanthoor
Contact:

Re: Progress on Rustic

Post by mvanthoor » Mon Jan 25, 2021 11:39 pm

unserializable wrote:
Mon Jan 25, 2021 10:48 pm
EDIT: I will take a look at Rustics 'qsearch-fix' branch tomorrow, additional games against older version are still running.
Thanks :)

To be honest, I think the 10s+0 games are low quality, but for giggles I'm running a few sets of 200 to just see what the results are. The engine Rustic plays against makes a huge difference (the ratings I quote are a bit older; they may have slightly changed).

Rustic vs:

Pulse 1.7.2 (CCRL 1650): +150 ELO
TSCP 1.81 (CCRL 1713): -100 ELO
ShallowBlue 2.0 (CCRL 1724): +30 ELO
Pigeon 1.5.1 (CCRL 1798): +35 ELO

This result puts Rustic somewhere in between 1600 (against TSCP) and 1830 (against Pigeon).

Note that I don't put a lot of stock in 10s+0 games. My own tests to see if Rustic improved after a change are going to be run at either 1m+0.6s, or at the same time controls as CCRL blitz.
Author of Rustic.
Releases | Code | Docs

User avatar
Guenther
Posts: 3938
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Progress on Rustic

Post by Guenther » Tue Jan 26, 2021 8:08 am

unserializable wrote:
Mon Jan 25, 2021 10:18 pm
mvanthoor wrote:
Mon Jan 25, 2021 9:38 pm
PS: I took a look at Monchester after I saw the score of your program against Rustic... it's in the list at ELO 837. How did you get it all the way down there? I don't understand...Have you not yet implemented quiescence search, MVV-LVA and check extension? I intended to leave them out of my very first version, but when I saw the _MASSIVE_ blunders to program played, I quickly decided to not release it like that.
...
There are obscure results for some of its CCRL games, as the commented .pgns are not available I am not certain how these happened, e.g. one game ends with opponent putting its queen en-prise and winning...


Hi Taimo, You should report the most strange ones (as above) to CCRL.
(They have an own forum BTW)

May be there was a strange adjudication rule active, e.g. sth like x plies into a won 4 men TB,
also there is no Termination tag or reason given at all.
Sometimes they fix wrong (not really checked) game results.

Actually I think games for programs below let's say 1400 probably should not be adjudicated at all.
OTH this would probably reduce the motivation to test some of them, due to marathon games.
https://rwbc-chess.de
'chessqueen' 2018-present, aka: 'George' 2013-2016, 'pichy' 2006-2013, 'Jorge Pichard' 2000-2006 (old forum)
Troll barometer:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

unserializable
Posts: 63
Joined: Sat Oct 24, 2020 4:39 pm
Full name: Taimo Peelo

Re: Progress on Rustic

Post by unserializable » Tue Jan 26, 2021 9:30 am

unserializable wrote:
Mon Jan 25, 2021 10:48 pm
EDIT: I will take a look at Rustics 'qsearch-fix' branch tomorrow, additional games against older version are still running.
Some more interesting games from 7000 10s+0 games against the Rustic Alpha 1 without search fix (99.36%, +6947 =17 -36 for Rustic). No identifying round numbers, as those were lost when merging games from parallel runs. I have put desktop at home to do another 7k games at the same time control against the search fix branch.

Stalemates

Stalemates, probably in time trouble, Rustic reported search depth 1 on last moves -- but there are many moves available at depth 1 that do not result in stalemate.







Fun stalemate, but that one is probably out of horizon at such fast time control:



Maybe avoidable losses

Loss after material grabbing 23. Qxh8 with reported 6-ply lookahead by Rustic:



Fun multi-queen ending with epaulette mate, which seems like if check extension is applied as reported in new fix branch, it should be successfully avoided.


Guenther wrote:
Tue Jan 26, 2021 8:08 am
Hi Taimo, You should report the most strange ones (as above) to CCRL.
Hey Guenther! I do not see much point in complaining to CCRL anymore about the testing of 0.99 beta, but I have thought about asking CCRL to do 1.0+ testing when I get the 1.0.1+ out, maybe then I will also take time to point out some questionable adjudications.
Monchester 1.0, chess engine playing at scholastic level: https://github.com/unserializable/monchester ("Daddy, it is gonna take your horsie!")
Tickle Monchester at: https://lichess.org/@/monchester

User avatar
Guenther
Posts: 3938
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Progress on Rustic

Post by Guenther » Tue Jan 26, 2021 10:33 am

unserializable wrote:
Tue Jan 26, 2021 9:30 am

...

Fun multi-queen ending with epaulette mate, which seems like if check extension is applied as reported in new fix branch, it should be successfully avoided.


Guenther wrote:
Tue Jan 26, 2021 8:08 am
Hi Taimo, You should report the most strange ones (as above) to CCRL.
Hey Guenther! I do not see much point in complaining to CCRL anymore about the testing of 0.99 beta, but I have thought about asking CCRL to do 1.0+ testing when I get the 1.0.1+ out, maybe then I will also take time to point out some questionable adjudications.
You are probably right, also it gives the chance your rating plus with a newer version looks better ;-)

Back to the Rustic thread. The last game you showed above might reveal a bug in Monchester too?
It still showed negative eval after the wrong f1=Q and I don't understand why? It only changed the eval sign
with mate in two on board - shouldn't it see it earlier at least, or is this just because depth = 4 plies
and everything above is just material count?

BTW is it also possible that Rustic thought it should play f1=N+!, but underpromotion is not implemented yet?
https://rwbc-chess.de
'chessqueen' 2018-present, aka: 'George' 2013-2016, 'pichy' 2006-2013, 'Jorge Pichard' 2000-2006 (old forum)
Troll barometer:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

Sven
Posts: 3969
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: Progress on Rustic

Post by Sven » Tue Jan 26, 2021 12:42 pm

unserializable wrote:
Mon Jan 25, 2021 10:48 pm
mvanthoor wrote:
Mon Jan 25, 2021 10:29 pm
I haven't extensively checked the code, but it seems Monchester is either stuck at 4 moves deep, or it limits itself to that search depth. That would be one explanation for the low rating.
Yes, it is constant strength at 4 plies, no matter the hardware or time controls. It is meant as reference engine and for scholastic chess, for sparring it is suitable for beginning engines or mass-testing. In blitz, it is rated currently around 1550 at lichess.

Draw you posted has notable 253 depth extension search there: 14. ... Kh8 {0.00/253 0.003s}.

EDIT: I will take a look at Rustics 'qsearch-fix' branch tomorrow, additional games against older version are still running.
Hi Taimo,

your engine could reach much more than depth 4 by simply changing your search from full minimax to negamax + alphabeta and very basic move ordering. After looking briefly into your code I think the changes would not be too hard to do for you as an experienced developer, most of them in compmove.c, I guess:

1) In function Score() replace the two-level move loop (i.e. loop over all pieces and there over all moves per piece) by a simple loop over a linear array of MoveInfo instances, aka "move list" (and do necessary preparations for that). The size of that move list (which I would keep as local as possible) can be limited to e.g. 256 moves since there is no known FIDE chess position with that many legal moves (I remember about 230).

2) Extend struct MoveInfo by a "score" member and set it to the MVV/LVA score for all captures, zero for all other moves. Sort the move list in descending order based on that score (for the purpose of alphabeta there is even a slightly faster way than that, as an optimization for later).

3) Change function Score() from minimax to negamax logic by defining all scores from the viewpoint of the active color and making the recursive call like "score = -Score(Board, ..., depth - 1, ...)". The "judged" parameter would become obsolete. Updating the best score "r" would become more simple since there would not be two cases.

4) Add parameters "int alpha, int beta" to Score(), drop the variable "r" and let "alpha" track your best score instead, do the recursive call as "score = -Score(Board, ..., depth - 1, -beta, -alpha, ..)", add the cutoff "if (score >= beta) return beta;" prior to updating alpha ("if (score > alpha) alpha = score;"), and make sure you only update the PV if alpha < score < beta.

That should bring you to depth 7 at least within the same thinking time ...

EDIT: Negamax is not strictly required in order for alphabeta to work but it makes life easier ... And I would also throw out all kinds of score randomization and add a very basic positional scoring, e.g. mobility. And then, next step would be qsearch :wink:
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)

Post Reply