Put that PGN into the MRI

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Rebel
Posts: 5241
Joined: Thu Aug 18, 2011 10:04 am

Put that PGN into the MRI

Post by Rebel » Tue Feb 18, 2020 11:01 am

Some snippets

MRI is a PGN utility for engine developers and/or beta testers to extract critical information from an eng-eng match, even from rating list PGN downloads with multiple engines such as CCRL, CEGT and probably others.

...

7. Lost Games Analysis - This function tries (emphasis added) to pinpoint the move where it all went wrong. It checks the 4 last moves and if the scores from move to move slowly but steady go down and down until the margin (default 1.50) is reached then it makes sense to have a look at that game fragment. While it's far from perfect this function probably is the most useful of the MRI utility.

...

This option 5c is one of the better features of MRI, download the annotated games of CCRL and or CEGT of your latest version and get the results of all games played against all opponents.

...

http://rebel13.nl/download/mri.html
90% of coding is debugging, the other 10% is writing bugs.

Guenther
Posts: 3280
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Put that PGN into the MRI

Post by Guenther » Tue Feb 18, 2020 12:18 pm

I downloaded it and tested it on a fresh pgn and have a few suggestions and questions.
This was a test of SF11 vs. SF11x2 (approx. half time used)

1. Unluckily it cuts off the name in the general overview, thus both entities look now the same (could be cured by allowing more chars I guess)
From the results and avg. depths I can still recognize who is who though.

2. It gives wrong values for book depth (we had this already for another tool of yours, if you remember?), the book is exactly 6 plies always and created by me. The time output also says 0:00 for one program. BTW should this be accumulated time or average time, I see no explanation in the output?

3. How does it parse and calculate the suspect openings? There is no record given, despite I know there are several entries with 1.00 margin (mostly for being extremely fast games and search gives too pessimistic evals at lower depths). Example below.

The pgn was created by cutechess-cli (it was correctly assumed), but modified by a script by me, which removes 's' at the end of the move times
and creates perfect columns. Also it is a slightly modified cutechess compiled by me, which adds at least 2 digits precision for the saved move times. May be this confuses MRI?

Here an example - they are all that way. (left the pgn headers out here)

Code: Select all

1. c4 {book} e5 {book}
2. Nc3 {book} f5 {book}
3. g3 {book} Nf6 {book}
4. e3 {+1.05/16 0.92} Nc6 {-0.30/17 1.92}
5. d4 {+0.93/14 0.25} Bb4 {+0.01/13 0.15}
BTW the one probably missed win is a nice find of a festung, if I am not mistaken.


Part of the MRI output referred above:

Code: Select all

Assume CuteChess PGN.

Current Settings:
Margin for suspect Openings          : 1.00
Margin for crazy games               : 1.00
Margin for possible missed won games : 5.00
Drop in Score Margin                 : 1.50
Phase Margin                         : 0.50
Lost Game Score Margin               : 1.50
Opening Repertory                    : on 
Double Game Detection                : off

Phase          Won Games (numbers)           Late Endgame             Match
Overview       MIDG END1 END2 END3      QUEEN ROOK LIGHT PAWN         Score
Stockfish_11    150    7   96   23          0   17    14   14   713.5 / 1000 (71.3%)
Stockfish_11     22    1   14    3          0    3     0    4   286.5 / 1000 (28.6%)

Phase              Won games %               Late Endgame             Match
Overview       MIDG END1 END2 END3      QUEEN ROOK LIGHT PAWN         Score
Stockfish_11   87.2 87.5 87.3 88.5        0.0 85.0  100% 77.8   713.5 / 1000 (71.3%)
Stockfish_11   12.8 12.5 12.7 11.5        0.0 15.0   0.0 22.2   286.5 / 1000 (28.6%)

Depths         MIDG END1 END2 END3     BOOK             TIME
Stockfish_11   17.1 17.5 22.9 25.2     3.5 (moves)     20:31
Stockfish_11   15.3 15.6 20.0 21.7     4.1 (moves)      0:00

Suspect opening lines overview (1.00)  View

Crazy games (incompatible scores)  (1.00)  View

Possible missed wins overview (5.00)  View
Stockfish_11-64 possibly missed a win in game 928 with a score of 5.26 (move 71... Qd2+) draw only.  Stockfish
8/2p5/8/3k4/p7/B1R5/K7/3q4 b - - sm Qd2+; acd 20; ce 5.26; acs 0.0s; 


Score drop overview (1.50)  View
Drop in score of 1.74 (0.00 to -1.74) for Stockfish_11x2-64 in game 3 during moves 18-19, game lost  Stockfish
rr4k1/1q1nbpp1/p1pp1nbp/4P3/3P1B2/P1N5/BP1Q1NPP/3R1RK1 b - - sm Nh5; acd 13; ce -1.74; acs 0.0s; 
https://rwbc-chess.de
https://rwbc-chess.de/chronology.htm
--------------------------------------------------
The troll explosion at talkchess:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

User avatar
Rebel
Posts: 5241
Joined: Thu Aug 18, 2011 10:04 am

Re: Put that PGN into the MRI

Post by Rebel » Tue Feb 18, 2020 3:42 pm

Guenther wrote:
Tue Feb 18, 2020 12:18 pm
I downloaded it and tested it on a fresh pgn and have a few suggestions and questions.
This was a test of SF11 vs. SF11x2 (approx. half time used)

1. Unluckily it cuts off the name in the general overview, thus both entities look now the same (could be cured by allowing more chars I guess)
From the results and avg. depths I can still recognize who is who though.
The current length of the names is fixed and set to 12, I will make it user defined.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
2. It gives wrong values for book depth (we had this already for another tool of yours, if you remember?), the book is exactly 6 plies always and created by me. The time output also says 0:00 for one program. BTW should this be accumulated time or average time, I see no explanation in the output?
I have an idea about the why, will give it attention.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
3. How does it parse and calculate the suspect openings? There is no record given, despite I know there are several entries with 1.00 margin (mostly for being extremely fast games and search gives too pessimistic evals at lower depths). Example below.
An engine may play the first move with a score of -5.xx, it's only recorded if it loses the game.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
The pgn was created by cutechess-cli (it was correctly assumed), but modified by a script by me, which removes 's' at the end of the move times
and creates perfect columns. Also it is a slightly modified cutechess compiled by me, which adds at least 2 digits precision for the saved move times. May be this confuses MRI?
Yes.

The "s" is typical for the Cute format, else it is assumed more Arena alike, thus full seconds only. Run the original PGN instead.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
Here an example - they are all that way. (left the pgn headers out here)

Code: Select all

1. c4 {book} e5 {book}
2. Nc3 {book} f5 {book}
3. g3 {book} Nf6 {book}
4. e3 {+1.05/16 0.92} Nc6 {-0.30/17 1.92}
5. d4 {+0.93/14 0.25} Bb4 {+0.01/13 0.15}
BTW the one probably missed win is a nice find of a festung, if I am not mistaken.
Correct.
90% of coding is debugging, the other 10% is writing bugs.

Guenther
Posts: 3280
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Put that PGN into the MRI

Post by Guenther » Tue Feb 18, 2020 5:40 pm

Rebel wrote:
Tue Feb 18, 2020 3:42 pm
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
I downloaded it and tested it on a fresh pgn and have a few suggestions and questions.
This was a test of SF11 vs. SF11x2 (approx. half time used)

1. Unluckily it cuts off the name in the general overview, thus both entities look now the same (could be cured by allowing more chars I guess)
From the results and avg. depths I can still recognize who is who though.
The current length of the names is fixed and set to 12, I will make it user defined.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
2. It gives wrong values for book depth (we had this already for another tool of yours, if you remember?), the book is exactly 6 plies always and created by me. The time output also says 0:00 for one program. BTW should this be accumulated time or average time, I see no explanation in the output?
I have an idea about the why, will give it attention.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
3. How does it parse and calculate the suspect openings? There is no record given, despite I know there are several entries with 1.00 margin (mostly for being extremely fast games and search gives too pessimistic evals at lower depths). Example below.
An engine may play the first move with a score of -5.xx, it's only recorded if it loses the game.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
The pgn was created by cutechess-cli (it was correctly assumed), but modified by a script by me, which removes 's' at the end of the move times
and creates perfect columns. Also it is a slightly modified cutechess compiled by me, which adds at least 2 digits precision for the saved move times. May be this confuses MRI?
Yes.

The "s" is typical for the Cute format, else it is assumed more Arena alike, thus full seconds only. Run the original PGN instead.
Guenther wrote:
Tue Feb 18, 2020 12:18 pm
Here an example - they are all that way. (left the pgn headers out here)

Code: Select all

1. c4 {book} e5 {book}
2. Nc3 {book} f5 {book}
3. g3 {book} Nf6 {book}
4. e3 {+1.05/16 0.92} Nc6 {-0.30/17 1.92}
5. d4 {+0.93/14 0.25} Bb4 {+0.01/13 0.15}
BTW the one probably missed win is a nice find of a festung, if I am not mistaken.
Correct.
Thanks for the answers!

I am still not sure about point 3., shouldn't it report then a 'suspect openings', if e.g. eval > 1.0 for White and White wins?
Or does it consider the eval of the reply too?
This would be a big improvement anyway. What, if you just record it as 'suspect', if the eval is >X for a few moves (let's say 3 or 4 again)?
https://rwbc-chess.de
https://rwbc-chess.de/chronology.htm
--------------------------------------------------
The troll explosion at talkchess:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

Guenther
Posts: 3280
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Put that PGN into the MRI

Post by Guenther » Tue Feb 18, 2020 6:26 pm

After the change in the pgn I got at least two different time results compared to before.
But still something there seems to be wrong? The test lasted over 16:03:xx.

Code: Select all

after adding (s) again to move times:

Depths         MIDG END1 END2 END3     BOOK             TIME
Stockfish_11   17.1 17.5 22.9 25.2     3.5 (moves)   5:09:40
Stockfish_11   15.3 15.6 20.0 21.7     4.1 (moves)   2:17:46

----------------------------------------------------------------------------------------
before:

Depths         MIDG END1 END2 END3     BOOK             TIME
Stockfish_11   17.1 17.5 22.9 25.2     3.5 (moves)     20:31
Stockfish_11   15.3 15.6 20.0 21.7     4.1 (moves)      0:00
Also still no crazy games and no suspect openings saved.

Shouldn't this be suspect by your definition with a margin of 1.00 or not?

https://rwbc-chess.de
https://rwbc-chess.de/chronology.htm
--------------------------------------------------
The troll explosion at talkchess:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

User avatar
Rebel
Posts: 5241
Joined: Thu Aug 18, 2011 10:04 am

Re: Put that PGN into the MRI

Post by Rebel » Tue Feb 18, 2020 11:31 pm

Guenther wrote:
Tue Feb 18, 2020 5:40 pm
I am still not sure about point 3., shouldn't it report then a 'suspect openings', if e.g. eval > 1.0 for White and White wins?
Or does it consider the eval of the reply too?
This would be a big improvement anyway. What, if you just record it as 'suspect', if the eval is >X for a few moves (let's say 3 or 4 again)?
I hear you, but it's complicated. If you look at the game black starts with a -1.07 score but during the game the score increases close to a draw score and following the code the opening is not suspect. Following your suggestion the game wouldn't make it to suspect status either :) Lemme think about it.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
Rebel
Posts: 5241
Joined: Thu Aug 18, 2011 10:04 am

Re: Put that PGN into the MRI

Post by Rebel » Tue Feb 18, 2020 11:45 pm

Guenther wrote:
Tue Feb 18, 2020 6:26 pm
After the change in the pgn I got at least two different time results compared to before.
But still something there seems to be wrong? The test lasted over 16:03:xx.
16 minutes or 16 hours ?

Code: Select all

after adding (s) again to move times:

Depths         MIDG END1 END2 END3     BOOK             TIME
Stockfish_11   17.1 17.5 22.9 25.2     3.5 (moves)   5:09:40
Stockfish_11   15.3 15.6 20.0 21.7     4.1 (moves)   2:17:46

----------------------------------------------------------------------------------------
before:

Depths         MIDG END1 END2 END3     BOOK             TIME
Stockfish_11   17.1 17.5 22.9 25.2     3.5 (moves)     20:31
Stockfish_11   15.3 15.6 20.0 21.7     4.1 (moves)      0:00
Note that you need to divide time by the number of cores you used.

But I noticed a small bug anyway, MRI stops counting time after 190 moves, typo.

Also still no crazy games and no suspect openings saved.
Crazy games only occur when 2 engines firmly disagree with each other and both show a possitive score > margin.

Shouldn't this be suspect by your definition with a margin of 1.00 or not?

See previous post.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
Rebel
Posts: 5241
Joined: Thu Aug 18, 2011 10:04 am

Re: Put that PGN into the MRI

Post by Rebel » Wed Feb 19, 2020 12:08 am

One more thing, I noticed your Cutechess PGN has the time in 2 decimals (0.92s), mine has only 1 decimal (0.9s).
90% of coding is debugging, the other 10% is writing bugs.

Guenther
Posts: 3280
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Put that PGN into the MRI

Post by Guenther » Wed Feb 19, 2020 7:03 am

Rebel wrote:
Wed Feb 19, 2020 12:08 am
One more thing, I noticed your Cutechess PGN has the time in 2 decimals (0.92s), mine has only 1 decimal (0.9s).
Yes, I wrote this already in my first post. I compiled Cutechess and CuteChess for doing so, because with short tc stats, I need more precision.
Other people use that compilation too BTW. This will become a future improvement in cutechess/cli anyway.

BTW even the default cutechess can have up to 3 digits precision in move times for very low times.
https://github.com/cutechess/cutechess/ ... me.cpp#L42
Last edited by Guenther on Wed Feb 19, 2020 7:15 am, edited 3 times in total.
https://rwbc-chess.de
https://rwbc-chess.de/chronology.htm
--------------------------------------------------
The troll explosion at talkchess:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

Guenther
Posts: 3280
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Put that PGN into the MRI

Post by Guenther » Wed Feb 19, 2020 7:08 am

Rebel wrote:
Tue Feb 18, 2020 11:31 pm
Guenther wrote:
Tue Feb 18, 2020 5:40 pm
I am still not sure about point 3., shouldn't it report then a 'suspect openings', if e.g. eval > 1.0 for White and White wins?
Or does it consider the eval of the reply too?
This would be a big improvement anyway. What, if you just record it as 'suspect', if the eval is >X for a few moves (let's say 3 or 4 again)?
I hear you, but it's complicated. If you look at the game black starts with a -1.07 score but during the game the score increases close to a draw score and following the code the opening is not suspect. Following your suggestion the game wouldn't make it to suspect status either :) Lemme think about it.
Well, that's why I asked what exactly you are doing for recognizing a suspect opening.
If you don't tell, I cannot check if something is wrong.
https://rwbc-chess.de
https://rwbc-chess.de/chronology.htm
--------------------------------------------------
The troll explosion at talkchess:
https://docs.google.com/spreadsheets/d/ ... KSptBx9AUs

Post Reply