Page 1 of 3
Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 6:50 am
by Necromancer
So while browsing my engine results at
http://ccrl.chessdom.com/ccrl/404/cgi/e ... 1_1_64-bit
I saw this:
Tunguska 1.1 (2471) vs Jonny 4.0 (2746)
+36−5=8
It's a 275 ELO difference, so Tunguska winning chances are ~17%. I saw some games and they look normal. The weird thing is that Jonny appears to play well against engines of it's own level. Maybe a bug in the stronger engine?
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 7:07 am
by Dann Corbit
Lots of possible explanations.
Some engines have a nemesis that simply beats them better than one would expect given the Elo difference, even over a large number of trials.
The game count is small. Random fluctuation can cause all sorts of strange looking things with just a few trials.
Jonny seems to perform well with a giant pile of cores. Was the test single threaded?
Jonny could be misconfigured.
There are many other possibilities.
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 7:26 am
by Necromancer
Dann Corbit wrote: ↑Tue May 14, 2019 7:07 am
Was the test single threaded?
Jonny could be misconfigured.
There are many other possibilities.
I don't know, it's from CCRL 40/4. Reading about Jonny
...Jonny uses a 0x88 board representation, and applies a sophisticated distributed and parallel search.
So maybe that was the problem, thanks!
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 9:40 am
by Guenther
Necromancer wrote: ↑Tue May 14, 2019 7:26 am
Dann Corbit wrote: ↑Tue May 14, 2019 7:07 am
Was the test single threaded?
Jonny could be misconfigured.
There are many other possibilities.
I don't know, it's from CCRL 40/4. Reading about Jonny
...Jonny uses a 0x88 board representation, and applies a sophisticated distributed and parallel search.
So maybe that was the problem, thanks!
Normally we would need eval/depth (which is not available for the 40/4 games) to investigate the case the best (missconfigs or cpu overloads happen...), but in this case it is extremely unlikely that Jonny 4.00 was hit 3 times by an asteroid in the last 2 or 3 months.
I am saying this, because Jonny 4.00 suffered the same extreme outlier negative result, not only vs. Tunguska,
but also vs. Topple and FranMAD. I am convinced, something went wrong here.
Code: Select all
– Topple 0.5.0 64-bit 4CPU 2845 +24 -24 (+99) 0.5 - 44.5 (+0-44=1) 1.1% 0.5/45 0.0% -590
– Francesca MAD 0.21 64-bit 2709 +20 -20 (-37) 1.5 - 37.5 (+0-36=3) 3.8% 1.5/39 100.0% -532
A fishy example game below.
(even w/o eval/depth it should be possible to analyse, if Jonny did not reach normal depth here to lose that way)
I checked a few positions after the opening and Jonny played often moves, which are already discarded here at depth 10-12
after one second...(on my slow 10 years old quadcore - 1 cpu of course), e.g. 13. Qa4?? and others.
[pgn][Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2019.02.23"]
[Round "469.4.161"]
[White "Jonny 4.00"]
[Black "Francesca MAD 0.21 64-bit"]
[Result "0-1"]
[ECO "A04"]
[Opening "Reti opening"]
[PlyCount "64"]
[WhiteElo "2746"]
[BlackElo "2709"]
1. Nf3 e6 2. c4 b6 3. d4 Bb7 4. Nc3 Nf6 5. Bf4 Be7 6. h4 O-O 7. h5 d5 8. e3 c5
9. h6 g6 10. dxc5 bxc5 11. cxd5 exd5 12. Bb5 Nc6 13. Qa4 Qb6 14. O-O-O Rfd8 15.
Ne5 Na5 16. f3 a6 17. Bd7 Bd6 18. Bh3 Bxe5 19. Bxe5 Nc4 20. Nxd5 Bxd5 21. Rxd5
Rxd5 22. Bxf6 Nxe3 23. Qf4 c4 24. Re1 Rd3 25. a4 Qb3 26. Kb1 Nd5 27. Bf5 c3 28.
Qc1 Nb4 29. Re8+ Rxe8 30. Be6 c2+ 31. Qxc2 Qxc2+ 32. Ka1 Rd1# 0-1[/pgn]
Code: Select all
[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2019.02.23"]
[Round "469.4.161"]
[White "Jonny 4.00"]
[Black "Francesca MAD 0.21 64-bit"]
[Result "0-1"]
[ECO "A04"]
[Opening "Reti opening"]
[PlyCount "64"]
[WhiteElo "2746"]
[BlackElo "2709"]
1. Nf3 e6 2. c4 b6 3. d4 Bb7 4. Nc3 Nf6 5. Bf4 Be7 6. h4 O-O 7. h5 d5 8. e3 c5
9. h6 g6 10. dxc5 bxc5 11. cxd5 exd5 12. Bb5 Nc6 13. Qa4 Qb6 14. O-O-O Rfd8 15.
Ne5 Na5 16. f3 a6 17. Bd7 Bd6 18. Bh3 Bxe5 19. Bxe5 Nc4 20. Nxd5 Bxd5 21. Rxd5
Rxd5 22. Bxf6 Nxe3 23. Qf4 c4 24. Re1 Rd3 25. a4 Qb3 26. Kb1 Nd5 27. Bf5 c3 28.
Qc1 Nb4 29. Re8+ Rxe8 30. Be6 c2+ 31. Qxc2 Qxc2+ 32. Ka1 Rd1# 0-1
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 9:59 am
by Graham Banks
Sergio ran the Tunguska v Jonny games under Arena 3.51.
Might pay to check with him.
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 10:23 am
by xr_a_y
Same thing appends with Minic 0.47 on CCRL 40/4.
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 10:30 am
by Guenther
Guenther wrote: ↑Tue May 14, 2019 9:40 am
Normally we would need eval/depth (which is not available for the 40/4 games) to investigate the case the best (missconfigs or cpu overloads happen...), but in this case it is extremely unlikely that Jonny 4.00 was hit 3 times by an asteroid in the last 2 or 3 months.
...
Code: Select all
– Topple 0.5.0 64-bit 4CPU 2845 +24 -24 (+99) 0.5 - 44.5 (+0-44=1) 1.1% 0.5/45 0.0% -590
– Francesca MAD 0.21 64-bit 2709 +20 -20 (-37) 1.5 - 37.5 (+0-36=3) 3.8% 1.5/39 100.0% -532
Around that time Jonny 4.00 had two other results vs. FranMad 0.22 and 0.23:
14.5 : 20.5 FranMAD 0.22
16.5 : 15.5 FranMAD 0.23
both inside normal error bars, unlike the result vs. FranMAD 0.21.
After downloading the 40/4 games file of Jonny 4.00, it appears that the problem is at least
manifested since February 2019 and there are other completely unlikely bad results and
lots of quick strange losses in the pgn. Probably all played on the same quirky setup.
(Inbetween there are normal results)
Code: Select all
CCRL 40/4 2019 (Jonny 4.00 games in 2019)
Jonny 4.00 2746 - Winter 0.4a 64-bit 2811 10.0 - 22.0 +6/=8/-18 31.25%
Jonny 4.00 2746 - Topple 0.3.4 64-bit 2664 23.5 - 7.5 +20/=7/-4 75.81%
Jonny 4.00 2746 - Francesca MAD 0.21 64-bit 2709 1.5 - 37.5 +0/=3/-36 3.85% XXX
Jonny 4.00 2746 - Topple 0.3.5 64-bit 2701 15.0 - 15.0 +12/=6/-12 50.00%
Jonny 4.00 2746 - chess22k 1.12 64-bit 3083 0.0 - 1.0 +0/=0/-1 0.00%
Jonny 4.00 2746 - Dirty CUCUMBER 64-bit 2928 0.0 - 1.0 +0/=0/-1 0.00%
Jonny 4.00 2746 - Amyan 1.72 2604 0.0 - 1.0 +0/=0/-1 0.00%
Jonny 4.00 2746 - Floyd 0.9 64-bit 2585 0.0 - 1.0 +0/=0/-1 0.00%
Jonny 4.00 2746 - Ruffian 2.1.0 2609 0.5 - 0.5 +0/=1/-0 50.00%
Jonny 4.00 2746 - Nebula 2.0 64-bit 2656 0.5 - 0.5 +0/=1/-0 50.00%
Jonny 4.00 2746 - Pharaon 3.5.1 2604 1.0 - 0.0 +1/=0/-0 100.00%
Jonny 4.00 2746 - Ktulu 9 2782 0.5 - 0.5 +0/=1/-0 50.00%
Jonny 4.00 2746 - BugChess2 1.9 64-bit 2758 1.0 - 0.0 +1/=0/-0 100.00%
Jonny 4.00 2746 - Gaviota 1.0 64-bit 2871 0.0 - 1.0 +0/=0/-1 0.00%
Jonny 4.00 2746 - Gogobello 1.4 64-bit 2756 1.0 - 0.0 +1/=0/-0 100.00%
Jonny 4.00 2746 - Delfi 5.4 2683 1.0 - 0.0 +1/=0/-0 100.00%
Jonny 4.00 2746 - Gogobello 2.0 64-bit 2834 15.0 - 17.0 +11/=8/-13 46.88%
Jonny 4.00 2746 - Francesca MAD 0.22 64-bit 2730 14.5 - 20.5 +11/=7/-17 41.43%
Jonny 4.00 2746 - Igel 1.4 64-bit 2634 22.0 - 10.0 +19/=6/-7 68.75%
Jonny 4.00 2746 - Fridolin 3.10 64-bit 4CPU 2796 3.0 - 26.0 +0/=6/-23 10.34% XXX
Jonny 4.00 2746 - Topple 0.5.0 64-bit 2829 10.0 - 22.0 +5/=10/-17 31.25%
Jonny 4.00 2746 - Topple 0.5.0 64-bit 4CPU 2845 0.5 - 44.5 +0/=1/-44 1.11% XXX
Jonny 4.00 2746 - Minic 0.47 64-bit 4CPU 2880 5.0 - 42.0 +1/=8/-38 10.64% XXX
Jonny 4.00 2746 - Francesca MAD 0.23 64-bit 2771 16.5 - 15.5 +12/=9/-11 51.56%
Jonny 4.00 2746 - Tunguska 1.1 64-bit 2471 9.0 - 40.0 +5/=8/-36 18.37% XXX
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 10:31 am
by Guenther
xr_a_y wrote: ↑Tue May 14, 2019 10:23 am
Same thing appends with Minic 0.47 on CCRL 40/4.
Yes that's true, I have added a bigger result list for whole 2019 now regarding Jonny 4.00.
(irregular results marked with 'XXX')
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 11:31 am
by xr_a_y
I checked some of the game, there are not that bad.
Re: Weird results Tunguska 1.1 vs Jonny 4.0
Posted: Tue May 14, 2019 11:41 am
by Guenther
Removed the irrelevant single game matches from gauntlets.
Jonny 4.00 CCRL 40/4 2017 -2019:
Again 2 extreme outliers already in 2017, both in December.
Check this example game, 20. f4? and 21.Qd6?? appear only for a fraction of a second
in Jonny 4.00's search at depth 6-8 or so... (hash filled stepping through the game quickly)
[pgn][Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2017.12.14"]
[Round "148.5"]
[White "Jonny 4.00"]
[Black "Scorpio 2.7.8 64-bit"]
[Result "0-1"]
[ECO "D45"]
[WhiteElo "2746"]
[BlackElo "2861"]
[PlyCount "47"]
[EventDate "2017.??.??"]
1. d4 d5 2. Nf3 c6 3. c4 Nf6 4. e3 e6 5. Nc3 Nbd7 6. Qc2 Bd6 7. Bd3 O-O 8. O-O
dxc4 9. Bxc4 b5 10. Bd3 Bb7 11. Bd2 b4 12. Na4 c5 13. dxc5 Nxc5 14. Nxc5 Bxc5
15. Qxc5 Qxd3 16. Bxb4 Bxf3 17. gxf3 Nd5 18. Ba3 Qg6+ 19. Kh1 Qh5 20. f4 Rfc8
21. Qd6 Qf3+ 22. Kg1 Nxe3 23. fxe3 Qg4+ 24. Kh1 0-1[/pgn]
or this one:
16. Rxe8?? 17. Qxf5??? never appear here in Jonny 4.00's search at least from depth 4 or so
(this game is from an omitted mini match - gauntlet)
[pgn][Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2017.09.27"]
[Round "140.6"]
[White "Jonny 4.00"]
[Black "Chronos 1.9.9 64-bit"]
[Result "0-1"]
[ECO "C89"]
[WhiteElo "2746"]
[BlackElo "2739"]
[PlyCount "38"]
[EventDate "2017.??.??"]
1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 O-O 8. c3
d5 9. exd5 Nxd5 10. Nxe5 Nxe5 11. Rxe5 c6 12. Re1 Bd6 13. d4 Bf5 14. Nd2 Qc7
15. Qf3 Rfe8 16. Rxe8+ Rxe8 17. Qxf5 Re1+ 18. Nf1 Bxh2+ 19. Kh1 Rxf1# 0-1[/pgn]
Code: Select all
CCRL 40/4 2017
Jonny 4.00 2746 - Laser 1.3 64-bit 2948 9.0 - 28.0 +5/=8/-24 24.32%
Jonny 4.00 2746 - Amoeba 2.1 64-bit 2747 25.5 - 14.5 +20/=11/-9 63.75%
Jonny 4.00 2746 - Tornado 8.0 64-bit 2825 11.5 - 22.5 +7/=9/-18 33.82%
Jonny 4.00 2746 - Zurichess Jura 64-bit 2773 20.0 - 12.0 +16/=8/-8 62.50%
Jonny 4.00 2746 - Carballo 1.7 64-bit 2723 14.5 - 19.5 +10/=9/-15 42.65%
Jonny 4.00 2746 - ChessBrainVB 3.20 2803 21.0 - 29.0 +15/=12/-23 42.00%
Jonny 4.00 2746 - Amoeba 2.3 64-bit 2782 22.5 - 27.5 +13/=19/-18 45.00%
Jonny 4.00 2746 - Cheese 1.9 64-bit 2731 22.5 - 8.5 +19/=7/-5 72.58%
Jonny 4.00 2746 - RuyDos 1.0.2 64-bit 2669 19.5 - 9.5 +14/=11/-4 67.24%
Jonny 4.00 2746 - Zurichess Luzern 64-bit 2842 13.0 - 18.0 +7/=12/-12 41.94%
Jonny 4.00 2746 - ChessBrainVB 3.31 2828 15.5 - 12.5 +10/=11/-7 55.36%
Jonny 4.00 2746 - chess22k 1.4 64-bit 2689 20.0 - 12.0 +12/=16/-4 62.50%
Jonny 4.00 2746 - Gandalf 7 64-bit 2668 23.0 - 9.0 +18/=10/-4 71.88%
Jonny 4.00 2746 - chess22k 1.5 64-bit 2738 16.5 - 15.5 +12/=9/-11 51.56%
Jonny 4.00 2746 - GNU Chess 6.25 64-bit 2683 6.5 - 23.5 +3/=7/-20 21.67%
Jonny 4.00 2746 - Defenchess (SCTR) 1.0 64-bit 2843 11.5 - 14.5 +7/=9/-10 44.23%
Jonny 4.00 2746 - RuyDos 1.0.27 64-bit 2758 14.0 - 14.0 +9/=10/-9 50.00%
Jonny 4.00 2746 - Fruit 2.3.1 2780 10.5 - 15.5 +7/=7/-12 40.38%
Jonny 4.00 2746 - Ethereal 8.28 64-bit 2754 30.5 - 19.5 +23/=15/-12 61.00%
Jonny 4.00 2746 - Marvin 2.2.0 64-bit 2697 19.5 - 12.5 +15/=9/-8 60.94%
Jonny 4.00 2746 - ECE X3 64-bit 2656 20.5 - 11.5 +17/=7/-8 64.06%
Jonny 4.00 2746 - The Baron 3.41 64-bit 2823 5.5 - 24.5 +3/=5/-22 18.33%
Jonny 4.00 2746 - Devel 1.8090 2707 19.0 - 13.0 +16/=6/-10 59.38%
Jonny 4.00 2746 - Ethereal 8.37 64-bit 2825 12.5 - 19.5 +8/=9/-15 39.06%
Jonny 4.00 2746 - chess22k 1.6 64-bit 2829 10.5 - 21.5 +5/=11/-16 32.81%
Jonny 4.00 2746 - Defenchess (SCTR) 1.1e 64-bit 3035 7.0 - 25.0 +4/=6/-22 21.88%
Jonny 4.00 2746 - Scorpio 2.7.8 64-bit 2861 0.5 - 29.5 +0/=1/-29 1.67% XXX
Jonny 4.00 2746 - Tucano 7.00 64-bit 2871 3.5 - 25.5 +1/=5/-23 12.07% XXX
CCRL 40/4 2018
Jonny 4.00 2746 - Scorpio 2.7.9 64-bit 2883 14.0 - 42.0 +4/=20/-32 25.00%
Jonny 4.00 2746 - GreKo 2017 64-bit 2614 24.0 - 8.0 +21/=6/-5 75.00%
Jonny 4.00 2746 - Karballo 1.8 64-bit 2753 15.0 - 51.0 +5/=20/-41 22.73%
Jonny 4.00 2746 - Shield 2.1 64-bit 2735 18.5 - 12.5 +12/=13/-6 59.68%
Jonny 4.00 2746 - Marvin 3.0.0 64-bit 2706 19.0 - 14.0 +14/=10/-9 57.58%
Jonny 4.00 2746 - RuyDos 1.1.0 64-bit 2777 21.0 - 40.0 +15/=12/-34 34.43%
Jonny 4.00 2746 - Daydreamer 2.0.0-pre2 64-bit 2896 8.5 - 21.5 +4/=9/-17 28.33%
Jonny 4.00 2746 - Devel 2.0000 2713 32.5 - 29.5 +20/=25/-17 52.42%
Jonny 4.00 2746 - Godel 4.0.7 64-bit 2805 30.5 - 81.5 +20/=21/-71 27.23%
Jonny 4.00 2746 - Marvin 3.1.0 64-bit 2789 24.0 - 56.0 +17/=14/-49 30.00%
Jonny 4.00 2746 - Counter 2.9 64-bit 2715 26.0 - 43.0 +19/=14/-36 37.68%
Jonny 4.00 2746 - RubiChess 1.0 64-bit 2651 42.0 - 18.0 +34/=16/-10 70.00%
Jonny 4.00 2746 - Pirarucu 2.3.8 64-bit 2865 12.0 - 20.0 +7/=10/-15 37.50%
Jonny 4.00 2746 - RofChade 1.0 64-bit 2792 13.5 - 18.5 +9/=9/-14 42.19%
Jonny 4.00 2746 - GreKo 2018.08 64-bit 2717 19.0 - 14.0 +14/=10/-9 57.58%
Jonny 4.00 2746 - RubiChess 1.1 64-bit 2780 13.5 - 18.5 +9/=9/-14 42.19%
Jonny 4.00 2746 - Monolith 1.0 64-bit 2818 15.0 - 17.0 +9/=12/-11 46.88%
Jonny 4.00 2746 - Donna 4.1 64-bit 2700 36.0 - 26.0 +26/=20/-16 58.06%
Jonny 4.00 2746 - Marvin 3.2.0 64-bit 2826 10.0 - 22.0 +6/=8/-18 31.25%
Jonny 4.00 2746 - Cheese 2.0 64-bit 2765 14.5 - 17.5 +11/=7/-14 45.31%
Jonny 4.00 2746 - Winter 0.3 64-bit 2739 11.5 - 17.5 +8/=7/-14 39.66%
Jonny 4.00 2746 - RofChade 2.0 64-bit 3129 3.0 - 20.0 +0/=6/-17 13.04%
Jonny 4.00 2746 - Arminius 2018-12-23 64-bit 2760 10.0 - 18.0 +9/=2/-17 35.71%
CCRL 40/4 2019
Jonny 4.00 2746 - Winter 0.4a 64-bit 2811 10.0 - 22.0 +6/=8/-18 31.25%
Jonny 4.00 2746 - Topple 0.3.4 64-bit 2664 23.5 - 7.5 +20/=7/-4 75.81%
Jonny 4.00 2746 - Francesca MAD 0.21 64-bit 2709 1.5 - 37.5 +0/=3/-36 3.85% XXX
Jonny 4.00 2746 - Topple 0.3.5 64-bit 2701 15.0 - 15.0 +12/=6/-12 50.00%
Jonny 4.00 2746 - Gogobello 2.0 64-bit 2834 15.0 - 17.0 +11/=8/-13 46.88%
Jonny 4.00 2746 - Francesca MAD 0.22 64-bit 2730 14.5 - 20.5 +11/=7/-17 41.43%
Jonny 4.00 2746 - Igel 1.4 64-bit 2634 22.0 - 10.0 +19/=6/-7 68.75%
Jonny 4.00 2746 - Fridolin 3.10 64-bit 4CPU 2796 3.0 - 26.0 +0/=6/-23 10.34% XXX
Jonny 4.00 2746 - Topple 0.5.0 64-bit 2829 10.0 - 22.0 +5/=10/-17 31.25%
Jonny 4.00 2746 - Topple 0.5.0 64-bit 4CPU 2845 0.5 - 44.5 +0/=1/-44 1.11% XXX
Jonny 4.00 2746 - Minic 0.47 64-bit 4CPU 2880 5.0 - 42.0 +1/=8/-38 10.64% XXX
Jonny 4.00 2746 - Francesca MAD 0.23 64-bit 2771 16.5 - 15.5 +12/=9/-11 51.56%
Jonny 4.00 2746 - Tunguska 1.1 64-bit 2471 9.0 - 40.0 +5/=8/-36 18.37% XXX