I missed this at the time;
lkaufman wrote: ↑
Fri Oct 16, 2020 9:35 pm
Have any rated human players played games against Sargon on a modern computer to provide any data on the human rating it might earn today? If so at what time control?
When Larry Kaufman takes an interest in your project you should definitely take note! So when I saw this around a month ago I decided to fill the gap myself by playing 3+2 Blitz and 15+5 Rapid matches against Sargon 1978 V1.0 to try to measure Sargon's "Human" Elo. It's computer Elo seems to be about 1300. I am a capable player, I even have a FIDE CM title, although I am getting old and it's some time since I was even 2000 FIDE. My current FIDE ratings are 1858 (Classical), 1924 (Rapid). No Blitz rating unfortunately. I am definitely weak at Blitz, I maintain around 1700 on chess.com and 1900 on lichess.org. So let's estimate my Blitz rating at 1700. The results of the match were;
Blitz: Sargon scored 12/16, according to FIDE's calculator that translates to 1860 (vs a 1700).
Rapid: Sargon scored 4.5/10, according to FIDE's calculator that translates to 1888 (vs a 1924).
A PGN of the match is available here https://triplehappy.com/downloads/Bill ... n 1978.pgn
. I used this resource https://www.sp-cc.de/unbalanced-human-openings.htm
to provide interesting starting positions.
Embarrassed by my own performance I sat on the results, thinking I would try again, treat the first match as a training match / take it more seriously / force myself to play anti-computer tactics / skip the blitz and focus on rapid etc. etc. But who has time. So I decided to swallow my pride and publish the raw results, in the hope that Larry sees this and is still interested.
I had previously stated that I didn't think Sargon 1978 could beat me if I paid attention. That was based on innumerable "games" played while I wrestled the project into life. But in those games I never put myself on the clock, and I suppose I just abandoned them if my slow burning sacrificial attacks (normal fun approach) didn't draw the horizon effect slaughter I was going for. I didn't use an opening book then either for obvious reasons, and using a book in the match definitely helped Sargon as it is poor in the opening and this way it gets something to work with in the middle game (definitely its happy place).
Sargon 1978's positional wisdom is limited to "try to castle early" and "try to control squares". In the endgame it suffers horribly from the horizon effect, since it's SOMA boost doesn't help with passed pawn advances. It also has a bad repetition issue (indeed like a scoundrel I got many draws in games I deserved to lose because of this), and would very rarely be able to convert even very advantageous endings since even the Q+K v K elementary mate is beyond it because controlling squares misses the point and its unpruned search without a transposition cache is too shallow to brute force the mate unless the K is already cornered.
Despite this, Sargon 1978 running on a modern PC is capable of beating weak humans like me, especially at blitz, simply because it doesn't offer or miss simple tactical opportunities.
Some other things that came up that I can answer authoritatively if required were;
lkaufman wrote: ↑
Sat Oct 17, 2020 12:21 am
mwyoung wrote: ↑
Fri Oct 16, 2020 11:47 pm
lkaufman wrote: ↑
Fri Oct 16, 2020 11:32 pm
OK, as best as I can figure out from all the posts, the elo rating of Sargon 1 in 1978 was about 1400, and its estimated elo vs. humans on an i7 now is around 1700. The hardware speedup is either 50 to 1 or 6000+ to 1 depending on which post you believe; if it is 50 to 1 this all makes sense, since 1700 elo is not too far out of line with 1254 CCRL Blitz if you contract rating differences by a third or so. I get 52knps on my very fast i7 which means about 40k on a typical one, so if there is some evidence that the original hardware got around 800 nps then everything fits. The 6000 to 1 figure is hard to credit, especially since it referred to using some old hardware, not an i7, so an i7 might be 10000 to 1 which means the original Sargon got 4 nodes per second?? That seems impossibly low.
I note that the 1700 elo mentioned was based on Shredder and SF versions set to 1600, but does anyone know whether those ratings were themselves based on human games or on games with CCRL rated engines?
You can estimate the speed up. The level times 1ply meaning level 1, 2 to 2..... level 6 for 6 ply took an average of 4 hours. Sargon did not have time controls only levels. Take a few positions and see how long your Sargon takes to search 6 ply. And compare that to 4 hours.This will give you a speed ratio compared to a TRS-80 computer.
I get anywhere from 3 to 36 seconds with depth set to 6, maybe 15 seconds or so average, which would give a 960 to 1 ratio. That implies that Sargon got about 50 nodes per second on the TRS-80. Was it really that slow, or could it perhaps be nonlinear, maybe the ratio is much lower for a 4 ply search than for a six ply search for example?
I estimated 2 or 3 orders of magnitude speedup, but measured a 6000 times speed up. This measurement was fairly robust and the details are up-thread. I am surprised you measure "only" a 1000 times speed up on your much faster computer. Although all those thread-ripper threads will make no difference, this is strictly single threaded stuff. A possible source of trouble is the way I measure Nodes (I didn't give this much attention, I just count the position evaluations). I am willing to dig deeper if necessary to answer this more authoritatively.
(Actually I just did a different type of measurement - one of my regression test suites checks 33 positions in about 50 seconds, 20 of those are at depth 5 (the rest run essentially instantaneously), so this means an average of 2.5 seconds for a depth 5 calculation compared to 40 minutes from the TRS-80 version manual kindly provided by mwyoung upthread. 40x60 = 2400, so very close to a 1000 times speedup by this alternative approach too.)
B) The repetition problem.
lkaufman wrote: ↑
Sat Oct 17, 2020 2:53 am
It seems that its blindness to repetition draws makes any ratings or results it obtains rather questionable; probably it would have a significantly higher CCRL blitz rating with this one bug fixed. I imagine it was fixed with a later Sargon version, although I don't know this. I don't think of this as a question of evaluation, it is just not knowing the rules of the game.
It's true Sargon 1978 doesn't know anything about this rule, I think this was entirely normal at the time. Later versions of Sargon undoubtedly addressed this problem, as that became normal practice. I thought that this problem was of about the same magnitude as Sargon's inability to complete elementary K+Q (or K+R) v K mates. For my match I planned to use these holes to seek shelter and get undeserved draws if necessary. The elementary mates problem never arose, but the repetition problem did keep coming up (and I used it shamelessly). Conveniently, I can fix the repetition problem in the UCI wrapper, without altering the core Sargon code at all. I can create a new API call, basically "Calculate best move - but this list of moves (which create repetition) are off limits". Then in pseudo code;
Calculate best move
If Sargon is better and best move repeats
Calculate best move excluding repetition moves
If Sargon is still better use new best move else use the original best move
If anyone shows any interest (maybe even if nobody does
I shall go ahead and implement this, resulting in a new Sargon 1978 V1.01 to replace the existing Sargon 1978 V1.00.