History of Memory Wall in Computer Chess?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: History of Memory Wall in Computer Chess?

Post by Vinvin »

Joost Buijs wrote: Thu Aug 13, 2020 8:34 am ...
Why the port of Sargon I to the PC is only 50 times faster is not very clear to me, if you look at clock-frequency alone the speed became at least 1000 times higher, and the IPC is much higher too. I would expect the port to run at least 1000 times faster, I wonder what the bottleneck is.
Are you talking about the port from Bill Forster ? http://talkchess.com/forum3/viewtopic.p ... 25#p845725
Some tests show the speed multiplied by 6000 ...
Bill Forster wrote: Wed Jun 03, 2020 2:20 am ...6000 times faster is much better than I hoped for. Credit for Intel (and AMD I suppose), my understanding is that many of the Z80 instructions translate to old fashioned 16 bit 8086 instructions that are kept around only for backwards compatibility and are implemented in microcode rather than in hardware. Still pretty damn quick.
...
Joost Buijs
Posts: 1564
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: History of Memory Wall in Computer Chess?

Post by Joost Buijs »

Vinvin wrote: Thu Aug 13, 2020 6:49 pm
Joost Buijs wrote: Thu Aug 13, 2020 8:34 am ...
Why the port of Sargon I to the PC is only 50 times faster is not very clear to me, if you look at clock-frequency alone the speed became at least 1000 times higher, and the IPC is much higher too. I would expect the port to run at least 1000 times faster, I wonder what the bottleneck is.
Are you talking about the port from Bill Forster ? http://talkchess.com/forum3/viewtopic.p ... 25#p845725
Some tests show the speed multiplied by 6000 ...
Bill Forster wrote: Wed Jun 03, 2020 2:20 am ...6000 times faster is much better than I hoped for. Credit for Intel (and AMD I suppose), my understanding is that many of the Z80 instructions translate to old fashioned 16 bit 8086 instructions that are kept around only for backwards compatibility and are implemented in microcode rather than in hardware. Still pretty damn quick.
...
I only replied to the 50 times faster that Kai wrote, 6000 times faster is something I understand much better.

But the question remains why it plays so weak on modern hardware. Lets say the original on the Z80 reached ~1600 Elo, 6000x is more than 12 speed doublings, each doubling is roughly 60 Elo, so I would expect it to have an Elo of at least 1600 + 720 = 2320 Elo, but it appears to be much weaker than this. Of course I have to let it play a much larger number of games to get meaningful results, maybe it is not as bad as it looks.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: History of Memory Wall in Computer Chess?

Post by Vinvin »

Joost Buijs wrote: Thu Aug 13, 2020 7:51 pm
Vinvin wrote: Thu Aug 13, 2020 6:49 pm
Joost Buijs wrote: Thu Aug 13, 2020 8:34 am ...
Why the port of Sargon I to the PC is only 50 times faster is not very clear to me, if you look at clock-frequency alone the speed became at least 1000 times higher, and the IPC is much higher too. I would expect the port to run at least 1000 times faster, I wonder what the bottleneck is.
Are you talking about the port from Bill Forster ? http://talkchess.com/forum3/viewtopic.p ... 25#p845725
Some tests show the speed multiplied by 6000 ...
Bill Forster wrote: Wed Jun 03, 2020 2:20 am ...6000 times faster is much better than I hoped for. Credit for Intel (and AMD I suppose), my understanding is that many of the Z80 instructions translate to old fashioned 16 bit 8086 instructions that are kept around only for backwards compatibility and are implemented in microcode rather than in hardware. Still pretty damn quick.
...
I only replied to the 50 times faster that Kai wrote, 6000 times faster is something I understand much better.

But the question remains why it plays so weak on modern hardware. Lets say the original on the Z80 reached ~1600 Elo, 6000x is more than 12 speed doublings, each doubling is roughly 60 Elo, so I would expect it to have an Elo of at least 1600 + 720 = 2320 Elo, but it appears to be much weaker than this. Of course I have to let it play a much larger number of games to get meaningful results, maybe it is not as bad as it looks.
IIRC, Sargon I on the Z80 was around 1000 Elo.
May be doubling the speed doesn't worth +60 Elo because of the bad search (full width) ...
Joost Buijs
Posts: 1564
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: History of Memory Wall in Computer Chess?

Post by Joost Buijs »

Vinvin wrote: Thu Aug 13, 2020 10:19 pm IIRC, Sargon I on the Z80 was around 1000 Elo.
May be doubling the speed doesn't worth +60 Elo because of the bad search (full width) ...
I think it was much stronger than 1000 Elo, but it is 42 years ago and my memory starts letting me down a bit, so I could be wrong.
In the past (with a full-width search) 1 ply extra search depth was ~100 Elo, the branching factor was ~6, your are right, this explains it.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: History of Memory Wall in Computer Chess?

Post by Rebel »

Joost Buijs wrote: Fri Aug 14, 2020 6:46 am
Vinvin wrote: Thu Aug 13, 2020 10:19 pm IIRC, Sargon I on the Z80 was around 1000 Elo.
May be doubling the speed doesn't worth +60 Elo because of the bad search (full width) ...
I think it was much stronger than 1000 Elo, but it is 42 years ago and my memory starts letting me down a bit, so I could be wrong.
In the past (with a full-width search) 1 ply extra search depth was ~100 Elo, the branching factor was ~6, your are right, this explains it.
That's how I remember it also.
90% of coding is debugging, the other 10% is writing bugs.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: History of Memory Wall in Computer Chess?

Post by jdart »

I wrote a lot of Z80 assembler back in the day, including a word processing program that was sold commercially. That program could not fit in 640k RAM so it had to use disk overlays: the printing module and editing module were loaded into RAM as required and couldn't be present at the same time.

A little later I had a professional programming job that involved a fair amount of 80386 assembler. That program was so big it needed to run under a DOS extender. It was basically running a custom protected mode OS. The bootstrap code that brought up the environment was especially tricky, because it switched into protected mode and there was really no debugger that could trace it through that process.

Since memory got cheap and abundant, this sort of thing has gone the way of the dinosaurs.

--Jon
smatovic
Posts: 2658
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: History of Memory Wall in Computer Chess?

Post by smatovic »

Okay, so the 386/486/586 CPUs ran at higher clocks than main memory and had L1 resp. L2 cache of maybe 256KB on the motherboard, but what about the Motorola 68k series for example?

In Wikipedia specs for the last Amiga model, the A4000, I see a clock rate for the 68040 of 25 MHz, but no L2 cache, so the RAM was here 'hot-clocked' with the CPU/MMU? What about higher clocked 68ks resp. later models like 68060? Did the RAM frequency catch up, or did they add L2 cache on the mother/accelerator-boards?

--
Srdja
smatovic
Posts: 2658
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: History of Memory Wall in Computer Chess?

Post by smatovic »

smatovic wrote: Sun Aug 16, 2020 9:24 am Okay, so the 386/486/586 CPUs ran at higher clocks than main memory and had L1 resp. L2 cache of maybe 256KB on the motherboard, but what about the Motorola 68k series for example?

In Wikipedia specs for the last Amiga model, the A4000, I see a clock rate for the 68040 of 25 MHz, but no L2 cache, so the RAM was here 'hot-clocked' with the CPU/MMU? What about higher clocked 68ks resp. later models like 68060? Did the RAM frequency catch up, or did they add L2 cache on the mother/accelerator-boards?

--
Srdja
Okay, after some web research I see that it is not that easy...I mixed up memory
access latency measured in ns with memory bus clock and ignored CPU memory wait
states and pipelines...


--
Srdja
Bill Forster
Posts: 76
Joined: Mon Sep 21, 2015 7:47 am
Location: New Zealand

Re: History of Memory Wall in Computer Chess?

Post by Bill Forster »

Coming to this late, but I can speak with authority about Sargon 1, having spent weeks immersed in it's intricacies as I refined my port. It does run 6000 (approx) times faster than on the original Z80 hardware. I was hoping that a massive speedup would turn a weak program into a strong one, but sadly it was not to be. About 1300 Elo seems to be the consensus. I think the main reason for this disappointing outcome is the full width search. It sucks up all the extra capacity. Exponential growth is a harsh taskmaster. Looking at every move means each ply takes approximately 20 times longer. 20 X 20 x 20 = 8000, so 40 years of hardware progress buys just 3 ply (at least on Sargon 1). So a 3 or 4 ply search from 1978 becomes a 6 or 7 ply search in 2020. In most positions it doesn't make a huge difference. The SOMA algorithm protects Sargon from the most gross blunders even at 1 ply, and the complete lack of positional knowledge means extra ply provide diminishing returns unless there are tactics in the position. But not deep tactics! A 7 ply search might not be sufficient to see that it's worth pushing an unstoppable passed pawn in an ending.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: History of Memory Wall in Computer Chess?

Post by Vinvin »

smatovic wrote: Sun Aug 16, 2020 9:24 am Okay, so the 386/486/586 CPUs ran at higher clocks than main memory
IIRC, for 486-DX, same speed for the CPU and memory bus. For the 486-DX2, the speed of the CPU is double of the mem bus.
For the Pentium 60 Mhz and 66 MHz same speed for the CPU and memory bus. For the Pentium 90 Mhz and 100 MHz, the speed of the CPU is *1.5 the speed of the mem bus.

All the speed of clock speed / Bus speed here :
https://en.wikipedia.org/wiki/Intel_80486
https://en.wikipedia.org/wiki/P5_(microarchitecture)