The Gigatron revisited

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

The Gigatron revisited

Post by hgm »

Unfortunately I had little time recently to work on the home-built retro-computer kit designed by Marcel van Kervinck and friends, (see viewtopic.php?f=7&t=65905 ), both for building the actual hardware (I bought two kits), and for writing the Chess program for it. And it doesn't seem this will change before I return from the ICGA Computer Olympiad in July.

I did arrive at a final design for enhancing the instruction set, however. (The reason I bought two kits was that I wanted to build one orthodox one, and another one with modifications.) To recapitulate: the original Gigatron uses instruction encoding with 3 bits specifying the operation (LD, ADD, SUB, AND, OR, XOR, ST, branch), 2 bits specifying a data source (accumulator A, immediate data D, input port IN or memory). The remaining 3 bits formed a combined addressing mode / destination field, where destination A could be combined with 4 addressing modes ([D], [X], [Y:D], [Y:X]), destinations X or Y only with [D] (i.e. an address in the zero page specified by the instruction's 8-bit data field), and destination OUT (the output port) with modes [D] or [Y:X++] (post-incrementing X as a side effect). On ST (memory-store) instructions writing to the specified destinations A or OUT would be suppressed, and the data would only go to memory. Branches by definition have no destination register, and use the 3-bit mode/destination field for specifying the branch condition. (And the data source provides the target address when the branch is taken, always with addressing mode [D] when memory is specified.)

When I started to write programs for the Gigatron, it turned out that things I wanted to do frequently were only possible in a quite cumbersome way, saving and loading registers all the time. Of course I blamed the instruction set! In particular, having no 'indexed' mode when loading X or Y was an annoyance: you first have to load the data in A (which does support such a mode) and then transfer that to X (destroying the contents of A, which was of course exactly what you wanted to store at the address given by X). Implementing a 'stack frame' for local variables of recursive routines was also a pain; only Y was available (through mode [Y:D]) to access a variable specified by D for an instance specified by Y. And using Y as frame pointer leads to severe overloading of Y, as it also has to be used for indexing global arrays, and for specifying the high address of distant jumps. So you would have to reload Y nearly every other instruction.

The problem is that the mode [Y:D] is the only mode in the original Gigatron that can combine the instruction-specified constant D with a calculated address, while this is what you most often want to do. So it would be very helpful if there would also be a mode that made an address out of X and D, such as [X:D] or [D:X]. The latter could actually be used to replace the mode [X] = [0:X], which just would be the special case with D=0. As it happens, the high byte of the memory address in the Gigatron is obtained by routing Y through a 2-input multiplexer, but this multiplexer only ever selects Y, while the output enable of it is used to force the high byte to 0 in the case of modes [X] and [D]. The second multiplexer input is not connected, and could be wired to X or D. By properly operating the control inputs of this multiplexer, this would then make it possible to provide addressing modes [X:X] and [X:D], or [D:X] and [D:D] (only one of the two being useful in either case).

The remaining problem is how to fit the new addressing mode into the opcode map. I finally decided that [X:D] is the more useful of the two (and thus will connect the originally unused multiplexer input to X). This is not an upward-compatible extension of an existing mode, however. The existing instruction format can handle only 8 combinations of destination register and addressing mode, all used. And I would like to have more already-existing modes on destinations X and Y, as well as introducing this new addressing mode at least for one destination. So it is clear the encoding format will have to be changed, completely redefining the opcode map (and destroying all binary backward-compatibility).

So instead of using a 2-bit data-source field and a 3-bit combined destination/addressing field I will use a 2-bit destination field (A, X, Y or OUT) and a 3-bit combined source/addressing field. This is more natural, as the memory addressing is only relevant in case the memory is the data source (or in store instructions). So many combinations of data source and addressing mode would be intrinsically redundant. Ideally the 8 possible encodings could have specified A, D, IN and memory in 5 different modes (the 4 original modes plus the new one [X:D]), which can then be used on all destinations. This would lead to a rather complex instruction decoder, though. So I decided to abandon the use of IN as data source on all instructions except stores.

For the latter the situation is very different anyway: memory there never is a data source, but the addressing mode is always relevant, as the memory by definition is active for writing. Instructions that use D both as data source as for generating the address are nearly useless, though. But for the modes [X] and [Y:X] it would make sense to have them both for storing A and D. It must also be possible to store IN (or we could not access it at all), preferably in mode [D] (so that it does not need an index register). So most modes would be needed multiple times, for storing data from different sources.

It turns out the needs of store and other instructions can both be met reasonably well with the following encoding:

Code: Select all

code  address  src     bra      ALU
0      0:X      A       RAM      RAM
1      0:D      IN      RAM      RAM
2      0:X      D
3      0:D      A
4      Y:X      A       RAM      RAM
5      Y:D      IN      RAM      RAM
6      Y:X      D                RAM
7      X:D      A                RAM
As we see most of the addressing modes occur twice. Store instructions exploit that by using one instance of the mode for storing A, the other for storing D (or IN when storing D makes no sense because it was already used in the address). Other instructions eliminate the duplicats by simply ignoring the value read from memory, and selecting a register as data source. So the data-source selection by the decoder based on the two lowest bits in the source/mode field can be suppressed by disabling the decoder, enabling the RAM output instead for certain combinations of the upper two bits, dependent on instruction type. This is highly orthogonal, which allows for a simple instruction decoder.

There are a few irregularities: branch instructions have only 4 modes, and they should all have memory modes [D] and [X] rather than [Y:D] and [Y:X] so branch instructions will have to force the high address to 0 always, not only for codes 0-3. Selection of X as high address occurs only for src/mode = 7 (and possibly 3, but there it is invisible as the high address is forced to 0 anyway). This is OK for ALU instructions, which already have [Y:D] as case 5. But in store instructions case 5 stores IN rather than A, so storing A at [Y:D] would no longer be possible. Redefining the data source as A only for this case would require rather complex logic. But here the fact that store instructions suppress writing to A or OUT destinations comes to the rescue, as that means that plain store instructions occur twice in the opcode map anyway. So the versions with (dummy) destination OUT can be given slightly different addressing modes from those used with all other destinations. This involves using [Y:D] instead of [X:D] in case 7, and incrementing X in cases 4 and 6.

This would then create the following instruction set:
LDA (and other ALU operations) have modes A, D, [D], [X], [Y:D], [Y:X] and [X:D]. (The latter instead of IN.)
LDX, LDY (and other ALU operations) have modes A, D, [D], [X], [Y:D], [Y:X] and [X:D]. (The latter 4 new, but no IN.)
LDOUT (and other ALU operations) has modes A, D, [D], [X], [Y:D] and [Y:X++]. ([X] and [Y:D] new, but no IN.)
Branches have modes A, D, [D] and [X]. (The latter instead of IN; branching on IN was virtually useless.)
STA has modes [D], [X], [Y:D], [Y:X], [Y:X++], [X:D]. (The latter new.)
Storing immediate data has modes [X], [Y:X], [Y:X++]. (No [D] or [Y:D] anymore; these were virtually useless.)
STIN has modes [D] and [Y:D]. (No [X], [Y:X] or [Y:X++] anymore.)
STA which also loads X or Y has modes [D], [X], [X:D] and [Y:X]. (The latter 3 new.)
Storing immediate data, also loading it into X or Y has modes [X] and [Y:X]. (No [D] anymore; this was virtually useless.)

The only real loss is that IN no longer can be used as data source in ALU instructions. As a work-around you would have to store IN in the zero page (mode [D]), from where all instructions could then use it as operand. This requires one extra instruction/cycle for accessing IN. But how often would you want to poll the input? It seems a small price to pay for the availability of the new mode [X:D], and the ability to use all modes rather than just zero page on instructions with X and Y destination.