The Alto, Part 2:  Microcode

These two boards contain the logic for the Alto’s main processor

The Alto proposal was a bold one, with a short time-frame and lofty goals.  In a few short months, Chuck Thacker and the rest of the crew had designed and implemented a complete computer encapsulated in just a handful of circuit boards.  The processor was comprised of two small boards (approximately 7″x10″) containing a mere 138 integrated circuits.

Peripheral controllers were similarly simple — single boards of the same size for each of the Display, Ethernet, and Disk.

How was such economy of hardware possible?  Microcode.

Microcode and you:

In many computers from the Alto’s era, the architecture of the processor was hard-wired into the circuitry — there was dedicated hardware to fetch, decode, and execute each instruction in the processor’s repertoire.  This meant that once designed and built, the computer’s instruction set could not be modified or extended, and any bugs in the hardware were very expensive to fix. It also meant that the hardware was more complicated — involving more components and longer development cycles.  As processors became more and more advanced these limitations became more pressing.

Microcode and micro-programming helped solve these problems by replacing large swaths of control logic with software.

So What *is* microcode?

Microcode is software.  It’s very low-level software, and it’s written in a language tailored to a very specific domain — the control of a specific set of hardware — but it’s software all the same.  Let’s look at a small snippet of a program from the highest level, down to a microcode representation.

At the highest level, we have human language:

      Add 5 to 6 and give me the result.

Which, if you’re able to read this blog, you are likely to understand the meaning of.

In a somewhat-high-level programming language, like C, this translates roughly to:

      i = 5 + 6;

This isn’t too much different from the English statement above and if you’re familiar with programming or math at all, you can figure out that the above statement adds 5 to 6 and stores the result in the variable i.

In Nova assembly language, the above C program might look something like:

     FIVE:  5
     SIX:   6
     I:     0
     ADD:
            LDA 0,FIVE
            LDA 1,SIX
            ADC 0,1
            STA 0,I

Now we’re starting to get a little bit further from the source material.  This code consists of three memory locations (containing the operands and the result) and four instructions.  Each of these instructions performs a small part of the “i=5+6” operation.  “LDA” is Nova for “Load Accumulator” and loads 5 into Accumulator 0, and 6 into Accumulator 1.

ADC” is Nova for “ADD with Carry.”  It adds Accumulator 0 and Accumulator 1 together and stores the result in Accumulator 0.  “STA” means “Store Accumulator” and puts the contents of Accumulator 0 (the result of 5+6) into a location in memory (designated as “I” in this example.)

In Alto microcode, a direct analog to the Nova instruction sequence above could be implemented as:

       MAR← FIVE
       NOP
       T ←MD

       MAR← SIX
       NOP
       L ←MD+T    

       MAR← I
       NOP
       MD← L

The number of instructions is larger (9 vs 4) and the microcode operates at a lower level than the equivalent Nova code.

In English:

  • The “MAR← FIVE” operation tells the Memory Address Register hardware to load a 16-bit address. The subsequent “NOP” here (and after every MAR← operation) is necessary to give the memory hardware time to accept the memory request.  If this is omitted, unpredictable things will happen.
  • The “T← MD” operation puts the Memory Data register — the contents of the memory at the loaded location in the previous instruction — onto the ALU bus.
  • The “L← MD+T” instruction puts the memory data onto the ALU bus and tells the processor’s ALU to add the T register to it; and finally it causes the result to be stored in the L register.

What you may notice in the above examples is that the Nova machine language instructions tell the computer as a whole what to do (i.e. load a register, or perform an addition) whereas the Alto microcode instructions tell the computer how to go about doing it (i.e. “put this data on the memory bus,” or “tell the ALU to add this to that.”)  It directs the individual components of the processor (the memory bus, the ALU, the register file, and the shifter) to perform the right operations in the right order.

So what components is the Alto processor made of?  Here’s a handy diagram of the Alto CPU:

Block diagram of the Alto processor

The nexus of the processor is the Processor Bus, a 16-bit wide channel that most of the other components in the Alto connect to in order to transfer data.  The microcode controls who gets to read and write from this channel and at what time.

The R and S registers are 16-bit register sets; the R bank contains 32 16-bit registers, and the S register bank contains up to 8 sets of 32 16-bit registers.  These are general-purpose registers and are used to store data for any purpose by the microcode.

The L, M, and T registers are intermediate registers used to provide inputs and outputs to the ALU and the general-purpose registers.

The ALU (short for Arithmetic Logic Unit) hardware takes two 16-bit inputs (A and B), a Function (for example, ADD or XOR) and produces a 16-bit result.  The A input comes directly from the Processor Bus (and thus can operate on data from a variety of sources) and the B input comes from the T register.

The Shifter applies a shift or rotate operation to the output of the ALU before it gets written back to the R registers.

The MAR (Memory Address Register) contains the 16-bit address for the current operation (read or write) to the Alto’s Main Memory.   Data is transferred to and from the specified address via the Memory Data Bus.

Each of the components above and their interactions with each other are carefully controlled by the microcode.  Each microcode instruction is a 32-bit word consisting of the following fields:

Alto microcode word format

RSEL selects the general-purpose Register to be read from or written to.

ALUF selects the Function the ALU should apply to its A and B inputs.

BS selects what source on the Processor Bus should be sent to the ALU.

F1 and F2 are Special Functions that control the shifter, memory, branching operations, and other operations.

T and L specify whether the output from the ALU should be written back to the T and L intermediate registers.

And the NEXT field specifies the address of the next Microinstruction.  Unlike most conventional machine language where by default each instruction follows the last, Alto microcode must explicitly specify the address of the next instruction to be executed.

The Alto is similar to many other computers of its era in that it uses microcode to interpret and execute a higher-level instruction set — in this case an instruction set very similar to that of the Data General Nova minicomputer (which was popular at PARC at the time).  The Alto’s microcode was also directly involved in controlling hardware — more on this later.

This means the earlier microcode example is contrived — it shows what an addition might look like in microcode, but not what the execution of the equivalent Nova assembly code example actually entails.  So as to provide full disclosure, here’s the microcode instruction sequence the Alto goes through when executing a single Nova instruction, the “LDA 0,FIVE” instruction from the earlier code sequence:

       START:       T← MAR← PC+SKIP
       START1: L← NWW, BUS=0
       :MAYBE, SH<0, L← 0+T+1
       MAYBE: PC← L, L← T, :DOINT
       DIS0: L← T← IR← MD
       DIS1: T← ACSOURCE, :GETAD
       G1: T← PC -1, :DOINS
       L← DISP + T, TASK, :SAVAD
       SAVAD: SAD← L, :XCTAB
       XLDA: MAR← SAD, :FINLOAD
       FINLOAD: NOP;
       LOADX: L← MD, TASK
       LOADD: ACDEST← L, :START

Whew.  That’s 13 micro-instructions to execute a single Nova ADC instruction.  (Keep in mind that each micro-instruction executes in 170 nanoseconds, meaning that these 13 micro-instructions execute in about 2.2 microseconds, which is within spitting distance of the instruction time on a real Nova.)

The above sequence takes the current instruction, decodes it as an LDA instruction and performs the operations needed to execute it.  If it doesn’t make a whole lot of sense at first glance, this is entirely normal.  For the hardcore hackers among us, the above code sequence is taken directly from the microcode listing available on Bitsavers, the hardware manual is here and if you want to explore it in depth, you can use the ContrAlto emulator to step through the code line by line and see how it all works.

The Alto used microcode to simplify implementation of its processor.  This was state-of-the-art at the time, but not altogether novel.  What was novel about the Alto is its use of the processor’s microcode engine to drive every part of the Alto, not just the CPU.  The display, disk, Ethernet, and even the refreshing of the Alto’s dynamic RAM was driven by microcode running on the Alto’s processor.  In essence, software replaced much of what would have been done in dedicated hardware on other computers.

The Alto was also unique in that it provided “CRAM” (Control RAM) – a microcode store that could be changed on the fly.  This allowed the processor’s behavior to be extended or modified at the whim of the programmer – while the Alto was running.

Tasks

To allow the devices in the Alto to share the Alto’s processor with minimal overhead, the Alto’s designers developed very simple cooperative task-switching hardware.

Conceptually, the processor is shared between sixteen microcode Tasks, with a priority assigned to each one:  Task 0 is the lowest priority task, and Task 15 the highest.  Each device in the Alto has one or more tasks dedicated to it.  To make task switching fast and cheap to implement in hardware, the only state saved by the task-switching hardware for each Task is that task’s program counter — the MPC (“Micro Program Counter”).

Only one task is running on the processor at a time, and at any time the microcode for a task may invoke a “task switch” function (named, strangely enough, TASK).  When a TASK instruction is executed, the processor selects the highest priority task that needs to run (is “Requesting Wakeup”) and switches to it by jumping to the instruction pointed to by that Task’s MPC.

The Emulator Task (Task 0)  is always eligible for wakeup but runs at the lowest priority and which can be interrupted at any time by hardware that needs attention.  As suggested by the name, this task contains microcode that “Emulates” the Nova instruction set — it fetches, interprets, and executes the instructions for the user software that runs on the Alto.

The Disk Word Task (Task 14) is the highest priority task implemented in a standard Alto.  It needs to run at the highest priority because its main job is pulling in words of data off the Diablo 30 disk drive as they move under the read head.

The Display Word Task (Task 9) is responsible for picking up words out of the Alto’s memory and painting them on the display as the CRT’s electron beam moves across the screen.

Since the Alto’s Task system is cooperative (meaning that task switches only happen when microcode explicitly requests them) the microcode for each task must be careful to yield to other tasks every so often so that they don’t get starved of time.  If the Disk Word Task is late in execution, data from the disk is corrupted. If the Display Word Task doesn’t get enough time, the display will flicker or display glitches.  And since the Alto hardware saves only the MPC for each task, each task’s microcode must make sure that any important state is saved somewhere (for example, the 32-general purpose R registers) before the task switch happens.

This puts some interesting constraints on the microcode (and results in much hair-pulling on the part of the micro-coder) as you might imagine.  But despite these limitations, and despite the extremely low-level details the microcode has to deal with, the wizards behind the Alto’s design managed to fit microcode implementing the basic Nova instruction set (with extensions for graphics operations), disk controller, Ethernet controller, display, and memory tasks into a 1,024 word microcode ROM.  With three words to spare.

The Task-based microcode architecture, in tandem with the writable microcode Control RAM (CRAM) made the Alto a very flexible computer — new hardware could be quickly implemented and added, and the microcode to drive it could easily be loaded and debugged.  Entirely new instruction sets were devised, experimented with, and revised quickly.  Applications could load custom microcode to accelerate graphics rendering.

In the next article, we will go into depth on the rest of the Alto’s hardware.