A Journey Into the Ether: Debugging Star Microcode

Back in January I unleashed my latest emulation project Darkstar upon the world. At that time I knew it still had a few areas that needed more refinement, and a few areas that were very rough around the edges. The Star’s Ethernet controller fell into that latter category: No detailed documentation for the Ethernet controller has been unearthed, so my emulated version of it was based on a reading of the schematics and diagnostic microcode listings, along with a bit of guesswork.

Needless to say, it didn’t really work: The Ethernet controller could transmit packets just fine but it wasn’t very good at receiving them. I opted to release V1.0 of Darkstar despite this deficiency — while networking was an important part of Xerox’s computing legacy, there were still many interesting things that could be done with the emulator without it. I’d get the release out the door, take a short break, and then get back to debugging.

Turns out the break wasn’t exactly short — sometimes you get distracted by other shiny projects — but a couple of weeks back I finally got back to working on Darkstar and I started with an investigation of the Receiver end of the Ethernet interface — where were things going wrong?

The first thing I needed to do was come up with some way to see what was actually being received by the Star, at the macrocode level. While I lack sources for the Interlisp-D Ethernet microcode, I could see it running in Darkstar’s debugger, and it seemed to be picking up incoming packets, reading in the words of data from these packets and then finally shuffling them off to the main memory. From this point things got very opaque — what was the software (in this case the operating system itself) doing with that data, and why was it apparently not happy with it?

The trickiest part here was finding diagnostic software to run on the Star that could show me the raw Ethernet data being received, and after a long search through available Viewpoint, XDE, and Interlisp-D tools and finding nothing that met my needs I decided to write my own in Interlisp-D. The choice to use Interlisp-D was mainly due to the current lack of XDE compilers, but also because the Interlisp-D documentation covered exactly what I needed to accomplish, using the ETHERRECORDS library. I wrote some quick and dirty code to print out the contents of any packets coming in, and got… crickets. Nothing. NIL, as the Lisp folks say.


So I went back and watched the microcode read a packet in and while it was indeed pulling in data, upon closer inspection it was discarding the packet after the first few words. The microcode was checking that the packet’s Destination MAC address (which begins each Ethernet packet’s header) matched that of the Star’s MAC address and it was ascertaining that the packet in question wasn’t addressed to it. This is reasonable behavior, but the packets it was receiving from my test harness were all Broadcast packets, which use a destination address of ff:ff:ff:ff:ff:ff and which are, by definition, destined for all machines on the network — which is when I finally noticed that hey wait a minute… the words the microcode is reading in for the destination address aren’t all FF’s as they should be… and then I slapped my forehead when I saw what I had done:


I’d accidentally used the “PayloadData” field (which contains just the actual data in the packet) rather than the “Data” field (which contains the full packet including the Ethernet header). So the microcode was never seeing Ethernet headers at all, instead it was trying to interpret packet data as the header!

I fixed that and things were looking much, much better. I was able to configure TCP/IP on Interlisp-D and connect to a UNIX host and things were generally working, except when they weren’t. On rare occasions the Star would drop a single word (two bytes) from an incoming packet with no fanfare or errors:

The case of the missing words. Note the occasional loss of two characters in the above directory listing.

This was puzzling to say the least. After some investigation it became clear that the lost word was randomly positioned within the packet; it wasn’t lost at the beginning or end of the packet due to an off-by-one error or something not getting reset between packets. Further investigation indicated that without fail, the microcode was reading in each word from the packet via the ←EIData function (which reads the next incoming word from the Ethernet controller and puts it on the Central Processor’s X Bus). On the surface it looked like the microcode was reading each word in properly… but then why was one random word getting lost?

It was time to take a good close look at the microcode. I lack source code for the Interlisp-D Ethernet microcode but my hunch was that it would be pretty similar to that used in Pilot since no one in their right mind rewrites microcode unless they absolutely have to. I have some snippets of Pilot microcode, fortunately, and as luck would have it the important portions of it matched up with what Interlisp was using, notably the below loop:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

The code starting with the label EInLoop (helpfully labeled “main input loop”) loads the Memory Address Register (MAR) with the address where the next word from the Ethernet packet will be stored; and the following line invokes ←EIData to read the word in and write it to memory via the Memory Data Register (MDR). The next instruction then decrements a word counter in a register named EE and loops back to EInLoop (“GOTO[EInLoop]”). (If this word counter underflows then the packet is too large for the microcode to handle and is abandoned.)

An important diversion is in order to discuss how branches work in Star microcode. By default, each microinstruction has an INIA (InitialNext Instruction Address) field that tells the processor where to find the next instruction to be executed. Microinstructions need not be ordered sequentially in memory, and in fact, generally are not (this makes looking at a raw dump of microcode highly entertaining). At the end of every instruction, the processor looks at the INIA field and jumps to that address.

To enable conditional jumps, a microinstruction can specify one of several types of Branches or Dispatches. These cause the processor to modify the INIA of the next instruction by OR’ing in one or more bits based on a condition or status present during the current instruction. (This is then referred to as NIA, for Next Instruction Address). For example, the aforementioned word counter underflow is checked by the line:

ERead:    EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];

The EE register is decremented by 1 and the ZeroBr field specifies a branch if the result of that operation was zero. If that was the case, then the INIA of the next instruction (at EInLoop) is modified — ZeroBr will OR a “1” into it.

EInLoop:    MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong],    c1;

This branch is denoted by the BRANCH[$,EITooLong] assembler macro which denotes the two possible destinations of the branch. The dollar sign ($) indicates that in the case of no branch, the next sequential instruction should be executed, and that that instruction needs no special address. In the case of a branch (indicating underflow) the processor will jump to EITooLong instead.

Clear as mud? Good! So how does this loop exit under normal conditions? In the microcode instruction at EInLoop there is the clause EtherDisp. This causes a microcode dispatch — a multi-way jump — based on two status bits from the Ethernet controller. The least-significant bit in this status is the Attn bit, used to indicate that the Ethernet controller has something to report: A completed packet, a hardware error, etc. The other bit is always zero if the Ethernet controller is installed. (If it’s not present, the bit is always 1).

Just like a conditional branch, a dispatch modifies the INIA of the next instruction by ORing those status bits in to form the final NIA. The instruction following EInLoop is:

MDR ← EIData, DISP4[ERead, 0C],    c2;

The important part to us right now is the DISP4 assembler macro: this sets up a dispatch table starting with the label ERead which it places at address 0x0C (binary: 1100). Note how the lower two bits in this address are clear, to allow branches and dispatches to OR modified bits in. In the case where EtherDisp specfies no special conditions (all bits zero) the INIA of this instruction is unmodified and left as 0x0C and the loop continues. In the case of a normal packet completion, EtherDisp will indicate that the Attn bit is set, ORing in 1, resulting in an NIA of 0x0D (binary: 1101).

This all looked pretty straightforward and I didn’t see any obvious way a single word could get lost here, so I looked at the other ways this loop could be exited — how do we get to the instruction at 0x0E (binary: 1110) from the dispatch caused by EtherDisp? At first this left me scratching my head — as mentioned earlier, the second bit masked in by EtherDisp is always zero! The clue is in what the instruction at 0x0E does: it jumps to a Page Cross handler for the routine.

This of course requires another brief (not so brief?) diversion into Central Processor minutiae. The Star’s Central Processor contains a simple mechanism for providing virtual memory via a Page Map, which maps virtual addresses to physical addresses. Each page is 256 words in size, and the CP has special safeguards in place to trap memory accesses that might cross a page boundary both to prevent illegal memory accesses and so the map can be maintained. In particular, any microinstruction that loads MAR via an ALU operation that causes a carry out of the low 8 bits (i.e. calculating an address that crosses a 256-word boundary) results in any memory access in the following instruction being aborted and a PageCross branch being taken. This allows the microcode to deal with Page Map-related activities (update access bits or cause a page fault, for example) before resuming the aborted memory access.

Whew. So, in the case of to the code in question:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

Imagine (if you will) that register E (the Ethernet controller microcode gets two whole CPU registers of its very own and their names are E and EE) contains 0xFF (255) and the processor is running the instruction at EInLoop.  The ALU adds 1 to it, resulting in 0x100 — this is a carry out from the low 8-bits and so a PageCross branch is forced during the next instruction.  A PageCross branch will OR a “2” into the INIA of the next instruction.

The next instruction attempts to store the next word from the Ethernet’s input FIFO into memory via the MDR←EIData operation but this store is aborted due to the Page Cross caused during the last instruction.  And at last, a 2 is ORed into INIA, causing a dispatch to 0x0E (binary: 1110).  So in answer to our (now much earlier) question:  The routine at 0x0E is invoked when a Page Cross occurs while reading in an Ethernet packet.  (How the code gets to the routine at 0x0F is left as an exercise to the reader.)

And as it turns out, it’s the instruction at 0x0E that’s triggering the bug in my emulated Ethernet controller. 

E ← EIData, uETemp2 ← EE, GOTO[ERCross],    c3, at[0E,10,ERead];

Note the E←EIData operation being invoked — it’s reading in the word from the Ethernet controller for a second time during this turn through the loop, and remember that the first time it did this, it threw the result away since the MDR<- operation was canceled.  This second read is done with the intent to store the abandoned word away (in register E) until the Map operation is completed.

So what’s the issue here?  On the real hardware, those two ←EIData operations return the same data word rather than reading the next word from the input packet.  This is in fact one of the more clearly spelled-out details in the Ethernet schematic — it even explains why it’s happening! — one that I completely, entirely missed when writing the emulation:

Seems pretty clear to me…

Microinstructions in the Star’s Central Processor are grouped into clicks of three instructions each; a click’s worth of instructions execute atomically — they cannot be interrupted.  Each instruction in a click executes in a single cycle, referred to as Cycle 1, Cycle 2, and Cycle 3 (or c1, c2, and c3 for short).  You can see these cycles notated in the microcode snippet above.  Some microcode functions behave differently depending on what cycle they fall on.  ←EIData only loads in the next word from the Ethernet FIFO when executed during a c2; an ←EIData during c1 or c3 returns the last word loaded.  I had missed this detail, and as a result, my emulation caused any invocation of ←EIData to pull the next word from the FIFO.  As demonstrated above this nearly works, but causes a single word to be lost when a packet read crosses a page boundary.

I fixed the ←EIData issue in Darkstar and at long last, Ethernet is working properly.  I was even able to connect to one of the machines here at the museum:

The release on Github has been updated; grab a copy and let me know how it works for you!

If you’re interested in learning more about how the Star works at the microcode level, the Hardware Reference and Microcode Reference are a good starting point. Or drop me a line!