Unix Version 0 on the PDP-7 at LCM+L

In February 2016, a wonderful piece of news came to the attention of the international vintage computing community: The source of the original implementation of the Unix operating system, written for the DEC PDP-7 computer, had come to light, in the form of listings for the kernel and several user programs (including the editor and the assembler program). The announcement came from Warren Toomey, founder of The Unix Historical Society (TUHS) in Australia.

Warren asked if we might be interested in participating in the recovery of this historic software, and perhaps run it on the PDP-7 here.1 We were very interested, and Josh Dersch began thinking about how to do this without a disk drive, since our PDP-7 did not have one.

Two months later, Fred Yearian visited the museum and told us that he had a PDP-7 in the basement of his house. Fred, a retired Boeing engineer, had acquired the system many years earlier from Boeing Surplus, and kept it in semi-running condition. In 2018, Fred made arrangements to donate his PDP-7 to the museum, and it was moved to the computer room on the third floor, where Fred worked with Jeff Kaylin to put it into good running order.

This PDP-7 included a non-DEC interface which was installed by Boeing, and which was apparently a controller for a magnetic tape drive. Jeff and Fred traced the circuitry for the interface, and Jeff created schematics for it.

Fred had a utility program for the PDP-7 which was written as a set of binary numbers (represented in octal = base 8) in an assembly lnaguage program for a Varian minicomputer. He translated the binary representations of the PDP-7 instructions into an assembly language program for the PDP-7, which Rich Alderson compiled using a program for Windows originally created for the family of simulation programs for DEC’s 18 bit computers.2 Fred added new features as the restoration of his PDP-7 progressed and new needs were recognized.

Meanwhile, enthusiasts had added features to the SimH PDP-7 simulator such as a simulated disk of the kind used on the PDP-7 at Bell Labs where Unix was created, which allowed the operating system to be run under a PDP-7 simulation. Some programs were missing from the source listings provided to Warren Toomey, including the shell command processor, and these had to be recreated from scratch based on early documentation and programming notes.

As part of our Unix@50 programming, celebrating the 50th anniversary of the Unix operating system, we moved Fred’s PDP-7 from the third floor computer room to the second floor exhibit hall in June 2019. It formed one of the anchors of a private event hosted by https://SDF.org for attendees of the Usenix Technical Conference in July.

Following the move to the exhibit hall, Fred and Jeff continued the restoration in view of the public, answering questions as they worked. Jeff also designed a device, dubbed the JK09,3 which looks to the PDP-7 like a kind of disk connected to the interface installed by Boeing. Once that was debugged, it was time for the software to be revisited.

Josh Dersch took the lead, copying the Unix Version 0 sources and modified SimH PDP-7 simulator from GitHub, then modifying that to use the JK09 device instead of the RB09 simulated by SimH. Once he had Version 0 running under SimH, he created a disk image which was loaded into the JK09 attached to the PDP-7, and Unix was booted on the system.

IMLAC PDS-1 Power Supply

In early July of 2019, the power supply of one of our rarest and iconic machines, started to fail. This is the IMLAC PDS-1 originally produced from 1970 to 1972. Despite the efforts of our staff to troubleshoot and replace components, we were soon left with a completely failed power supply.

Typical of these situations, we set about to do an engineering evaluation toward designing a form, fit, and functional replacement. The photos below show the power supply system in its’ original form.

IMLAC Power Supply – Power Input, Rectifier, and Filtering (left chassis)

This image has an empty alt attribute; its file name is image-4-768x1024.png
IMLAC Power Supply – Regulator Chassis (right chassis)

Accessing the schematics, we found what voltages and currents the power supply system had to provide. Next, using the power supply components we have available for this purpose, we had to design a system which fits in the chassis and interfaces with the control and power signals the computer needs to run.

Surprise ! The Power Supply Generates a Non-DC timing Signal

The schematic below shows a section which we couldn’t figure out initially. At first glance, the collection of four diodes on the left looks like a bridge rectifier. On closer examination, the anodes and cathodes are not hooked up like a bridge rectifier. What we have here instead is a frequency doubler which is used to generates a 120 hz signal from the 60 hz power line that is used as a periodic interrupt for the video display logic. Not at all expected.

Original IMLAC schematic showing 120 hz sync signal generator (frequency doubler)

Below is our replication of this circuit. We used a miniature 120 VAC to 10 VAC transformer.

Schematic for 120 hz sync signal generator

After the surprise above, we set about removing the original power supply components and installing a new configurable supply along with the necessary modifications to the internal wiring harness.

Below are two images showing the right and left power supply chassis as modified.

IMLAC Power Supply – After Modification the Power Input, Rectifier, and Filtering are no longer needed (left chassis)
The regulator hardware has been replaced by a configurable power supply(on the right) and the 120 hz signal generator on the left ( orange color board with caution label ).

We completed and tested the above modifications in about 1 1/2 weeks. A week after the unit was put back in service, a third power supply ( located in the control console ) also failed. We replaced it with another configurable supply as shown below.

Control Console rear showing replacement supply. This powers the control console LEDs.

The system has now been running since late July without incident.

This is a good example of the type of work we have been performing for the last 15 years.

RS232 In to CMOS

Schematic

We often use RS232 serial communication in our projects. The USB converters are very convenient. For output, these converters can receive from CMOS levels (0 to 3.3V). For input, the ones we use provide about 6 and a half volts positive and negative. To interface to CMOS 3.3 volt logic, I have created a circuit to drop the positive 6.5 volts down to a little under 3 volts, and also completely block any negative voltage. It works well at 9600 baud.

Update: I tried different baud rates. The maximum normal rate is 921600. The waveform received at 460800 did not have sharp rise and fall times, and was not symmetrical. At 230400, the rise and fall times made less difference and the symmetry was closer. But I wondered if it was my circuit introducing the difference in rise and fall times, as 2.2K resistors do not pull that hard. Nope, it is actually the USB converter which has the bit jitter. Also the differing rise and fall times — their method of creating the negative voltage must have less oomph. I measured bit times of 4.48us for the first bit, 4.28, 4.2, and 4.32 for the next bits. I ran a test overnight and had no communication errors at 230400. That is 24 times faster than 9600.

Add Memory to a FPGA board

DRAM memory
DRAM memory board

Xilinx FPGA chips have limited memory available. For those projects requiring more memory, I have created a daughter board. This board may have up to 4 memory chips, each providing 16MBits of memory (4M x 4).

The MESA boards have 24 signals on each of their connectors. Finding memory requiring 24 or less signals was the trick. The 4M x 4 configuration has 19 signals: 11 address, 4 data, RAS, CAS, WE, and OE. I select one of the 4 memory chips by providing separate CAS.

Apple I Keyboard

We are using an Apple ][ keyboard to control our Apple original.

Of course the connectors are not compatible. So, a simple rewiring may work. However, if one were to put some logic between, then if the Strobes are active low instead of active high, or if the Reset is active low instead of active high, or if some of the data bits are usually high instead of usually low, then one could easily make adjustments. Plus, with enough logic, one could pretend to be typing in programs.

That is what I have done. With a BASYS3 FPGA board I listen to the Apple ][ keyboard and pass that data along. But I also listen to a telephone keypad, and if a button is pressed I send my own data. The most useful program is the BASIC interpreter. It takes 6 minutes! That is about 30 characters a second to enter a line of data, and then an appropriate pause at the end of each line so the data can be processed. Then a command to RUN and BASIC is going. Then, another key may be pressed on the keypad to load a BASIC program. That is a slower process, as BASIC takes some time to process each character. But, BASIC programs are usually short anyway.

To blow, or not to blow!

That is the question that one of the blowers in the base of KATIA, our 1967 PDP10-KA was apparently asking itself a couple of weeks ago.

KATIA had been running happily for a couple of months, and when I came in one day, she wasn’t. I tried rebooting her a couple of times, without any luck. With my usual collection of tasty puzzle morsels on my plate, it took me a few days to get around to poor KATIA. It was Friday when I started running diagnostics, and they all passed till it got down to one of the last ones: DAKCB.

DAKCB would fail while trying to test the FDVL instruction: Double precision Floating Point Divide. This was at the end of the day, so since I was scheduled to work half a day Saturday, I could work on it then. That might incite interesting conversations with the Museum Visitors.

After going to my 6:30AM Saturday morning WW meeting, Studying for a test for a while, then taking the test for the highest class Amateur Radio Operator license, and passing, I managed to stagger into work a little after 1PM.

I was assisted by Leda, our Post-Grad Ethnography student, who is trying to figure out why Grown Adults would want to spend their time playing with these ancient computers.

While looking at the machine when it failed DAKCB, we had noticed a couple of things: it had stopped, as in the PC wasn’t changing, but the RUN light was still on! This is a symptom of losing a pulse! In Asynchronous machines like the KA and KI, there is no clock that times the instructions. When you poke the start button, a single pulse is launched into a bunch of logic and delay lines. The little pulses and logic fetch the instruction, examine it to figure out what instruction it is, and wiggle all the logic at the correct time to do whatever that instruction is supposed to do. When it is done doing one instruction, and the RUN light is still on, it will take the pulse coming out of the end of the logic, and put it back in the front, starting the next instruction.

In my Zeal and Enthusiasm, I conned Leda into running the oscilloscope probe for me as we tried to follow this tiny pulse all the way through the logic till it disappeared. Normal trouble shooting practice would be to see the pulse at the beginning, look at the end and not see it, then look somewhere in the middle. We call this a binary search, and it is the fastest way to do this.

Unfortunately I am not sure where the end is, and I certainly don’t know where the middle of this process would be, so we started at the beginning and followed each step till we got to a pulse amplifier at 1F19: we could see the pulse go in, but it didn’t come out! We can just change the module, right?

I did that, and then noticed that the module I had taken out was REALLY warm in my hand, not so hot it burned me, but it shouldn’t have been that hot. Why? Being as how this is an updraft machine, I put my hand above the card cage, and it was nice and toasty warm up there. Where is all the air like is coming out of Bay 2’s cage, keeping all those cards cool? Going back around to the front of the machine, I notice that the blower in the bottom of Bay 1 was not turning, and shut the machine off. Now we know why it quit working!

This is not going to be pretty! Leda and I went down to the basement, and stole a blower from one of the other KA’s we have down there, and installed it. Of course the freight elevator had gotten its tail tied in a knot on Friday and was on strike for better working conditions, so I carried the 30lb awkward thing up the stairs, and managed to get it installed around 5:30PM. We turned on the machine, and luckily this blower was not having an existential crisis, and so it was blowing precious cooling air all through the cards in Bay 1: Yay!

Since we figured that it was probably bad bearings, David took the blower apart to change the bearings, and here is a picture of it sitting on a 2 foot by 3 foot cart:

KATIAs blower, disassembled.

Looking at the stator of the motor a little closer:

Blower stator.

Those wires should not be black! I’m afraid this motor is toast!

I have so far spent about a week trying to get basic instructions to work on KATIA, without a lot of success. What I know so far, is that if I try to run out of fast memory, the ACs that live inside the processor, it REALLY doesn’t work. At a guess, I suspect that the logic receives all zeros for the second instruction. If I disable fast memory, or run in main memory, things work better, but some very simple things don’t work. It will add a constant to an accumulator, but it won’t load a constant into an accumulator. This reduces its usefulness as a computer a little. It is still pretty good as a room heater.

I will try to blog more frequently so you can follow along as we discover what else is wrong with KATIA.

Connecting to the PDP-7

Two 55-pin connectors are available on the PDP-7 SN129. One provides 18 bits of output data, and the other can read 18 bits of input data.

My project was to solder 50-pin ribbon cables to the 55-pin connectors. This will allow us to easily connect to external logic.

55-pin round connector
Pin numbers spiral in, so it is best to begin soldering at the inside and work out. Here are the first four pairs of wire. The green wire is the 25th pair of the ribbon cable.

Each wire has heat-shrink tubing to both insulate and to provide structural strength.

55-pin round connector
This is the other connector, and being of the other gender it spirals the other direction. The final two wires are ready to be soldered.

Computer Maintenance Hell!

Back in October 2018, our PDP10-KI went down, and it didn’t want to come back up. I ran all the normal diagnostics, and they all worked, but the TOPS-10 would hang when I tried to boot it. That is the definition of Computer Maintenance Hell, Everything works, but the operating system won’t run!

Running the normal diagnostics sounds like an easy thing, but that isn’t always the case! The first bunch of diagnostics run from paper tape, and that is pretty easy. As we continue past DBKAG, the tapes don’t fit well in the reader, so we switch over to getting them off of DECTape, herein lies the rub: the TD10 DECTape controller on the KI is almost always broken when I need it.

After much gnashing of teeth, and tearing of hair, there was enough blood on the floor for the dust bunnies to leave tracks in that pointed to what was wrong with the TD10, and we were off once again. I ran the rest of the usual diagnostics, and they all passed! Still didn’t boot.

I had plenty of things to keep me occupied, so our poor PDP10-KI didn’t get a lot of my attention. During our group session bringing up the KATIA, we played with the KI some, and found that the KI didn’t like its memory! The KA liked it, but the KI didn’t! It would run the DDMMD memory diagnostic for about 10 or 15 minutes, then fail. The KA would happily run the KIs memory well past where the KI would fail. Looking at the errors, it appeared as if things were getting confused about which particular bit of memory it was talking to. It would always start failing at location 0374000 where either it hadn’t inverted the contents of those locations, or it had done it twice.

Now it did’t fail all the time. The part of the test that failed was going through memory incrementing the address by a more significant bit than the LSB, then wrap around to the LSB. When it started with the LSB, bit 35, everything was fine. It worked when it did bit 34. It had to get up to bit 25 before we had problems, we would fail between bit 25, and bit 21, 20 through 18 worked too.

I spent quite a while trying to write a diagnostic that did what DDMMD was doing, but in a quick and repeatable way. I believe I got pretty close, but nothing I wrote would tickle the problem… bother!

Months have now passed, and I broke down and plugged in the logic analyzer. Most of the time, I use an oscilloscope as my main debug tool. ‘Scopes don’t lie as much as logic analyzers do! If the logic is working, procducing 1’s and 0’s as it should, a logic analyzer is a good tool. When things are broken, sometimes you get a half, or a third instead of a one or zero, and this is where the ‘scope is better about telling the truth, and the logic analyzer will lie. Here the machine was pretty much working, at least the diagnostics thought so.

Here is one of the first logic analyzer traces I took, just showing the logic analyzer sample number, the memory operation, and the address:

1581 wr 626415
1601 wr 626435
1621 wr 626455
1641 wr 626475
1661 wr 626515
1681 wr 626535

I did a bunch of work with PERL to go through the 100MB of data that came out of the logic analyzer, and boil it down to what you see here.

Now it turns out that the way this part of DDMMD worked, is that it would fill memory from the bottom to the top stepping by 1, complement each location using the funny addressing pattern, then verify from bottom to top normally. I added the top 8 bits from the CPU’s MA (Memory Address) register to the logic analyzer:

104873 rd 377774, 376
104903 rd 377775, 376
104933 rd 377776, 376
104963 rd 377777, 376
104989 rd 777000, 400 ***
105019 rd 400001, 400
105047 rd 400002, 400
105075 rd 400003, 400
105103 rd 400004, 400

This is where it is doing the final verification, and you can see something funny here: the upper bits from the MA register incremented like I expected them to, but when a whole bunch of them changed, the address going to the actual memory didn’t follow as quickly! Instead of going from 377777 to 400000, it went to 777000! Here we get into a bit of logic called the “Pager”.

A PDP10 can really only talk to 256K words of memory at one time. How can the KI use 4MWs of memory? That is the Pagers job! The Pagers job is to translate the logical address that the CPU provides into a physical address of a hopefully larger memory. While running diagnostics, the Pager should be turned off, resulting in a maximum of 256KW of memory directly addressed from the MA register to the address lines going to memory. Something was going wrong here!

I added another set of 8 probes from the Logic Analyzer, and started moving backwards from the physical address going to the memory, to where the MA register fed into the Pager. When I got to the output of the CAM’s, there was something I didn’t understand.

What is a CAM you ask? CAM stands for “Content Addressable Memory”. What you do is give it the logical address that you want, and it will tell you if it knows about that, with a single line for each location inside itself. All four of them.

I got lucky, the first group of 8 output bits looked like this:

536691 wr 360631, 360, 400
536793 wr 362631, 362, 100
536844 wr 363631, 362, 040
536896 wr 364631, 364, 020
536947 wr 365631, 364, 010
536999 wr 366631, 366, 004
537052 wr 367631, 366, 006

Near as I can tell, there should be only a single 1 in the right column. It is octal, so we can watch which location in the CAM has the data as the addresses change, and when we get to 367631, we get two ones! I believe that output should have been a 002, not 006!

That output came from board 2PR09, so I swapped it and 2PR08, and I couldn’t run the diagnostic at all due to a “Page Fail Trap Error”! Ah, I think we are very close here! I checked the inventory, and we didn’t have a record for an M260 board, so I stole one from one of the machines that came in in September, and Voila, the memory test passed! It can even run TOPS-10 if we don’t try to initialize its serial ports. This could be correct since we stole a bunch of its serial lines to use on KATIA while the KI was asleep.

OK, Since the KI, the KA, and the CDC are all working, I seem to have made it out of Computer Maintenance Hell for now. Give them a little while, one of them will fail.

Bruce Sherry

Adventures in PDP10 land

The PDP10-KI went down sometime in the fall, maybe October. This is the machine just to the right of the CDC6500 as you come in the second floor computer room. I noticed this fairly quickly and tried to reboot it, but it would hang when I tried, several times.

OK, it must be time to run diagnostics, which I proceeded to do. It passed all the diagnostics from DBKAA to DBKAH, which are the ones on paper tape, but the TD10 DECTape controller, just to the right of the console had quit again, as it has done almost every time I need to use it.

This TD10 and I just do not get along very well. It always kicks me around the block several times before it will let me know what is wrong, so I can fix it. Because of this history, it sometimes takes a while for me to generate the gumption to work on it. This time was no exception, since I was busy trying to get the KA working. If you don’t know about gumption and gumption traps, you should read the classic “Zen and the Art of Motorcycle Maintenance”, which doesn’t really talk about motorcycles, or Zen much.

Back to our story: We fixed the KA, moved it down to the second floor computer room, fixed it again, and had a small gathering to boot it the first time. The CDC was being its normal self, but was refusing to be down when I arrived at work, which is my signal to drop everything and work on it.
I was running out of reasons to avoid working on the KI.

Contrary to normal behavior, it only took a few hours to figure out the problem with the TD10, so I could run the diagnostics that are very awkward to load from paper tape.

All the normal diagnostics ran except for DBKAL. I think it took a few days to remember that there is a diagnostic for which the binary we have is wrong, and I have to patch a couple of locations after loading, in order for it to work. Now DBKAL and DBKAM work. On to some more obscure ones. All the CPU ones seem to pass, does it boot now? No, it still hangs after the OS is loaded, and we type “GO” to fire up timesharing.

What else can we test? We ran DDRHA, which tests the RH10 disk controller, but it passed. We then ran DDRPI, which tests the disk drives. Now we don’t really use the disk drives it is expecting to test, we use our MDE (Massbus Disk Emulator). We have been using this MDE for about 5 years now, but there could still be a bug hiding in there somewhere.

DDRPI looked like everything was fine for about 20 minutes, while it was doing register tests, seek tests, all ones and zeros tests. When it got to testing the surface of the disk, things started to go wrong. It would get an error where it looked like the data was misplaced, like it was reading the wrong sector or something like that.

How could that be? This thing has been working fine for over 5 years, in fact when we ran DDRPI from the KA, using the KI’s Memory, RH10s, DAS33, and MDE, EVERYTHING was FINE, even the surface test!

How about the memory? It passes my little MARCH memory test, from the KI or the KA. Dragging out the DECTape again, we loaded up DDMMD, which is one of the memory diagnostics. We fired it up, and it ran fine for about 15 minutes, whereupon it started spewing out errors. We have run this a BUNCH, and the errors usually seem to start at location 374000, and the data seems to be inverted from what it should be. The test complains about address bit 24.

We run the same test from the KA against the KI’s memory, and it works fine! It is handy to have the KA right across the room to enable this kind of testing.

What is really going on here? Let’s look at the console:

OK, TN=2, that means it is doing “Address” test. AS=F24 turns out to mean that it was doing fast addressing on address bit 24. “What does that mean” you may ask? I did! After much grovelling over the DDMMD listing, and consulting with Rich Alderson, I found that they would fill memory with the address and its complement, and then go through reading a location, verifying it was correct, and writing the complement in it, then go back and read and verify that they all had the complement. Ah, but what about that F24 part? When they are reading and complementing the data they start skipping locations, by changing which address bit they increment first. The first time they do this, they use bit 35 as the lsb, but next time through they shift the lsb over to bit 34, then 33 etc. When they get to bit 24, we run into this problem.

How do I figure out what is really going on here? I decided to write a version of my MARCH that does this, MRCHFA. It took a while, but I finally got it to work on the KA, and tried it on the KI: Unfortunately the KI passed it too! What else are they doing differently? OK, more grovelling over the listing: They are stuffing all their inner loops down in the Fast AC’s to speed them up. On to MRCHF3, which pushes the inner loops down to the Fast AC’s. Does the KI fail that one? Nope!

I’m running out of ideas, where do we go from here? I decide to just watch it for a while, and see what happens AFTER it starts to fail. I see it fail bit 24 from 374000 to 377777, both bit 23 and 22 over the same range, then it starts failing location 700000, in the same way. Shortly thereafter, the program gives up in disgust, and stops printing the results, and starts ignoring the errors. Now I just watch the lights on the ARM10 memory.

I got used to the way the lights blink while working on MRCHFA, and MRCHF3, as the LSB moves up the address, the slow blinking address follows it.

But wait, that isn’t what I am seeing: I see the lights increment from the bottom, do the FA thing, then increment from the bottom again, and then shift the LSB over. What is going on here? Is there anything on the ARM10 that will tell me anything? Yes, there are the read and write lights. While writing, both lights come on, but a read only lights the read light. After the FA thing, it just reads! Back to grovelling over the listing some more.

Ok, what they do is: fill memory from the bottom to the top, check and complement using the shifting LSB, and then check from the bottom to the top. Another new test called THEIRS, which of course doesn’t catch the problem either. I am running out of hair to pull out here!

As I write this, both machines are running DDMMD against the others memory, happily, no errors. No happy ending… YET!

A Journey Into the Ether: Debugging Star Microcode

Back in January I unleashed my latest emulation project Darkstar upon the world. At that time I knew it still had a few areas that needed more refinement, and a few areas that were very rough around the edges. The Star’s Ethernet controller fell into that latter category: No detailed documentation for the Ethernet controller has been unearthed, so my emulated version of it was based on a reading of the schematics and diagnostic microcode listings, along with a bit of guesswork.

Needless to say, it didn’t really work: The Ethernet controller could transmit packets just fine but it wasn’t very good at receiving them. I opted to release V1.0 of Darkstar despite this deficiency — while networking was an important part of Xerox’s computing legacy, there were still many interesting things that could be done with the emulator without it. I’d get the release out the door, take a short break, and then get back to debugging.

Turns out the break wasn’t exactly short — sometimes you get distracted by other shiny projects — but a couple of weeks back I finally got back to working on Darkstar and I started with an investigation of the Receiver end of the Ethernet interface — where were things going wrong?

The first thing I needed to do was come up with some way to see what was actually being received by the Star, at the macrocode level. While I lack sources for the Interlisp-D Ethernet microcode, I could see it running in Darkstar’s debugger, and it seemed to be picking up incoming packets, reading in the words of data from these packets and then finally shuffling them off to the main memory. From this point things got very opaque — what was the software (in this case the operating system itself) doing with that data, and why was it apparently not happy with it?

The trickiest part here was finding diagnostic software to run on the Star that could show me the raw Ethernet data being received, and after a long search through available Viewpoint, XDE, and Interlisp-D tools and finding nothing that met my needs I decided to write my own in Interlisp-D. The choice to use Interlisp-D was mainly due to the current lack of XDE compilers, but also because the Interlisp-D documentation covered exactly what I needed to accomplish, using the ETHERRECORDS library. I wrote some quick and dirty code to print out the contents of any packets coming in, and got… crickets. Nothing. NIL, as the Lisp folks say.

Hmm.

So I went back and watched the microcode read a packet in and while it was indeed pulling in data, upon closer inspection it was discarding the packet after the first few words. The microcode was checking that the packet’s Destination MAC address (which begins each Ethernet packet’s header) matched that of the Star’s MAC address and it was ascertaining that the packet in question wasn’t addressed to it. This is reasonable behavior, but the packets it was receiving from my test harness were all Broadcast packets, which use a destination address of ff:ff:ff:ff:ff:ff and which are, by definition, destined for all machines on the network — which is when I finally noticed that hey wait a minute… the words the microcode is reading in for the destination address aren’t all FF’s as they should be… and then I slapped my forehead when I saw what I had done:

Whoops.

I’d accidentally used the “PayloadData” field (which contains just the actual data in the packet) rather than the “Data” field (which contains the full packet including the Ethernet header). So the microcode was never seeing Ethernet headers at all, instead it was trying to interpret packet data as the header!

I fixed that and things were looking much, much better. I was able to configure TCP/IP on Interlisp-D and connect to a UNIX host and things were generally working, except when they weren’t. On rare occasions the Star would drop a single word (two bytes) from an incoming packet with no fanfare or errors:

The case of the missing words. Note the occasional loss of two characters in the above directory listing.

This was puzzling to say the least. After some investigation it became clear that the lost word was randomly positioned within the packet; it wasn’t lost at the beginning or end of the packet due to an off-by-one error or something not getting reset between packets. Further investigation indicated that without fail, the microcode was reading in each word from the packet via the ←EIData function (which reads the next incoming word from the Ethernet controller and puts it on the Central Processor’s X Bus). On the surface it looked like the microcode was reading each word in properly… but then why was one random word getting lost?

It was time to take a good close look at the microcode. I lack source code for the Interlisp-D Ethernet microcode but my hunch was that it would be pretty similar to that used in Pilot since no one in their right mind rewrites microcode unless they absolutely have to. I have some snippets of Pilot microcode, fortunately, and as luck would have it the important portions of it matched up with what Interlisp was using, notably the below loop:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

The code starting with the label EInLoop (helpfully labeled “main input loop”) loads the Memory Address Register (MAR) with the address where the next word from the Ethernet packet will be stored; and the following line invokes ←EIData to read the word in and write it to memory via the Memory Data Register (MDR). The next instruction then decrements a word counter in a register named EE and loops back to EInLoop (“GOTO[EInLoop]”). (If this word counter underflows then the packet is too large for the microcode to handle and is abandoned.)

An important diversion is in order to discuss how branches work in Star microcode. By default, each microinstruction has an INIA (InitialNext Instruction Address) field that tells the processor where to find the next instruction to be executed. Microinstructions need not be ordered sequentially in memory, and in fact, generally are not (this makes looking at a raw dump of microcode highly entertaining). At the end of every instruction, the processor looks at the INIA field and jumps to that address.

To enable conditional jumps, a microinstruction can specify one of several types of Branches or Dispatches. These cause the processor to modify the INIA of the next instruction by OR’ing in one or more bits based on a condition or status present during the current instruction. (This is then referred to as NIA, for Next Instruction Address). For example, the aforementioned word counter underflow is checked by the line:

ERead:    EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];

The EE register is decremented by 1 and the ZeroBr field specifies a branch if the result of that operation was zero. If that was the case, then the INIA of the next instruction (at EInLoop) is modified — ZeroBr will OR a “1” into it.

EInLoop:    MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong],    c1;

This branch is denoted by the BRANCH[$,EITooLong] assembler macro which denotes the two possible destinations of the branch. The dollar sign ($) indicates that in the case of no branch, the next sequential instruction should be executed, and that that instruction needs no special address. In the case of a branch (indicating underflow) the processor will jump to EITooLong instead.

Clear as mud? Good! So how does this loop exit under normal conditions? In the microcode instruction at EInLoop there is the clause EtherDisp. This causes a microcode dispatch — a multi-way jump — based on two status bits from the Ethernet controller. The least-significant bit in this status is the Attn bit, used to indicate that the Ethernet controller has something to report: A completed packet, a hardware error, etc. The other bit is always zero if the Ethernet controller is installed. (If it’s not present, the bit is always 1).

Just like a conditional branch, a dispatch modifies the INIA of the next instruction by ORing those status bits in to form the final NIA. The instruction following EInLoop is:

MDR ← EIData, DISP4[ERead, 0C],    c2;

The important part to us right now is the DISP4 assembler macro: this sets up a dispatch table starting with the label ERead which it places at address 0x0C (binary: 1100). Note how the lower two bits in this address are clear, to allow branches and dispatches to OR modified bits in. In the case where EtherDisp specfies no special conditions (all bits zero) the INIA of this instruction is unmodified and left as 0x0C and the loop continues. In the case of a normal packet completion, EtherDisp will indicate that the Attn bit is set, ORing in 1, resulting in an NIA of 0x0D (binary: 1101).

This all looked pretty straightforward and I didn’t see any obvious way a single word could get lost here, so I looked at the other ways this loop could be exited — how do we get to the instruction at 0x0E (binary: 1110) from the dispatch caused by EtherDisp? At first this left me scratching my head — as mentioned earlier, the second bit masked in by EtherDisp is always zero! The clue is in what the instruction at 0x0E does: it jumps to a Page Cross handler for the routine.

This of course requires another brief (not so brief?) diversion into Central Processor minutiae. The Star’s Central Processor contains a simple mechanism for providing virtual memory via a Page Map, which maps virtual addresses to physical addresses. Each page is 256 words in size, and the CP has special safeguards in place to trap memory accesses that might cross a page boundary both to prevent illegal memory accesses and so the map can be maintained. In particular, any microinstruction that loads MAR via an ALU operation that causes a carry out of the low 8 bits (i.e. calculating an address that crosses a 256-word boundary) results in any memory access in the following instruction being aborted and a PageCross branch being taken. This allows the microcode to deal with Page Map-related activities (update access bits or cause a page fault, for example) before resuming the aborted memory access.

Whew. So, in the case of to the code in question:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

Imagine (if you will) that register E (the Ethernet controller microcode gets two whole CPU registers of its very own and their names are E and EE) contains 0xFF (255) and the processor is running the instruction at EInLoop.  The ALU adds 1 to it, resulting in 0x100 — this is a carry out from the low 8-bits and so a PageCross branch is forced during the next instruction.  A PageCross branch will OR a “2” into the INIA of the next instruction.

The next instruction attempts to store the next word from the Ethernet’s input FIFO into memory via the MDR←EIData operation but this store is aborted due to the Page Cross caused during the last instruction.  And at last, a 2 is ORed into INIA, causing a dispatch to 0x0E (binary: 1110).  So in answer to our (now much earlier) question:  The routine at 0x0E is invoked when a Page Cross occurs while reading in an Ethernet packet.  (How the code gets to the routine at 0x0F is left as an exercise to the reader.)

And as it turns out, it’s the instruction at 0x0E that’s triggering the bug in my emulated Ethernet controller. 

E ← EIData, uETemp2 ← EE, GOTO[ERCross],    c3, at[0E,10,ERead];

Note the E←EIData operation being invoked — it’s reading in the word from the Ethernet controller for a second time during this turn through the loop, and remember that the first time it did this, it threw the result away since the MDR<- operation was canceled.  This second read is done with the intent to store the abandoned word away (in register E) until the Map operation is completed.

So what’s the issue here?  On the real hardware, those two ←EIData operations return the same data word rather than reading the next word from the input packet.  This is in fact one of the more clearly spelled-out details in the Ethernet schematic — it even explains why it’s happening! — one that I completely, entirely missed when writing the emulation:

Seems pretty clear to me…

Microinstructions in the Star’s Central Processor are grouped into clicks of three instructions each; a click’s worth of instructions execute atomically — they cannot be interrupted.  Each instruction in a click executes in a single cycle, referred to as Cycle 1, Cycle 2, and Cycle 3 (or c1, c2, and c3 for short).  You can see these cycles notated in the microcode snippet above.  Some microcode functions behave differently depending on what cycle they fall on.  ←EIData only loads in the next word from the Ethernet FIFO when executed during a c2; an ←EIData during c1 or c3 returns the last word loaded.  I had missed this detail, and as a result, my emulation caused any invocation of ←EIData to pull the next word from the FIFO.  As demonstrated above this nearly works, but causes a single word to be lost when a packet read crosses a page boundary.

I fixed the ←EIData issue in Darkstar and at long last, Ethernet is working properly.  I was even able to connect to one of the machines here at the museum:

The release on Github has been updated; grab a copy and let me know how it works for you!

If you’re interested in learning more about how the Star works at the microcode level, the Hardware Reference and Microcode Reference are a good starting point. Or drop me a line!