At Home With Josh Part 8: Lisp System Installation

In our last installment our intrepid adventurer had gotten the Interphase 2181 SMD controller running again, and had used it to do a low-level format of a gigantic 160mb Fujitsu hard drive. This left him with all the ingredients needed to put together a running LMI Lambda system, at least in theory.

Tapes and Tape Drives

I had intended to wait until I’d found the proper 9-track tape drive for the system before attempting to go through the installation process. As you might recall, the Qualstar drive I have on the system is functional but extremely slow; it takes several minutes to find and load tiny diagnostic programs from tape. As system installation requires copying 20-30 megabytes from tape (i.e. a lot of data), it seemed to me that doing an installation from the Qualstar would simply take too long to be practical.

But on the other hand, the drive was functional and it occurred to me that possibly it was just the SDU “tar” utility’s simplicity that might be causing the extremely slow transfer rate: if it was overly conservative in its reads from tape, on an unbuffered drive like the Qualstar it might end up being very inefficient. Maybe the “load” tool would be a bit more intelligent in its tape handling. Or perhaps not — but there’s no harm in trying, right? And while I’d tracked down a proper Cipher F880 tape drive, it would require waiting until the current quarantine was lifted to go and pick it up. I demanded instant gratification so off I went. Except…

Shiny new 9-track tapes! Ok, they’re still twenty years old. That’s pretty new!

The other pressing issue was one of tapes. I have a small pile of blank (or otherwise unimportant) 9-track tapes here at home but all of them were showing signs of shedding and none of them worked well enough for me to write out a complete Lambda Install tape. Despite a few cleaning passes, eventually enough oxide would shed off the tape to gum up the heads and cause errors. Clearly I would need to find some better tapes, so I hit up eBay and found a stack of 5 tapes, new-old stock (apparently from NASA). And waited patiently for them to arrive.

[About a week passes…]

The Actual Installation

With new tapes in hand I was finally able to write out the “Install” tape without errors. And thus, with my fingers crossed and a rabbit’s foot in my pocket I started the installation process. The “load” utility is used to set up new hard disks and can copy files to and from tape to do installation and maintenance tasks. Here’s a transcription of the operation:

SDU Monitor version 102 
>> disksetup 
What kind of disk do you have? 
Select one of { eagle cdc-515 t-302 micro-169 cdc-9766 }: micro-169

>> /tar/load
using 220K in slot 9
load version 307
(creating block-22 mini-label)
(creating mini-label)
Disk is micro-169
Loading "/tar/bigtape"
Loading "/tar/st2181"
Disk unit 0 needs to be initialized:
Disk has no label, or mini-label is wrong.
Create new unit 0 label from scratch? (y/n) y
Creating lisp label from scratch.
How many LAMBDA processors: 1
Type "?" for command list.
load >

The initial steps above tell the SDU that I have a “micro-169″ disk (the 8-inch equivalent of the giant 14” Fujitsu I actually have installed). This is necessary to allow the load program to know the characteristics of the system’s disk. /tar/load is then executed and since it finds an empty disk, it sets up the disk’s label, the LMI’s equivalent of a partition table — information written to the beginning of the disk that describes the disk and slices the its space into partitions that can be used to hold files or entire filesystems. Even though this Lambda is a “2X2” system (with two LAMBDA processors) it would be a tight squeeze to run both of them in the the 160mb capacity of the drive, so for now I will only be running one of the two processors. Or trying to, anyway. (Oooh, foreshadowing!)

Continuing on:

load > install

*****************************************************
The new backup label track number is 16340.
Record this number and keep it with the machine.
*****************************************************
Writing unit 0 label
Using half-inch tape
Installing track-0 disk driver ...
copying 10 blocks from "/tar/disk" to "disk"
copy done

Tape-ID = "FRED gm 7/23/86 12:33:34 522520414 "
File is "SDU5 3.0 rev 14"; 1500 blocks.
"SDU5 3.0 rev 14" wants to be loaded into UNX6.
reading 1500 blocks into UNX6.
copying 1500 blocks from "bigtape" to "UNX6"
copy done

Next file ...

File is "ULAMBDA 1764"; 204 blocks.
Default partition to load into is LMC3
reading 204 blocks into LMC3.
copying 204 blocks from "bigtape" to "LMC3"
copy done

Next file ...

File is " 500.0 (12/8)"; 23189 blocks.
Default partition to load into is LOD1
reading 23189 blocks into LOD1.
copying 23189 blocks from "bigtape" to "LOD1"
copy done
Next file ...

End of tape.
Writing unit 0 label
load >

There are three tape files that the install process brings in; you can see them being copied above. The first (“SDU5 3.0 rev 14”) contains a set of tools for the SDU to use, diagnostics and bootstrap programs. The second (“ULAMBDA 1764″) contains a set of microcode files for use by the Lambda processor. The Lambda CPU is microcoded, and the SDU must load the proper microcode into the processor before it can run. The final file (cryptically named ” 500.0 (12/8)” is a load band. (The Symbolics analogue is a “world” file). This is (roughly) a snapshot of a running Lisp system’s virtual memory. At boot time, the load band is copied to the system’s paging partition, and memory-resident portions are paged into the Lambda’s memory and executed to bring the Lisp system to life.

Loading from the installation tape on the Qualstar

As suspected the tape drive’s throughput was higher during installation than during diagnostic load. But not by much. The above process took about two hours and as you can see it completed without errors, or much fanfare. But it did complete!

Time now for the culmination of the last month’s time and effort: will it actually boot into Lisp? Nervously, I walk over to the LMI’s console, power it on, and issue the newboot command:

The “newboot” herald, inviting me to continue…

Newboot loaded right up and prompted me for a command. To start the system, all you need to do is type boot. And so I did, and away it went, loading boot microcode from disk and executing it, to bring the Lisp system in from the load band. Then the breaker tripped. Yes, I’m still running this all off a standard 15A circuit in my basement, and the addition of the Fujitsu drive has pushed it to its limit. Don’t do this at home, people.

I unplugged the tape drive to reduce the power load a bit, reset the breaker and turned the Lambda on again. Let’s have us another go, shall we?

(I apologize in advance for the poor quality of the videos that follow. One of the side-effects of being stuck at home is that all I have is a cellphone camera…)

And awayyyyyy we go!

(Warning, the above video is long, and also my phone gave out after 3:12. Just watch the first 30 seconds or so and you’ll get the gist of it.)

Long story short: about two minutes after the video above ended, the screen cleared. This normally indicates that Lisp is starting up, and is a good sign. And then… nothing. And more nothing. No disk activity. I gave it another couple of minutes, and then I pinged my friend Daniel Seagraves, the LMI expert. He told me to press “META-CTRL-META-CTRL-LINE” on the keyboard (that’s the META and CTRL keys on both the left and right side of the keyboard, and the LINE key, all held down at once). This returns control to the SDU and to newboot; at this point the “why” command will attempt to provide context detailing what’s going on with the Lambda CPU:

Tell me why, I gotta know why!

Since Daniel knows the system inside and out, he was able to determine exactly where things were going off the rails during Lisp startup. The error being reported indicated that a primitive operator expected an integer as an operand and was getting some other type. This hints at a problem inside the CPU logic, that either ended up loading a bogus operand, or that reported a valid operand as having a bogus type.

Out of superstition, I tried rebooting the system to see if anything changed but it failed identically, with exactly the same trace information from “why.”

In the absence of working diagnostics, schematics, or even detailed hardware information, debugging this problem was going to be an interesting endeavor.

But all was not lost. This is a 2×2 system, after all. There’s a second set of CPU boards in the chassis just waiting to be tested…

This time, after the screen clears (where the video above starts) you can see the “run lights” flashing at the bottom of the screen. (These tiny indicators reflect system and CPU activity while the system is running). Then the status line at the bottom loaded in and I almost fell over from shock. Holy cow, this thing is actually working after all this time!

I have one working Lambda CPU out of the two. I’m hoping that someday soon I can devise a plan for debugging the faulty processor. In particular, I think the missing “double-double” TRAM file opined about in Part 6 of this series has turned up on one of the moldy 9-track tapes I rescued from the Pennsylvania garage — this should hopefully allow me to run the Lambda CPU diagnostics, but it will have to wait until I have a larger disk to play with, as this file resides in a UNIX partition that I don’t currently have space for.

In the meantime since I have a known working set of CPU boards (recall from Part 2 that the Lambda processor consists of four boards), it was a simple matter to isolate the fault to a single board by swapping boards between the sets one at a time. The issue turns out to be somewhere on the CM (“Control Memory”) board in CPU 0.

Meanwhile, not everything is exactly rosy with CPU 1… what’s with the system clock?

Time keeps slippin’ into the future…

System beeps are high-pitched squeaks and the wall clock on the status line counts about 4x faster than it should. Daniel and I are unsure exactly what the cause is at this time, but we narrowed it down to the RG (“ReGisters”) board. In many systems there is a periodic timer, sometimes derived from the AC line frequency (60Hz in the US) that is used to keep time and run the operating system’s process scheduler. The LMI uses something similar, and clearly it is malfunctioning.

Another fairly major issue is the lack of a working mouse. Way back in Part 2 I noted that the RJ11 connector had corroded into a green blob. This still needs repair and as it turns out, getting a working mouse on this system ended up being a journey all its own…

But that’s for my next installment. Until then, keep on keepin’ on!

Lookin’ good, LMI. Lookin’ good.

At Home With Josh Part 7: Putting the “Mass” in “Mass Storage”

Continuing from the conclusion of my last post, I had gotten to the point of testing the LMI’s Interphase SMD 2181 disk controller, but was getting troubling looking diagnostic output:

SDU Monitor version 102
>>/tar/2181 -C
Initializing controller
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 10 test 0 no completion (either ok or error) from iopb status
iopb: cyl=0 head=0 sector=0 (TRACK 0)
87 11 00 00 00 00 00 00 00 00 00 00 10 00 c5 62 00 40 00 00 00 00 c5 3a

My immediate suspicion was that this was truly indicating a real failure with the controller. The “gave up waiting for IO completion” message was the canary in the coal mine here. The way a controller like this communicates with the host processor (in this case the SDU) is via a block of data in memory that the controller reads, this is the “iopb” (likely “I/O Program Block”) mentioned in the output above. The iopb contains the command to the controller, the controller executes that command then returns the status of the operation in the same iopb, and may interrupt the host processor to let it know that it’s done so. (More on interrupts later.)

What the above diagnostic failure appears to be indicating is that the SDU is setting up an initialization command in the iopb and waiting for the 2181 to return a result. And it waits. And it waits. And it waits. And then it gives up after a few milliseconds because the response has taken too long: the 2181 is not replying, indicating a hardware problem.

But the absence of any real documentation or instructions for these diagnostics or the 2181 controller itself left open other possibilities. The biggest one was that I did not at that time have an actual disk hooked up to the controller. The “-C” option to the 2181 diagnostic looked like it was supposed to run in the absence of a disk, but that could be an incorrect assumption on my part. It may well be that the 2181 itself requires a disk to be connected in order to be minimally functional, though based on experience with other controllers this seemed to me to be unlikely. But again: no documentation, anything could be possible.

The lack of a disk was a situation I could rectify. The Lambda’s original disk was a Fujitsu Eagle (model M2351), a monster of a drive storing about 470mb on 10.5″ platters. It drew 600 watts and took up most of the bottom of the cabinet. At the time of this writing I am still trying to hunt one of these drives down. The Eagle used the industry-standard SMD interface, so in theory another SMD drive could be made to work in its stead. And I had just such a drive lying dormant…

If the Eagle is a monster of a drive, its predecessor, the M2284 is Godzilla. This drive stores 160MB on 14″ platters and draws up to 9.5 Amps while getting those platters spinning at 3,000 RPM. The drive itself occupies the same space as the Eagle so it will fit in the bottom of the Lambda. It has an external power supply that won’t, so it’ll be hanging out the back of the cabinet for awhile. It also has a really cool translucent cover, so you can watch the platters spinning and the heads moving:

The Fujitsu M2284, freshly installed in the Lambda.

The drive is significantly smaller in capacity than the Eagle, but it’s enough to test things out with. It also conveniently has the same geometry as another, later Fujitsu disk that the SDU’s “disksetup” program knows about (the “Micro-169”), which makes setup easy. I’d previously had this drive hooked up to a PDP-11/44 and was working at that time. With any amount of luck, it still is.

Only one thing needed to be modified on the drive to make it compatible with the Lambda — the sector size. As currently configured, the drive is set up to provide 32 sectors per track; the Lambda wants 18 sectors. This sector division is provided by the drive hardware. The physical drive itself provides storage for 20,480 bytes per track. These 20,480 bytes can be divided up into any number of equally sized sectors (up to 128 sectors per track) by setting a bank of DIP switches inside the drive. Different drive controllers or different operating systems might require a different sector size.

The 32 sector configuration was for a controller that wanted 512-byte sectors — but dividing 20,480 by 32 yields 640. Why 640? Each sector requires a small amount of overhead: among other things there are two timing gaps at the beginning and end of each sector, as well as an address that uniquely identifies the sector, and a CRCs at the end of the sector. The address allows the controller to verify that the sector it’s reading is the one it’s expecting to get. The CRC allows the controller to confirm that the data that was read was valid.

What a single sector looks like on the Fujitsu.

The more sectors you have per track, the more data space you lose to this overhead. The Lambda wants 1024-byte sectors, which means we can fit 18 sectors per track. 20,480 divided by 18 is approximately 1138 bytes — 114 bytes are used per sector as overhead. The configuration of the DIP switches is carefully described in the service manual:

Everyone got that? There will be a quiz later. No calculators allowed.

Following the instructions and doing the math here yields: 20,480 / 18 = 1137.7777…, so we truncate to 1137 and add 1, yielding 1138. Then we subtract 1 again (Fujitsu enjoys wasting my time, apparently) and configure the dip switches to add up to 1137. 1137 in binary is 10 001 110 001 (1024 + 64 + 32 + 16 + 1), so switches SW1-1, SW1-5, SW1-6, SW1-7 are turned on, along with SW2-4. Simple as falling off a log!

With that rigamarole completed, I hooked the cables up, powered the drive up and set to loading the Interphase 2181 diagnostic again:

SDU Monitor version 102
>>/tar/2181 -C
Initializing controller
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 10 test 0 no completion (either ok or error) from iopb status
iopb: cyl=0 head=0 sector=0 (TRACK 0)
87 11 00 00 00 00 00 00 00 00 00 00 10 00 c5 62 00 40 00 00 00 00 c5 3a

Darn. Looks like having a drive present wasn’t going to make this issue go away.

About that time, a local friend of mine had chimed in and let me know he had a 2181 controller in his collection. It had been installed in a Sun-1 workstation at some point in its life, and was a slightly different revision. I figured that if nothing else, comparison in behavior between his and mine might shed a bit of light on my issue so I went over to his house to do a (socially distanced) pickup.

Annoyingly, the revisional differences between his 2181 and mine were fairly substantial:

You can see the commonality between the two controllers, but there are many differences, especially with regard to configuration jumpers — and since (as I have oft repeated) there is no documentation, I have no idea how to configure the newer board to match the old.

So this is a dead end, the revisional differences are just too great. I did attempt to run diagnostics against the new board, but it simply reported a different set of failures — though at least it was clear that the controller was responding.

Well it was well past the time to start actually thinking about the problem rather than hoping for a deus ex machina to swoop in and save the day. I wasn’t going to find another 2181, and documentation wasn’t about to fall out of the sky. As with my earlier SDU debugging expedition, it seemed useful to start poking at the 2181’s processor, in this case an Intel 8085. This is an 8-bit processor, an update of the 8080 with a few enhancements. Like with the SDU’s 8088, looking at power, clock and reset signals was a prudent way to start off.

Unlike with the SDU, all three of these looked fine — power was present, the clock was counting out time, and the processor wasn’t being reset. Well, let’s take a look at the pinout of the 8085 and see what else we might be able to look at:

8085 pinout, courtesy Wikimedia Commons (https://commons.wikimedia.org/wiki/File:Anschlussbelegung_8085.gif)
Oscillation overthruster

The AD0 through AD7 and A15 pins are the multiplexed address/data bus: When the 8085 is addressing memory, AD0-AD7 plus the A8-A15 pins form the 16-bit memory address; when a read or write takes place, AD0-AD7 contain the 8-bits of data being read or written. Looking for activity on these pins is a good way to see if the CPU is actually running — a running CPU will be accessing and addressing memory constantly — and sure enough, looking with an oscilloscope showed pulsing on these pins.

The TRAP, RST7.5, RST6.5, RST5.5, and INTR signals are used to allow external devices to interrupt the 8085’s operation and are typically used to let software running on the CPU know that a hardware event has occurred: a transfer has completed or a button was pushed, for example. When such an interrupt occurs, the CPU jumps to a specific memory location (called an interrupt vector) and begins executing code from it (referred to as an interrupt service routine), then returns to where it was before the interrupt happened. If any of these signals were being triggered erroneously it could cause the software running on the CPU to behave badly.

Probing the RST7.5, 6.5 and 5.5 signals revealed a constant 3.5V signal at RST7.5, a logic “1” — something connected to the 8085 was constantly interrupting it! This would result in the CPU running nothing but the interrupt service routine, over and over again. No wonder the controller was unable to respond to the Lambda’s SDU.

Now the question is: what’s connected to the RST7.5 signal? It could potentially come from anywhere, but the most obvious source to check on this controller is one chip, an Intel 8254 Programmable Interval Timer. As the name suggests, this device can be programmed to provide timing signals — it contains three independent clocks that can be used to provide precise timing for hardware and software events. The outputs of these timers are often connected to interrupt pins on microprocessors, to allow the timers to interrupt running code.

The Intel 8254 Programmable Interrupt Timer

And, as it turns out, pin 17 (OUT 2) of the 8254 is directly connected to pin 7 (RST7.5) of the 8085. OUT 2 is the data output for the third counter, and goes high (logic “1”) when that timer elapses. Based on what I’m seeing on the oscilloscope, this signal is stuck high, likely indicating that the 8254 is faulty. Fortunately it’s socketed, so it’s easy to test that theory. I simply swapped the 8254s between my controller and the one I’m borrowing from my friend and…

Success! Probing RST7.5 on the 8085 now shows a logic “0”, the CPU is no longer constantly being pestered by a broken interval timer and is off and running. The diagnostic LEDs on the board reflect this change in behavior — now only one is lit, instead of both. This may still indicate a fault, but it’s at least a different fault, and that’s always exciting.

Well, the controller is possibly fixed, and I already have a disk hooked up and spinning… let’s go for broke here and see if we can’t format the sucker. The “-tvsFD” flags tell the controller to format and test the drive, doing a one-pass verify after formatting. Here’s a shaky, vertically oriented video (sorry) of the diagnostic in action:

Look at that text!

And here’s the log of the output:

SDU Monitor version 102
>> reset
>> disksetup
What kind of disk do you have?
Select one of { eagle cdc-515 t-302 micro-169 cdc-9766 }: micro-169
>> /tar/2181 -tvsFD
Initializing controller
2181: status disk area tested is from cyl 0 track 0 to cyl 822 track 9
2181: status format the tracks

Doing normal one-pass format ...
2181:at test 0 test reset                 passed
2181: test 1 test restore                 passed
2181: test 2 test interrupt               passed
failedginning of cyl 159 ...              at beginning of cyl 0 ...
2181: error 18          test 4 header read shows a seek error
iopb: cyl=0 head=0 sector=0 (TRACK 0)
00 00 82 12 00 00 00 00 00 00 00 12 10 00 c5 62 00 40 00 00 00 00 c5 3a
2181: error 18          test 4 header read shows a seek error
The 1 new bad tracks are:...
bad: track 1591; cyl=159 head=1
        ... mapped to track 8229; cyl=822 head=9

There were 1 new bad tracks
Number of usable tracks is 8228 (822 cyls).
(creating block-10 mini-label)
Disk is micro-169
2181: test 5 read random sectors in range   passed
2181: status read 500 random sectors
2181: test 6 write random sectors in range  passed
2181: status write to 500 random sectors
2181: test 8 muliple sector test            passed
2181: test 9 iopb linking test              passed
2181: test 10 bus-width test                passed
2181: test 0 test reset                     0 errors
2181: test 1 test restore                   0 errors
2181: test 2 test interrupt                 0 errors
2181: test 4 track verify                   2 errors
2181: test 5 read random sectors in range   0 errors
2181: test 6 write random sectors in range  0 errors
2181: test 8 muliple sector test            0 errors
2181: test 9 iopb linking test              0 errors
2181: test 10 bus-width test                0 errors
>>

And some video of the drive doing its thing during the verification pass:

Look at those random seeks!

As the log indicates, one bad track was found. This is normal — there is no such thing as a perfect drive (modern drives, both spinning rust and SSD have embedded controllers that automatically remap bad sectors from a set of spares, providing the illusion of a flawless disk). Drives in the era of this Fujitsu actually came with a long list of defects (the “defect map”) from the factory. A longer verification phase would likely have revealed more bad spots on the disk.

Holy cow. I have a working disk controller. And a working disk. And a working tape drive. Can a running system be far off? Find out next time!

At Home With Josh Part 6: Diagnostic Time!

In our last exciting episode, after a minor setback I got the Lambda’s SDU to load programs from 9-track tape. Now it’s time to see if I can actually test the hardware with the available diagnostics.

Tape Images

Tape images of the Lambda Release and System tapes are available online. Daniel Seagraves has been working on updating the system and has his latest and greatest are available here. A tape image is a file that contains a bit-for-bit copy of the data on the original tape. Using this file in conjunction with a real 9-track drive allows an exact copy of the original media to be made. In my case, I have an HP 7980S 9-track drive connected to a Linux PC for occasions such as these. At the museum we have an M4 Data 9-track drive set up to do the same thing. The old unix workhorse tool “dd” can be used to write these files back to tape, one at a time:

$ dd if=file1 of=/dev/nst0 bs=1024

(Your UNIX might name tape devices differently, consult your local system administrator for more information.)

Data on 9-track tapes is typically stored as a sequence of files, each file being separated by a file mark. The Lambda Release tape contains five such files, the first two being relevant for diagnostics and installation, and the remainder containing Lisp load bands and microcode that get copied onto disk when a Lisp system is installed.

The first file on tape is actually an executable used by the SDU — it is a tiny 2K program that can extract files from UNIX tar archives on tape and execute them. Not coincidentally, this program is called “tar.” The second tape file is an actual tar archive that contains a variety of utility programs and diagnostics. Here’s a rundown of the interesting files we have at our disposal:

  • 3com – Diagnostic for the Multibus 3Com Ethernet controller
  • 2181 – Diagnostic for the Interphase 2181 SMD controller
  • cpu – Diagnostic for the 68010 UNIX processor
  • lam – Diagnostic for the Lambda’s Lisp processors
  • load – Utility for loading a new system: partitioning disks and copying files from tape.
  • ram – Diagnostic for testing NuBus memory
  • setup – Utility for configuring the system
  • vcmem – Diagnostic for testing the VCMEM (console interface) boards.

The unfortunate thing is: there is no documentation for most of these beyond searching for strings in the files that might reveal secrets. Daniel worked out the syntax for some of them while writing his LambdaDelta emulator, but a lot of details are still mysterious.

In case you missed it, I summarized the hardware in the system along with a huge pile of pictures of the installed boards in an earlier post — it might be helpful to reacquaint yourself to get some context for the following diagnostic runs. Plus pictures are pretty.

I arbitrarily decided to start by testing the NuBus memory boards, starting with the 16mb board in slot 9 (which I’d moved from slot 12 since the last writeup). The diagnostic is loaded and executed using the aforementioned tar program as below. The “-v” is the verbose flag, so we’ll get more detailed output. the “-S 9” indicates to the diagnostic that we want to test the board in slot 12.

SDU Monitor version 102
>> reset
>> /tar/ram -v -S 9
ram: error 6 test 1 bad reset state, addr=0xf9ffdfe0, =0x1, should=0x4
ram: error11 test 3 bad configuration rom
ram: error 1 test 6 bad check bits 0xffff, should be 0xc, data 0x0
ram: error 1 test 7 bad check bits 0xffff, should be 0xc, data 0xffffffff
ram: error 7 test 8 for dbe w/flags off, DBE isn't on
ram: error 7 test 9 for dbe w/flags off, DBE isn't on
ram: status fill addr 0xf9000000
ram: status fill addr 0xf9002000
... [elided for brevity] ...
ram: status fill addr 0xf903c000
ram: status fill addr 0xf903e000
ram: status fill check addr 0xf9000000
ram: status fill check addr 0xf9002000
... [elided for brevity] ...
ram: status fill check addr 0xf903c000
ram: status fill check addr 0xf903e000

Well, the first few lines don’t look exactly promising what with all the errors being reported. The test does continue on to fill and check regions of the memory but only up through address 0xf907e000 (the first 512KB of memory on the board, that is). Thereafter:

ram: status fill check addr 0xf907c000
ram: status fill check addr 0xf907e000
ram: status block of length 0x4000 at 0xf9000000
ram: status stepsize 4 forward
ram: error 4 test 16 addr 0xf9000004 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000008 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf900000c is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000010 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000014 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000018 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf900001c is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000020 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000024 is 0xffffffff sb 0x0 (data f/o)
ram: error 4 test 16 addr 0xf9000028 is 0xffffffff sb 0x0 (data f/o)

And so on and so forth, probably across the entire region from 0xf9000000-0xf907ffff. This would take a long time to run to completion (remember, this output is coming across a 9600bps serial line — each line takes about a second to print) so I wasn’t about to test this theory. The output appears to be indicating that memory reads are returning all 1’s (0xffffffff) where they’re supposed to be 0 (0x0).

So this isn’t looking very good, but there’s a twist: These diagnostics fail identically under Daniel’s emulator. After some further discussion with Daniel it turns out these diagnostics do not apply to the memory boards I have installed in the system (or that the emulator simulates). The Memory boards that were available at the time of the Lambda’s introduction were tiny in capacity: Half megabyte boards were standard and it was only later that larger (1, 2, 4, 8, and 16mb boards) were developed. The only memory boards I have are the later 4 and 16mb boards and these use different control registers and as a result the available diagnostics don’t work properly. If there ever was a diagnostic written for these newer, larger RAM boards, it has been lost to the ages.

This means that I won’t be able to do a thorough check of the memory boards, at least not yet. But maybe I can test the Lisp CPU? I slotted the RG, CM, MI and DP boards into the first four slots of the backplane and started up the lam diagnostic program:

SDU Monitor version 102
>> reset
>> /tar/lam -v
/tar/lam version 6
compiled by wer on Wed Mar 28 15:24:02 1984 from machine capricorn
setting up maps
initializing lambda
starting conreg = 344
PMR
passed ones test
passed zeros test
TRAM-ADR
passed ones test
passed zeros test
TRAM
passed ones test
passed zeros test
loading tram; double-double
disk timed out; unit=0x0 cmd=0x8F stat=0x0 err=0x0
disk unit 0 not ready
can't open c.tram-d-d
SPY:
passed ones test
passed zeros test
HPTR:
Previous uinst destination sequence
was non-zero after force-source-code-word
during lam-execute-r
Previous uinst destination sequence
was non-zero after force-source-code-word
during lam-execute-r
Previous uinst destination sequence
was non-zero after force-source-code-word
... [and so on and so forth] ...

Testing starts off looking pretty good — the control registers and TRAM (“Timing RAM”) tests pass, and then it tries to load a TRAM file from disk. Aww. I don’t have a disk connected yet, and even if I did it wouldn’t have any files on it. And to add insult to injury, as it turns out even the file it’s trying to load (“double-double”) is unavailable — like the later RAM diagnostics, it is lost to the ages. The TRAM controls the speed of the execution of the lisp processor and the “double-double” TRAM file causes the processor to run slowly enough that the SDU can interrogate it while running diagnostics. Without a running disk containing that file I won’t be able to proceed here.

So, as with the memory I can verify that the processor’s hardware is there and at least responding to the outside world, but I cannot do a complete test. Well, shucks, this is getting kind of disappointing.

The vcmem diagnostic tests the VCMEM board — this board contains the display controller and memory that drives the high-resolution terminals that I restored in a previous writeup. It also contains the serial interfaces for the terminal’s keyboard and mouse. Perhaps it’s finally time to test out the High-Resolution Terminals for real. I made some space on the bench next to the Lambda and set the terminal and keyboard up there, and grabbed one of the two console cables and plugged it in. After powering up the Lambda, I was greeted with a display full of garbage!

Isn’t that the most beautiful garbage you’ve ever seen?

This may not look like much, but this was a good sign: The monitor was syncing to the video signal, and the display (while full of random pixels) is crisp and clear and stable. The garbage being displayed was likely due to the video memory being uninitialized: Nothing had yet cleared the memory or reset the VCMEM registers. There is an SDU command called “ttyset” that assigns the SDU’s console to various devices; currently I’d been starting the Lambda up in a mode that forces it to use the serial port on the back as the console, but by executing

>> ttyset keytty

The SDU will start using the High-Resolution terminal as the console instead. And, sure enough, executing this caused the display to clear and then:

It lives!

There we are, a valid display on the screen! The keyboard appeared to work properly and I was able to issue commands to the SDU using it. So even without running the vcmem diagnostic, it’s apparent that the VCMEM board is at least minimally functional. But I really wanted to see one of these diagnostics do its job, so I ran it anyway:

SDU Monitor version 102
/tar/vcmem -v -S 8
vcmem: status addr = 0xf8020000
vcmem: status fill addr 0xf8020000
... [elided again for brevity] ...
vcmem: status fill addr 0xf803e000
vcmem: status fill check addr 0xf8020000
vcmem: status fill check addr 0xf8022000
vcmem: status fill check addr 0xf8024000
vcmem: status fill check addr 0xf8026000
...
vcmem: status fill check addr 0xf8036000
vcmem: status fill check addr 0xf8038000
vcmem: status fill check addr 0xf803a000
vcmem: status fill check addr 0xf803c000
vcmem: status fill check addr 0xf803e000

As the test continued, patterns on the screen slowly changed, reflecting the memory being tested. Many different memory patterns are tested over the next 15 minutes.

vcmem: status movi block at 0xf803c000
vcmem: status movi stepsize 2 forward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 2 backward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 4 forward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 4 backward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 8 forward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 8 backward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
... [elided] ...
vcmem: status movi stepsize 4096 forward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 4096 backward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 8192 forward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000
vcmem: status movi stepsize 8192 backward
vcmem: status movi checking 0x0000 writing 0xffff
vcmem: status movi checking 0xffff writing 0x0000

And at last the test finished with no errors reported, leaving a test pattern on the display. How about that, a diagnostic that works with the hardware I have.

Not your optometrist’s eye chart…

Looking crisp, clear, and nice and straight. This monitor is working fine — what about the other one? As you might recall, I got two High-Resolution Terminals with this system and pre-emptively cleaned and replaced all the capacitors in both of them. The second of these would not display anything on the screen when powered up (unlike the first) though I was seeing evidence that it was otherwise working. Now that I’d verified that the VCMEM board was working and producing a valid video signal, I thought I’d see if I could get anything out of the second monitor.

Well, what do you know? Note the cataracts in the corners.

Lo and behold: it works! I soon discovered the reason for the difference in behavior between the two monitors: The potentiometer (aka “knob”) that controls the contrast on this display is non-functional; with it turned up on the first monitor you can see the retrace, with it turned down it disappears. Interestingly the broken contrast control doesn’t seem to have a detrimental effect on the display, as seen above.

So that’s a VCMEM board, two High-Resolution Terminals, and the keyboard tested successfully, with the CPU and Memory boards only partially covered. I have yet to test the Ethernet and Disk controllers. The 3com test runs:

SDU Monitor version 102
>> /tar/3com -v
3com: status Reading station address rom start addr=0xff030600
3com: status Reading station address ram start addr=0xff030400
3com: status Transmit buffer: 0xff030800 to 0xff030fff.
3com: status Receive A buffer: 0xff031000 to 0xff0317ff.
3com: status Receive B buffer: 0xff031800 to 0xff031fff.
3com: status Receive buffer A - 0x1000 to 0x17ff.
3com: status Receive buffer B - 0x1800 to 0x1fff.
>>
Hex editors to the rescue!

No errors reported and the test exits without complaining so it looks like things are OK here. Now onto the disk controller. I don’t have a disk hooked up at the moment, but after a bit of digging into the test’s binary, it looks like the “-C” option should run controller-only tests:

SDU Monitor version 102
>>/tar/2181 -C
Initializing controller
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 10 test 0 no completion (either ok or error) from iopb status
iopb: cyl=0 head=0 sector=0 (TRACK 0)
87 11 00 00 00 00 00 00 00 00 00 00 10 00 c5 62 00 40 00 00 00 00 c5 3a
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 10 test 0 no completion (either ok or error) from iopb status
iopb: cyl=0 head=0 sector=0 (TRACK 0)
87 11 00 00 00 00 00 00 00 00 00 00 10 00 c5 62 00 40 00 00 00 00 c5 3a
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion
2181: error 10 test 0 no completion (either ok or error) from iopb status
iopb: cyl=0 head=0 sector=0 (TRACK 0)
87 11 00 00 00 00 00 00 00 00 00 00 10 00 c5 62 00 40 00 00 00 00 c5 3a
2181: error 3 test 0 Alarm went off - gave up waiting for IO completion

This portends a problem. The output seems to indicate that the test is asking the controller to do something and then report a status (either “OK” or “Error”) and the controller isn’t responding at all within the allotted time, so the diagnostic gives up and reports a problem.

This could be caused by the lack of a disk, perhaps the “-C” option isn’t really doing what it seems like it should, but my hacker sense wass tingling, and my thought was that there was a real problem here.

Compounding this problem is a lack of any technical information on the Interphase SMD 2181 controller. Not even a user’s manual. The Lambda came with a huge stack of (very moldy) documentation, including binders covering the hardware: “Hardware 1” and “Hardware 3.” There’s supposed to be a “Hardware 2” binder but it’s missing… and guess which binder contains the 2181 manual? Sigh.

There are two LEDs on the controller itself and at power-up they both come on, one solid, one dim. In many cases LEDs such as these are used to indicate self-test status — but lacking documentation I have no way to interpret this pattern. I put out a call on the Interwebs to see if I could scare up anything, but to no avail.

Looks like my diagnostic pass at the system was a mixed bag: Outdated diagnostics, meager documentation, and what looks like a bad disk controller combined with the success of the consoles and at least a basic verification of most of the Lambda’s hardware.

In my next installment, I’ll hook up a disk and see if I can’t suss out the problem with the Interphase 2181. Until then, keep on chooglin’.

At Home With Josh Part 5: Tape Drives and EPROMS And Whiskers on Kittens

After working on the Lambda’s monitors as described in my last writeup, my next plan of action was to see if I could get diagnostics loaded into the SDU via 9-track tape.

ROM Upgrade Time

Monitor version 8

But first, I wanted to upgrade the SDU’s Monitor ROM set. The SDU Monitor is a program that runs on the SDU’s 8088 processor. It provides the user’s interface to the SDU where it provides commands for loading and executing files, and booting the system. It also communicates with devices on the Multibus and the NuBus. As received, my Lambda has Version 8 of the monitor which is as far as I know the last version released to the public at large. However, the Lambdas that Daniel Seagraves owns came with an internal-only Monitor in their SDUs, designated Version 102. This version adds a few convenient features: it can deal more gracefully with loss of CMOS RAM (important since I don’t have a backup battery anymore) and adds a few commands for defining custom hard drive types.

One of the 27128A EPROMs

A week or so prior, Daniel had sent me a copy of the Version 102 ROMs so all I had to do was write (“burn”) the copy onto real EPROMs and install them in the SDU. I had spare EPROMs (four Intel 27128A’s) at the ready but the thing about EPROMs is that they need to be erased before they can be programmed with new data. To do that, you need an EPROM eraser — a little box with a UV lamp in it and a timer — and after searching the house for mine, I came to the realization that I’d taken it to work a few months back for and I’d never brought it back home. And due to present circumstances, it was going to be stuck there for awhile.

Bah.

So with much wailing and gnashing of teeth I ordered a replacement off the Internet and began waiting patiently for it to arrive in 5-7 days. Meanwhile I decided to start documenting this entire process for some kind of blog thing and so I went off, took pictures, and started writing long-winded prose about Lisp Machines and restorations.

Also during this time I decided to test the old wives’ tale about using sunlight to erase EPROMs. At that time Seattle was experiencing an extremely lovely bout of sunny weather, so I took four 27128’s outside, put them on the windowsill so as to gather as much sun as possible, and left them there for the next four days.

[Four Days Pass…]

They’re still not erased. So much for that idea. Only a few more days until my real EPROM eraser arrives anyway…

[A Few More Days Pass…]

At last my dirt-cheap EPROM ERASER arrived on my doorstep bearing dire warnings of UV exposure and also of the overheating of this fine precision instrument. Ignoring the 15 minute time-limit warning, I put four EPROMs into the drawer, cranked the timer up to 30 minutes and turned it on. And once again I found myself waiting.

[Thirty Minutes Pass…]

The Faithful DATA I/O 280 Gang Programmer

I pulled out my trusty DATA I/O 280 programmer and ran its “blank check” routine to ensure that the EPROMs were indeed as blank as they ought to be, and the programmer said “BLANK CHECK OK.”

It’s then a simple matter to hook the programmer up to my PC and program the new ROMs and soon enough all four were ready to get installed in the SDU. But before I did that I wanted to double-check that the Lambda was still operating — it’d been a couple of weeks since I had last powered it up and things can go wrong sometimes. Best not to introduce a new variable (i.e. new ROMs) into the equation before I can verify the current state.

Uh Oh

And so I hooked things back up to the Lambda and turned it on. And… nothing. No SDU prompt on the terminal and all three LEDs on the front panel are stuck on solid. (As we learned in my second post in this series, this indicates that the SDU is failing its self tests.) I pressed the Reset button a couple of times. Nothing. Power cycled the system just for luck. NIL.

“Well, fiddle-dee-dee!” I said. (I may have used slightly more colorful language than this, but this is a family-friendly blog). “Gosh darn it all to heck.”

I retraced my steps — had I changed anything since the last time I’d powered it on? Yes — I’d installed an Ethernet board that Daniel had graciously sent me (my system apparently never had an Ethernet interface, which is an odd choice for a Lisp Machine). Maybe the Ethernet board was causing some problem here? Pulling the board made no difference in behavior. I checked the power supply voltages at the power supply and at the backplane and everything was dead on. I pulled the SDU out and inspected it, and double-checked socket connections and everything looked OK.

Well, at this point I’m frustrated and my tendency in situations like this is to obsess about whether I broke something and so I run in circles for a bit when what I really need to do is take a step back: OK — it’s broken. How is it broken? How do I go about answering that question? Think, man, think!

Well, I know that the three LEDs are on solid — this would indicate that the SDU’s self-test code either wasn’t running or wasn’t getting very far before finding a failure. So: let’s assume for now that the self-test code isn’t running — how do I confirm that this is the case?

The SDU uses an Intel 8088 16-bit microprocessor to do its business, and it’s a relatively simple matter to take a look at various pins on the chip to see if there’s activity, or lack thereof. The most vital things to any processor (and thus good first investigations while debugging a microprocessor-based system) are power, clock, and reset signals. Power obviously makes the CPU actually, you know, do things. A clock signal is what drives the CPU’s internal logic, one cycle at a time, and the reset signal is what tells the CPU to clear its state and restart execution from step 0. A lack of the first two or an abundance of the latter could cause the symptoms I was seeing.

i8088 pinout, from the datasheet.

Time to get out the oscilloscope; this will let me see the signals on the pins I’m probing. Looking at the Intel 8088 pinout (at right) the pins I want to look at are pin 40 (Vcc), Pin 21 (RESET) and pin 19 (CLK). Probing reveals immediately that Vcc and CLK are ok. Vcc is a nice solid 5 volts and CLK shows a 5Mhz clock signal. RESET however, is at 3.5V — a logic “1” meaning that the CPU is being held in a Reset state, preventing it from running!

So that’s one question answered: The SDU is catatonic because for some reason RESET is being held high. Typically, RESET gets raised at power-up (to initialize the CPU among other things) and might also be attached to a Reset button or other affordance. In the SDU, there is also power monitoring signal attached to the RESET line designated as DCOT (DC Out of Tolerance) — if the +5 voltage goes out of range the CPU is reset:

Power supply status signals, from the “SDU General Description” manual.

It seemed possible (though unlikely) that the Lambda’s Reset switch or the cabling associated with it had failed, causing the symptoms I was seeing, but as expected the cabling tested out OK.

SDU Paddlecard. The cable carrying the DCOT signal is the bundle 2nd from the right.

I then checked the DCOT signal and even though the power supply voltages were measuring OK, I was reading 8V on the DCOT pin at the paddleboard. 8V is high for a normal TTL signal (which are normally between 0 and 5V) and this started me wondering. When I disconnected the DCOT wire from the paddleboard, the DCOT signal measured at the power supply was 0V while the signal at the paddleboard remained at 8V… suggesting some sort of failure between the power supply and the SDU for this signal. It also explains the the odd 8V reading– it’s likely derived from a 12V source with a pull-up resistor; the expectation being that the DCOT signal from the power supply would normally pull the signal down further into valid TTL range.

But what could have failed here? Clearly the power supply itself thinks things are OK (hence the 0V reading there). The difference in reading at one end versus the other can really only point to a problem in the wiring between the power supply and the SDU paddleboard.

Connectors just above the power supply. Connector on the left carries actual power, connector on the right contains the power supply status signals.

There is a small three-conductor cable that runs from the SDU paddlecard down to a connector just above the power supply (pictured at the right). A second three-conductor cable is plugged into this and runs to the power supply itself. Checking these signals for continuity revealed that none of the three wires were continuous from the SDU back to the power supplies. The cable from the connector to the power supply tested fine — so what happened to the cable that runs from the connector to the SDU?

I pulled out the power supply tray to get a look at the cabling, and one glance below the card cage revealed the answer:

Oh.

“Aw, nut bunnies,” I may have been heard to remark to myself. Those three wires had apparently been ripped from the connector (quite neatly, I might add) the last time I had pushed the power supply drawer back in. (Likely while I was taking pictures of the power supplies for my blog writeups…) Quite how it got caught on the tray I’m not sure.

This was easy enough to fix — the wires were reinserted into the pins, and the cable itself rerouted so it would hopefully never get snagged on the power supply tray again. I reconnected everything, held my breath and flipped The Switch one more time.

[Several long seconds pass…]

SDU Monitor version 8
CMOS RAM invalid
>>

greeted me on the terminal. Yay. Whew.

New SDU Monitor, At Last

OK. So at last I’m back to where I’d started this whole exercise, after an evening of panic and frenzied investigation. What was it I was going to do when I’d started out? Oh yeah, I had these new SDU ROMs all ready to go, let’s put ’em in:

SDU Monitor version 102
>>
>> help
r usage: r [-b][-w][-l] addr[,n]
w usage: w [-b][-w][-l] addr[,n] d
x usage: x [-b][-w][-l] addr[,n]
dev usage: dev
reset usage: reset [-m] [-n] [-b]
enable usage: enable [-x] [-m] [-n]
init usage: init
ttyset usage: ttyset dev
setbaud usage: setbaud portnum baudrate
disktype usage: disktype type heads sectors cyls gap1 gap2 interleave skew secsize badtrk
disksetup usage: disksetup
setdr usage: setdr name file [ptr]

>>

Ah, much better. So now the SDU was functional and upgraded, and I was ready to move onto the next phase: running system diagnostics.

9-Track Mind

The SDU has the capability to run programs off of 9-track tape. This is how an operating system is loaded onto a new disk and it’s how diagnostics are loaded into the system to test the various components. The Lambda uses a Ciprico Tapemaster controller, which is normally hooked up to a Cipher F880 tape drive mounted in the top of the Lambda’s chassis.

Qualstar 1052 9-Track Tape Drive

My Lambda’s F880 was missing when I picked it up, but the Tapemaster should in theory be able to talk to any tape drive with a Pertec interface. I’m still trying to track down an actual F880 drive, but in the meantime I have one potentially compatible drive in my collection — a Qualstar 1052. This was a low-cost, no-frills drive when it was introduced in the late 1980s but it’s simple and well documented and best of all: it has no plastic or rubber parts, so no worries about parts of the transport turning into tar or becoming brittle and breaking off.

It’s also really slow. The drive has no internal buffer so it can’t read ahead, which means that depending on how it’s accessed it may have to “shoeshine” (reverse the tape, then read forward again) the tape frequently. But speed isn’t really what I’m after here — will it work with the Lambda or won’t it?

I have a tape containing diagnostics (previously written on a modern Unix system with a SCSI 9-track drive attached) ready to go. So I cabled up the Qualstar to the Lambda’s Pertec cabling (as pictured in the above photograph) and attempted to load a program from the tape using the “tar” program:

>> /tar/load

The tape shoeshined (shoeshone?) once (yay!) and stopped (boo!), and the SDU spat back:

tape IO error 0xD
>>

Well, that’s better than nothing, but only barely. But what does IO error 0xD mean? The unfortunate reality is that there is little to no documentation available on the SDU or the associated diagnostics. But I do have the Ciprico Tapemaster manual, thanks to bitsavers.org:

Relevant snippet from the Ciprico Tapemaster manual

Error 0xD indicates a data parity error: the data being transmitted over the Pertec cabling isn’t making it from the drive to the Tapemaster intact, so the controller is signalling a problem. The SDU stops the transfer and helpfully provides the relevant error code to us.

So where are the parity errors coming from? It could be a controller fault but given this system’s history I decided to take a closer look at the cabling first. A Pertec tape drive is connected to the controller via two 50-pin ribbon cables designated “P1” and “P2.” While I’d previously checked the cables for damage, I hadn’t actually checked the edge connectors at the ends of the cables, and well, there you go:

Crusty Connectors
It’s cleaner now, trust me.

It’s a bit difficult to discern in the above picture but if you look closely at the gold contacts you can see that there’s greenish-white corrosion on many of them. Dollars to donuts that this is the problem. For cleaning out edge connectors like this, I’ll usually spray the insides with contact cleaner and then, to apply a bit of abrasion to the pins, I wipe a thin piece of cardboard soaked in isopropyl alcohol in and out of the slot. I used this technique here and pulled out a good quantity of crud and dirt, leaving the connector nice and clean. Or at least clean enough to function, I hoped. Rinse and repeat for the second Pertec cable and let’s try this again:

>> /tar/load

And the tape shoeshines once… and shoeshines again… and again… hm. Is it actually reading anything or is there some other problem and it’s just reading the same block over and over? Let’s let it run for a bit…

A graphic portrayal of tape shoeshining!
>> /tar/load
no memory in main bus
Initializing SDU
 
SDU Monitor version 102
>>

No more parity errors, and the “load” program did eventually load. It then complained about a lack of memory. It looks like the tape drive, the cable, and the controller all work! (Thanks to the Qualstar’s slowness, it took about five minutes between the “/tar/load” and the “no memory in main bus” error, so this is going to be a time-consuming diagnostic process going forward.)

The “no memory in main bus” error is not unexpected since at that moment the only boards installed in the Lambda’s backplane were the SDU and the tape controller. I have a few memory boards at my disposal, and I opted to re-install the 4mb memory board that normally resides in slot 9. Let’s run that again:

>> /tar/load
no memory in main bus
Initializing SDU

SDU Monitor version 102
>>

Well, hm. Maybe that memory board doesn’t work — let’s try the 16mb board normally in slot 12:

>> /tar/load
using 220K in slot 12
load version 307
Disk unit 0 is not ready.

/tar/loadbin exiting
Initializing SDU

SDU Monitor version 102
>>

Huzzah! The LMI has memory that works well enough to respond to the SDU, and it has a functional tape subsystem. It’s going to be awhile before I have a functioning disk, and as per the error message in the output, /tar/load expects one to be present. This is completely rational, since “load” is the program that is used to load Lisp load bands onto the disk from tape.

That’s enough for now — in the next installment, since the Lambda is now capable of loading diagnostics from tape, we’ll actually run some diagnostics! Thrills! Chills! Indecipherable hexadecimal sludge! See you next time!

At Home With Josh Part 4: High-Resolution Terminal Restoration

In my previous installment I tested the Lambda’s fans and the power supply and powered things up for the first time. A few of the fans were non-functional even after cleaning and lubricating and so an eBay order was placed. While waiting for those fans to arrive, I started taking a look at the Lambda’s monitor, referred to in the documentation variously as “High-Resolution Terminals” or “High-Resolution Monitors.” Whatever they’re called, they were in need of a bit of sprucing up:

LMI Lambda monitors, mid-cleaning
If you look closely you can see the scarring on the picture tube’s face.

I cleaned the exterior with a bit of Simple Green and some liberally applied Magic Eraser to get some of the grungier parts off. Exposure to the elements had left some interesting etchings on the anti-glare coating on the CRT; I’m not sure if they ate it away or if they just deposited a thin layer of something on the surface– either way light scrubbing with the Magic Eraser either removed the deposits or removed the rest of the anti-glare coating to match, it’s difficult to say. Eventually the external dirt and grime were removed and the monitors looked much better.

Shiny Happy Monitor

One of the two monitors has a CRT with “cataracts” (also referred to as “CRT Rot”) in the corners. This is a problem that plagues older televisions and monitors and is caused by degradation of the thin PVA glue layer between the front of the CRT glass and the implosion-protection lens. Over time, the PVA breaks down causing small spots to appear. The cataracts here are relatively minor; on an ADM-3A terminal I recently repaired the PVA breakdown was so extreme it had started leaking out onto the circuit boards and was an absolute bear to clean up (fortunately it’s organic so it washes off with water, but not without a fight.)

Close-up of CRT cataracts

On some CRTs this can be repaired, typically by carefully separating the implosion lens from the rest of the CRT, cleaning all the PVA residue and reassembling. (Here’s an interesting write-up of one such process for old TV picture tubes.) On the Lambda’s CRTs, this is made much more difficult — there is a metal band around the tube with a “lip” that extends around the front of the tube, helping to hold the whole assembly in place. This band is glued in place with a potting compound making removal of this band extremely difficult; and due to the lip the implosion lens cannot be removed without removing this band. Fortunately the cataracts on this tube are not bad enough to warrant attempting to do this — I’m happy to put up with it — and the other monitor’s tube is free of cataracts, so far.

Inspecting the Internals

Much like with the rest of the Lambda system, we have to give the internals a thorough inspection. One of these monitors was left on top of the Lambda in the garage; the other (the one with the cataracts) was on the floor near the door and was exposed a slightly more harsh environment as a result. However, they both cleaned up very nicely on the outside so my expectation was that internally they’d be similar as well.

The interior of the monitor with the back covers removed.

Looking at the interior from the rear (as in the above photos) reveals a relatively clean monitor — though you can see some obvious rust in places like the ground strap going across the bell of the picture tube. The interior of the other monitor is very similar in terms of condition. On the left side is the monitor’s power supply, on the right is the deflection board which scans the CRT’s electron beam across the screen to form a raster, and in the middle is the “neck board”, so called because it plugs into the neck of the CRT. It supplies power to the CRT’s heaters and takes the incoming video signal from the Lambda and feeds it to the tube appropriately.

Safety First, People:

It’s important to note at this time that safety is important when working on CRTs: they tend to make use of extremely high voltages (5-10KV in monochrome tubes, up to 25KV in color sets) and you can get zapped if you’re not careful. Picture tubes can build up a charge even while sitting unplugged and unused; so even though this tube hasn’t been powered up in a couple of decades it still has the potential to bite. Discharging of the tube before working on it is a good idea, as is working with one hand behind your back (to avoid causing current flow across your heart, should you grab ground with one hand and 20KV with the other, inadvertently.)

The CRT envelope is made of glass and contains a powerful vacuum; if the glass breaks the tube can potentially implode — sending glass shards everywhere. While modern tubes (like the ones in the Lambda) have implosion protection measures in place, it never hurts to be careful around large tubes like this: watch your hands, watch your tools and make sure they don’t strike the neck of the tube where the glass is thinnest and the most likely to take damage.

The Inspection Continues:

Looking closer at the power supply you can get a better idea of the cleanup necessary here — everything is covered in a layer of dirt and shingle detritus from when the garage’s roof was replaced. Just as with the Lambda’s chassis and power supplies, I’m looking for out-of-place things and broken or damaged components. All three of these boards contain socketed chips, so checking the sockets and the ICs in them for corrosion is important. I’m also keeping my eyes open for damaged capacitors. Monitors can be hard on capacitors, especially high-resolution monitors like this one. Monitors don’t typically have fans so they tend to run hot, and heat leads to shorter lifespans of internal components.

And sure enough I found my first victims on the power supply board.

RIFA film capacitors, top view.
Exploded RIFA, from the side

These are film capacitors, used as AC line-filters in the power supply. Or at least they were film capacitors — as you can see the casings have cracked and split and have turned a deep brown in places (they’re normally golden-yellow colored). These were manufactured by RIFA, and are absolutely notorious for failing in this way, and when they do fail they emit an unforgettable odor, though not an entirely bad one (we’ll get to those smells later). Kinda like burning paper. Which is not a coincidence because these are made of metallized paper. As they age, moisture seeps in and eventually causes a short-circuit resulting in smoke, but not usually fire. (There was this one time at the museum when one of these died in action and set off the smoke detectors and the fire department came. That was a fun day…)

Even if they haven’t already clearly failed as these have, they should be replaced as a matter of course, because they will fail if you don’t. Probably within the first thirty minutes of being powered up.

Original RIFA next to its brand-new replacement.

Moving along onto the deflection board: There are a few socketed chips, and the sockets don’t look so hot. These sockets have deeply recessed pins and my suspicion is that as a result they hold onto moisture longer, increasing the chances of corrosion. As you can see in the picture below, some of the pins show the original gold-plating, while others are green or grey. It’s likely that these sockets will provide poor contact with the IC, so I replaced it with a spare I had on-hand, a nice turned-pin socket from Mill-Max:

On this same board I found the first instance in this restoration of a visibly-bad electrolytic capacitor:

That capacitor is supposed to be a uniform silver in color. It is browned and blackened likely due to heat while in operation due to its proximity to that transformer, and it might have been a slightly under-specced part as well. Instant candidate for replacement, no questions asked.

On the neck board we find another kind of capacitor that can often cause issues; look closely at the four blue raindrop-shaped components in the below picture:

One of these things is not like the others.

Well, they’re all supposed to be blue, but the second one from the left is black and sure enough it’s a dead short, rather than a capacitor. These are tantalum capacitors and they have a tendency to explode in a tiny little fireball when they go bad — and they can scorch other components when they do so. And the smell they make is decidedly unpleasant. Given the state of the black one it seemed prudent to replace all four just to be on the safe side. Takes a long time to get that odor out of an already stuffy basement, I’m not taking any chances.

There is one further board in these monitors, called the “headboard” — it lives in the monitor stand and breaks out the signals on the cable from the Lambda into keyboard, mouse, and video. It also includes a tiny speaker and three controls for brightness, contrast, and volume:

Ugh. Just, ugh.

The one in the monitor that had been sitting on top of the Lambda was just a bit dusty, but the one that’d been on the floor… yow. Some serious insect activity in here over the years, and everything was pretty well covered in… insect stuff. I took the board out of the housing and scrubbed the base-plate down in the utility sink. I went over the PCB with a soapy toothbrush and Q-Tips to get as much gunk off as possible. It cleaned up pretty well!

Ahh, much better.

Having assessed the condition of the boards (and having gone through and cleaned everything as thoroughly as possible), I made the decision to do a complete “re-cap” of the three main boards in both monitors: a replacement of all of the electrolytic capacitors and the problematic-looking tantalums. I placed an order for replacement parts (I tend to use Mouser or Digi-Key for this sort of thing) and 3-5 days later a box of capacitors arrived on my doorstep.

Replaced tantalum capacitors on the neck board

At this point it’s a straightforward matter: desolder the old components, and solder in the new ones, one at a time. I have a Hakko desoldering iron (just like the ones we use at work) and a Weller soldering station that have served me well over the years. I didn’t take any pictures of the actual desoldering/resoldering process because I only have two hands and I don’t own a tripod… I’m lame.

All the replaced capacitors from the power supply board, next to the re-capped supply. On a really ugly benchtop.
Woo-hoo!

With everything reassembled in the first monitor, the only thing left to do was to put it on the bench, plug it in, cross my fingers, and turn it on. I wasn’t entirely sure it would do anything without being hooked up to a running Lambda with functioning video hardware: some monitors of this era won’t light up unless they’re getting sync pulses from their video input (Sun-3 monochrome workstation monitors for example). Others will display a “free-running” blank raster instead. Turns out the Lambda console is one of these latter:

I got very lucky and things appeared to be working as perfectly as could be determined without a valid video signal to feed it. I let it burn in on the bench for a half an hour and no issues arose. If you’ve accidentally put an electrolytic capacitor in backwards, you’ll know within the first few minutes, if not sooner… (another fun smell you don’t want in your house.)

The next day I took on the second console, going through exactly the same steps — like deja vu all over again. However, I wasn’t as lucky with this one; no smoke or fire but also no action on the display at all, and no faint chatter of the yoke indicating deflection, no static on the face of the tube indicating the presence of high voltage. The neck of the picture tube lit up, however — so at least a few things were functional. The voltages coming out of the power supply (it generates +48V and +32V) were in the right ballpark at +45 and +33. There is a potentiometer on the power supply to adjust these voltages, so I gave it a small tweak to get closer to +48V and at that point I heard the HV kick into gear, but I don’t understand why — the voltages were a little off but not enough to prevent the deflection board from running, and I’d only tweaked it up to +46V anyway. This seems like a sign of a bad connection: a loose wire, dirty connector, or maybe a cold solder joint. At this point I had high voltages and could hear evidence of deflection but there was nothing on the display, no free-running raster like on the first monitor.

I powered it down and took a closer look at everything; cleaned the various cables and connectors on the power supply and inspected my soldering job — still nothing jumped out at me as being obviously wrong. But I put it back together and it was still working as before, deflection running and high voltages being generated, though I was still getting nothing on the display at all. At this point I needed a break and decided to shelve the second monitor for the time being. One working display was enough to use with the Lambda (assuming I ever did get it to do anything) and at that point I’d return to debugging the other.

In my next write-up I’ll see if I can get the Lambda to load and run diagnostics from the world’s slowest 9-track tape drive, after dealing with a minor setback. The anticipation, you can hardly stand it!

At Home with Josh, Part 3: Power Supply Testing and Initial LMI Lambda Power-up

Last time around I went through the cleaning and inspection of the Lambda. Overall, apart from a few errant screws and a faint musty odor, things looked pretty good. We’re inching closer to the point where we can power this thing on and see what it does, but there are a few things left to go over before we can get there.

Power Supplies!

We haven’t yet looked at the power supplies in the Lambda beyond verifying that mice haven’t eaten all the wires away. LMI made this task pretty easy, and I want to thank the person who designed the chassis: the whole power supply assembly is on rack-mount slides and just pulls right out of the rear of the cabinet like so:

The LMI Lambda Power tray, inside the cabinet.
Disconnect the two cables and slide it right out. Magic!

The fan tray in the rear is normally situated right below the card cage, and serves to keep the logic well-ventilated. There are two power supplies mounted in the front of the tray. The narrow ACDC Electronics supply on the left provides +/-5V and +/-12V to the backplane, and the large blue LH Research supply on the right provides +5V at 150 Amps. That’s a lot of power, and it’s used to run the majority of the logic in the system. The smaller supply provides power to run the ECL components for the high-resolution terminal interface, the RS-232 drivers for the console ports, and other odds and ends.

The screw is a tightening atrocity.

As with the card cage inspection, it’s important to go through both of these supplies with a fine-toothed comb looking for damage and for things that don’t belong inside power supplies. For example, this screw that fell out of the smaller supply as I was opening it up to take a closer look.

And like our misplaced screw from the last entry I have no idea how it ended up in here, but there it was. Had this supply been powered up with that screw in place it could have shorted something out and done serious damage.

What I usually look for in a visual inspection of a power supply are obviously bad parts: bulging electrolytic capacitors, charred tantalum capacitors or transistors, burned traces, things broken off, etc. Passing a visual inspection by no means indicates the supply will work — many parts can (and often do) fail invisibly. But visibly broken parts obviously won’t work and so it’s a good starting point.

Unfortunately, I did my power supply inspection just before I decided to start thoroughly documenting the restoration process, so I don’t have any detailed photos of the insides of these supplies as I was examining them and testing them (and it’s a sufficient amount of work to remove them again that I’m not taking them back out to take pictures now. I apologize for my laziness.) Suffice it to say, apart from that screw, nothing out of the ordinary was found and everything looked much cleaner than I expected — no corrosion or signs of damage of any kind.

It’s at this point where I usually debate with myself whether to just preemptively replace the electrolytic capacitors in the supplies. Shotgun replacement of caps isn’t always a good idea (and I suspect there are engineers out there who will take umbrage with even suggesting such an approach) but for a supply of this age, and for one that’s sat in sub-optimal conditions (cold, dry Pennsylvania winters, hot humid summers for 20+ years) there’s a good chance that the capacitors have dried out and gone out of spec. At LCM+L we typically go one step further than capacitor replacement: Since our goal is to run many of our systems 24×7 (or at least during museum hours) we will often bypass the original supplies and retrofit more efficient and reliable modern supplies (this is usually done alongside the originals so that the system can be returned to its original configuration if need be). I don’t have the budget for that option (150A 5V supplies are expensive), so I’m sticking with the original supplies.

I also figured, what the heck, let’s test the supplies with the original capacitors and see what happens. This is done by hooking up a “dummy” load to the power supply — switching supplies don’t like being powered up without a load to power — and measuring voltages and testing the supplies for ripple. Ripple is a deviation from a nice, flat DC voltage and an excessive amount of it (more than 50-100mV typically) indicates trouble in the power supply: bad smoothing capacitors or dead rectifiers or transistors typically. The exact effects and causes differ depending on the type of power supply but it’s never a good thing to have.

Again I lament my lack of foresight as far as taking pictures of this portion of the restoration. On the positive side: everything tested out fine. All voltages were present and working under load within specifications. I let the supplies run for an hour or so. No funny smells were emitted, and the magic smoke remained safely ensconced in the supplies. I may still end up replacing the capacitors in the supplies at some point in the future, but for the time being I’m leaving well-enough alone.

Fans!

Dirty, filty, ugly, naughty fan…

The importance of cooling in your average computer system cannot be overstated, and thus it is vital to ensure that all the fans are spinning freely and actually moving air around. A closeup shot of one of the fans in the fan tray pre-restoration is to the right. You can see how much crud and rust has accumulated on it over the years. Of the six fans in the tray, two of them spin freely, and the others make a noise not unlike a kazoo when given a spin. However, these are well-made fans and the three exposed screws on the underside there indicate that they were probably made to be serviced — it should be possible to disassemble, clean, and lubricate them.

Sure enough, they come right apart. The major thing to keep track of is the Circlip that holds the fan blade rotor onto the shaft, as well as the numerous washers involved. Cleaning the bearing shaft off and applying some light machine oil to it and to the felt washers is all that’s required to make one of these spin freely again; I also took the time to clean the fan blades as thoroughly as possible. They’re never going to look like new again, but at least they’re not dirty anymore.

Go on, give it a good spin!
eBay photo of the replacement fan

After reassembly, I applied power to the fans and four out of the six worked just fine — they made no appreciable noise and they spun at the right speed, moving a lot of air. The other two spun up very slowly even with help and never reached the proper speed. I suspect that the windings in these motors have been damaged (possibly while in service years ago). These two fans will need to be replaced. I was able to find an exact replacement on eBay, new-old stock. You can find just about anything on eBay.

First Power-up

The power supplies are tested and seem to be working, and enough of the fans are spinning so as to keep things cool at least for a little while — let’s power this sucker up and see what happens.

As discussed in the last write-up, the System Diagnostic Unit (SDU) is the nexus of the Lambda: it bridges the two buses in the backplane and is responsible for booting the operating system. It also provides a diagnostic console over its RS-232 serial interface, which is what I’ll be talking to a lot in the coming weeks. For the initial power-up the only board I will have installed in the backplane is the SDU. This will confirm the functionality of the power supplies, wiring, the backplane and hopefully the SDU itself.

I pulled the other boards out of the backplane, leaving them in the slots but pulled out so they are disconnected, and wired up my trusty Qume dumb-terminal to the serial port marked “Remote” on the rear bulkhead and configured it to 9600 baud, 8 bits, 1 stop bit, no parity. I plugged the power cable into the wall, crossed my fingers, and flipped The Switch.

The Switch.
Serial cabling on the rear bulkhead panel
It’s difficult to make out in this picture, but all three LEDs are stuck on. Not what I wanted to see.

Fans spun, the LEDs on the front panel and the SDU itself came on. No smoke! But also nothing on the terminal. And all three lights on the front panel were on. This indicates a fault — under normal operation the LEDs should progress through a pattern and then end after a few seconds with just the RUN light on (and probably the SET UP light as well, this indicates that the battery-backed up settings have been erased — expected since I’d pulled the long-since-dead battery out.) Per consulting with Daniel, if all three lights are stuck on, this means that the SDU isn’t passing its initial round of self-tests. This could be caused by any number of things — bad RAM or EPROM, a clobbered CPU bus, or the RESET signal to the CPU being stuck on.

I rechecked power supply voltages and they measured fine. I pulled the SDU out and re-examined the pins on all of the socketed chips and found that a few pins on the 8088 CPU were still pretty grungy (sloppy work on my part during my earlier cleaning/inspection pass, I suppose) so I went over them again.

Powered up the system with the re-cleaned SDU installed and… hey! After a few seconds, just the RUN and SET UP lights were on. Looks like I got lucky here. Still nothing on the terminal, though. Hm.

Wankel-Rotary Switch

I consulted further with Daniel Seagraves and he suggested checking the rotary selector switch on the rear of the cabinet; this selects one of several actions when the system is powered on or reset. Normally it should be at “0” to force the monitor console onto the serial port, but depending on the revision of the SDU’s ROM monitor, it might want to be at “1” instead. I turned the switch to “1”, turned the system on and:

Success!

Alright! Now we’re cookin’ with gas. The SDU is talking to me at last, the power supplies are working acceptably, and the faint musty odor of the air being wafted at me out of the cabinet by the chassis fans smells like victory.

My plan now is to hunt down a suitable 9-track tape drive so that I can use it to load diagnostics into the system and test the various components in the system. While that’s going on, I’m going to take a look at the Lambda’s High-Resolution Terminals (aka “monitors”) and see what needs to be done to make them work again. Stay tuned for the next exciting installment!

At Home With Josh, Part 2: Lambda Cleaning and Inspection

As mentioned in my last post, the LMI Lambda I acquired spent most of the last two decades in a garage somewhere in Pennsylvania. It wasn’t a climate-controlled garage. It wasn’t a leak-proof or a rodent-proof garage. The door to said garage wouldn’t even close all the way. So this machine was in rough condition when I picked it up. The first thing I did before bringing it in the house was to clean it out, thoroughly.

The Lambda, just outside the garage.

Unfortunately I neglected to take too many “before” photos (at the time I hadn’t yet considered documenting this whole process) but they’d only really reveal a cabinet covered in a thin layer of grit and grime.

The outside and inside walls of the cabinet were cleaned off with some Simple Green and some elbow grease, and while this was underway I inspected the interior for rodent-related damage.

The Lambda after cleaning, sitting in its new home in my basement. Still kind of rusty, but smelling less musty.

Fortunately, apart from the hole in the door near the top (which is hard for small animals to reach) the Lambda’s cabinet is free of any holes large enough for a mouse to squeeze into and no damage was observed. This is good: mice love to chew up wires, and their urine and droppings can eat the traces off of printed circuit boards and corrode the legs on ICs and other components. While a lot of this sort of damage can eventually be repaired, it’s difficult and messy (and smelly) work to deal with.

Removal of Batteries

After a thorough wash-down of the cabinet, things were smelling better so I had a good look at the card cage and backplane. Daniel Seagraves alerted me to the fact that the SDU (System Diagnostic Unit) contains a soldered-on nickel-cadmium battery, and that’s where I started my inspection.

Anyone who’s worked on any of a variety of old computers knows that batteries are bad news. Tiny cells soldered to motherboards have been the death of many a PC, Amiga and VAXStation (and they’re not kind to arcade games either.) Over time, the batteries leak or outgas — and what comes out of them eats away at circuit boards and dissolves traces. So as I pulled the SDU out I kept my fingers crossed that the damage would be minimal — the SDU is one board I don’t have spares for. And as luck would have it, even though this thing had been sitting in a rather unforgiving environment for a long time, the battery hadn’t leaked at all:

Intact NiCad battery on the SDU. Got very lucky here.

Whew. I got out a pair of wirecutters and immediately removed it from the board. I’m not taking any chances, and this battery will need to be replaced anyway. The rest of the SDU was in decent shape — some rust on some IC legs, and one rather rusty looking clock crystal but overall not too shabby.

Inspecting the Card Cage

When inspecting a cardcage like this, generally it’s important to make sure that everything’s in its right place: In addition to making sure that nothing’s missing, some slots may have designated purposes (where having the wrong board in the wrong slot can cause catastrophic damage) and some busses don’t like having empty slots in the wrong places. (Some backplanes don’t even have the concept of a bus at all, and each slot is hard-wired for a specific board.)

The Lambda’s cardcage as-received, still sitting on the loading dock.

Additionally, when inspecting a cardcage in a system that’s been in an unfriendly environment it’s a good idea to carefully inspect everything for damage or foreign matter up front so you know what you’re going to be up against. It’s also a good time to do minor cleanup of corroded pins on chips that are socketed.

This is also a good time to start collecting as much information and documentation about the system as possible — this will be required in order to assess what’s missing and what goes where, and it will be essential if repairs need to be made. Bitsavers is always good place to start.

First thing’s first, let’s see what we have in the system. The Field Service Manual is a good reference to have handy while doing this, although it’s slightly outdated in some areas. The backplane is divided into sections, and slots are numbered from right to left, starting at zero. The first 8 slots are NuBus slots with a processor interconnect for the Lambda CPUs (there are two sets of four boards) and these slots are specifically designated for these processor boards. Slots 8-14 are general-purpose NuBus slots, slot 15 is for the SDU and the remaining slots are Multibus slots for peripheral controllers (Disk, Tape, and Ethernet typically). Here’s a handy table showing what’s where in my system:

SlotBoard
0RG (CPU 0): “ReGisters Board”
1CM (CPU 0): “Control Memory”
2MI (CPU 0): “Memory Interface”
3DP (CPU 0): “Data Paths”
4RG (CPU 1): (as above)
5CM (CPU 1)
6MI (CPU 1)
7DP (CPU 1)
8VCMEM (0): High-resolution terminal (keyboard, mouse, display) interface for CPU 0
94MB Memory: Memory for the Lisp processors
10VCMEM (1): High-resolution terminal (keyboard, mouse, display) interface for CPU 1
11CPU: 68010 processor for UNIX
1216MB Memory: Memory for the Lisp processors
13Empty
14Empty
15SDU: System Diagnostic Unit
16Disk: Multibus SMD Disk Controller
17Tape: Multibus Pertec Tape Controller
18Empty
19Empty
20Memory: TI-Manufactured 4MB Board

The RG, CM, MI, and DP boards together contain the logic for the Lambda’s Lisp processor, one specially designed to run Lisp code efficiently.

Lambda CPU: RG “Registers” board

This system is outfitted with two sets of these boards. This was a way to make the system more cost (and space) effective: A single Lambda could run two processors and two consoles (and thus service two simultaneous users) while sharing a single disk. Since Lisp Machines were typically single-user machines, reducing the overall cost and footprint in this way would make the LMI more attractive to buyers concerned about investment in very expensive computers. (Symbolics did something similar with their 3653 workstation: it contained three 3650 Lisp CPUs in a single chassis.)

Lambda CPU: MI “Memory Interface” board.

The logic in the Lambda is built entirely from 7400-series TTL (Transistor-Transistor Logic) parts along with a few PALs (Programmable Array Logic) and EPROMs (Erasable-Programmable Read-Only Memory). The most complicated IC involved is the 74181 ALU used on the Data Paths board. The 74181 was a 4-bit ALU used in many processors of the time, though it was a bit outdated by the time of the Lambda (The Xerox Alto used them way back in 1973, for example).

Lambda CPU: DP “Data Paths” board – note the 3×3 array of 74S181 ALU chips.

That’s good news — TTL, PAL and EPROM parts are still readily available so they can be replaced if they fail. The trick, of course, is being able to debug the hardware to the point where the bad chips can be accurately identified. At the moment, circuit schematics are not available so this will likely be a challenge.

Lambda CPU: CM “Control Memory” board
The Motorola 68010-based UNIX CPU Board, for when you want to contaminate the beauty of your Lisp Machine with UNIX.

The “CPU” board in slot 11 contains a Motorola 68010 processor and allowed the Lambda to run UNIX alongside Lisp. This configuration, with two CPUs and the 68010 UNIX board, was referred to as the 2×2/PLUS. (See the exciting product announcement here!)

The VCMEM boards each drive a single high-resolution terminal — a black and white monitor into which a keyboard and mouse plug in, connected to the Lambda by a very long cable. (The terminals typically lived far away from the Lambda itself since no one would want to work very close to a machine this loud.) I got two consoles with the machine, but only one keyboard and mouse, alas.

VCMEM board — framebuffer memory and display, keyboard, and mouse controller
16MB of memory

The memory boards provide memory for the Lisp processors to use. Each processor can address up to 16MB of memory.

System Diagnostic Unit

The SDU sits in the middle of the cardcage and acts as a middle-man, so to speak: it negotiates communication between the NuBus boards on its right from the Multibus boards on its left. Additionally, it is responsible for booting the system and can be used to run diagnostics, to format drives, and to reinstall the operating system from tape. It is based around the Intel 8088 CPU.

Multibus Interphase SMD 2181 Drive Controller, in carrier.

The Multibus controllers sit in special carriers that adapt the boards to the form-factor that the Lambda’s card cage uses. The disk, tape, and ethernet controllers were off-the-shelf parts made by other manfacturers: 3Com for the Ethernet controller, Ciprico for the Tape controller, and Interphase for the Disk controller.

From the looks of things, everything is in the right slot, except for that TI Memory board in slot 20. That’s a Multibus slot and the Lisp memory boards won’t work there at all (and in fact could end up getting damaged if installed there when power is applied). Someone likely put it there to keep the board safe while in storage or in transit. I removed that board and put it in a static-proof bag for safekeeping.

The misplaced Texas Instruments-manufactured memory board, easily distinguishable by its use of surface-mount rather than through-hole parts.

Board Inspection and Cleaning

While going through the card cage I took every board out for inspection and cleaning. Things to look for are severe amounts of dirt, dust, rust or corrosion, physically broken or burned components, damaged circuit board traces, bent pins, and other things that don’t belong on a circuit board.

Socketed chips are good to look at closely: the sockets and the pins on the ICs are more prone to corrosion in my experience (not entirely sure why — some of this is due to being more exposed to the elements than soldered-in chips, moisture may linger in the sockets for a longer time than elsewhere; it may also be partially due to reactions between the different metals used in the sockets and the IC pins). On these boards it was not at all uncommon to find chips like this:

Fairly corroded legs on an Intel 8089 I/o Coprocessor

Light oxidation manifests as a dull grey patina on IC legs and socket connectors; corrosion as more obvious orange or brown rust or green verdigris. Any of these will cause poor connection with the sockets and must be eliminated. Oxidation and corrosion can usually be removed with some fine-grit sandpaper or gentle scraping with a sharp X-Acto knife blade. Sometimes such cleaning will reveal legs that have corroded away so much that they fall off — which is always disappointing to find. This fortunately has not happened to any of the ICs in the Lambda so far.

Occasionally you run across something like this:

IC with a bent leg found as-is in a socket!

This is more common than you might think — the pin got bent over when placed in the socket when the board was manufactured, but still made good contact with the top of the socket pin so the fault was never discovered. Over time, this can cause intermittent failures. Luckily it’s pretty easy to bend these back without breaking them.

And then sometimes you find something completely inexplicable:

How’d that screw get there?

I have no idea how or when this happened, but that screw was very firmly wedged in-between those two ICs. Given that the boards are vertically oriented in the chassis, it seems really unlikely that a screw fell in the top and managed to get stuck in there. Very strange. Also not good for operational integrity. Note also the grey oxidized pins on the blue socket in the lower right of that picture. When new, those were shiny and chrome. Luckily that’s a spare socket and is unused.

Inspecting the Rear Chassis

The logic boards plug into the front of the chassis. Around back, there’s another set of connectors which small “paddle cards”plug into, breaking out signals into cables that connect to peripherals and run to bulkhead connectors like these:

The back of the CPU card cage. Some light pitting on the aluminum of the rear panel.

That rear panel hinges outward and reveals the rear side of the Lambda’s backplane:

Behind the rear panel

You can see how the backplane is segmented here: On the top running from right to left you have the NuBus portion, which ends on the left with the NuBus Terminator paddle card. (There is a similar terminator on the right side as well). On the lower-left you can see two sets of four slot backplanes for the CPU interconnects. To the right of those are the paddle cards for the VCMEM slots: These bring out the connections for the high-resolution terminals, which are cabled to DB-25 connectors on the bulkhead panels. Further to the right you can just make out the SDU paddle card. Everything to the right of the SDU slot is Multibus — the middle section of these slots carry the Multibus signals, while the top and bottom sections carry device-specific signals for disk, tape, and ethernet cabling.

The important thing to check here is that all the paddle cards are present, in the right places, and are in decent shape. This appears to be the case. All the cabling looks to be intact as well. Good news all around.

Conclusion

After a lot of careful inspection and cleaning, the Lambda’s chassis looks to be in solid shape. In my next installment I’ll go over the power supplies and fans, and do the first power-up of the system. See you then!

At home with Josh

A Tour of my Workspace

Like many of you, I’ve been spending most of my time at home these days. While the museum is closed to the public, the engineers and other awesome staff are hard at work planning future exhibits and tours, polishing up our on-line systems, and working on hardware and software projects from the comfort of their couches. Or in my case, a workbench in the basement.

While I’ve been working from home, I’ve been hacking on a bit of emulation on the Unibone, archiving a huge pile of 8″ floppy disks and doing some preparatory research for a future emulation project. But I thought some of you might be interested what I’ve been working on in my spare time, which, as it turns out, is mostly identical to what I would usually be doing in my day-to-day when on site at the museum.

I have what might charitably be called a “hoarding problem” and as such I have ended up with a tiny computer museum in my basement that I’ve been adding to for the past twenty or thirty years. What started as a couple of Commodore 64s and TRS-80s (and so, so, so many TI-99/4As) scavenged from garage sales for $5 a pop when I was eleven has grown to encompass micros, minis and a few supercomputers crammed into an ever-shrinking space in my basement. Over the years I’ve learned to restore and maintain these systems, and since starting work as an engineer at LCM+L, I’ve gained even more knowledge but I still have a lot more to learn. In my spare time I tinker on machines in my collection, getting them to work again and making them do stupid/clever things.

So, here’s a brief tour of my basement museum, after which I’ll introduce you to my current restoration project.

So much computer, so little space. Note the fine wood-grain paneling on the walls. No basement should be without it.

Starting on the left-hand end of the west wall, we have an AT&T 3B2/600G UNIX system (1986) on top of a DEC PDP-11/73 (1983) and MicroVAX-I (1984). A classic LSI ADM-3A terminal adorns the 3B2. The PDP-11/73 was the first DEC system (and first minicomputer, even though DEC referred to it as a micro) I acquired, when I was still in high school. At the time it was running Micro-RSX, now it runs 2.11BSD. The large system to the right is a DEC VAX-11/750 (1980) with TU80 9-track drive. The 750 is restored and when I need to heat up the room, it runs 4.3bsd-Quasijarus or VMS. There’s currently a Data General Dasher D200 terminal on top of the TU80; it’s awaiting the completion of an Eclipse S/230 system.

Even More Computers

Moving further to the right, the DEC and Data General equipment continues: In the left rack I have a 1971 Data General Nova 820 (well, it’s a Canadian “DataGen” system but apart from a faint maple syrup smell…) on top of a DEC PDP-8/I (1968) with an OEM front panel. The Diablo 30 belongs to the Nova and has yet to be restored. In the right rack is a PDP-11/40 system (1973), with two RL02 removable pack drives (capacity 10mb) and one RK05 drive (capacity 2.5mb). The 11/40 runs a variety of systems: RT-11, RSX, Ultrix-11, and I plan to play around with RSTS on it sometime in the not-too-distant future. The PDP-8/I is in working condition but currently lacks peripherals beyond the basic teletype interface.

PDP-8s abound!

Continuing our tour, we are confronted with even more DEC equipment. The Straight-8 on the left is one of my favorite systems and I’m incredibly lucky to have found it. The Straight-8 is the first of the PDP-8 line, introduced in 1965. This particular unit is serial number 14, making it a very early example of this system (about 1500 of this model were made). It is nearly fully restored — there is a random glitch in the teletype interface that causes it to hang at random times and I haven’t yet tracked down the cause (though I suspect an intermittent backplane connection.) The next rack contains my workhorse PDP-8: a PDP-8/m with TU56 DECtape drive, RX02 floppy drive, RK05 removable pack drive and PC05 high-speed paper-tape reader and punch (these latter two are currently obscured by the ASR33 Teletype sitting in front of them). At the moment I have TSS/8 running on it.

Computer potpourri

The next rack to the right contains a miscellaneous assortment of minicomputers in various states of repair. From top to bottom: Texas Instruments 990/4 (1975, Data General Nova 800 (1971), Texas Instruments 980B (1974), PDP-8/e (1971), and mostly hidden is a Bunker Ramo BR-2412 (1971), an obscure 12-bit system originally manufactured by Nuclear Data. Tucked away in the corner is an AMT DAP 610, c. 1990. This is a massively parallel array processor, sporting 4096 1-bit processors in a 64×64 array. Each processor runs at 10Mhz, has its own memory and can communicate with its immediate neighbors via high-speed links. The system is capable of 40 billion boolean operations per second. It’s also technically a SCSI peripheral! The front-end processor for the DAP is a run-of-the-mill Sun Sparcstation LX.

North Wall: Way too much stuff

Progressing along the north wall, we have a smattering of systems and terminals, the most notable being the Inmos ITEM (“Inmos Transputer Evaluation Module”) which contains 40 Transputer processors, the PDP-11/34 (1977) which I’m using for Unibone development while working from home. The Tektronix 4051 computer and 4014 terminal are really cool examples of Tektronix’s Direct-View Storage Tube technology. The blue system in the corner is what remains of an Imlac PDS-1D (with front panel console box in front) and in front of that is a small rack with an Interdata Model 70 processor in it. Way back in the corner are shelves full of calculators and miscellaneous items too numerous to cover in detail. Also it’s a mess over there, and I try not to look at it too long. Avert your eyes…

Ridge 32 and VAX-11/730

Along the east side of the basement are a couple more systems: A Ridge 32 Supermini and a VAX-11/730. The Ridge 32 (1984) is an early 32-bit RISC design designed to be a VAX killer but hobbled by its operating system, ROS. ROS was a UNIX-alike system and at the time that just wasn’t enough, despite the system’s extremely fast CPU. The VAX-11/730 (1982) is one of my favorite systems — it’s the world’s slowest VAX at slightly less than 1/3 the speed of the original VAX-11/780 (0.3 VUPS) but it’s small, relatively quiet, and clever — The entire VAX processor was compressed into three hex-height boards. A couple of years back, I took this system to the beach:

VAX on the beach

Where it ran for 5 days providing dial-up service at 300 baud and a splendid time was had by all. We only tripped the breakers a few dozen times and it only rained once…

Where was I?

LispM’s, PERQs, and Suns, oh my!

Next up along the east wall is a collection of workstations; from left to right these are: Symbolics XL1200 (1990), Symbolics 3630 (1986), Three Rivers PERQ 1A (1981), and a Sun 2/120 (1986). The Symbolics systems are part of a class of systems known as “Lisp Machines” (LispM’s for short). They are near and dear to my heart — sophisticated systems from a lost era of computing. The PERQ 1A is a computer that came out of Pittsburgh in the early 1980s — a graphical workstation inspired by the Xerox Alto with local storage, a high resolution (768×1024) display, ethernet, and a microcoded CPU. It was an immensely clever design and was a very powerful machine for the time (and for the price) but Three Rivers never quite figured out how to compete in the market against Sun, Apollo and others. It is quirky and strange and hacked together, and I love it so much I wrote an emulator for it way back in 2006

The middle row of computer junk

Finally we have the center row of benches where I have a few notable systems set up, including two currently slated for restoration: a Xerox Alto and an LMI Lambda. I picked these two up on a recent trip out east, and they are two systems I never thought I’d own. I’m really looking forward to restoring them both, and my plan is to document the restoration here on this blog over the coming weeks. I’ve chosen the Lambda for the first restoration — I love the Alto more than just about anything else in my collection (I wrote an emulator for it even) but there are already several restored specimens in the world (including at LCM+L, of course) and there are, at the time of this writing, no LMI systems in operating condition in the world. (The Lambda in my basement is one of six known to exist.)

A Bit Of Background on LMI and the Lambda

The history of Lisp Machine Incorporated (LMI) and of lisp machines in general has been written about fairly extensively in various places (see here and here) and this post is running a bit long already, so I’ll provide an abridged version in this post:

In the mid-1970s a group of hackers at MIT’s AI Lab designed a series of computer systems for the specific purpose of running Lisp programs efficiently. The first was known as the CONS, and its successor the CADR was a sufficiently successful design that it was proposed to create a commercial product in the nascent Lisp AI space. Opinions differed on the best course of action and so two competing endeavors ensued: Lisp Machine Inc., and Symbolics. Both companies started off selling commercialized versions of the CADR (LMI’s as the LMI CADR, Symbolics’s as the LM-2) before expanding off into their own designs. Symbolics emerged the victor: their follow-up to the CADR, the 3600 series, was extremely successful while LMI struggled to sell their Lambda workstation — fewer than 200 were sold. LMI went bust in 1987 before it could produce its Lambda successor, the K-Machine.

My Lambda

My Lambda came out of a drafty Pennsylvania garage it had been sitting in for over twenty years. It was covered in a fine grit of mouse droppings, dust, and bits of shingle from when the garage’s roof was replaced several years back. It also has a fine patina of rust and corrosion on nearly every surface.

It’s also missing the tape drive and disk drive. The good news is that both of these are at least somewhat still possible to find: The tape drive was a standard Cipher F880 9-track drive with a Pertec interface, and the hard drive was a Fujitsu Eagle SMD drive. It’s likely that any Pertec-compatible tape drive will work, and it should be possible to find a suitable SMD disk that still functions.

Apart from the missing drives, the system appears to be complete: The backplane is fully populated with boards and in fact is a 2×2-PLUS system (two LISP CPUs and a 68010-based UNIX CPU in one chassis!). Two consoles were included with cabling, but only one keyboard and mouse, alas.

Restoration Plan

So my hope is to restore this system to operating condition, and since I’ll be here at home a lot more than usual I’ll have ample spare time to do it in! As I progress in the restoration in my off hours I’ll post updates here to keep you all in the loop and to give you an idea of the kind of steps we typically undertake when restoring systems at LCM+L.

It’s important to have a plan for this sort of thing; so here’s my current plan of attack, which is of course subject to change depending on how things go:

  1. General cleanup. After twenty years in the garage, this thing smells fairly musty and as mentioned previously, is quite dirty. The chassis, consoles, cables and all assorted ephemera need to be cleaned out and inspected for damage. If mice get into computer hardware they can do serious damage. I haven’t seen any evidence of this but it’s best to be sure.
  2. Inspection, cleaning, and testing of the system power supplies. Just plugging a system like this in after so many years lying dormant is a bad idea. The supplies will be removed from the system, checked out and tested with dummy loads prior to reinstallation. Any necessary repairs to the supplies will be undertaken at this time.
  3. Inspection and cleaning of the boards in the backplane, and the backplane itself. This entails cleaning of corroded pins on socketed ICs and inspecting for serious damage.
  4. Power up of a subset of boards, starting with the SDU (System Diagnostic Unit). The SDU can be used to inspect and test the rest of the boards in the system, once a tape drive has been procured.
  5. Find a working tape drive and disk drive; write out a copy of the LMI Installation and System tapes.
  6. Use the SDU to test the rest of the boards in the system.
  7. Restore one or both of the Lambda consoles; use the SDU to test them.
  8. Install the system to disk.
  9. Boot the system.
  10. Do a dance of some kind.

Daniel Seagraves (author of LambdaDelta, a most-excellent LMI Lambda emulator) is undertaking a similar effort as I type this; he rescued two Lambdas from a much worse garage than the one mine came from and is documenting his restoration efforts here. We’ve been chatting and he’s been extremely helpful in inspecting my Lambda and has sent me some updated SDU ROMs and an Ethernet interface for my system. His help will be instrumental in getting my system going.

Whew. I think that’s enough for one blog post. The next post will bring everything up to date with the current status of the restoration.

Microsoft Software I Have Loved

With Microsoft’s 45th birthday coming up, I thought it’d be appropriate to reflect on the Microsoft tools I use regularly in my day-to-day work at the museum. Well, maybe not day-to-day: some days I’m out there probing hardware or installing an operating system on a PDP-8 and on those days most software I’m using either predates Microsoft or isn’t software at all.

But on those days when I’m working on my emulation-project-du-jour or any of a handful of tools in use around the museum, I’m sitting at my desk staring at the latest incarnation of Visual Studio. My first introduction to it was in 1999 in one of my comp. sci classes in college (our curriculum was fairly evenly split between programming on Windows and Solaris systems) and I ended up using it in my classes on graphics and multimedia where I learned to use OpenGL and DirectX, as well as for a course where I needed to do some development for Windows CE (on the original Compaq iPAQ — remember those?)

Visual Studio 2017, editing Darkstar

Microsoft has always treated its developers well and Visual Studio has always been an extremely polished piece of software. It has an excellent debugger and decent integration with a variety of source control systems. I’m a big fan of C# for most of my projects, hence ContrAlto, Darkstar, and sImlac being written in it.

And heck, since I’m feeling nostalgic, allow me to wax rhapsodic about Microsoft QuickBASIC. As a kid I graduated from BASICA to QBasic and abandoned line numbers in favor of structured (or at least slightly-more-structured) programming. QB.EXE was my IDE of choice for many years, until I graduated to Borland Turbo C++ 3.0 for DOS. (Hot take: C++ was, in many ways, a step down from QuickBASIC.) And I will not hear a bad word spoken about GORILLA.BAS, perhaps the finest piece of software Microsoft ever wrote (runner up: NIBBLES.BAS):

What I Did on my Summer Vacation, 1991:

The first computer I can remember using — as in actually sitting down in front of it and interacting with it — was in 1986 in my dad’s office at Lincoln Land Community college. I was 6.

Yours truly, at the seat of a screamin’ machine. If only I’d known what was to come…
IBM Personal Computer PIECHART (Copyright IBM Corp 1981, 1982)

The computer was an off-brand IBM PC clone and I vividly remember playing with a program that drew pie charts and another that played really crude music through the PC’s speaker. (These were two of the demo BASICA programs that came with IBM PC-DOS 3.0.)

IBM Personal Computer MUSIC (Copyright IBM Corp 1981, 1982)

My school at the time had a few Bell & Howell Apple II machines but I don’t recall ever actually getting to use one. A few years later I had friends with Commodore 64s and Ataris and one with a Coleco ADAM (I still feel sorry for that family). The computer lab at Lakeside Elementary (no relation to the one Bill and Paul went to) was stocked with Apple IIs on which we used Appleworks to write stories and to play video games (mostly the latter). I didn’t have a computer of my own until 1991.

The first computer that was all mine was an IBM 5150, the original IBM PC (introduced 1981) that went on to conquer the world. It was given to our family in the spring of 1991 (by a family that had just upgraded to something better) right as I was finishing the 5th grade. It came equipped with two 360K floppy drives, a whopping 640K of RAM (should be enough for anybody), a 4.77Mhz Intel 8088 processor, and an IBM CGA adapter connected to an Amdek Color II display. It also came with a couple of boxes of software and the IBM PC-DOS and IBM BASIC reference manuals.

My copy of the IBM PC BASIC Reference Manual.

While I don’t necessarily recommend it, one can teach one’s self BASIC by reading the IBM BASIC Reference manual, and this is how I got my start with computer programming. The summer following 5th grade I learned how to manipulate the crude graphics of the CGA board (4 colors! Well, three colors really, and none of them ones you’d want to use for anything) producing drawings and animations, and to make the speaker emit crude beeps. My addiction at the time was Ms. Pac-Man and so what I wanted most in the world was to write my own rendition.

It took me some time to get there and… well, it wasn’t very good but I did eventually write something that kind of resembled Pac-Man if you squinted just right and also didn’t care about gameplay or fun.

PAC-GUY: It’s just like PAC-MAN except in every possible way. Not shown: fun.

In 1991 the 16-bit 8088-based IBM PC was already ten years old — the state of the art was the new 32-bit i80486 at speeds of up to 50Mhz. (My dad had a Northgate 486 system in his office at work for running Maple really quickly). But technology moved slower then: you could still go to your local Electronics Boutique and buy new software that came on 5.25″ 360K disks and ran on the original PC with CGA graphics. Heck, in those days you could still buy a new system based on the 8088 or 8086 — many of the cheaper laptops and portables of the day were still 8086 systems (albeit at 8Mhz) with 640K of memory and CGA graphics. I bought Print Shop Pro and printed out the most awesome banners on the IBM Graphics Printer. I played Monty Python’s Flying Circus: The Computer Game, and indulged myself on scads of older games scrounged from local BBSes and traded with friends.

Me, 1994, with CompuAdd laptop, demonstrating COMPUTER PROGRAMMING at my 8th grade Job Faire. Yes I did get beat up a lot, why do you ask?

And I yearned for more speed and maybe more than four colors on my display. And over the coming years I saved my money to pay for half of a new computer (my parents were very generous): In the summer of 1993 I bought a CompuAdd 425TX laptop (25Mhz 486, 4MB memory, SVGA greyscale display at 640×480 pixels, and an 80 megabyte hard drive which filled up more quickly than I’d expected) which I used and cherished until 1998… but the IBM PC will always hold a special place in my heart.