Time-sharing in 12KW: Running TSS/8 On Real PDP-8 Hardware

But first, a little history

Digital Equipment Corporation’s PDP-8 minicomputer was a small but incredibly flexible little computer. Introduced in 1965 at a cost of $18,000 it created a new market for small computers, and soon PDP-8s found themselves used for all sorts of tasks: Industrial control, laboratory data capture and analysis, word processing, software development, and education. They controlled San Francisco’s BART Subway displays, ran the scoreboard at Fenway Park and assisted in brain surgery.

They were also used in early forays into time sharing systems. Time-sharing stood in stark contrast to the batch processing systems that were popular at the time: Whereas batch processing systems were generally hands-off systems (where you’d submit a stack of punched cards to an operator and get your results back days later) a time-sharing system allowed multiple users to interact conversationally with a single computer at the same time. These systems did so by giving each user a tiny timeslice of the computer: each user’s program would run for a few hundred milliseconds before another user’s program would get a chance. This switching happens so quickly it is imperceivable to users — providing the illusion that each user had the entire computer to themselves. Sharing the system in this manner allowed for more efficient use of computing resources in many cases.

TSS/8 was one such time-sharing endeavor, started as a research project at Carnegie-Mellon University in 1967. A PDP-8 system outfitted with 24KW of memory could comfortably support 20 simultaneous users. Each user got what appeared to them as a 4K PDP-8 system with which they were free to do whatever they pleased, and the system was (in theory, at least) impervious to user behavior: a badly behaved user program could not affect the system or other users.

With assistance from DEC, TSS/8 was fleshed out into a stable system and was made available to the world at large in 1968, eventually selling over a hundred copies. It was modestly popular in high schools and universities, where it provided a cost-effective means to provide computing resources for education. While it was never a widespread success and was eventually forgotten and supplanted on the PDP-8 by single-user operating systems like OS/8, TSS/8 was a significant development, as Gordon Bell notes:

“While only a hundred or so systems were sold, TSS/8 was significant because it established the notion that multiprogramming applied even to minicomputers. Until recently, TSS/8 was the lowest cost (per system and per user) and highest performance/cost timesharing system. A major side benefit of TSS/8 was the training of the implementors, who went on to implement the RSTS timesharing system for the PDP-11 based on the BASIC language.”

Gordon Bell, “Computer Engineering: A DEC View of Hardware Systems Design,” 1978

It is quite notable that DEC made such a system possible on a machine as small as the PDP-8: An effective time-sharing system requires assistance from the hardware to allow separation of privileges and isolation of processes — without these there would be no way to stop a user’s program from doing whatever it wanted to with the system: trampling on other users’ programs or wreaking havoc with system devices either maliciously or accidentally. So DEC had to go out of their way to support time-sharing on the PDP-8.

PDP-8 Time-Sharing Hardware

In combination with the MC8/I memory extension (which allowed up to 32KW of memory to be addressed by the PDP-8), the KT8/I was the hardware option that made this possible, and was available on the PDP-8/I as an option at its introduction. The KT8 option was made available for the original PDP-8 around this time as well.

So what does the KT8/I do (in combination with the MC8/I) that makes time-sharing on the PDP-8 feasible? First, it provides two privilege levels for program execution: Executive, and User. The PDP-8 normally runs at the Executive privilege level — at this level all instructions can be executed normally. Under the User privilege level, most instructions execute as normal, but certain instructions are forbidden and cause a trap. On the PDP-8, trappable instructions are:

  • IOTs (I/O Transfer instructions, generally used for controlling hardware and peripherals).
  • The HLT (Halt) instruction, which normally stops the processor.
  • The OSR and LAS instructions, which access the front panel’s switch register.

Under a time-sharing system such as TSS/8, the operating system’s kernel (or “Monitor” in TSS parlance) runs at the Executive privilege level. The Monitor can then control the hardware and deal with scheduling user processes.

User processes (or “Jobs” in TSS) run at the User level (as you might have guessed by the name). At this level, user programs can do whatever they want, but if one of the classes of instructions listed above is executed, the user’s program is suspended (the processor traps the instruction via an interrupt) and the PDP-8’s processor returns to the Monitor in Executive mode to deal with the privileged instruction. If the instruction is indeed one that a user program is not allowed to execute, the Monitor may choose to terminate the user program. In many cases, IOTs are used as a mechanism for user programs to request a service from the Monitor. For example, a user program might execute an IOT to open a file, type a character to the terminal, or to send a message to another user. Executing this IOT causes a trap, the Monitor examines the trapped instruction and translates it into the appropriate action, after which the Monitor resumes execution of user’s program in User mode.

Thus the privileged Executive mode and the unprivileged User mode make it possible to build an operating system that can prevent user processes from interfering with the functioning of the system’s hardware. The MC8/I Memory Extension hardware provided the other piece: Compartmentalizing user processes so they can’t stomp on other user programs or the operating system itself.

A basic PDP-8 system has a 12-bit address space and is thus capable of addressing only 4KW of memory. The MC8/I allowed extending memory up to 32KW in 4KW fields of memory — it did so by providing a three bit wide Extended Memory Address register (which thus provided up to 8 fields.) This did not provide a linear (flat) memory space: The PDP-8 processor could still only directly address 4096 words. But it did allow the processor to access data or execute instructions from any of these 8 fields of memory by executing a special IOT which caused future memory accesses and/or program instructions to come from a new field.

With this hardware assistance it becomes (relatively) trivial to limit a user program to stay within its own 4KW field: if it attempts to execute a memory management IOT to switch fields the KT8/I will cause a trap and the Monitor can either abort the user’s program or ensure that the field switch was a valid one (swapping in memory or moving things around to ensure that the right field is in the right place). (This latter proves to be significantly more difficult to do, for reasons I will spare you the details on. You’re welcome.)

This article’s supposed to be about running TSS/8 on a real PDP-8, let’s talk about that then shall we?

Where were we. Oh yes, TSS/8.

TSS/8 was initially designed to run on a PDP-8/I (introduced 1968) or the original PDP-8 (1965) equipped with the following hardware at minimum:

  • 12KW of memory
  • KT8/I and MC8/I or equivalent
  • A programmable or line-time clock (KB8/I)
  • An RF08 or DF32 fixed head disc controller with up to four RS08s or DS32 fixed head disks

Optionally supported were the TC08 DECtape controller and a DC08 or PT08 terminal controller for connecting up multiple user terminals. As time went on, TSS/8 was extended to support the newer Omnibus PDP-8 models and peripherals: The PDP-8/e (1970), 8/f, 8/m and the 8/a introduced in 1976.

TSS/8 used an RF08 or RF32 disc for storing the operating system, swapping space, and the user filesystem. Of these the most critical application was swapping: each user on the system got 4KW of swap space on the disk for their current job — as multiple users shared the system and there became more users than memory, a user’s program would be swapped out to disk to allow another user’s program to run, then swapped back in at a later time. Thus the need for fast transfer rate with minimal latency was required: The RF08 being a fixed-head disk had very little latency (averaging about 17ms due to rotational delays) and had a transfer rate of about 62KW/second.

Fixed head disks also had the advantage of being word addressable, unlike many later storage mechanisms which read data a sector at a time. This made transfers of small amounts of data (like filesystem structures) more efficient as only the necessary data needed to be transferred.

Our RF08 Controller with two RS08 drives (256KW capacity each)
Our RF08 Controller with two RS08 drives (256KW capacity each)

We’ve wanted to get TSS/8 running at the museum for a long time. The biggest impediment to running TSS/8 on real hardware in this year of 2019 is the requirement for a fixed-head disk. There are not many RF08s or RF32s left in the world these days, and the ones that remain are difficult to keep operational in the long term. We have contemplated restoring a PDP-8/I and the one RF32 controller (with two RS32 discs) in our collection or building an RF08 emulator, but I thought it would be an interesting exercise to get it to run on the PDP-8/e we already have on exhibit on the second floor, with the hardware we already have restored and operational.

LCM+L's PDP-8/e.  RK05 drive on the left.
LCM+L’s PDP-8/e. RK05 drive on the left.

Our 8/e is outfitted with an RK05 drive, controlled by the usual RK8E Omnibus controller. The RK05 is a removable pack drive with a capacity of approximately 2.5MB and a transfer rate of 102KW/sec. On paper it didn’t seem infeasible to run a time-sharing system with an RK05 instead of a DF08 — each user’s 4K swap area transposes nicely to a single track on an RK05 (a single track is 16 sectors of 256 words yielding 4096 words) and the capacity is larger than the maximum size for an DF08 controller (1.6MW vs 1.0MW). However, the seek time of the RK05 (10ms track-to-track, 50ms average vs. no seek time on the DF08) means performance is going to be lower, the only question is by how much. My theory was that while the system would be slower it would still be usable. Only one way to find out, I figured.

Finding TSS/8 Sources

Of course, in order to modify the system it would be useful to have access to the original source code. Fortunately the heavy lifting here has already been done: John Wilson transcribed a set of source listings way back in the 1980s and made them available on the Internet in the early 2000s. Since then a couple of PDP-8 hackers (Brad Parker and Vincent Slyngstad) combined efforts to make those source listings build again, and made the results available here. Cloning that repository provides the sources and the tools necessary to assemble the TSS/8 source code and build an RF08 disk containing the resultant binaries along with a working TSS/8 filesystem. I began with this as a base and started in to hacking away.

Hacking Away

The first thing one notices when perusing the TSS/8 source is that it has comments. Lots of comments. Useful comments. I would like to extend my heartfelt thanks to the original authors of this code, you are the greatest.

Lookit’ them comments: That’s the way you do it!

There are two modules in TSS/8 that need modifications: INIT and TS8. Everything else builds on top of these. INIT is a combination of bootstrap, diagnostic, backup, and patching tool. Most of the time it’s used to cold boot the TSS/8 system: It reads TS8 into fields 0 and 1 of the PDP-8’s memory and starts executing it. TS8 is the TSS/8 Monitor (analogous to the “kernel” in modern parlance). It manages the hardware, schedules user jobs, and executes user requests.

It made sense to make changes to INIT first, since it brings up the rest of the system. These changes ended up being fairly straightforward as everything involved with booting the system read entire 4K tracks in at a time, nothing complicated. (I still have yet to modify the DECtape dump/restore routines, however.)

The code for TS8, the TSS/8 Monitor, lives in ts8.pal, and this is where the bulk of the code changes live. The Monitor contains the low-level disk I/O routines used by the rest of the system. I spent some time studying the code in ts8.pal to understand better what needed to be changed and it all boiled down to two sets of routines: one used for swapping processes in and out 4KW at a time, and one used for filesystem transfers of arbitrary size.

I started with the former as it seemed the less daunting task. The swapping code is given a 4K block of memory to transfer either to (“swapping out”) or from (“swapping in”) the fixed-head disk. For the RF32 and DF08 controllers this is simple: You just tell the controller “copy 4KW from here and put it over there” (more or less) and it goes off and does it and causes an interrupt to let the processor know when it’s done. Simple:

SWPIN,    0
     DCMA        /TO STOP THE DISC
     TAD SWINA   /RETURN ADDRESS FOR INTURRUPT CHAIN
     DCA I DSWATA    /SAVE IT
     TAD INTRC   /GET THE TRAC # TO BE READ IN
     IFZERO RF08-40 <     
     TAD SQREQ   /FIELD TO BE USED     
     DEAL     
     CLA     
     NOP     /JUST FOR PROPER LENGTH     >
     IFZERO RF08 <     
     DXAL     
     TAD SQREQ   /FIELD TO BE SWAPPED IN     
     TAD C0500   /ENABLE INTERRUPT ON ERROR AND ON COMPLETION     
     DIML     >
     DCA DSWC    /WORD COUNT
     CMA
     DCA DSMA    /CORE ADDRESS
     DMAR
     JMP I SWPIN

SWPTR,    JMP SWPERR      /OOPS
     TAD FINISH      /DID WE JUST SWAP IN OR OUT?
     SMA
     JMP SWPOK       /IN; SO WE'RE FINISHED
     CIA
     DCA FINISH      /SAVE IT
     JMS SWPIO       /START SWAP IN
     DISMIS          /GO BACK TO WHAT WE WERE DOING

For the RK05 things are a bit more complicated: The RK8E controller can only transfer data one sector (256 words) at a time, so my new swapping code would need to run 16 times (and be interrupted 16 times) in order to transfer a full 4KW. And it would have to keep track of the source and destination addresses manually. Obviously this code was going to take up more space, and space was already at a premium in this code (the TSS/8 Monitor gets a mere 8KW to do everything it needs to do). After fighting with the assembler and optimizing and testing things I came up with:

SWPIN, TAD SQREQ                 / GET FIELD TO BE SWAPPED IN
     TAD C0400                   / READ SECTOR, INTERRUPT
     DLDC                        / LOAD COMMAND REGISTER:
                                 / FIELD IS IN BITS 6-8;
                                 / INTERRUPTS ENABLED ON TRANSFER COMPLETE
                                 / OF A 256-WORD READ TO DRIVE ZERO.
     TAD     INTRC               / GET THE TRACK # TO READ FROM
     TAD     RKSWSE              / ADD SECTOR
     DLAG                        / LOAD ADDRESS, GO
     JMP I   SWPIT
     
 / FOR RK05:
 / ON EACH RETURN HERE, CHECK STATUS REG (ERROR OR SUCCESS MODIFIES
 / ENTRY ADDRESS TO SWPTR)
 / ON COMPLETION, INC. SECTOR COUNT, DO NEXT SECTOR.  ON LAST SECTOR
 / FINISH THE SWAP.
 SWPA,    SWPTR                  /RETURN ADDRESS AFTER SWAP
 
 SWPTR, JMP SWPERR      /OOPS
     TAD RKADR
     TAD C0400       /NEXT ADDRESS
     DCA RKADR
     TAD RKSWSE      /NEXT SECTOR
     IAC
     AND C0017   
     SNA             /SECTOR = 16? DONE?
     JMP SWFIN       /YEP, FINISH THINGS UP.
     DCA RKSWSE      /NO - DO NEXT SECTOR
     JMS SWPIO       /START NEXT SECTOR TRANSFER
     DISMIS          /GO BACK TO WHAT WE WERE DOING
 SWFIN, TAD FINISH   /DID WE JUST SWAP IN OR OUT?    
     SMA
     JMP SWPOK       /IN; SO WE'RE FINISHED
     CIA
     DCA FINISH      /SAVE IT
     JMS SWPIR       /START SWAP IN
     DISMIS          /GO BACK TO WHAT WE WERE DOING      
     

The above is only slightly larger than the original code. Like the original, it’s interrupt driven: SWPIN sets up a sector transfer then returns to the Monitor — the RK8E will interrupt the processor when this transfer is done, at which point the Monitor will jump to SWPTR to process it. SWPTR then determines if there are more sectors to transfer, and if so starts the next transfer, calculating the disk and memory addresses needed to do so.

After testing this code, TSS/8 would initialize and prompt for a login, and then hang attempting to do a filesystem operation to read the password database. Time to move on to the other routine that needed to be changed: the filesystem transfer code. This ended up being considerably more complicated than the swapping routine. As mentioned earlier, the RF08 and RF32 disks are word-addressable, meaning that any arbitrary word at any address on disk can be accessed directly. And these controllers can transfer any amount of data from a single word to 4096 words in a single request. The RK05 can only transfer a sector’s worth of data (256 words) at once and transfers must start on a sector boundary (a multiple of 256 words). The TSS/8 filesystem code makes heavy use of the flexibility of the RF08/RF32, and user programs can request transfers of arbitrary lengths from arbitrary addresses as well. This means that the RK05 code I’m adding will need to do some heavy lifting in order to meet the needs of its callers.

Like the swapping code, a single request may require multiple sector transfers to complete. Further, the new code will need to have access to a private buffer 256 words in length for the transfer of a single RK05 sector — it cannot copy sector data directly to the caller’s destination like it does with the RF08/RF32 because that destination is not likely to be large enough. (Consider the case where the caller wants to read only one word!) So for a read operation, the steps necessary are:

  1. Given a word address for the data being requested from disk, calculate the RK05 sector S that word lives in. (i.e. divide the address by 256).
  2. Given the same, calculate the offset O in that sector that the data starts at (i.e. calculate the word address modulo 256)
  3. Start a read from the RK05 for sector S into the Monitor’s private sector buffer. Return to the Monitor and wait for an interrupt signalling completion.
  4. On receipt of an interrupt, calculate the length of the data to be copied from the private sector buffer into the caller’s memory (the data’s final destination). Calculate the length L as 256-O (i.e. copy up to the end of the sector we read.)
  5. Copy L words from offset O in the private sector buffer to the caller’s memory.
  6. Decrement the caller’s requested word count by L and see if any words remain to be transferred: If yes, increment the sector S, reset O to 0 (we start at the beginning of the next sector) and go back to step 3.
  7. If no more words to be transferred, we’re done and we can take a break. Whew.

Doing a Write is more complicated: Because the offset O may be in the middle of a sector, we have to do a read-modify-write cycle: Read the sector first into the private buffer, copy in the modified data at offset O in the buffer, and then write the whole buffer back to disk.

This code ended up not fitting in Field 0 of TS8 — I had to move it into Field 1 in order to have space for both the code and the private sector buffer. So as not to bore you I won’t paste the final code here (it’s pretty long) but if you’re curious you can see it starting around line 6994 of ts8.pal.

This code while functional has some obvious weaknesses and could be optimized: the read-modify-write cycle for write operations is only necessary for transfers that start at a non-sector boundary or are less than a sector in size. Repeated reads from the same sector could bypass the actual disk transfer (only the first read need actually hit the disk). Similarly, repeated writes to the same sector need only commit the sector to disk when a new sector is requested. I’m waiting to see how the system holds up under heavy use, and what disk usage patterns emerge before undertaking these changes, premature optimization being the root of all evil and whatnot.

The first boot of TSS/8 on our PDP-8/e!

I tested all of these changes as I was writing them under SIMH, an excellent suite of emulators for a variety of systems including the PDP-8. When I was finally ready to try it on real hardware, I used David Gesswein’s dumprest tools to write the disk image out to a real RK05 pack, and toggled in the RK05 TSS/8 bootstrap I wrote to get INIT started. After a a couple of weeks of working only under the emulator, it was a real relief when it started right up the first time on the real thing, let me tell you!

TSS/8 is currently running on the floor at the museum, servicing only two terminals. I’m in the process of adding six more KL8E asynchronous serial lines so that we can have eight users on the system — the hope is to make the system available online early next year so that people around the world can play with TSS/8 on real hardware.

I’ve also been working on tracking down more software to run on TSS/8. In addition to what was already available on the RF08 disk image (PALD, BASIC, FOCAL, FORTRAN, EDIT) I’ve dug up ALGOL, and ported CHEKMO II and LISP over. If anyone out there is sitting on TSS/8 code — listings, paper tape, disk pack, or DECtape, do drop me a line!

And if you’re so inclined, and have your own PDP-8 system with an RK05 you can grab the latest copy of my changes on our Github at https://github.com/livingcomputermuseum/cpus-pdp8 and give it a whirl. Comments, questions, and pull requests are always welcome!

Unix Version 0 on the PDP-7 at LCM+L

In February 2016, a wonderful piece of news came to the attention of the international vintage computing community: The source of the original implementation of the Unix operating system, written for the DEC PDP-7 computer, had come to light, in the form of listings for the kernel and several user programs (including the editor and the assembler program). The announcement came from Warren Toomey, founder of The Unix Historical Society (TUHS) in Australia.

Warren asked if we might be interested in participating in the recovery of this historic software, and perhaps run it on the PDP-7 here.1 We were very interested, and Josh Dersch began thinking about how to do this without a disk drive, since our PDP-7 did not have one.

Two months later, Fred Yearian visited the museum and told us that he had a PDP-7 in the basement of his house. Fred, a retired Boeing engineer, had acquired the system many years earlier from Boeing Surplus, and kept it in semi-running condition. In 2018, Fred made arrangements to donate his PDP-7 to the museum, and it was moved to the computer room on the third floor, where Fred worked with Jeff Kaylin to put it into good running order.

This PDP-7 included a non-DEC interface which was installed by Boeing, and which was apparently a controller for a magnetic tape drive. Jeff and Fred traced the circuitry for the interface, and Jeff created schematics for it.

Fred had a utility program for the PDP-7 which was written as a set of binary numbers (represented in octal = base 8) in an assembly lnaguage program for a Varian minicomputer. He translated the binary representations of the PDP-7 instructions into an assembly language program for the PDP-7, which Rich Alderson compiled using a program for Windows originally created for the family of simulation programs for DEC’s 18 bit computers.2 Fred added new features as the restoration of his PDP-7 progressed and new needs were recognized.

Meanwhile, enthusiasts had added features to the SimH PDP-7 simulator such as a simulated disk of the kind used on the PDP-7 at Bell Labs where Unix was created, which allowed the operating system to be run under a PDP-7 simulation. Some programs were missing from the source listings provided to Warren Toomey, including the shell command processor, and these had to be recreated from scratch based on early documentation and programming notes.

As part of our Unix@50 programming, celebrating the 50th anniversary of the Unix operating system, we moved Fred’s PDP-7 from the third floor computer room to the second floor exhibit hall in June 2019. It formed one of the anchors of a private event hosted by https://SDF.org for attendees of the Usenix Technical Conference in July.

Following the move to the exhibit hall, Fred and Jeff continued the restoration in view of the public, answering questions as they worked. Jeff also designed a device, dubbed the JK09,3 which looks to the PDP-7 like a kind of disk connected to the interface installed by Boeing. Once that was debugged, it was time for the software to be revisited.

Josh Dersch took the lead, copying the Unix Version 0 sources and modified SimH PDP-7 simulator from GitHub, then modifying that to use the JK09 device instead of the RB09 simulated by SimH. Once he had Version 0 running under SimH, he created a disk image which was loaded into the JK09 attached to the PDP-7, and Unix was booted on the system.

IMLAC PDS-1 Power Supply

In early July of 2019, the power supply of one of our rarest and iconic machines, started to fail. This is the IMLAC PDS-1 originally produced from 1970 to 1972. Despite the efforts of our staff to troubleshoot and replace components, we were soon left with a completely failed power supply.

Typical of these situations, we set about to do an engineering evaluation toward designing a form, fit, and functional replacement. The photos below show the power supply system in its’ original form.

IMLAC Power Supply – Power Input, Rectifier, and Filtering (left chassis)

This image has an empty alt attribute; its file name is image-4-768x1024.png
IMLAC Power Supply – Regulator Chassis (right chassis)

Accessing the schematics, we found what voltages and currents the power supply system had to provide. Next, using the power supply components we have available for this purpose, we had to design a system which fits in the chassis and interfaces with the control and power signals the computer needs to run.

Surprise ! The Power Supply Generates a Non-DC timing Signal

The schematic below shows a section which we couldn’t figure out initially. At first glance, the collection of four diodes on the left looks like a bridge rectifier. On closer examination, the anodes and cathodes are not hooked up like a bridge rectifier. What we have here instead is a frequency doubler which is used to generates a 120 hz signal from the 60 hz power line that is used as a periodic interrupt for the video display logic. Not at all expected.

Original IMLAC schematic showing 120 hz sync signal generator (frequency doubler)

Below is our replication of this circuit. We used a miniature 120 VAC to 10 VAC transformer.

Schematic for 120 hz sync signal generator

After the surprise above, we set about removing the original power supply components and installing a new configurable supply along with the necessary modifications to the internal wiring harness.

Below are two images showing the right and left power supply chassis as modified.

IMLAC Power Supply – After Modification the Power Input, Rectifier, and Filtering are no longer needed (left chassis)
The regulator hardware has been replaced by a configurable power supply(on the right) and the 120 hz signal generator on the left ( orange color board with caution label ).

We completed and tested the above modifications in about 1 1/2 weeks. A week after the unit was put back in service, a third power supply ( located in the control console ) also failed. We replaced it with another configurable supply as shown below.

Control Console rear showing replacement supply. This powers the control console LEDs.

The system has now been running since late July without incident.

This is a good example of the type of work we have been performing for the last 15 years.

RS232 In to CMOS

Schematic

We often use RS232 serial communication in our projects. The USB converters are very convenient. For output, these converters can receive from CMOS levels (0 to 3.3V). For input, the ones we use provide about 6 and a half volts positive and negative. To interface to CMOS 3.3 volt logic, I have created a circuit to drop the positive 6.5 volts down to a little under 3 volts, and also completely block any negative voltage. It works well at 9600 baud.

Update: I tried different baud rates. The maximum normal rate is 921600. The waveform received at 460800 did not have sharp rise and fall times, and was not symmetrical. At 230400, the rise and fall times made less difference and the symmetry was closer. But I wondered if it was my circuit introducing the difference in rise and fall times, as 2.2K resistors do not pull that hard. Nope, it is actually the USB converter which has the bit jitter. Also the differing rise and fall times — their method of creating the negative voltage must have less oomph. I measured bit times of 4.48us for the first bit, 4.28, 4.2, and 4.32 for the next bits. I ran a test overnight and had no communication errors at 230400. That is 24 times faster than 9600.

Add Memory to a FPGA board

DRAM memory
DRAM memory board

Xilinx FPGA chips have limited memory available. For those projects requiring more memory, I have created a daughter board. This board may have up to 4 memory chips, each providing 16MBits of memory (4M x 4).

The MESA boards have 24 signals on each of their connectors. Finding memory requiring 24 or less signals was the trick. The 4M x 4 configuration has 19 signals: 11 address, 4 data, RAS, CAS, WE, and OE. I select one of the 4 memory chips by providing separate CAS.

Apple I Keyboard

We are using an Apple ][ keyboard to control our Apple original.

Of course the connectors are not compatible. So, a simple rewiring may work. However, if one were to put some logic between, then if the Strobes are active low instead of active high, or if the Reset is active low instead of active high, or if some of the data bits are usually high instead of usually low, then one could easily make adjustments. Plus, with enough logic, one could pretend to be typing in programs.

That is what I have done. With a BASYS3 FPGA board I listen to the Apple ][ keyboard and pass that data along. But I also listen to a telephone keypad, and if a button is pressed I send my own data. The most useful program is the BASIC interpreter. It takes 6 minutes! That is about 30 characters a second to enter a line of data, and then an appropriate pause at the end of each line so the data can be processed. Then a command to RUN and BASIC is going. Then, another key may be pressed on the keypad to load a BASIC program. That is a slower process, as BASIC takes some time to process each character. But, BASIC programs are usually short anyway.

To blow, or not to blow!

That is the question that one of the blowers in the base of KATIA, our 1967 PDP10-KA was apparently asking itself a couple of weeks ago.

KATIA had been running happily for a couple of months, and when I came in one day, she wasn’t. I tried rebooting her a couple of times, without any luck. With my usual collection of tasty puzzle morsels on my plate, it took me a few days to get around to poor KATIA. It was Friday when I started running diagnostics, and they all passed till it got down to one of the last ones: DAKCB.

DAKCB would fail while trying to test the FDVL instruction: Double precision Floating Point Divide. This was at the end of the day, so since I was scheduled to work half a day Saturday, I could work on it then. That might incite interesting conversations with the Museum Visitors.

After going to my 6:30AM Saturday morning WW meeting, Studying for a test for a while, then taking the test for the highest class Amateur Radio Operator license, and passing, I managed to stagger into work a little after 1PM.

I was assisted by Leda, our Post-Grad Ethnography student, who is trying to figure out why Grown Adults would want to spend their time playing with these ancient computers.

While looking at the machine when it failed DAKCB, we had noticed a couple of things: it had stopped, as in the PC wasn’t changing, but the RUN light was still on! This is a symptom of losing a pulse! In Asynchronous machines like the KA and KI, there is no clock that times the instructions. When you poke the start button, a single pulse is launched into a bunch of logic and delay lines. The little pulses and logic fetch the instruction, examine it to figure out what instruction it is, and wiggle all the logic at the correct time to do whatever that instruction is supposed to do. When it is done doing one instruction, and the RUN light is still on, it will take the pulse coming out of the end of the logic, and put it back in the front, starting the next instruction.

In my Zeal and Enthusiasm, I conned Leda into running the oscilloscope probe for me as we tried to follow this tiny pulse all the way through the logic till it disappeared. Normal trouble shooting practice would be to see the pulse at the beginning, look at the end and not see it, then look somewhere in the middle. We call this a binary search, and it is the fastest way to do this.

Unfortunately I am not sure where the end is, and I certainly don’t know where the middle of this process would be, so we started at the beginning and followed each step till we got to a pulse amplifier at 1F19: we could see the pulse go in, but it didn’t come out! We can just change the module, right?

I did that, and then noticed that the module I had taken out was REALLY warm in my hand, not so hot it burned me, but it shouldn’t have been that hot. Why? Being as how this is an updraft machine, I put my hand above the card cage, and it was nice and toasty warm up there. Where is all the air like is coming out of Bay 2’s cage, keeping all those cards cool? Going back around to the front of the machine, I notice that the blower in the bottom of Bay 1 was not turning, and shut the machine off. Now we know why it quit working!

This is not going to be pretty! Leda and I went down to the basement, and stole a blower from one of the other KA’s we have down there, and installed it. Of course the freight elevator had gotten its tail tied in a knot on Friday and was on strike for better working conditions, so I carried the 30lb awkward thing up the stairs, and managed to get it installed around 5:30PM. We turned on the machine, and luckily this blower was not having an existential crisis, and so it was blowing precious cooling air all through the cards in Bay 1: Yay!

Since we figured that it was probably bad bearings, David took the blower apart to change the bearings, and here is a picture of it sitting on a 2 foot by 3 foot cart:

KATIAs blower, disassembled.

Looking at the stator of the motor a little closer:

Blower stator.

Those wires should not be black! I’m afraid this motor is toast!

I have so far spent about a week trying to get basic instructions to work on KATIA, without a lot of success. What I know so far, is that if I try to run out of fast memory, the ACs that live inside the processor, it REALLY doesn’t work. At a guess, I suspect that the logic receives all zeros for the second instruction. If I disable fast memory, or run in main memory, things work better, but some very simple things don’t work. It will add a constant to an accumulator, but it won’t load a constant into an accumulator. This reduces its usefulness as a computer a little. It is still pretty good as a room heater.

I will try to blog more frequently so you can follow along as we discover what else is wrong with KATIA.

Connecting to the PDP-7

Two 55-pin connectors are available on the PDP-7 SN129. One provides 18 bits of output data, and the other can read 18 bits of input data.

My project was to solder 50-pin ribbon cables to the 55-pin connectors. This will allow us to easily connect to external logic.

55-pin round connector
Pin numbers spiral in, so it is best to begin soldering at the inside and work out. Here are the first four pairs of wire. The green wire is the 25th pair of the ribbon cable.

Each wire has heat-shrink tubing to both insulate and to provide structural strength.

55-pin round connector
This is the other connector, and being of the other gender it spirals the other direction. The final two wires are ready to be soldered.

Computer Maintenance Hell!

Back in October 2018, our PDP10-KI went down, and it didn’t want to come back up. I ran all the normal diagnostics, and they all worked, but the TOPS-10 would hang when I tried to boot it. That is the definition of Computer Maintenance Hell, Everything works, but the operating system won’t run!

Running the normal diagnostics sounds like an easy thing, but that isn’t always the case! The first bunch of diagnostics run from paper tape, and that is pretty easy. As we continue past DBKAG, the tapes don’t fit well in the reader, so we switch over to getting them off of DECTape, herein lies the rub: the TD10 DECTape controller on the KI is almost always broken when I need it.

After much gnashing of teeth, and tearing of hair, there was enough blood on the floor for the dust bunnies to leave tracks in that pointed to what was wrong with the TD10, and we were off once again. I ran the rest of the usual diagnostics, and they all passed! Still didn’t boot.

I had plenty of things to keep me occupied, so our poor PDP10-KI didn’t get a lot of my attention. During our group session bringing up the KATIA, we played with the KI some, and found that the KI didn’t like its memory! The KA liked it, but the KI didn’t! It would run the DDMMD memory diagnostic for about 10 or 15 minutes, then fail. The KA would happily run the KIs memory well past where the KI would fail. Looking at the errors, it appeared as if things were getting confused about which particular bit of memory it was talking to. It would always start failing at location 0374000 where either it hadn’t inverted the contents of those locations, or it had done it twice.

Now it did’t fail all the time. The part of the test that failed was going through memory incrementing the address by a more significant bit than the LSB, then wrap around to the LSB. When it started with the LSB, bit 35, everything was fine. It worked when it did bit 34. It had to get up to bit 25 before we had problems, we would fail between bit 25, and bit 21, 20 through 18 worked too.

I spent quite a while trying to write a diagnostic that did what DDMMD was doing, but in a quick and repeatable way. I believe I got pretty close, but nothing I wrote would tickle the problem… bother!

Months have now passed, and I broke down and plugged in the logic analyzer. Most of the time, I use an oscilloscope as my main debug tool. ‘Scopes don’t lie as much as logic analyzers do! If the logic is working, procducing 1’s and 0’s as it should, a logic analyzer is a good tool. When things are broken, sometimes you get a half, or a third instead of a one or zero, and this is where the ‘scope is better about telling the truth, and the logic analyzer will lie. Here the machine was pretty much working, at least the diagnostics thought so.

Here is one of the first logic analyzer traces I took, just showing the logic analyzer sample number, the memory operation, and the address:

1581 wr 626415
1601 wr 626435
1621 wr 626455
1641 wr 626475
1661 wr 626515
1681 wr 626535

I did a bunch of work with PERL to go through the 100MB of data that came out of the logic analyzer, and boil it down to what you see here.

Now it turns out that the way this part of DDMMD worked, is that it would fill memory from the bottom to the top stepping by 1, complement each location using the funny addressing pattern, then verify from bottom to top normally. I added the top 8 bits from the CPU’s MA (Memory Address) register to the logic analyzer:

104873 rd 377774, 376
104903 rd 377775, 376
104933 rd 377776, 376
104963 rd 377777, 376
104989 rd 777000, 400 ***
105019 rd 400001, 400
105047 rd 400002, 400
105075 rd 400003, 400
105103 rd 400004, 400

This is where it is doing the final verification, and you can see something funny here: the upper bits from the MA register incremented like I expected them to, but when a whole bunch of them changed, the address going to the actual memory didn’t follow as quickly! Instead of going from 377777 to 400000, it went to 777000! Here we get into a bit of logic called the “Pager”.

A PDP10 can really only talk to 256K words of memory at one time. How can the KI use 4MWs of memory? That is the Pagers job! The Pagers job is to translate the logical address that the CPU provides into a physical address of a hopefully larger memory. While running diagnostics, the Pager should be turned off, resulting in a maximum of 256KW of memory directly addressed from the MA register to the address lines going to memory. Something was going wrong here!

I added another set of 8 probes from the Logic Analyzer, and started moving backwards from the physical address going to the memory, to where the MA register fed into the Pager. When I got to the output of the CAM’s, there was something I didn’t understand.

What is a CAM you ask? CAM stands for “Content Addressable Memory”. What you do is give it the logical address that you want, and it will tell you if it knows about that, with a single line for each location inside itself. All four of them.

I got lucky, the first group of 8 output bits looked like this:

536691 wr 360631, 360, 400
536793 wr 362631, 362, 100
536844 wr 363631, 362, 040
536896 wr 364631, 364, 020
536947 wr 365631, 364, 010
536999 wr 366631, 366, 004
537052 wr 367631, 366, 006

Near as I can tell, there should be only a single 1 in the right column. It is octal, so we can watch which location in the CAM has the data as the addresses change, and when we get to 367631, we get two ones! I believe that output should have been a 002, not 006!

That output came from board 2PR09, so I swapped it and 2PR08, and I couldn’t run the diagnostic at all due to a “Page Fail Trap Error”! Ah, I think we are very close here! I checked the inventory, and we didn’t have a record for an M260 board, so I stole one from one of the machines that came in in September, and Voila, the memory test passed! It can even run TOPS-10 if we don’t try to initialize its serial ports. This could be correct since we stole a bunch of its serial lines to use on KATIA while the KI was asleep.

OK, Since the KI, the KA, and the CDC are all working, I seem to have made it out of Computer Maintenance Hell for now. Give them a little while, one of them will fail.

Bruce Sherry

Adventures in PDP10 land

The PDP10-KI went down sometime in the fall, maybe October. This is the machine just to the right of the CDC6500 as you come in the second floor computer room. I noticed this fairly quickly and tried to reboot it, but it would hang when I tried, several times.

OK, it must be time to run diagnostics, which I proceeded to do. It passed all the diagnostics from DBKAA to DBKAH, which are the ones on paper tape, but the TD10 DECTape controller, just to the right of the console had quit again, as it has done almost every time I need to use it.

This TD10 and I just do not get along very well. It always kicks me around the block several times before it will let me know what is wrong, so I can fix it. Because of this history, it sometimes takes a while for me to generate the gumption to work on it. This time was no exception, since I was busy trying to get the KA working. If you don’t know about gumption and gumption traps, you should read the classic “Zen and the Art of Motorcycle Maintenance”, which doesn’t really talk about motorcycles, or Zen much.

Back to our story: We fixed the KA, moved it down to the second floor computer room, fixed it again, and had a small gathering to boot it the first time. The CDC was being its normal self, but was refusing to be down when I arrived at work, which is my signal to drop everything and work on it.
I was running out of reasons to avoid working on the KI.

Contrary to normal behavior, it only took a few hours to figure out the problem with the TD10, so I could run the diagnostics that are very awkward to load from paper tape.

All the normal diagnostics ran except for DBKAL. I think it took a few days to remember that there is a diagnostic for which the binary we have is wrong, and I have to patch a couple of locations after loading, in order for it to work. Now DBKAL and DBKAM work. On to some more obscure ones. All the CPU ones seem to pass, does it boot now? No, it still hangs after the OS is loaded, and we type “GO” to fire up timesharing.

What else can we test? We ran DDRHA, which tests the RH10 disk controller, but it passed. We then ran DDRPI, which tests the disk drives. Now we don’t really use the disk drives it is expecting to test, we use our MDE (Massbus Disk Emulator). We have been using this MDE for about 5 years now, but there could still be a bug hiding in there somewhere.

DDRPI looked like everything was fine for about 20 minutes, while it was doing register tests, seek tests, all ones and zeros tests. When it got to testing the surface of the disk, things started to go wrong. It would get an error where it looked like the data was misplaced, like it was reading the wrong sector or something like that.

How could that be? This thing has been working fine for over 5 years, in fact when we ran DDRPI from the KA, using the KI’s Memory, RH10s, DAS33, and MDE, EVERYTHING was FINE, even the surface test!

How about the memory? It passes my little MARCH memory test, from the KI or the KA. Dragging out the DECTape again, we loaded up DDMMD, which is one of the memory diagnostics. We fired it up, and it ran fine for about 15 minutes, whereupon it started spewing out errors. We have run this a BUNCH, and the errors usually seem to start at location 374000, and the data seems to be inverted from what it should be. The test complains about address bit 24.

We run the same test from the KA against the KI’s memory, and it works fine! It is handy to have the KA right across the room to enable this kind of testing.

What is really going on here? Let’s look at the console:

OK, TN=2, that means it is doing “Address” test. AS=F24 turns out to mean that it was doing fast addressing on address bit 24. “What does that mean” you may ask? I did! After much grovelling over the DDMMD listing, and consulting with Rich Alderson, I found that they would fill memory with the address and its complement, and then go through reading a location, verifying it was correct, and writing the complement in it, then go back and read and verify that they all had the complement. Ah, but what about that F24 part? When they are reading and complementing the data they start skipping locations, by changing which address bit they increment first. The first time they do this, they use bit 35 as the lsb, but next time through they shift the lsb over to bit 34, then 33 etc. When they get to bit 24, we run into this problem.

How do I figure out what is really going on here? I decided to write a version of my MARCH that does this, MRCHFA. It took a while, but I finally got it to work on the KA, and tried it on the KI: Unfortunately the KI passed it too! What else are they doing differently? OK, more grovelling over the listing: They are stuffing all their inner loops down in the Fast AC’s to speed them up. On to MRCHF3, which pushes the inner loops down to the Fast AC’s. Does the KI fail that one? Nope!

I’m running out of ideas, where do we go from here? I decide to just watch it for a while, and see what happens AFTER it starts to fail. I see it fail bit 24 from 374000 to 377777, both bit 23 and 22 over the same range, then it starts failing location 700000, in the same way. Shortly thereafter, the program gives up in disgust, and stops printing the results, and starts ignoring the errors. Now I just watch the lights on the ARM10 memory.

I got used to the way the lights blink while working on MRCHFA, and MRCHF3, as the LSB moves up the address, the slow blinking address follows it.

But wait, that isn’t what I am seeing: I see the lights increment from the bottom, do the FA thing, then increment from the bottom again, and then shift the LSB over. What is going on here? Is there anything on the ARM10 that will tell me anything? Yes, there are the read and write lights. While writing, both lights come on, but a read only lights the read light. After the FA thing, it just reads! Back to grovelling over the listing some more.

Ok, what they do is: fill memory from the bottom to the top, check and complement using the shifting LSB, and then check from the bottom to the top. Another new test called THEIRS, which of course doesn’t catch the problem either. I am running out of hair to pull out here!

As I write this, both machines are running DDMMD against the others memory, happily, no errors. No happy ending… YET!