The XKL Toad-1 System

The XKL Toad-1 System (hereafter “the Toad”) is an extended clone of the DECSYSTEM-20, the third generation of the PDP-10 family of computers from Digital Equipment Corporation. What does that mean? To answer that requires us to step back into the intertwined history of DEC, BBN,1 SAIL,2 and other parts of Stanford University’s computing community and environs.

It’s a long story. Get comfortable. I think it will be worth your time.

The PDP-10

The PDP-10 family (which includes the earlier PDP-6) is a typical mainframe computer of the mid-1960s. Like many science oriented computers prior to IBM’s System/360 line, the PDP-10 architecture addressed binary words which were 36 bits long, rather than individual characters as was common in business oriented systems. In instructions, memory addresses took up half of the 36 bit word; 18 bits is enough to address 262,144 locations, or 256KW, a very large memory in the days when each bit of magnetic core cost about $1.00.3 Typical installations had 64KW or 96KW attached. The KA-10 CPU in the first generation PDP-104 could not handle any more memory than that.

Another important feature of the PDP-10 was timesharing, a facility by which multiple users of the computer each was given the illusion that each was alone in interacting with the system. The PDP-6 was in fact the first commercially available system to feature interactive timesharing as a standard facility rather than as an added cost item.

TENEX

In the late 1960s, virtual memory was an important topic of research: How to make use of the much larger, less expensive capacity of direct access media such as disks and drums to extend the address space of computers, instead of the very expensive option of adding more core.5 One company which was looking at the issue was Bolt, Beranek, & Newman, who were interested in demand paged virtual memory, that is, viewing memory as made up of chunks, or pages, accessed independently, and what an operating system which had access to such facilities would look like.

To facilitate this research, BBN created a pager which they attached to a DEC PDP-10, and began writing an operating system which they called TENEX for “PDP-10 Executive”6 TENEX was very different from Tops-10, the operating system provided by DEC, but was interactive in the same way as the older OS. The big difference was that more programs could run at the same time, because only the currently executing portions of each program needed to be present in the main (non-virtual) memory of the computer.

TENEX was popular operating system, especially in university settings, so many PDP-10s had the BBN pager attached. In fact, the BBN pager was also used on a PDP-10 system which ran neither TENEX nor Tops-10, to wit, the WAITS system at SAIL.7

The DECsystem-10

The second generation of the PDP-10 underwent a name change, to the DECsystem-10, as well as gaining a faster new processor, the KI-10. This changed the way memory was handled, by adding a pager which divided memory up into 512 word blocks (“pages”). Programs were still restricted to 18 bits of address like previous generations, but the CPU could now handle 22 bits of address in the pager, so the physical memory could be up to four megawords (4MW), which is 16 times as much as the KA-10.

This pager was not compatible with, and was much less capable than, the BBN device, although DEC provided a version of TENEX modified to work with the KI pager for customers willing to pay extra. Some customers considered this to be too little, too late.

SAIL and the Super FOONLY

In the late 1960s, computer operating systems were an object of study in the broader area of artificial intelligence research. This was true of the Stanford Artificial Intelligence Laboratory, for example, where the PDP-6 timesharing monitor8 had been heavily modified to make it more useful for AI researchers. When the PDP-10 came out three years later, SAIL acquired one, attached a BBN pager, and connected it to the PDP-6, modifying the monitor (now named Tops-10) to run on both CPUs, with the 10 handing jobs off to the 6 if they called for equipment attached to the latter. By 1972, the monitor had diverged so greatly from Tops-10 that it received a new name, WAITS.

But the hardware was old and slow, and a faster system was desired. The KI-10 processor was underpowered from the perspective of the SAIL researchers, so they began designing their own PDP-10 compatible system, the Super FOONLY.9 This design featured a BBN style pager and used very fast semiconductors10 in its circuitry. It also expanded the pager address to 22 bits, like the KI-10, so was capable of addressing up to 4MW of memory. Finally, unlike the DEC systems, this system was build around the use of a fast microcoded processor which implemented the PDP-10 architecture as firmware rather than as special purpose hardware.

DECSYSTEM-20 and TOPS-20

DEC was aware of the discontent with their new system among customers; to remedy the situation, they purchased the design of the SuperFOONLY from Stanford, and hired a graduate student from SAIL to install and maintain the SUDS drawing system at DEC’s facilities in Massachusetts. The decision was made to keep the KI-10 pager design in the hardware, and implement the BBN style pager in microcode.

Because of the demand for TENEX from a large part of their customer base, DEC also decided to port the BBN operating system to the new hardware based on the SAIL design. DEC added certain features to the new operating system which had been userland code libraries in TENEX, such as command processing, so that a single style of command handling was available to all programmers.

When DEC announced the new system as the DECSYSTEM-20, with its brand new operating system called TOPS-20, they fully expected customers who wanted to use the new hardware would flock to it, and would port all of their applications from Tops-10 to TOPS-20, even though the new OS did not support many older peripherals on which the existing applications relied. The customers rebelled, and DEC was forced to port Tops-10 to the new hardware, offering different microcode to support the older OS on the new KL-10 processor.

Code Name: Jupiter

DEC focused on expanding the capabilities of their flagship minicomputer line, the PDP-11 family, for the next few years, with a planned enhancement to take the line from 16 bit mini to 32 bit supermini. The end result was an entirely new family, the VAX, which offered virtual memory like the PDP-10 mainframes in a new lower cost package.

But DEC did not forget their mainframe customer base. They began designing a new PDP-10 system, intended to include enhanced peripherals, support more memory, and run much faster than the KL-10 in the current Dec-10/DEC-20 systems. As part of the design, codenamed “Jupiter”, the limited 18 bit address space of the older systems was upgraded to 30 bits, that is, a memory size of one gigaword (1GW = 1024MW), which was nearly 2.5 times the size of the equivalent VAX memory, and far larger than the memory sizes available in the IBM offerings of the period.

Based on the promise of the Jupiter systems, customers made do with the KL-10 systems which were available, often running multiple systems to make up for the lack of horsepower. Features were added to the KL, by changes to the microcode as well as by adding new hardware. The KL-10 was enhanced with the ability to address the new 30-bit address space, although the implementation was limited to addressing 23 bits (where the hardware only handled 22); thus, although a system maxed out at 4MW, virtual memory could make it look like 8MW.

DEC also created a minicomputer sized variant of the PDP-10 family, which they called the DECSYSTEM-2020. This was intended to extend the family into department sized entities, rather than the corporation sized mainframe members of the family.11 There was also some interest in creating a desktop variant; one young engineer was well known for pushing the idea of a “10 on a desk”, although his idea was never prototyped at DEC.

DEC canceled the Jupiter project, apparently destined to be named the DECSYSTEM-40, in May 1983, with an announcement to the Large Systems customers at the semiannual DECUS symposia. Customer outrage was so great that DEC agreed to continue hardware development on the KL-10 until 1988, and software development across the family until 1993.

Stanford University Network

In 1980, there were about a dozen sites at Stanford University which housed PDP-10 systems, mostly KL-10 systems running TOPS-20 but also places like SAIL, which had attached a KL-10 to the WAITS dual processor. Three of the TOPS-20 sites were the Computer Science Department (“CSD”), the Graduate School of Business (“GSB”), and the academic computing facility called LOTS.12

At this time, local-area networking was seen as a key element in the future of computing, and the director of LOTS (whom we’ll call “R”) wrote a white paper on the future of Ethernet13 on the campus. R also envisioned a student computer, what today we would call a workstation, which featured a megabyte of memory, a million pixels on the screen, a processor capable of executing a million instructions per second, and an Ethernet connection capable of transferring a millions bits of data per second, which he called the “4M machine”.

Networking also excited the director of the CSD computer facility, whom we’ll call “L”.14 L designed an Ethernet interface for the KL-10 processors in the DEC-20s which were ubiquitous at Stanford. This was dubbed the Massbus-Ethernet Interface Subsystem, or MEIS,15 pronounced “maze“.

The director of the GSB computer facility, whom we’ll call “S”, was likewise interested in networking, as well as being a brilliant programmer herself. (Of some importance to the story is the fact that she was eventually married to L.) S assigned one of the programmers working for her to add code to the TOPS-20 operating system to support the MEIS, using the PUP protocols created at PARC for the Alto personal computer.16

The various DEC-20 systems were scattered across the Stanford campus, each one freestanding in a computer room. R, L, and S ran miles of 50ohm coaxial cable, the medium of the original Ethernet, through the so-called steam tunnels under the campus, connecting all the new MEISes together. Now, it was possible to transfer files between DEC-20s from the command line rather than by writing them to a tape and carrying them from one site to another. It was also possible to log in from one DEC-20 to another–but using one mainframe to connect to another seemed wasteful of resources on the source system, so L came up with a solution.

R’s dream of a 4M machine had borne fruit: While still at CSD, he had a graduate student create the design for the Stanford University Network processor board. L repurposed the SUN-1 board17 at the processor in a terminal interface processor (“EtherTIP”), in imitation of the TIPs used by systems connected to the ARPANET and to commercial networks like Tymnet and Telenet. Now, instead of wiring terminals directly to a single mainframe, and using the mainframe to connect from one place to another, the terminals could be wired to an EtherTIP and could freely connect to any system on the Ethernet.

A feature of the PUP protocols invented at PARC was the concept of internetworking, connecting two or more Ethernets together to make a larger network. This is done by using a computer connected to both networks to forward data from each to the other. At PARC, a dedicated Alto acted as the router for this purpose; L designated some of the SUN-1 based system as routers rather than as EtherTIPs, and the Stanford network was complete.

Stanford University also supported a number of researchers who were given access to the ARPANET as part of their government sponsored research, so several of the PDP-10s on campus were connected to the ARPANET. When the ARPANET converted to using the TCP/IP protocols which had been developed for the purpose of bring internetworking to wide area networks, our threesome were ready, and assigned programmers from CSD, GSB, and LOTS to make L’s Ethernet routers speak TCP/IP as well as PUP. TOPS-20 was also updated to use TCP/IP, by Stanford programmers as well as by DEC.

S and L saw a business opportunity in all this, and began a small company to sell the MEIS and the associated routers and TIPs to companies and universities who wanted to add Ethernet to their facilities. They saw this as a way to finance the development of L’s long-cherished dream of a desktop PDP-10. They eventually left Stanford as the company grew, as it had tapped the exploding networking market at just the right time. The company grew so large in fact that the board of directors discarded the plan to build L’s system, and so the founders left Cisco Systems to pursue other opportunities.

XKL

L moved to Redmond in 1990, where he founded XKL Systems Corporation. This company had as its business plan to build the “10 on a desk”. The product was codenamed “TOAD”, which is what L had been calling his idea for a decade and a half because “Ten On A Desktop” is a mouthful. He hired a small team of engineers, including his old friend R from Stanford, to build a system which implemented the full 30-bit address space which DEC had abandoned with the cancelled Jupiter project, and which included modern peripherals and networking capabilities.18 R was assigned as Chief Architect; his job was to insure that the TOAD was fully compatible with the entire PDP-10 family, without necessarily replicating every bug in the earlier systems.

R also oversaw the port of TOPS-20 to the new hardware, although some boards19 had a pair of engineers assigned: One handled the detailed design and implementation of the board, while the other worked on the changes to the relevant portion of the operating system. R was responsible for the changes which related to the TOAD’s new bus architecture, as well as those relating to the much larger memory which the TOAD supported and the new CPU.20

The TOAD was supposed to come to market with a boring name, the “TD-1”, but ran into trademark issues. By that time, I was working at XKL, officially doing pre- and post-sales customer advocacy, but also working on the TOPS-20 port.21 Part of my customer advocacy duties was some low-key marketing; when we lost the proposed name, I pointed out that people had been hearing about L’s TOAD for years, and we should simply go with it; S, considered the unofficial “Arbiter of Taste” at XKL, agreed with me.22 We officially introduced the XKL Toad-1 System at a DECUS trade show in the spring of 1995.

On COBOL and Legacy Systems

Two recent articles1 (and I’m sure there are more which I haven’t seen yet) responding to a call for help from Gov. Phil Murphy of New Jersey decry the continued use of the COBOL programming language in legacy systems which have been in place for decades.

What’s irritating about these articles is the notion that COBOL is outmoded, outdated, old fashioned, because the language was first defined and implemented more than 60 years ago, and the applications which are having issues currently were written more than 40 years ago. I say, so what?

I spent several years—40 years ago!—programming in COBOL, supporting the financial systems of the University of Chicago. The most important thing to remember in financial systems is the maxim, If it ain’t broke, don’t fix it! The second is When it breaks, fix it right!

COBOL had two standards when I used it in my job, from 1968 and 1974. The differences were minuscule, mostly tightening up some ambiguities and eliminating minor features which had proven unuseful. Since then, new standards addressing new conditions came out in 1985 (structured programming primitives, freeform source permitted), 2002 (object-oriented paradigm added), and 2014 (whose changes I haven’t seen, yet, but I will, just because). Contrast this stability with the weekly updates from Microsoft, Apple, and the GNU/Linux communities.

The article by Steinberg points to the “pain” suffered in legacy systems when the so-called Y2K bug was a concern in the late 1990s. As those of us in the industry since the late 1960s are aware, that was a great deal of worry over nothing, because we had been aware of the possibility of our systems lasting that long when we wrote them. Yes, there were managers who insisted that none of this code would last long enough for two-digit years to be an issue, but front-line programmer/analysts who were supporting legacy code from the 1950s knew better, and did the right thing.

Those who do not know COBOL buy into the notion that somehow it is less secure because it is old, but data security has always been a concern in financial systems in traditional data processing shops, where the mainframe has been king all along. Who has ever heard of an on-line exploit targeting a weakness of a COBOL program running on a mainframe operating system, whether IBM or Unisys?2 The security systems created for those operating systems have always been strong, and were very rarely (never?) retrofits trying to fix weaknesses discovered by crackers.3

So, in my not so very humble opinion, the call to scrap these legacy systems and redo all the work in the flavor of the moment system is misguided, to say the least. The governor has the right idea: Get some people who know COBOL to fix the problems the right way, perhaps updating the language used where that is called for, and move on smartly. His call for “volunteers” to do the work is a wistful dream, of course, because the people who can do this work command hourly rates in the range of $200/hour, but they’re out there if the State of New Jersey is willing to do things right.

                                                                Of course, that’s just my opinion. I could be wrong.
                                                                                                                             –Dennis Miller

The PDP-10, Macro-10, and Altair 8800 BASIC

When the Altair 8800 microcomputer was announce by MITS in the January 1975 issue of Popular Electronics, it excited the dreams of two young men in Cambridge, Massachusetts, Paul Allen and Bill Gates. These two had long wanted to write a version of the BASIC language in which they had first learned to program several years earlier in High school, and here was a system for which they could do that.

How do you write a program for a computer which you don’t have? You use another computer and write a program to write your programs with. Paul Allen had already done this once, for an earlier project called Traf-o-Data which the two had created. To understand how it worked, we’ll first look at the computer on which they first developed their skills at system design and programming, the Digital Equipment Corporation PDP-101.

The PDP-10

The PDP-102 was the second generation of a line of mainframe computers from DEC, introduced in 1967 as a follow-on to the PDP-6 using the technology invented for the PDP-7 minicomputer. These systems arose in a time before the ubiquitous fixed size 8-bit byte, following an older standard with 36 bits per word in memory or on tape or disk media. Here, bytes were defined by the programmer for specific uses, and could be anywhere from 1 to 36 bits long as needed.

A feature of the PDP-10 which is important for our story has to do with the machine language of the central processor, which divides the 36 bit word3 up into several meaningful portions: A 9 bit instruction specifying an operation (“opcode”), 4 bits to select a register destination for the operation (“AC”), 1 bit to indicate whether the operation takes place on the data pointed to by the instruction or instead on data pointed to by the address pointed to by the instruction (“indirect”), 4 bits to select a register holding part of the address on which the instruction will operate (“index”), and 18 bits specifying part or all of the address or data on which the instruction will operate (“address”).

Now 9 bits can specify up to 512 different possible instructions, but the PDP-10 only defines a few more than 360.4 In most processors, that would be the end of the discussion, but DEC did things differently: They created a way for the otherwise unused instructions to be available to the programmer or the operating system, and called them UUOs.5 There were 2 classes of UUOs: “Monitor UUOs”, defined as opcodes 040-077,6 are used by the operating system7 to implement complex system calls such as opening files, moving data between peripherals, etc., while “user UUOs” (opcodes 001-037) were available to any programmer as a fast way to call routines in her program in the same way system calls worked.8

The monitor UUOs are assigned names like OPEN, CLOSE, INPUT, and so on, so that the programmer can simply use them like other instruction mnemonics like MOVE or ADD, but the user UUOs do not have names since they are intended to be program specific. The assembler language program, called Macro-10, provides a way to give a name to an operation, and will use that name just like the predefined mnemonics provided by DEC.

Macro-10

Assembler programs have existed since the early days of computing, as a way to eliminate programmer errors introduced by translating instructions into numerical representations by hand. They originated as simple tables of substitutions, but grew in capabilities as experience showed the possibilities for computers to manipulate text files. The most important such capability was the invention of “macroinstructions”, a way for the programmer to define a group of instructions which differed in specific ways which could be defined so that the assembler program could build a program more efficiently. An example will make this more clear.

The idea of a macro is to give a name to a block of text. Then when we mention the name, the corresponding text appears in the program and is treated by the assembler as if it were there all along. Our specific example is9

        MOVEI   2,15
        IDPB    2,1
        MOVEI   2,12
        IDPB    2,1
        MOVEI   2,0
        IPDB    2,1

This code takes a byte pointer in location A which points to the end of a text string, and adds an ASCII carriage return (octal 15 = ^M = CR), line feed (octal 12 = ^J = LF), and a null character (octal 0 = ^@ = NUL) to the end of the string.

Writing these 6 lines of code repeatedly throughout a program would be tiresome, so we define a macro to do this for us:

    DEFINE ACRLF <
        MOVEI   2,15
        IDPB    2,1
        MOVEI   2,12
        IDPB    2,1
        MOVEI   2,0
        IPDB    2,1 >;End of ACRLF

We can now use ACRLF to “Add a CR and LF” anywhere in our program that we might type an instruction.

Suppose we want to be able to put a byte from any AC, not just 2, into any string pointed to by a byte pointer, not just AC 1. In that case, we define our macro to take arguments, which will be substituted for by the assembler program when the macro is used:

    DEFINE ACRLF (ACC,BYP) <
        MOVEI   ACC,15
        IDPB    ACC,BYP
        MOVEI   ACC,12
        IDPB    ACC,BYP
        MOVEI   ACC,0
        IPDB    ACC,BYP >;End of ACRLF

Now, to use our macro to pad the string pointed to by PTR-1(Y) using the AC which we have named C in our program:

        ACRLF C,PTR-1(Y)

which will expand to (that is, the following code will appear to the assembler as if we had typed it in directly)

        MOVEI   C,15
        IDPB    C,PTR-1(Y)
        MOVEI   C,12
        IDPB    C,PTR-1(Y)
        MOVEI   C,0
        IPDB    C,PTR-1(Y)

Why is this important?

Paul Allen’s 8008 Simulator Program

Years before Microsoft, Paul Allen and Bill Gates started another company called Traf-o-Data, to build a traffic counter using the Intel 8008 microprocessor chip. While the Traf-o-Data hardware was being designed an built, Paul wanted to begin programming the 8008 but had no computer for the purpose, so he wrote a PDP-10 program to simulate the chip instead.10

The way it worked was simple. Intel published a data sheet which described what each instruction did with every 8 bit byte decoded. Paul wrote routines to perform the same operations on data in the PDP-10, storing the 8 bits of the 8008’s words in the 36 bits of the PDP-10’s. He defined UUOs to call these routines, and wrote macros which looked like the 8008 instructions published by Intel but expanded into his PDP-10 UUOs when assembled with Macro-10. Now Paul and Bill could write 8008 code for their Traf-o-Data hardware and test the resulting program using a PDP-10!

Altair BASIC

Paul had always wanted to write a version of BASIC, but Bill pointed out the inadequacy of the 8008 for Paul’s ideas. Not to be put off, Paul kept watching the news of the electronics world, and when Intel brought the 8080 he noted to Bill that all of the objections to the 8008 had been addressed in the new chip. Bill responded that no one build a computer which used the 8080. That was the state of things when the January 1975 issue of Popular Electronics hit the news stands in mid-December 1974, announcing the Altair 8800.

Paul and Bill were excited to learn that although the PE article said that BASIC was available for the new computer, no such program existed yet. Paul revised his 8008 simulator for the 8080 instruction set, and they began writing a version of BASIC using the new simulator program to test their code.

The 8080 simulator still exists on DECtapes belonging to Paul Allen’s estate. The contents of these tapes were copied onto the XKL Toad-1 which Paul purchased in 1997, and later onto the DECSYSTEM-2065 on which he debugged the early version of Altair BASIC which also resides on those DECtapes and which he demonstrated for Leslie Stahl on “60 Minutes” in 2011.

The following is a typeout of a short session on the DECsystem-10 at LCM+L with the 8080 simulator running an early version of Altair BASIC:

 *
 * Authorized users only - All activity is monitored.
 * To request access visit https://livingcomputers.org
 * Visit https://wiki.livingcomputers.org for documentation.
 *
 * Use ssh kl2065@tty.livingcomputers.org to connect.
 *
 * CTRL-] quit to return to the menu
 *
 PDPplanet DECsystem-10 #3530 TOPS-10 v7.04 
 .login alderson
 Job 16  PDPplanet DECsystem-10  TTY7
 Password: 
 [LGNLAS Last access to [20,20] succeeded on 24-Mar-120:12:38:55]
 14:09   24-Mar-120   Tuesday

 .run tbasic[20,10]

 * SIM-8080 EXECUTION *
 
 MEMORY SIZE? 8192

 STRING SPACE? 1024

 WANT SIN-COS-TAN-ATN? Y

 1276 BYTES FREE

 ALTAIR BASIC VERSION 1.1
 [EIGHT-K VERSION]

 OK
 PRINT 2+2

 4 

 OK
 10 FOR I=0 TO 10

 20 PRINT I

 30 NEXT I

 40 END

 LIST

 10 FOR I=0 TO 10
 20 PRINT I
 30 NEXT I
 40 END
 OK
 RUN

  0 
  1 
  2 
  3 
  4 
  5 
  6 
  7 
  8 
  9 
  10
 
 OK
 ^C

 .kjob
 Job 16  User ALDERSON [20,20]
 Logged-off TTY7  at 14:11:36  on 24-Mar-120
 Runtime: 0:00:01, KCS:40, Connect time: 0:02:04
 Disk Reads:173, Writes:6, Blocks saved:80570

Descendants of this simulator were used at Microsoft to write a number of products to the end of the 1980s, when microcomputers finally became powerful enough to host sophisticated development environments.





First languages: Rich Alderson

I was introduced to computers in the form of “Computer Math”, a high school class in programming in FORTRAN IV on the IBM 1401. My first exposure was as a guest of friends from the chess club, who were taking CM in the autumn of 1968; I was sorry that I had not known about the 1-semester class when I was signing up for my senior classes the previous spring.

This was the second year the class was taught, and demand was so high that the school district decided to offer it again in the spring. I rearranged my schedule, with the aid of the faculty adviser of the chess club (chair of the English department), and so began my life with computers.

The FORTRAN class was the usual, with lots of math oriented assignments as one might expect, since the teacher was the chair of the Math department. We learned to calculate areas of triangles, parallelograms, and so on, and how to make the printer and card punch do tricks. Exciting stuff (NOT).

Fortunately for me, my friends from the chess club were offered the opporunity to do a second semester of programming, taking classes in COBOL and PL/1 on Saturdays at the Illinois Institute of Technology. They used programmed instruction texts (read a paragraph or two, answer a question, see the answer on the next page, lather, rinse, repeat to end of book). I borrowed the two texts, read them cover to cover over the weekend, and proceeded to do all my assignments in 3 different languages.

I quickly fell in love with PL/1, which combined the mathematical capabilities of FORTRAN IV with the commercial processing capabilities of COBOL, and threw in marvelous string handling. Since I was interested in human languages already, this was a wonder and delight to me, to be able to string characters together on the fly instead of laboriously building a FORMAT statement to print a single line of text over and over.

For our final project in Computer Math, we were allowed to choose from several possibilities, such as “compute pi or e to 1000 places”. One possibility was to calculate the payroll for a mythical company; this is what I chose. I even used the 1968 tax tables, which included formulae for each bracket, to calculate deductions from people’s checks.

When it came time to turn in our projects, I showed the teacher all three versions of my program. He was dumbfounded. I got an A.

That began my lifelong interest in programming languages. Over the years, I have learned a couple of dozen, and have written compilers or interpreters for a few. I was always more interested in what I could do with a computer than in the physical details of how the computer worked, for years and years. That lasted until I went to work for a company building a new generation of one of the systems on which I made a living for decades.

But that’s a topic for another day.

Time-sharing in 12KW: Running TSS/8 On Real PDP-8 Hardware

But first, a little history

Digital Equipment Corporation’s PDP-8 minicomputer was a small but incredibly flexible little computer. Introduced in 1965 at a cost of $18,000 it created a new market for small computers, and soon PDP-8s found themselves used for all sorts of tasks: Industrial control, laboratory data capture and analysis, word processing, software development, and education. They controlled San Francisco’s BART Subway displays, ran the scoreboard at Fenway Park and assisted in brain surgery.

They were also used in early forays into time sharing systems. Time-sharing stood in stark contrast to the batch processing systems that were popular at the time: Whereas batch processing systems were generally hands-off systems (where you’d submit a stack of punched cards to an operator and get your results back days later) a time-sharing system allowed multiple users to interact conversationally with a single computer at the same time. These systems did so by giving each user a tiny timeslice of the computer: each user’s program would run for a few hundred milliseconds before another user’s program would get a chance. This switching happens so quickly it is imperceivable to users — providing the illusion that each user had the entire computer to themselves. Sharing the system in this manner allowed for more efficient use of computing resources in many cases.

TSS/8 was one such time-sharing endeavor, started as a research project at Carnegie-Mellon University in 1967. A PDP-8 system outfitted with 24KW of memory could comfortably support 20 simultaneous users. Each user got what appeared to them as a 4K PDP-8 system with which they were free to do whatever they pleased, and the system was (in theory, at least) impervious to user behavior: a badly behaved user program could not affect the system or other users.

With assistance from DEC, TSS/8 was fleshed out into a stable system and was made available to the world at large in 1968, eventually selling over a hundred copies. It was modestly popular in high schools and universities, where it provided a cost-effective means to provide computing resources for education. While it was never a widespread success and was eventually forgotten and supplanted on the PDP-8 by single-user operating systems like OS/8, TSS/8 was a significant development, as Gordon Bell notes:

“While only a hundred or so systems were sold, TSS/8 was significant because it established the notion that multiprogramming applied even to minicomputers. Until recently, TSS/8 was the lowest cost (per system and per user) and highest performance/cost timesharing system. A major side benefit of TSS/8 was the training of the implementors, who went on to implement the RSTS timesharing system for the PDP-11 based on the BASIC language.”

Gordon Bell, “Computer Engineering: A DEC View of Hardware Systems Design,” 1978

It is quite notable that DEC made such a system possible on a machine as small as the PDP-8: An effective time-sharing system requires assistance from the hardware to allow separation of privileges and isolation of processes — without these there would be no way to stop a user’s program from doing whatever it wanted to with the system: trampling on other users’ programs or wreaking havoc with system devices either maliciously or accidentally. So DEC had to go out of their way to support time-sharing on the PDP-8.

PDP-8 Time-Sharing Hardware

In combination with the MC8/I memory extension (which allowed up to 32KW of memory to be addressed by the PDP-8), the KT8/I was the hardware option that made this possible, and was available on the PDP-8/I as an option at its introduction. The KT8 option was made available for the original PDP-8 around this time as well.

So what does the KT8/I do (in combination with the MC8/I) that makes time-sharing on the PDP-8 feasible? First, it provides two privilege levels for program execution: Executive, and User. The PDP-8 normally runs at the Executive privilege level — at this level all instructions can be executed normally. Under the User privilege level, most instructions execute as normal, but certain instructions are forbidden and cause a trap. On the PDP-8, trappable instructions are:

  • IOTs (I/O Transfer instructions, generally used for controlling hardware and peripherals).
  • The HLT (Halt) instruction, which normally stops the processor.
  • The OSR and LAS instructions, which access the front panel’s switch register.

Under a time-sharing system such as TSS/8, the operating system’s kernel (or “Monitor” in TSS parlance) runs at the Executive privilege level. The Monitor can then control the hardware and deal with scheduling user processes.

User processes (or “Jobs” in TSS) run at the User level (as you might have guessed by the name). At this level, user programs can do whatever they want, but if one of the classes of instructions listed above is executed, the user’s program is suspended (the processor traps the instruction via an interrupt) and the PDP-8’s processor returns to the Monitor in Executive mode to deal with the privileged instruction. If the instruction is indeed one that a user program is not allowed to execute, the Monitor may choose to terminate the user program. In many cases, IOTs are used as a mechanism for user programs to request a service from the Monitor. For example, a user program might execute an IOT to open a file, type a character to the terminal, or to send a message to another user. Executing this IOT causes a trap, the Monitor examines the trapped instruction and translates it into the appropriate action, after which the Monitor resumes execution of user’s program in User mode.

Thus the privileged Executive mode and the unprivileged User mode make it possible to build an operating system that can prevent user processes from interfering with the functioning of the system’s hardware. The MC8/I Memory Extension hardware provided the other piece: Compartmentalizing user processes so they can’t stomp on other user programs or the operating system itself.

A basic PDP-8 system has a 12-bit address space and is thus capable of addressing only 4KW of memory. The MC8/I allowed extending memory up to 32KW in 4KW fields of memory — it did so by providing a three bit wide Extended Memory Address register (which thus provided up to 8 fields.) This did not provide a linear (flat) memory space: The PDP-8 processor could still only directly address 4096 words. But it did allow the processor to access data or execute instructions from any of these 8 fields of memory by executing a special IOT which caused future memory accesses and/or program instructions to come from a new field.

With this hardware assistance it becomes (relatively) trivial to limit a user program to stay within its own 4KW field: if it attempts to execute a memory management IOT to switch fields the KT8/I will cause a trap and the Monitor can either abort the user’s program or ensure that the field switch was a valid one (swapping in memory or moving things around to ensure that the right field is in the right place). (This latter proves to be significantly more difficult to do, for reasons I will spare you the details on. You’re welcome.)

This article’s supposed to be about running TSS/8 on a real PDP-8, let’s talk about that then shall we?

Where were we. Oh yes, TSS/8.

TSS/8 was initially designed to run on a PDP-8/I (introduced 1968) or the original PDP-8 (1965) equipped with the following hardware at minimum:

  • 12KW of memory
  • KT8/I and MC8/I or equivalent
  • A programmable or line-time clock (KB8/I)
  • An RF08 or DF32 fixed head disc controller with up to four RS08s or DS32 fixed head disks

Optionally supported were the TC08 DECtape controller and a DC08 or PT08 terminal controller for connecting up multiple user terminals. As time went on, TSS/8 was extended to support the newer Omnibus PDP-8 models and peripherals: The PDP-8/e (1970), 8/f, 8/m and the 8/a introduced in 1976.

TSS/8 used an RF08 or DF32 disc for storing the operating system, swapping space, and the user filesystem. Of these the most critical application was swapping: each user on the system got 4KW of swap space on the disk for their current job — as multiple users shared the system and there became more users than memory, a user’s program would be swapped out to disk to allow another user’s program to run, then swapped back in at a later time. Thus the need for fast transfer rate with minimal latency was required: The RF08 being a fixed-head disk had very little latency (averaging about 17ms due to rotational delays) and had a transfer rate of about 62KW/second.

Fixed head disks also had the advantage of being word addressable, unlike many later storage mechanisms which read data a sector at a time. This made transfers of small amounts of data (like filesystem structures) more efficient as only the necessary data needed to be transferred.

Our RF08 Controller with two RS08 drives (256KW capacity each)
Our RF08 Controller with two RS08 drives (256KW capacity each)

We’ve wanted to get TSS/8 running at the museum for a long time. The biggest impediment to running TSS/8 on real hardware in this year of 2019 is the requirement for a fixed-head disk. There are not many RF08s or DF32s left in the world these days, and the ones that remain are difficult to keep operational in the long term. We have contemplated restoring a PDP-8/I and the one RF08 controller (with two RF08 discs) in our collection or building an RF08 emulator, but I thought it would be an interesting exercise to get it to run on the PDP-8/e we already have on exhibit on the second floor, with the hardware we already have restored and operational.

LCM+L's PDP-8/e.  RK05 drive on the left.
LCM+L’s PDP-8/e. RK05 drive on the left.

Our 8/e is outfitted with an RK05 drive, controlled by the usual RK8E Omnibus controller. The RK05 is a removable pack drive with a capacity of approximately 2.5MB and a transfer rate of 102KW/sec. On paper it didn’t seem infeasible to run a time-sharing system with an RK05 instead of a RF08 — each user’s 4K swap area transposes nicely to a single track on an RK05 (a single track is 16 sectors of 256 words yielding 4096 words) and the capacity is larger than the maximum size for an RF08 controller (1.6MW vs 1.0MW). However, the seek time of the RK05 (10ms track-to-track, 50ms average vs. no seek time on the RF08) means performance is going to be lower, the only question is by how much. My theory was that while the system would be slower it would still be usable. Only one way to find out, I figured.

Finding TSS/8 Sources

Of course, in order to modify the system it would be useful to have access to the original source code. Fortunately the heavy lifting here has already been done: John Wilson transcribed a set of source listings way back in the 1980s and made them available on the Internet in the early 2000s. Since then a couple of PDP-8 hackers (Brad Parker and Vincent Slyngstad) combined efforts to make those source listings build again, and made the results available here. Cloning that repository provides the sources and the tools necessary to assemble the TSS/8 source code and build an RF08 disk containing the resultant binaries along with a working TSS/8 filesystem. I began with this as a base and started in to hacking away.

Hacking Away

The first thing one notices when perusing the TSS/8 source is that it has comments. Lots of comments. Useful comments. I would like to extend my heartfelt thanks to the original authors of this code, you are the greatest.

Lookit’ them comments: That’s the way you do it!

There are two modules in TSS/8 that need modifications: INIT and TS8. Everything else builds on top of these. INIT is a combination of bootstrap, diagnostic, backup, and patching tool. Most of the time it’s used to cold boot the TSS/8 system: It reads TS8 into fields 0 and 1 of the PDP-8’s memory and starts executing it. TS8 is the TSS/8 Monitor (analogous to the “kernel” in modern parlance). It manages the hardware, schedules user jobs, and executes user requests.

It made sense to make changes to INIT first, since it brings up the rest of the system. These changes ended up being fairly straightforward as everything involved with booting the system read entire 4K tracks in at a time, nothing complicated. (I still have yet to modify the DECtape dump/restore routines, however.)

The code for TS8, the TSS/8 Monitor, lives in ts8.pal, and this is where the bulk of the code changes live. The Monitor contains the low-level disk I/O routines used by the rest of the system. I spent some time studying the code in ts8.pal to understand better what needed to be changed and it all boiled down to two sets of routines: one used for swapping processes in and out 4KW at a time, and one used for filesystem transfers of arbitrary size.

I started with the former as it seemed the less daunting task. The swapping code is given a 4K block of memory to transfer either to (“swapping out”) or from (“swapping in”) the fixed-head disk. For the DF32 and RF08 controllers this is simple: You just tell the controller “copy 4KW from here and put it over there” (more or less) and it goes off and does it and causes an interrupt to let the processor know when it’s done. Simple:

SWPIN,    0
     DCMA        /TO STOP THE DISC
     TAD SWINA   /RETURN ADDRESS FOR INTURRUPT CHAIN
     DCA I DSWATA    /SAVE IT
     TAD INTRC   /GET THE TRAC # TO BE READ IN
     IFZERO RF08-40 <     
     TAD SQREQ   /FIELD TO BE USED     
     DEAL     
     CLA     
     NOP     /JUST FOR PROPER LENGTH     >
     IFZERO RF08 <     
     DXAL     
     TAD SQREQ   /FIELD TO BE SWAPPED IN     
     TAD C0500   /ENABLE INTERRUPT ON ERROR AND ON COMPLETION     
     DIML     >
     DCA DSWC    /WORD COUNT
     CMA
     DCA DSMA    /CORE ADDRESS
     DMAR
     JMP I SWPIN

SWPTR,    JMP SWPERR      /OOPS
     TAD FINISH      /DID WE JUST SWAP IN OR OUT?
     SMA
     JMP SWPOK       /IN; SO WE'RE FINISHED
     CIA
     DCA FINISH      /SAVE IT
     JMS SWPIO       /START SWAP IN
     DISMIS          /GO BACK TO WHAT WE WERE DOING

For the RK05 things are a bit more complicated: The RK8E controller can only transfer data one sector (256 words) at a time, so my new swapping code would need to run 16 times (and be interrupted 16 times) in order to transfer a full 4KW. And it would have to keep track of the source and destination addresses manually. Obviously this code was going to take up more space, and space was already at a premium in this code (the TSS/8 Monitor gets a mere 8KW to do everything it needs to do). After fighting with the assembler and optimizing and testing things I came up with:

SWPIN, TAD SQREQ                 / GET FIELD TO BE SWAPPED IN
     TAD C0400                   / READ SECTOR, INTERRUPT
     DLDC                        / LOAD COMMAND REGISTER:
                                 / FIELD IS IN BITS 6-8;
                                 / INTERRUPTS ENABLED ON TRANSFER COMPLETE
                                 / OF A 256-WORD READ TO DRIVE ZERO.
     TAD     INTRC               / GET THE TRACK # TO READ FROM
     TAD     RKSWSE              / ADD SECTOR
     DLAG                        / LOAD ADDRESS, GO
     JMP I   SWPIT
     
 / FOR RK05:
 / ON EACH RETURN HERE, CHECK STATUS REG (ERROR OR SUCCESS MODIFIES
 / ENTRY ADDRESS TO SWPTR)
 / ON COMPLETION, INC. SECTOR COUNT, DO NEXT SECTOR.  ON LAST SECTOR
 / FINISH THE SWAP.
 SWPA,    SWPTR                  /RETURN ADDRESS AFTER SWAP
 
 SWPTR, JMP SWPERR      /OOPS
     TAD RKADR
     TAD C0400       /NEXT ADDRESS
     DCA RKADR
     TAD RKSWSE      /NEXT SECTOR
     IAC
     AND C0017   
     SNA             /SECTOR = 16? DONE?
     JMP SWFIN       /YEP, FINISH THINGS UP.
     DCA RKSWSE      /NO - DO NEXT SECTOR
     JMS SWPIO       /START NEXT SECTOR TRANSFER
     DISMIS          /GO BACK TO WHAT WE WERE DOING
 SWFIN, TAD FINISH   /DID WE JUST SWAP IN OR OUT?    
     SMA
     JMP SWPOK       /IN; SO WE'RE FINISHED
     CIA
     DCA FINISH      /SAVE IT
     JMS SWPIR       /START SWAP IN
     DISMIS          /GO BACK TO WHAT WE WERE DOING      
     

The above is only slightly larger than the original code. Like the original, it’s interrupt driven: SWPIN sets up a sector transfer then returns to the Monitor — the RK8E will interrupt the processor when this transfer is done, at which point the Monitor will jump to SWPTR to process it. SWPTR then determines if there are more sectors to transfer, and if so starts the next transfer, calculating the disk and memory addresses needed to do so.

After testing this code, TSS/8 would initialize and prompt for a login, and then hang attempting to do a filesystem operation to read the password database. Time to move on to the other routine that needed to be changed: the filesystem transfer code. This ended up being considerably more complicated than the swapping routine. As mentioned earlier, the RF08 and DF32 disks are word-addressable, meaning that any arbitrary word at any address on disk can be accessed directly. And these controllers can transfer any amount of data from a single word to 4096 words in a single request. The RK05 can only transfer a sector’s worth of data (256 words) at once and transfers must start on a sector boundary (a multiple of 256 words). The TSS/8 filesystem code makes heavy use of the flexibility of the RF08/DF32, and user programs can request transfers of arbitrary lengths from arbitrary addresses as well. This means that the RK05 code I’m adding will need to do some heavy lifting in order to meet the needs of its callers.

Like the swapping code, a single request may require multiple sector transfers to complete. Further, the new code will need to have access to a private buffer 256 words in length for the transfer of a single RK05 sector — it cannot copy sector data directly to the caller’s destination like it does with the RF08/DF32 because that destination is not likely to be large enough. (Consider the case where the caller wants to read only one word!) So for a read operation, the steps necessary are:

  1. Given a word address for the data being requested from disk, calculate the RK05 sector S that word lives in. (i.e. divide the address by 256).
  2. Given the same, calculate the offset O in that sector that the data starts at (i.e. calculate the word address modulo 256)
  3. Start a read from the RK05 for sector S into the Monitor’s private sector buffer. Return to the Monitor and wait for an interrupt signalling completion.
  4. On receipt of an interrupt, calculate the length of the data to be copied from the private sector buffer into the caller’s memory (the data’s final destination). Calculate the length L as 256-O (i.e. copy up to the end of the sector we read.)
  5. Copy L words from offset O in the private sector buffer to the caller’s memory.
  6. Decrement the caller’s requested word count by L and see if any words remain to be transferred: If yes, increment the sector S, reset O to 0 (we start at the beginning of the next sector) and go back to step 3.
  7. If no more words to be transferred, we’re done and we can take a break. Whew.

Doing a Write is more complicated: Because the offset O may be in the middle of a sector, we have to do a read-modify-write cycle: Read the sector first into the private buffer, copy in the modified data at offset O in the buffer, and then write the whole buffer back to disk.

This code ended up not fitting in Field 0 of TS8 — I had to move it into Field 1 in order to have space for both the code and the private sector buffer. So as not to bore you I won’t paste the final code here (it’s pretty long) but if you’re curious you can see it starting around line 6994 of ts8.pal.

This code while functional has some obvious weaknesses and could be optimized: the read-modify-write cycle for write operations is only necessary for transfers that start at a non-sector boundary or are less than a sector in size. Repeated reads from the same sector could bypass the actual disk transfer (only the first read need actually hit the disk). Similarly, repeated writes to the same sector need only commit the sector to disk when a new sector is requested. I’m waiting to see how the system holds up under heavy use, and what disk usage patterns emerge before undertaking these changes, premature optimization being the root of all evil and whatnot.

The first boot of TSS/8 on our PDP-8/e!

I tested all of these changes as I was writing them under SIMH, an excellent suite of emulators for a variety of systems including the PDP-8. When I was finally ready to try it on real hardware, I used David Gesswein’s dumprest tools to write the disk image out to a real RK05 pack, and toggled in the RK05 TSS/8 bootstrap I wrote to get INIT started. After a a couple of weeks of working only under the emulator, it was a real relief when it started right up the first time on the real thing, let me tell you!

TSS/8 is currently running on the floor at the museum, servicing only two terminals. I’m in the process of adding six more KL8E asynchronous serial lines so that we can have eight users on the system — the hope is to make the system available online early next year so that people around the world can play with TSS/8 on real hardware.

I’ve also been working on tracking down more software to run on TSS/8. In addition to what was already available on the RF08 disk image (PALD, BASIC, FOCAL, FORTRAN, EDIT) I’ve dug up ALGOL, and ported CHEKMO II and LISP over. If anyone out there is sitting on TSS/8 code — listings, paper tape, disk pack, or DECtape, do drop me a line!

And if you’re so inclined, and have your own PDP-8 system with an RK05 you can grab the latest copy of my changes on our Github at https://github.com/livingcomputermuseum/cpus-pdp8 and give it a whirl. Comments, questions, and pull requests are always welcome!

A Journey Into the Ether: Debugging Star Microcode

Back in January I unleashed my latest emulation project Darkstar upon the world. At that time I knew it still had a few areas that needed more refinement, and a few areas that were very rough around the edges. The Star’s Ethernet controller fell into that latter category: No detailed documentation for the Ethernet controller has been unearthed, so my emulated version of it was based on a reading of the schematics and diagnostic microcode listings, along with a bit of guesswork.

Needless to say, it didn’t really work: The Ethernet controller could transmit packets just fine but it wasn’t very good at receiving them. I opted to release V1.0 of Darkstar despite this deficiency — while networking was an important part of Xerox’s computing legacy, there were still many interesting things that could be done with the emulator without it. I’d get the release out the door, take a short break, and then get back to debugging.

Turns out the break wasn’t exactly short — sometimes you get distracted by other shiny projects — but a couple of weeks back I finally got back to working on Darkstar and I started with an investigation of the Receiver end of the Ethernet interface — where were things going wrong?

The first thing I needed to do was come up with some way to see what was actually being received by the Star, at the macrocode level. While I lack sources for the Interlisp-D Ethernet microcode, I could see it running in Darkstar’s debugger, and it seemed to be picking up incoming packets, reading in the words of data from these packets and then finally shuffling them off to the main memory. From this point things got very opaque — what was the software (in this case the operating system itself) doing with that data, and why was it apparently not happy with it?

The trickiest part here was finding diagnostic software to run on the Star that could show me the raw Ethernet data being received, and after a long search through available Viewpoint, XDE, and Interlisp-D tools and finding nothing that met my needs I decided to write my own in Interlisp-D. The choice to use Interlisp-D was mainly due to the current lack of XDE compilers, but also because the Interlisp-D documentation covered exactly what I needed to accomplish, using the ETHERRECORDS library. I wrote some quick and dirty code to print out the contents of any packets coming in, and got… crickets. Nothing. NIL, as the Lisp folks say.

Hmm.

So I went back and watched the microcode read a packet in and while it was indeed pulling in data, upon closer inspection it was discarding the packet after the first few words. The microcode was checking that the packet’s Destination MAC address (which begins each Ethernet packet’s header) matched that of the Star’s MAC address and it was ascertaining that the packet in question wasn’t addressed to it. This is reasonable behavior, but the packets it was receiving from my test harness were all Broadcast packets, which use a destination address of ff:ff:ff:ff:ff:ff and which are, by definition, destined for all machines on the network — which is when I finally noticed that hey wait a minute… the words the microcode is reading in for the destination address aren’t all FF’s as they should be… and then I slapped my forehead when I saw what I had done:

Whoops.

I’d accidentally used the “PayloadData” field (which contains just the actual data in the packet) rather than the “Data” field (which contains the full packet including the Ethernet header). So the microcode was never seeing Ethernet headers at all, instead it was trying to interpret packet data as the header!

I fixed that and things were looking much, much better. I was able to configure TCP/IP on Interlisp-D and connect to a UNIX host and things were generally working, except when they weren’t. On rare occasions the Star would drop a single word (two bytes) from an incoming packet with no fanfare or errors:

The case of the missing words. Note the occasional loss of two characters in the above directory listing.

This was puzzling to say the least. After some investigation it became clear that the lost word was randomly positioned within the packet; it wasn’t lost at the beginning or end of the packet due to an off-by-one error or something not getting reset between packets. Further investigation indicated that without fail, the microcode was reading in each word from the packet via the ←EIData function (which reads the next incoming word from the Ethernet controller and puts it on the Central Processor’s X Bus). On the surface it looked like the microcode was reading each word in properly… but then why was one random word getting lost?

It was time to take a good close look at the microcode. I lack source code for the Interlisp-D Ethernet microcode but my hunch was that it would be pretty similar to that used in Pilot since no one in their right mind rewrites microcode unless they absolutely have to. I have some snippets of Pilot microcode, fortunately, and as luck would have it the important portions of it matched up with what Interlisp was using, notably the below loop:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

The code starting with the label EInLoop (helpfully labeled “main input loop”) loads the Memory Address Register (MAR) with the address where the next word from the Ethernet packet will be stored; and the following line invokes ←EIData to read the word in and write it to memory via the Memory Data Register (MDR). The next instruction then decrements a word counter in a register named EE and loops back to EInLoop (“GOTO[EInLoop]”). (If this word counter underflows then the packet is too large for the microcode to handle and is abandoned.)

An important diversion is in order to discuss how branches work in Star microcode. By default, each microinstruction has an INIA (InitialNext Instruction Address) field that tells the processor where to find the next instruction to be executed. Microinstructions need not be ordered sequentially in memory, and in fact, generally are not (this makes looking at a raw dump of microcode highly entertaining). At the end of every instruction, the processor looks at the INIA field and jumps to that address.

To enable conditional jumps, a microinstruction can specify one of several types of Branches or Dispatches. These cause the processor to modify the INIA of the next instruction by OR’ing in one or more bits based on a condition or status present during the current instruction. (This is then referred to as NIA, for Next Instruction Address). For example, the aforementioned word counter underflow is checked by the line:

ERead:    EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];

The EE register is decremented by 1 and the ZeroBr field specifies a branch if the result of that operation was zero. If that was the case, then the INIA of the next instruction (at EInLoop) is modified — ZeroBr will OR a “1” into it.

EInLoop:    MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong],    c1;

This branch is denoted by the BRANCH[$,EITooLong] assembler macro which denotes the two possible destinations of the branch. The dollar sign ($) indicates that in the case of no branch, the next sequential instruction should be executed, and that that instruction needs no special address. In the case of a branch (indicating underflow) the processor will jump to EITooLong instead.

Clear as mud? Good! So how does this loop exit under normal conditions? In the microcode instruction at EInLoop there is the clause EtherDisp. This causes a microcode dispatch — a multi-way jump — based on two status bits from the Ethernet controller. The least-significant bit in this status is the Attn bit, used to indicate that the Ethernet controller has something to report: A completed packet, a hardware error, etc. The other bit is always zero if the Ethernet controller is installed. (If it’s not present, the bit is always 1).

Just like a conditional branch, a dispatch modifies the INIA of the next instruction by ORing those status bits in to form the final NIA. The instruction following EInLoop is:

MDR ← EIData, DISP4[ERead, 0C],    c2;

The important part to us right now is the DISP4 assembler macro: this sets up a dispatch table starting with the label ERead which it places at address 0x0C (binary: 1100). Note how the lower two bits in this address are clear, to allow branches and dispatches to OR modified bits in. In the case where EtherDisp specfies no special conditions (all bits zero) the INIA of this instruction is unmodified and left as 0x0C and the loop continues. In the case of a normal packet completion, EtherDisp will indicate that the Attn bit is set, ORing in 1, resulting in an NIA of 0x0D (binary: 1101).

This all looked pretty straightforward and I didn’t see any obvious way a single word could get lost here, so I looked at the other ways this loop could be exited — how do we get to the instruction at 0x0E (binary: 1110) from the dispatch caused by EtherDisp? At first this left me scratching my head — as mentioned earlier, the second bit masked in by EtherDisp is always zero! The clue is in what the instruction at 0x0E does: it jumps to a Page Cross handler for the routine.

This of course requires another brief (not so brief?) diversion into Central Processor minutiae. The Star’s Central Processor contains a simple mechanism for providing virtual memory via a Page Map, which maps virtual addresses to physical addresses. Each page is 256 words in size, and the CP has special safeguards in place to trap memory accesses that might cross a page boundary both to prevent illegal memory accesses and so the map can be maintained. In particular, any microinstruction that loads MAR via an ALU operation that causes a carry out of the low 8 bits (i.e. calculating an address that crosses a 256-word boundary) results in any memory access in the following instruction being aborted and a PageCross branch being taken. This allows the microcode to deal with Page Map-related activities (update access bits or cause a page fault, for example) before resuming the aborted memory access.

Whew. So, in the case of to the code in question:

{main input loop}
EInLoop: MAR ← E ← [rhE, E + 1], EtherDisp, BRANCH[$,EITooLong], c1;
MDR ← EIData, DISP4[ERead, 0C], c2;
ERead: EE ← EE - 1, ZeroBr, GOTO[EInLoop], c3, at[0C,10,ERead];
E ← uESize, GOTO[EReadEnd], c3, at[0D,10,ERead];
E ← EIData, uETemp2 ← EE, GOTO[ERCross], c3, at[0E,10,ERead];
E ← EIData, uETemp2 ← EE, L6←L6.ERCrossEnd, GOTO[ERCross], c3, at[0F,10,ERead];

Imagine (if you will) that register E (the Ethernet controller microcode gets two whole CPU registers of its very own and their names are E and EE) contains 0xFF (255) and the processor is running the instruction at EInLoop.  The ALU adds 1 to it, resulting in 0x100 — this is a carry out from the low 8-bits and so a PageCross branch is forced during the next instruction.  A PageCross branch will OR a “2” into the INIA of the next instruction.

The next instruction attempts to store the next word from the Ethernet’s input FIFO into memory via the MDR←EIData operation but this store is aborted due to the Page Cross caused during the last instruction.  And at last, a 2 is ORed into INIA, causing a dispatch to 0x0E (binary: 1110).  So in answer to our (now much earlier) question:  The routine at 0x0E is invoked when a Page Cross occurs while reading in an Ethernet packet.  (How the code gets to the routine at 0x0F is left as an exercise to the reader.)

And as it turns out, it’s the instruction at 0x0E that’s triggering the bug in my emulated Ethernet controller. 

E ← EIData, uETemp2 ← EE, GOTO[ERCross],    c3, at[0E,10,ERead];

Note the E←EIData operation being invoked — it’s reading in the word from the Ethernet controller for a second time during this turn through the loop, and remember that the first time it did this, it threw the result away since the MDR<- operation was canceled.  This second read is done with the intent to store the abandoned word away (in register E) until the Map operation is completed.

So what’s the issue here?  On the real hardware, those two ←EIData operations return the same data word rather than reading the next word from the input packet.  This is in fact one of the more clearly spelled-out details in the Ethernet schematic — it even explains why it’s happening! — one that I completely, entirely missed when writing the emulation:

Seems pretty clear to me…

Microinstructions in the Star’s Central Processor are grouped into clicks of three instructions each; a click’s worth of instructions execute atomically — they cannot be interrupted.  Each instruction in a click executes in a single cycle, referred to as Cycle 1, Cycle 2, and Cycle 3 (or c1, c2, and c3 for short).  You can see these cycles notated in the microcode snippet above.  Some microcode functions behave differently depending on what cycle they fall on.  ←EIData only loads in the next word from the Ethernet FIFO when executed during a c2; an ←EIData during c1 or c3 returns the last word loaded.  I had missed this detail, and as a result, my emulation caused any invocation of ←EIData to pull the next word from the FIFO.  As demonstrated above this nearly works, but causes a single word to be lost when a packet read crosses a page boundary.

I fixed the ←EIData issue in Darkstar and at long last, Ethernet is working properly.  I was even able to connect to one of the machines here at the museum:

The release on Github has been updated; grab a copy and let me know how it works for you!

If you’re interested in learning more about how the Star works at the microcode level, the Hardware Reference and Microcode Reference are a good starting point. Or drop me a line!

Introducing Darkstar: A Xerox Star Emulator

Star History and Development

The Xerox 810 Information System (“Star”)

In 1981, Xerox released the Xerox 8010 Information System (codenamed “Dandelion” during development) and commonly referred to as the Star. The Star took what Xerox learned from the research and experimentation done with the Alto at Xerox PARC and attempted to build a commercial product from it.  It was envisioned as center point of the office of the future, combining high-resolution graphics with the now-familiar mouse, Ethernet networking for sharing and collaborating, and Xerox’s Laser Printer technology for faithful “WYSIWYG” document reproduction.  The Star’s operating system (called “Star” at the outset, though later renamed “Viewpoint”) introduced the Desktop Metaphor to the world.  In combination with the Star’s unique keyboard it provided a flexible, intuitive environment for creating and collaborating on documents and mail in a networked office environment.

The Star’s Keyboard

Xerox later sold the Star hardware as the “Xerox 1108 Scientific Information Processor” – In this form it competed with Lisp workstations from Symbolics, LMI, and Texas Instruments in the burgeoning AI workstation market and while it wasn’t quite as powerful as any of their offerings it was considerably more affordable – and sometimes much smaller.  (The Symbolics 3600 workstation, c. 1983 was the size of a refrigerator and cost over $100,000).

The Star never sold well – it was expensive ($16,500 for a single workstation and most offices would need far more than just one) and despite being flexible and powerful, it was also quite slow. Unlike the IBM PC, which also made its debut in 1981 and would eventually sell millions, Xerox ended up selling somewhere in the neighborhood of 25,000 systems, making the task of finding a working Star a challenge these days.

Given its history and relationship to the Alto, the Star seemed appropriate for my next emulation project. (You can find the Alto emulator, ContrAlto, here). As with the Alto a substantial amount of detailed hardware documentation had been preserved and archived, making it possible to learn about the machine’s inner workings… except in a few rather important places:


From the March 1982 edition of the Dandelion Hardware Manual.  Still waiting for these sections to be written…

Fortunately, Al Kossow at Bitsavers was able to provide extra documentation that filled in most of the holes.  Cross-referencing all of this with the available schematics, it looked like there was enough information to make the project possible.

The Dandelion Hardware

The Star’s Central Processor (CP). Note the ALU (4xAM2901, top middle) and 4KW microcode store (bottom)

Much like the Alto, the Dandelion’s Central Processor (referred to as the “CP”) is microcoded, and, again like the Alto, this microcode is responsible for controlling various peripherals, including the display, Ethernet, and hard drive.  The CP is also responsible for executing bytecode macroinstructions.  These macroinstructions are what the Star’s user programs and operating systems are actually compiled to.  The CP is sometimes referred to as the “Mesa” processor because it was designed to efficiently execute Mesa bytecodes, but it was in no way limited to implementing just the Mesa instruction set: The Interlisp-D and Smalltalk systems defined their own microcode for executing their own bytecodes, custom-tailored and optimized to their environments.

Mesa was a strongly-typed “high-level language.” (Xerox hackers loved their puns…) It originated on the Alto but quickly grew too large for it (a smaller, stripped-down Mesa called “Butte” (i.e. “a small Mesa”) existed for the Alto but was still fairly unwieldy.)  The Star’s primary operating system was written in Mesa, which allowed a set of very sophisticated tools to be developed in a relatively short period of time.

The Star architecture offloaded the control of lower-speed devices (the keyboard and mouse, serial ports, and the floppy drive) to an 8-bit Intel 8085-based I/O processor board, referred to as the IOP.  The IOP is responsible for booting the system: it runs basic diagnostics, loads microcode into the Central Processor and starts it running.  Once the CP is running, it takes over and works in tandem with the IOP.

Emulator Development

The Star’s I/O Processor (IOP). Intel 8085 is center-right.

Since the IOP brings the whole system up, it seemed that the IOP was the logical place to begin implementing the emulator.  I started with an emulation of the 8085 processor and hooked up the IOP ROMs and RAMs.  Since the first thing the IOP does at power up or reset is execute a vigorous set of self-tests, the IOP was, in effect, testing my work as I progressed which was extremely helpful.  This is one important lesson Xerox learned from the Alto and applied to the Star: on-board diagnostics are a good thing.  The Alto had no diagnostic facilities built in so if anything failed that prevented the system from running the only way to determine the fault was to get out the oscilloscope and the schematics and start probing.  On the Star, diagnostics and status are reported through a 4-digit LED display, the “Maintenance Panel” (or MP for short).  If the IOP finds a fault during testing, it presents a series of codes on this panel.  During a normal system boot, various codes are displayed to indicate progress.  The MP was the first I/O device I emulated on the IOP, for obvious reasons.

Development on the IOP progressed nicely for several weeks (and the codes reported in the emulated MP kept increasing, reflecting my progress in a quantitative way) and during this time I implemented a source-level debugger for the IOP’s 8085 code to help me along.  This was invaluable in working out what the IOP was trying to do and why it was failing to do so.  It allowed me to step through the original code, place breakpoints, and investigate the contents of the IOP’s registers and memory while the emulated system was running.

The IOP Debugger

Once the IOP self-tests were passing, the IOP emulation was running to the point where it attempted to actually boot the Central Processor!  This meant I had to shift gears and switch over to implementing an emulation of the CP and make it talk to the IOP. This is where the real fun began.

For the next couple of months I hunkered down and implemented a rough emulation of the CP, starting with system’s 16-bit ALU (implemented with four 4-bit AM2901 ALU chips chained together).  The 2901 (see top portion of the following diagram) forms the nexus of the processor; in addition to providing the processor’s 16 registers and basic arithmetic and logical operations, it is the primary data path between the “X bus” and “Y bus.”  The X Bus provides inputs to the ALU from various sources: I/O devices, main memory, a handful of special-purpose register files and the Mesa stack and bytecode buffer.  The ALU’s output connects to the Y bus, providing inputs back into these same components.

The Star Central Processor Data Paths

One of the major issues I was confronted with nearly immediately when writing the CP emulation was one of fidelity: how faithful to the hardware does this emulation need to be? This issue arose specifically because of two hardware details related to the ALU and its inputs:

  1. The AM2901 ALU has a set of flags that get raised based on the result of an ALU operation (for example, the “Carry” flag gets raised if the result of an operation causes a carry out from the most-significant bits). For arithmetic operations these flags make sense but the 2901 also sets these flags as the result of logical operations. The meaning of the flags in these cases is opaque and of no real use to programmers (what does it mean for a “carry” flag to be set as a result of a logical OR?) and exist only as a side-effect of the ALU’s internal logic. But they are documented in the spec sheet (see the picture below).
  2. With a 137ns clock cycle time, the CP pushes the underlying hardware to its limits. As a result, some combinations of input sources requested by a microinstruction will not produce valid results because the data simply cannot all make it to its destination on time. Some combinations will produce garbage in all bits, but some will be correct only in the lower nibble or byte of the result, with the upper bits being undefined. (This is due to the ALU in the CP being comprised of four 4-bit ALUs chained together.)
Logic equations for the “NOT R XOR S” ALU operation’s flags. What it means is an exercise left to the reader.

I spent a good deal of time pondering and experimenting. For #1, I decided to implement my ALU emulation with the assumption that Xerox’s microcode would not make use of the condition flags for non-arithmetic operations, as I could see no reason to make use of them for logical ops and implementing the equations for all of them would be computationally expensive, making the emulation slower. This ended up being a valid assumption for all logical ops except for OR — as it turns out, some microcode assumed that the Carry flag would be set appropriately for this class of operation. When this issue was found, I added the appropriate operations to my ALU implementation.

For #2 I assumed that if Xerox’s microcode made use of any “invalid” combinations of input sources, that it wouldn’t depend on the garbage portion of the results. (That is, if code made use of microinstructions that would only produce valid results in the lower 4 or 8 bits, the microcode would also only depend on the lower 4 or 8 bits generated.) Thus the emulated ALU always produces a complete, correct result across all 16-bits regardless of input source. This assumption appears to have held — I have encountered no real-world microcode that makes assumptions about undefined results thus far.

The above compromises were made for reasons of implementation simplicity and efficiency. The downside is that it is possible to write microcode that will behave differently on the emulation than on the real hardware. However, going through the time, trouble, and expense of a 100% accurate emulation did not seem worth it when no real microcode would ever require this level of accuracy. Emulation is full of trade-offs like this. It would be great to provide an emulation that is perfect in every respect, but sometimes compromises must be made.

I implemented a debugger and disassembler for the CP similar to the one I put together when emulating the IOP.  Emulation of the various X bus-related registers and devices followed, and slowly but surely the CP started passing boot diagnostics as I fixed bugs and implemented missing hardware.  Finally it reached the point where it moved from the diagnostic stage to executing the first Mesa bytecodes of the operating system – the Star was now executing real code!  At that time it seemed appropriate to implement the Star’s display controller so I could see what the Star was trying to tell me – and a few days and much debugging of the central processor later I was greeted with this display from the install floppy (and there was much rejoicing):

The emulated Star says “Hello” for the very first time

Following this I spent two weeks of late nights hacking — implementing the hard disk controller and fixing bugs.  The Star’s hard drive controller doesn’t use an off-the-shelf controller chip as this wasn’t an option at the time the Star was being developed in the late 1970s. It’s a very clever, minimal design with most of the heavy lifting being done in microcode rather than hardware. Thus the emulation has to work at a very low level, simulating (in a sense) the rotation of the platters and providing data from the disk as it moves under the heads, one word at a time (and at just the right time.)

During this period I also got to learn how Xerox’s hard disk formatting and diagnostic tools worked.  This involved some reverse engineering:  Xerox didn’t want end-users to be able to do destructive things with their hard disks so these tools were password protected.  If you needed your drive reformatted you called a Xerox service engineer and they came out to take care of it (for a minor service charge).  These days, these service engineers are in short supply for some reason.

Luckily, the passcodes are stored in plaintext on the floppy disk so they were easy to unearth.  For future reference, the password is “wizard” or “elf” (if you’re so inclined):

Having solved The Mystery of the Missing Passwords I was at last able to format a virtual hard disk and install Viewpoint, and after waiting nervously for the installation to finish I was rewarded with:

Viewpoint, at long last!

Everything looked good, until the hard disk immediately corrupted itself and the system crashed!  It was very encouraging to see a real operating system running (or nearly so), and over the following weeks I hammered out the remaining issues and started on a design for a real user interface for the emulator. 

I gave it a name: Darkstar.  It starts with a “D” (thus falling in line with the rest of the “D-Machines” produced by Xerox) contains “Star” in the name, and is also a nerdy reference to a cult-classic sci-fi film.  Perfect. 

Getting Darkstar

Darkstar is available for download on our Github site and is open source under the BSD 2-Clause license.  It runs on Windows and on Unix systems using the Mono runtime.  It is still very much a work in progress.  Feedback, bug reports, and contributions are always welcome.

Fun with the Star

You’ve downloaded and installed Darkstar and have perused the documentation – now what?  Darkstar doesn’t come with any Xerox software, but pre-built hard disk images are available on Bitsavers (and for the more adventurous among you, piles of floppy disk images are available if you want to install something yourself).  Grab http://bitsavers.org/bits/Xerox/8010/8010_hd_images.zip — this contains hard disk images for Viewpoint 2.0, XDE 5.0, and The Harmony release of Interlisp-D. 

You’ll probably want to start with Viewpoint; it’s the crowning achievement of the Star and it invented the desktop metaphor, with icons representing documents and folders. 

To boot Viewpoint successfully you will need to set the emulated Star’s time and date appropriately – Xerox imposed a very strict licensing scheme (referred to as Product Factoring) typically with licenses that expired monthly.  Without a valid license code, Viewpoint grants users a 6-day grace period, after which all programs are deactivated. 

Since this is an emulation, we can control everything about the system so we can tell the emulated Star that it’s always just a few hours after the installation took place, bypassing the grace period expiration and allowing you to play with Viewpoint for as long as you like.  Set the date to Nov. 10, 1990 and start the system running.

Now wait.  The system is running diagnostics.

Keep waiting.  Viewpoint is loading.

Go get a coffee.

Seriously, it takes a while for Viewpoint to start up.  Xerox didn’t intend for users to reboot their Stars very often, apparently.  Once everything is loaded a graphic of a keyboard will start bouncing around the screen:

The Bouncing Keyboard

Press any key or click the mouse to get started and you will be presented with the Viewpoint Logon Option Sheet:

The Logon Option Sheet

You can login with user name “user” with password “password”.  Hit the “Next” key (mapped to “Home” on your computer’s keyboard) to move between fields, or use the mouse to click in them.  Click on the “Start” button to log in and in a few moments, there you are:

Initial Viewpoint Desktop

The world is your oyster.  Some things work as you expect – click on things to select them, double-click to open them.  Some things work a little differently – you can’t drag icons around with the mouse as you might expect: if you want to move them, use the “Move” key (F6) on your keyboard; if you want to copy them, use the “Copy” key (F4).  These two keys apply to most objects in the system: files, folders, graphical objects, you name it.  The Star made excellent use of the mouse, but it was also very keyboard-centric and employed a keyboard designed to work efficiently with the operating system and tools.  Documentation for the system is available online – check out the PDFs at http://bitsavers.org/pdf/xerox/viewpoint/VP_2.0/, as they’re worth a read to familiarize yourself with the system. 

If you want to write a new document, you can open up the “Blank Document” icon, click the “Edit” button and start writing your magnum opus:

Plagiarism?

One can change text and paragraph properties – font type, size, weight and all other sorts of groovy things by selecting text with the mouse (use the left mouse to select the starting point, and the right to define the end) and pressing the “Prop’s” key (F8):

Mad Props

If you’re an artist or just want to pretend that you are one, open up the “Blank Canvas” icon on the desktop:

MSPaint.exe’s Great Grandpappy

Need to do some quick calculations?  Check out the Calculator accessory:

I’m the Operator with my Pocket Calculator
Help!

There are of course many more things that can be done with Viewpoint, far too many to cover here.  Check out the extensive documentation as linked previously, and also look at the online training and help available from within Viewpoint itself (check the “Help” tab in the upper-right corner.)

Viewpoint is only one of a handful of systems that can be run on Darkstar. Stay tuned for future installments, covering XDE and Interlisp-D!

IBM at LCM+L

As anyone familiar with LCM+L knows, the museum initially grew out of Paul Allen’s personal collection of vintage computers. Many of the larger systems in the collection reflected his own experiences with computers beginning in when he was still in high school. Among the systems he used then were System/360 mainframes manufactured by IBM, most of them stodgy batch processing systems with little appeal for a young man who had been exposed to interactive computing on systems from General Electric and Digital Equipment Corporation. There was, however, one member of the family which was different, IBM’s entry into the world of timeshared interactive computing, the System/360 Model 67.

The heart of the difference between the 360/671 and other members of the System/360 family is the operating system, composed of two independent parts. CP-67, the control program, provides timeshared access to all of the system’s features in the form of “virtual machines”; CMS, the Cambridge Monitor System, runs in each user’s own virtual machine and provides the interactive facilities for programming, text editing, and everything else the user might want to accomplish. The combination was known as CP/CMS.2

I came to work for Paul Allen in 2003, to improve and expand his collection and eventually to turn it into a museum. The wish list we developed was large, and of course included several models of the System/360, including and especially the 360/67. The quixotically intense search met with minimal success for years because IBM almost never sold their large computers, instead leasing them to customers so as to control the supply: IBM did not want to compete against their own products for market share. This meant that retired systems rarely made their way into the hands of collectors; they were instead sold overseas, leased to new customers, or scrapped. For a while, the best we could do was the lights panel from the console of a 360/91 from which all circuitry (rich in gold) had been removed.

The first major break came with a story on the Australian Broadcasting Company’s web site about the impending demise of systems owned by the Australian Computer Museum Society.3 My colleague Keith Perez contacted the ACMS and learned that they owned a 360/40, which they were not interested in deaccessioning. This conversation continued for a while, then tapered off until 2011, when Keith encountered an acquaintance of Tony Epton, president of the ACMS, while on a business trip to Sainte-Nazaire, France. The ensuing renewed discussions resulted in another colleague, Ian King, making a side trip to Perth in February before returning from a trip to Adelaide to have a look at an IBM 7090 system.4 Ian visited the barn in which the ACMS was storing two 360/40 systems, and recommended that we purchase one of them. The system arrived in Seattle in September 2011.

Once the 360/40 arrived, we brought in a retired IBM Customer Engineer to assess its prospects for restoration. At this point we learned something important about IBM mainframes of the 1960s and 1970s: No two are exactly alike, and without the system specific Automated Logic Diagrams (ALDs) which document how it was assembled, the chances of restoring one to operating condition are greatly reduced. The former CE also noted the amount of dust caked on the circuitry–the system had been stored in a barn in a desert–which would decrease the likelihood of a successful restoration. He passed on the opportunity to work on the project.

In 2012, we acquired three IBM systems (a 360/20, a 360/44, and a 360/65) from the American Computer Museum5 in Montana, none in working condition: The internal disk drive in the Model 44 had broken loose from its housing and was held in place by a piece of rope, and the internal console cables of the Model 65 had all been cut. The 360/65 was particularly painful: More than a dozen bundles of 50 to 100 identical wires each were made useless. Neither system could be repaired with our facilities.

Bob Barnett, the museum’s business manager, also located a 360/65 in Virginia which belonged to one of the principals at Sine Nomine Associates, David Boyes. David had contacts within IBM who he believed could be helpful in arranging for LCM+L to obtain licenses for the software we wanted to run, and was eager to help us put up a large System/360.6

The 360/20 is a 16-bit minicomputer only marginally related to the main System/360 line. As a stopgap, to be able to say we had a running System/360, the one we acquired from the American Computer Museum was restored to running condition by an enthusiastic pair of contractors, Glen Hermmannsfeldt and Craig Arno, with help from Keith Hayes and Josh Dersch of LCM+L; it was displayed in the Computer Room from 2015 to 2017, initially while the restoration work was done and then as an example of a small batch system. As is often done for vintage systems at LCM+L, virtual peripherals–a card reader and punch–were created for the 360/20.

By 2015, the desire for an IBM system capable of providing a timesharing experience led to the acquisition of a 4341 system7 from Paul Pierce of Portland, Oregon.8 By this time, we had established an ongoing dialogue with the team who had successfully restored an IBM 1401 at the Computer History Museum (CHM) in California. One of the members of the team introduced us to Fundamental Software Inc.9 Faced with the task of restoring 40-year-old tape and disk drives, or creating our own emulations, we decided that we would instead acquire an FSI FLEX-CUB to provide disks, tapes, and terminal services to the 4341.

Jeff Kaylin was given the task of making the 4341 CPU run. Beginning in July 2015, he spent seven months getting the power system into working condition; first power up was on 12 February 2016.

Once the system was working to this extent, we ordered a FLEX-CUB from FSI and began attaching 3278 terminals to the built-in controller for testing. Also at this time, David Boyes informed us that he had arranged licensing for the VM/SP HPO operating system for us.

The FLEX-CUB arrived at LCM+L on 1 June 2016, with a minimal VM/370 installation in place courtesy of our friends at FSI. After some phone consultations with FSI Support, we were able to IPL10 the system into VM/370. Three weeks of getting additional terminals configured followed, with discussions of the OS configuration between FSI and me, replacements of capacitors and CRTs in terminals, and so on. Progress halted on 20 June, when Jeff arrived on a Monday morning to find the system halted with the words CHECK STOP and an error code on the console.

We obtained an 8in diskette with diagnostics from FSI. Memory tests showed that the memory was working; swapping of boards with spares commenced. The power sequence was a suspect for a long time. Jeff began making schematics for the various boards in order to understand where faults might occur that matched the diagnostic callouts. For two months, Jeff wrestled with the system with no progress.

Our consultant from CHM advised Cynde Moya, our Collections Manager, of the existence of 4341 and 4361 systems housed in a warehouse in Sacramento, California. I spoke with the owner, Daniel de Long, and learned that he had a working 4361 plus spares in the form of another 4361 and two 4331s. I traveled to Sacramento a week later to have a look, seeing the 4361 IPLed and running under DOS/VSE.11 After some discussion, the 4361 equipment began arriving at LCM+L on 2 November 2016.

In December 2016, Jeff began pulling the power supplies out of the 4361, to check the capacitors. All were within tolerance, but since 2004 our policy has always been to replace all aluminum electrolytic capacitors in any device we restore.12 The new capacitors were installed and the power supplies replaced in the chassis in the remaining weeks of 2016.

In mid-January 2017, the newly refurbished 4361 replaced the 4341 in the Computer Room. FSI, who have been very helpful throughout the project, advised us on how to cable the FLEX-CUB to the new system. A different power outlet was installed to accommodate the different plug on the 4361.

When the power button was pushed, the built-in floppy drives’ motor spun, but stopped as soon as the button was released. Jeff tried attaching the operator console, with no change in behavior. A phone call to Dan de Long revealed that the system was wired for 230V rather than 208V, necessitating either a change in the room wiring or a reconfiguration of the system’s power supplies; the latter was a simple matter of changing jumpers on four transformers to provide single-phase 208V, after which the system powered up and stayed up.

Power issues continued to plague Jeff. The first supply in the system would come up, with its test point providing 1.5V as expected, and all the proper voltages supplied; the second and thrid supplies showed no voltages. Going through the ALDs allowed him to trace through all four supplies with no luck in determining the problem.

After a couple of weeks, I suggested that Jeff contact Dan again, who pointed out that the system requires that a printer be attached in order to complete the power sequence. We ordered capacitors for the printer, and had additional outlets installed under the raised floor. The printer was ready to go a month later, after degraded old foam insulation was replaced along with the power supply rebuild.13

With the printer installed, the system would now power up, but the printer would not stay powered on. A long correspondence, with pictures, commenced between Jeff and Dan. This went on from mid-March to mid-May, when a suggestion to swap the cables on the floppy disk drives led to the replacement of one drive. The system would now perform an Initial Microcode Load (“IML”), after which it suggested running the Problem Finder diagnostic tool. Progress! A few more days of fiddling about (bad breakers in the power supplies, etc.) led to the indicator lights on the console keyboard signalling “Power Complete”.

Jeff cabled the FLEX-CUB to the 4361, and changed some system settings on the console to allow it to run VM/370 instead of DOS/VSE. I sent the FLEX-CUB configuration which had been set up for the 4341 to Fundamental Software; they sent one back which had the proper incantations for the 4361 instead and installed it for us remotely.

After I checked over Jeff’s revised settings on the console, we tried to IPL the system, which could not find the configured IPL device. The Problem Finder tool likewise did not find it. I reviewed the FLEX-CUB configuration, and did not find anything problematic there, so stopped for the evening, asked Jeff to locate the Operating Procedures manual for the 4361, and sent pictures to FSI of the console screen showing the Unit Control Words (UCWs) defined for the devices attached to the system. The next day, I got back suggestions for updated UCWs and updated the settings on the console while Jeff moved the channel cables to their new places. Although the system still did not come up, it did report channel status on the console so we knew the system was alive.

The next day, I revised the UCWs again on advice from FSI, to change the controllers on all disks and tapes to 3880s. Several attempts to IPL the system were unsuccessful, but in the mean time we attached more 3278/3279 terminals and got the correct keyboards on them. A day later, after telling the system that the 3279-2A display was a 1052 Selectric-style printing terminal with no printer attached and another IPL, we were prompted for date and time; FSI advised issuing the command CP ENABLE ALL to make the attached terminals live in the system. FSI did little more configuration on the FLEX-CUB, and they and I were able to log on to the MAINT account! That was the end of May, 2017.

Now my task of installing a full operating system began. Several weeks of reading manuals ensued, along with the installation of the Hercules emulator14 on a Windows desktop and on a Linux server. By the end of June, 2017, I had the public domain VM/370 running on both, a task made simpler due in equal parts to the existence of turnkey installations and an active Hercules community.15 In particular, the members of the Hercules-VM group have been very helpful over the last year, offering suggestions, advice, software, and general excitement for our project.

I reached out to David Boyes to ask that he put us in touch with his IBM contact for licensing VM/SP, the preferred version of VM/CMS for our hardware. David wrote back to me that his contact was no longer at IBM, but that he would try to find us the proper person to talk to; he also told me that the tapes he had preserved had been shipped off to CHM a while back, and that he was asking that images be made. A week later, I had the name of IBM’s Product Manager for z/VM and Related Products,16 George Madl, and sent him a message outlining LCM+L’s mission and place of the 4361 and VM/SP in the museum’s offerings. He forwarded the request to Glenda Ford in IBM’s licensing department. Glenda shepherded the request through IBM’s processes for four months and by mid-November had worked out a very favorable license with reasonable restrictions (no support, no commercial use of the system, no fees).

While waiting for an answer to the license question, I moved on with planning for VM/SP, starting with a review of the differences between VM/370 and VM/SP installation. As the weeks went by, I proposed a backup plan in which we would begin by installing VM/370, and upgrade to VM/SP when the licensing came through. This took us to the end of 2017.

In January 2018, with help from FSI, I configured eight 3350 disk drives on the 4361. As we worked together to finalize the new setup, they set up a production VM/370 system on three drives, along with an emulated card reader and punch and an emulated printer. (We even uncovered a bug in the FLEX-CUB software, so the benefit was not all in one direction!) I set up guest accounts for two users who had been asking since the 4341 restoration began, and collected their impressions.

For further planning, I returned to the Hercules emulator, looking at access to language processors and other utilities. I planned to provision our new VM/370 from the prebuilt Hercules disk images, so had to learn the ins and outs of DDR (the DASD Dump/Restore program).17 I added three more 3350 disks to the system, in order to hold the desired contents from the Hercules ready-built VM/370 system. I had to remember to re-IPL the system in order to make the new drives available; the 1970s had no concept of “plug-and-play” peripherals.

It became clear that the integration of the Hercules “6-pack” (made up of six 3350 disk images) was very tight, and the simplest way forward might be to install these images onto our FLEX-CUB disks via DDR. I consulted with the H390-VM mailing list, who concurred in that idea. However, at this point two people came forward with offers of assistance.

One of the architects of the Hercules “6-pack VM” system had available the installation tapes for VM/SP Release 5, which was our original target for the 4361. He provided us with images of the tapes and images of 3350 disks onto which the installation files had been placed, and gave us a hand from the UK in getting things set up under Hercules.

The other is Drew Derbyshire, one of the VM/370 beta testers. Drew is a contract programmer with 10 years’ experience in the VM/CMS world, including a long stint working on the CP nucleus for IBM. He is also local to Seattle, and a member of LCM+L, so was well placed to help us move forward with the installation and configuration of VM/SP for our particular purpose.

On 1 March 2018, I was able to IPL the 4361 under VM/SP, having copied the installation disk images over to the FLEX-CUB with help from FSI and our helpers. These were still 3350s, so I created sixteen new 3380-K disk images on the FLEX-CUB, a total of just under 20GB of storage space,18 as the first step in making the system available to the public by 1 April.

At this point Drew, as a contractor, and I began a fruitful working relationship, trading configuration notes, ideas for further work, and so on. Drew set up a Hercules mimic of the 4361’s exact configuration in order to experiment when the museum was not open. This was helpful when the 4361’s disks were clobbered due to errors in configurations, and Drew did the artwork for the VM/SP splash page on display terminals connecting to the system.

Over the next 10 weeks, Drew and I built CP nucleuses19 with different parameter settings, different numbers of terminals defined, 3380 disks instead of 3350, and so on. In mid-May, the 4361 had a machine check, which Jeff and I traced down over the next week to a memory issue.20 Jeff pulled memory modules from the 4341 to replace those called out by the IBM diagnostics; I began backing up all the disks to tapes, taking the system down every night and bringing it up the next morning.

The interruption was annoying because the developer/maintainer of the Stanford Pascal Compiler was installing his program on the system when the memory fault occurred. Once that was repaired, Drew and he completed testing of the installation and declared it good.

I booted the 4361 on Friday evening, 18 May 2018, for a test run over the weekend. Drew accidentally crashed it from a remote location on Saturday morning, but brought it back up during open hours at LCM+L. The system ran for a week without incident, so I posted an invitation to the H390-VM list for anyone interested to apply for a beta account. This was as much to test the account management software Drew had written as to shine a light on any blind spots we had with regard to software for the casual user.

Since 1 June 2018, Drew has installed the PL/I Optimizing Compiler, Fortran/VS, and other pieces of software to make the system more hospitable. In addition, one of the beta test users installed a version of the IND$FILE file transfer program by cutting and pasting a hexadecimal dump of the binary program into his directory, then let us know about it to install for general use. Drew has made great use of it to make updates from his Hercules testbed to the running 4361.

Future possibilities include installing RSCS and NJE, the remote-job entry subsystem for VM, to create a BITNET-style network site,21 and creating subsidiary virtual machines running other interactive operating systems such as the Michigan Terminal System or the McGill University MUSIC timesharing system, so stay tuned for further developments!

What I learned Mapping Minecraft Worlds

Minecraft is getting a little stale for me now.  I’ve done my exploring, and exploiting.  Nothing left but… to look at the database!

Each Minecraft world has its own folder in the save directory, with other subfolders and a lot of data files.  I noticed that each map created in the world is a separate file, and that file is in GZip format, and there is a library for Python to look at them.  So, I would tediously explore the world, creating maps and then look at the database.  The worlds we have on our server are limited in size plus and minus 1024 blocks.  Each map displays 128 pixels, a pixel being one block at the closest zoom, and can be 16 x 16 blocks at furthest zoom.  Plus, the maps have a funny center offset of 64.  That means at highest resolution, to cover our explorable world, 17 x 17 maps are needed.  First, I would go to -1024, -1024 and create my first map (being Map_0).  Then I would move 128 east and do it 16 more times (to location 1024, -1024).  Then I’d go back to -1024 and go south 128.  That would be Map_17, and so it goes until Map_288.  How tedious (more about that later).  But then I have 289 maps, all of which I know how to unpack, and I could then generate a map!

But I don’t stop there, oh no!  I also can decode other files in the database to find height information and region information, and locations of interesting structures — although labels and making points visible is still a hand-done process.  *** Something Learned ***  PNG files support transparent areas.  Alas, Microsoft products seem to not support it.  However, one may drag a PNG picture into Powerpoint, use the Picture Tools Format tab, go to Color and choose Set Transparent Color.  Then click on the background color of your picture and it will disappear!  Then you may right-click on the picture and save it.  Having transparent areas will let me overlay features on my map.

*** Something Learned ***  How to make pictures with overlays?  Welcome to CSS.  The key is to make all images placed in the same absolute position.

img
{
position:absolute;
top: 36px;
left: 0px;
}

#base
{
z-index: 10
}

I also assigned z-index to my photos — just to be sure.

In HTML I installed buttons, stacked my maps: Topological, Regional, Altitude; stacked my overlays: Mine Rails, Spawners, Features, and Grid.  Here is the link to a world I want to become our new Amplified world: http://wilkinsfreeman.info/mc/layers.htm

Back to tedious map making.  The reason I stay with Office 2010 products is because later versions reduce functionality for security reasons.  With Excel 2010, I am still able to make system calls and use AppActivate and SendKeys.  Let me tell you how I automated map making.

The first task is to get into the Minecraft world.  Then one must be able to go into creative mode.  My program transports at altitude, to it is wise to start out flying, and looking in an interesting direction, and have nothing in your hands or inventory.  To run the program requires pausing Minecraft, and I use chat.  So, the first thing my program needs to do is to get out of chat.  I put in a lot of time delays using Now and While loops.  Now returns time to the second, so this will wait one second.

t = Now
While t = Now
DoEvents
Wend
AppActivate “Minecraft 1.12.2”, False
SendKeys “~”

The ~ is the way to send Enter.  AppActivate selects Minecraft as the active window, and then SendKeys sends the Enter.

Next I set up loops to go from -1024 to 1024 in steps of 128 in both directions.  I found that sending the complete command to Minecraft sometimes doesn’t work, so I send the / to start the command, wait, then send the rest of the command.  The first command teleports me to the chosen location, the next command gives me an empty map, then I right click to use the map, then I send Q to throw the map away (since there will be 289 maps, I can’t hold them all).

I have found that a one second wait after the transport may not be enough.  My current program (not quite perfect) goes “/” one second “tp -1023 160 -1023~” two seconds “/” one second “give sigma9 map~” two seconds, right-mouse click, four seconds, “Q” and wait two seconds.  That’s 53 minutes.

For z = -1024 To 1024 Step 128
For x = -1024 To 1024 Step 128
AppActivate “Minecraft 1.12.2″, False
q = Trim(Str(x)) & ” 160 ” & Trim(Str(z))
SendKeys “/”
t = Now
While t = Now
DoEvents
Wend
SendKeys “tp ” & q & “~”
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
AppActivate “Minecraft 1.12.2”, False
SendKeys “/”
t = Now
While t = Now
DoEvents
Wend
SendKeys “give sigma9 map~”
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
AppActivate “Minecraft 1.12.2”, False
‘send a down event
mouse_event MOUSEEVENTF_RIGHTDOWN, 0&, 0&, 0&, 0&
‘and an up
mouse_event MOUSEEVENTF_RIGHTUP, 0&, 0&, 0&, 0&
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
AppActivate “Minecraft 1.12.2”, False
SendKeys “q”
t = Now
While t = Now
DoEvents
Wend
t = Now
While t = Now
DoEvents
Wend
Next x
Next z
End Sub

I have found that it is wise to do the AppActivate before the mouse events.  One would think one would only need one AppActivate, but no.  Of course, one must then NOT TOUCH anything for the whole hour.  Do not have e-mail open, do not plug in USB, do not lose power.  It is okay to touch the mouse slightly if you want to be sure your computer doesn’t go to sleep.

The point of this exercise is to find interesting worlds.  I prefer a world with multiple region types, several villages, a temple and witch’s hut in the explorable part.  I also like to have water near spawn so I can make a quick getaway.  The world in the map example only has 3 villages, which is disappointing, but it has a lovely ocean (not too much) and two witch huts just right there.  The desert Temple is there in the upper right corner, but it is buried under the amplified terrain.

I will be exploring more world possibilities.  Try LCM+L for the seed and you will get Antarctica.

 

 

Fixing 40-year-old Software Bugs, Part One

The museum had a big event a few weeks ago, celebrating the 45th anniversary of the 1st “Intergalactic Spacewar Olympics.”  Just a couple of weeks before said event, the museum acquired a beautiful Digital Equipment Corporation Lab-8/e minicomputer and I thought it would be an interesting challenge to get the system restored and running Spacewar in time for the event.

As is fairly obvious to you DEC-heads out there, the Lab-8/e was a PDP-8/e minicomputer in a snazzy green outfit.  It came equipped with scads of analog hardware for capturing and replaying laboratory data, and a small Tektronix scope for displaying information.  What makes this machine perfect for the PDP-8 version of Spacewar is the inclusion of the VC8E Point Plotting controller and the KE8E Extended Arithmetic Element (or EAE).  The VC8E is used by Spacewar to draw the game’s graphics on a display; the EAE is used to make the various rotations and translations done by the game’s code fast enough to be fun.

The restoration was an incredibly painless process.  I started with the power supply which worked wonderfully after replacing 40+ year old capacitors, and from there it was a matter of testing and debugging the CPU and analog hardware.  There were a few minor faults but in a few days everything was looking good, so I moved on to getting Spacewar running.

But which version to choose?  There are a number of Spacewar variants for the PDP-8, but I decided upon this version, helpfully archived on David Gesswein’s lovely PDP-8 site.  It has the advantage of being fairly advanced with lots of interesting options, and the source code is adaptable for a variety of different configurations — it’ll run on everything from a PDP-12 with a VR12 to a PDP-8/e with a VC8E.

I was able to assemble the source file into a binary tape image suited for our Lab-8/e’s hardware using the Palbart assembler.  The Lab-8/e has a VC8E display and the DK8-EP programmable clock installed.  (The clock is used to keep the game running at a constant frame-rate, without it the game speed would vary depending on how much stuff was onscreen and how much work the CPU has to do.)  These are selected by defining VC8E=1 and DKEP=1 in the source file

Loading and running the program yielded an empty display, though the CPU was running *something*.  This was disappointing, but did I really think it’d be that easy?  After some futzing about I noticed that if I hit a key on the Lab-8/e’s terminal, the Tektronix screen would light up briefly for a single frame of the game, and then go dark again.  Very puzzling.

My immediate suspicion was that the DK8-EP programmable clock wasn’t interrupting the CPU. The DK8-EP’s clock can be set to interrupt after a specified interval has elapsed, and Spacewar uses this functionality to keep the game running at a steady speed — every time the clock interrupts, the screen is redrawn and the game’s state is updated.  (Technically, due to the way interrupts are handled by the Spacewar code, an interrupt from any device will cause the screen to be redrawn — which is why input from the terminal was causing the screen flash.)

I dug out the DK8-EP diagnostics and loaded them onto the Lab-8/e.  The DK8-EP passed with flying colors, but Spacewar was still a no go.  I decided to take a closer look at the Spacewar code, specifically the code that sets up the DK8-EP.  That code looks like this (with PDP-12 conditional code elided):

/SUBROUTINE TO START UP CLOCK
/MAY BE HARDWARE DEPENDENT
/THIS IS FOR KW12A CLOCK - PDP12
/OR PROGRAMABLE PDP8E CLOCK DK8EP
   CLSK=6131      /SKIP IF CLOCK
   CLLR=6132      /LOAD CONTROL
   CLAB=6133      /AC TO BUFFER PRESET
   CLEN=6134      /LOAD ENABLE
   CLSA=6135      /BIT RESET FLAGS

STCLK, 0
   CLA CLL        /JUST IN CASE   
   TAD (-40       /ABOUT 30CPS
   CLAB           /LOAD PRSET
   CLA CLL
   TAD (5300      /INTR ON CLOCK - 1KC
   CLLR
   CLA CLL
   JMP I STCLK

The bit relevant to our issue is in bold above; the CLLR IOT instruction is used to load the DK8-EP’s clock control register with the contents of the 8’s Accumulator register (in this case, loaded with the value 5300 octal by the previous instruction).  The comments suggest that this sets a 1 Khz clock rate, with an interrupt every time the clock overflows.

I dug out the a copy of the programming manual for the DK8-EP from the 1972 edition of the “PDP-8 Small Computer Handbook” (which you can find here if you’re so  inclined).  Pages 7-28 and 7-29 reveal the following information:

DK8-EP Nitty Gritty

 

The instruction we’re interested in is the CLDE (octal 6132) instruction: (the Spacewar code defines this as CLLR) “Set Clock Enable Register Per AC.”  The value set in the AC by the Spacewar code (from the octal value 5300) decodes as:

  • Bit 0 set: Enables clock overflow to cause an interrupt.
  • Bits 1&2 set to 01: Counter runs at selected rate.
  • Bits 3,4&5 set to 001: 1Khz clock rate.

(Keep in mind that the PDP-8, like many minicomputers from the era, numbers its bits in the opposite order of today’s convention, so the MSB is bit 0, and the LSB is bit 11.)  So the comments in the code appear to be correct: the code sets up the clock to interrupt, and it should be enabled and running at a 1Khz rate.  Why wasn’t it interrupting?  I wrote a simple test program to verify the behavior outside of Spacewar, just in case it was doing something unexpected that was affecting the clock.  It behaved identically.  At this point I was beyond confused.

But wait: The diagnostic was passing — what was it doing to make interrupts happen?

DK8E-EP Diagnostic Listing

The above is a snippet of code from the DK8E family diagnostic listing, used to test whether a clock overflow causes an interrupt as expected.  The JMS I XIOTF instruction at location 2431 jumps to a subroutine that executes a CLOE IOT to set the Clock Enable Register with the contents in AC calculated in the preceding instruction.  (Wait, CLOE?  I thought the mnemonic was supposed to be CLDE?)  The three TAD instructions at locations 2426-2430 define the Clock Enable Register bits.  The total sum is 4610 octal, which means (again referring to the 1972 Small Computer Handbook):

  • Bit 0 set: Enables clock overflow to cause an interrupt
  • Bits 1+2 unset: Counter runs at selected rate, and overflows every 4096 counts.
  • Bit 3, 4+5 set to 110: 1Mhz clock rate
  • Bit 8 set:  Events in Channels 1, 2, or 3 cause an interrupt request and overflow.

So this seems pretty similar to what the Spacewar code does (at a different clock rate) with one major difference:  Bit 8 is set.  Based on the description in the Small Computer Handbook having bit 8 set doesn’t make a lot of sense — this test isn’t testing channels 1, 2, or 3 and this code doesn’t configure these channels either.  Also, the CLOE vs CLDE mnemonic difference is odd.

All the same, the bit is set and the diagnostic does pass.  What happens if I set that Clock Enable Register bit in the Spacewar code?  Changing the TAD (5300 instruction to TAD (5310 is a simple enough matter (why, I don’t even need to reassemble it, I can just toggle the new bits in via the front panel!) and lo and behold… it works.

But why doesn’t the code make any sense?  I thought perhaps there might have been a different revision of the hardware or a different set of documentation so I took a look around and finally found the following at the end of the DK8-EP engineering drawings:

The Real Instructions

Oh hey look at that why don’t you.  Bit 8’s description is a bit more elaborate here: “Enabled events in channels 1, 2, or 3 or an enabled overflow (bit 0) cause an interrupt request when bit 0 is set to a one.” And per this manual, setting bit 0 doesn’t enable interrupts at all! To add insult to injury, on the very next page we have this:

More Real Info

That’s definitely CLOE, not CLDE.  The engineering drawings date from January 1972 (first revision in 1971), while the 1972 edition of the PDP-8 Small Computer Handbook has a copyright of 1971, so they’re from approximately the same time period.  I suspect that the programming information given in the Small Computer Handbook was simply poorly transcribed from the engineering documentation…  and then Spacewar was written using it as a reference.  There is a good chance given that this version of Spacewar supports a multitude of different hardware (including four different kinds of programmable clocks) that it was never actually tested with a DK8-EP.  Or perhaps there actually was a hardware change removing the requirement for bit 8 being set, though I can find no evidence of one.

So with that bug fixed, all’s well and our hero can ride off into the sunset in the general direction of the 2017 Intergalactic Spacewar Olympics, playing Spacewar all the way.  Right?  Not so fast, we’re not out of the woods yet.  Stay tuned for PART TWO!