hardware, restoration

Surface-Mount Prototyping

I thought I’d glue a surface-mount 7414 onto a resistor pack, just to make it easier to prototype. Then I thought, hmm… a resistor pack would be useful to pull down signals when the driving logic hasn’t be initialized.
So, here is the picture:

I soldered wires onto the resistor pack first, then submerged them in water so they wouldn’t come undone when I soldered the other end.
Works like a charm (0.1″ prototyping spacing).

hardware, restoration, software

Fixing 40-year-old Software Bugs, Part One

The museum had a big event a few weeks ago, celebrating the 45th anniversary of the 1st “Intergalactic Spacewar Olympics.”  Just a couple of weeks before said event, the museum acquired a beautiful Digital Equipment Corporation Lab-8/e minicomputer and I thought it would be an interesting challenge to get the system restored and running Spacewar in time for the event.

As is fairly obvious to you DEC-heads out there, the Lab-8/e was a PDP-8/e minicomputer in a snazzy green outfit.  It came equipped with scads of analog hardware for capturing and replaying laboratory data, and a small Tektronix scope for displaying information.  What makes this machine perfect for the PDP-8 version of Spacewar is the inclusion of the VC8E Point Plotting controller and the KE8E Extended Arithmetic Element (or EAE).  The VC8E is used by Spacewar to draw the game’s graphics on a display; the EAE is used to make the various rotations and translations done by the game’s code fast enough to be fun.

The restoration was an incredibly painless process.  I started with the power supply which worked wonderfully after replacing 40+ year old capacitors, and from there it was a matter of testing and debugging the CPU and analog hardware.  There were a few minor faults but in a few days everything was looking good, so I moved on to getting Spacewar running.

But which version to choose?  There are a number of Spacewar variants for the PDP-8, but I decided upon this version, helpfully archived on David Gesswein’s lovely PDP-8 site.  It has the advantage of being fairly advanced with lots of interesting options, and the source code is adaptable for a variety of different configurations — it’ll run on everything from a PDP-12 with a VR12 to a PDP-8/e with a VC8E.

I was able to assemble the source file into a binary tape image suited for our Lab-8/e’s hardware using the Palbart assembler.  The Lab-8/e has a VC8E display and the DK8-EP programmable clock installed.  (The clock is used to keep the game running at a constant frame-rate, without it the game speed would vary depending on how much stuff was onscreen and how much work the CPU has to do.)  These are selected by defining VC8E=1 and DKEP=1 in the source file

Loading and running the program yielded an empty display, though the CPU was running *something*.  This was disappointing, but did I really think it’d be that easy?  After some futzing about I noticed that if I hit a key on the Lab-8/e’s terminal, the Tektronix screen would light up briefly for a single frame of the game, and then go dark again.  Very puzzling.

My immediate suspicion was that the DK8-EP programmable clock wasn’t interrupting the CPU. The DK8-EP’s clock can be set to interrupt after a specified interval has elapsed, and Spacewar uses this functionality to keep the game running at a steady speed — every time the clock interrupts, the screen is redrawn and the game’s state is updated.  (Technically, due to the way interrupts are handled by the Spacewar code, an interrupt from any device will cause the screen to be redrawn — which is why input from the terminal was causing the screen flash.)

I dug out the DK8-EP diagnostics and loaded them onto the Lab-8/e.  The DK8-EP passed with flying colors, but Spacewar was still a no go.  I decided to take a closer look at the Spacewar code, specifically the code that sets up the DK8-EP.  That code looks like this (with PDP-12 conditional code elided):

/SUBROUTINE TO START UP CLOCK
/MAY BE HARDWARE DEPENDENT
/THIS IS FOR KW12A CLOCK - PDP12
/OR PROGRAMABLE PDP8E CLOCK DK8EP
   CLSK=6131      /SKIP IF CLOCK
   CLLR=6132      /LOAD CONTROL
   CLAB=6133      /AC TO BUFFER PRESET
   CLEN=6134      /LOAD ENABLE
   CLSA=6135      /BIT RESET FLAGS

STCLK, 0
   CLA CLL        /JUST IN CASE   
   TAD (-40       /ABOUT 30CPS
   CLAB           /LOAD PRSET
   CLA CLL
   TAD (5300      /INTR ON CLOCK - 1KC
   CLLR
   CLA CLL
   JMP I STCLK

The bit relevant to our issue is in bold above; the CLLR IOT instruction is used to load the DK8-EP’s clock control register with the contents of the 8’s Accumulator register (in this case, loaded with the value 5300 octal by the previous instruction).  The comments suggest that this sets a 1 Khz clock rate, with an interrupt every time the clock overflows.

I dug out the a copy of the programming manual for the DK8-EP from the 1972 edition of the “PDP-8 Small Computer Handbook” (which you can find here if you’re so  inclined).  Pages 7-28 and 7-29 reveal the following information:

DK8-EP Nitty Gritty

 

The instruction we’re interested in is the CLDE (octal 6132) instruction: (the Spacewar code defines this as CLLR) “Set Clock Enable Register Per AC.”  The value set in the AC by the Spacewar code (from the octal value 5300) decodes as:

  • Bit 0 set: Enables clock overflow to cause an interrupt.
  • Bits 1&2 set to 01: Counter runs at selected rate.
  • Bits 3,4&5 set to 001: 1Khz clock rate.

(Keep in mind that the PDP-8, like many minicomputers from the era, numbers its bits in the opposite order of today’s convention, so the MSB is bit 0, and the LSB is bit 11.)  So the comments in the code appear to be correct: the code sets up the clock to interrupt, and it should be enabled and running at a 1Khz rate.  Why wasn’t it interrupting?  I wrote a simple test program to verify the behavior outside of Spacewar, just in case it was doing something unexpected that was affecting the clock.  It behaved identically.  At this point I was beyond confused.

But wait: The diagnostic was passing — what was it doing to make interrupts happen?

DK8E-EP Diagnostic Listing

The above is a snippet of code from the DK8E family diagnostic listing, used to test whether a clock overflow causes an interrupt as expected.  The JMS I XIOTF instruction at location 2431 jumps to a subroutine that executes a CLOE IOT to set the Clock Enable Register with the contents in AC calculated in the preceding instruction.  (Wait, CLOE?  I thought the mnemonic was supposed to be CLDE?)  The three TAD instructions at locations 2426-2430 define the Clock Enable Register bits.  The total sum is 4610 octal, which means (again referring to the 1972 Small Computer Handbook):

  • Bit 0 set: Enables clock overflow to cause an interrupt
  • Bits 1+2 unset: Counter runs at selected rate, and overflows every 4096 counts.
  • Bit 3, 4+5 set to 110: 1Mhz clock rate
  • Bit 8 set:  Events in Channels 1, 2, or 3 cause an interrupt request and overflow.

So this seems pretty similar to what the Spacewar code does (at a different clock rate) with one major difference:  Bit 8 is set.  Based on the description in the Small Computer Handbook having bit 8 set doesn’t make a lot of sense — this test isn’t testing channels 1, 2, or 3 and this code doesn’t configure these channels either.  Also, the CLOE vs CLDE mnemonic difference is odd.

All the same, the bit is set and the diagnostic does pass.  What happens if I set that Clock Enable Register bit in the Spacewar code?  Changing the TAD (5300 instruction to TAD (5310 is a simple enough matter (why, I don’t even need to reassemble it, I can just toggle the new bits in via the front panel!) and lo and behold… it works.

But why doesn’t the code make any sense?  I thought perhaps there might have been a different revision of the hardware or a different set of documentation so I took a look around and finally found the following at the end of the DK8-EP engineering drawings:

The Real Instructions

Oh hey look at that why don’t you.  Bit 8’s description is a bit more elaborate here: “Enabled events in channels 1, 2, or 3 or an enabled overflow (bit 0) cause an interrupt request when bit 0 is set to a one.” And per this manual, setting bit 0 doesn’t enable interrupts at all! To add insult to injury, on the very next page we have this:

More Real Info

That’s definitely CLOE, not CLDE.  The engineering drawings date from January 1972 (first revision in 1971), while the 1972 edition of the PDP-8 Small Computer Handbook has a copyright of 1971, so they’re from approximately the same time period.  I suspect that the programming information given in the Small Computer Handbook was simply poorly transcribed from the engineering documentation…  and then Spacewar was written using it as a reference.  There is a good chance given that this version of Spacewar supports a multitude of different hardware (including four different kinds of programmable clocks) that it was never actually tested with a DK8-EP.  Or perhaps there actually was a hardware change removing the requirement for bit 8 being set, though I can find no evidence of one.

So with that bug fixed, all’s well and our hero can ride off into the sunset in the general direction of the 2017 Intergalactic Spacewar Olympics, playing Spacewar all the way.  Right?  Not so fast, we’re not out of the woods yet.  Stay tuned for PART TWO!

restoration

Minecraft World Boundary

The worlds created on the LivingComputers minecraft server have boundaries. One may go from -1024 to 1023 in both directions.
If one gets outside the boundary, one may not move. And if one is far enough outside the boundary, one suffocates in a wall.
There are 2 methods to get outside the boundary: transport there (if you have that permission), and dismount from transportation. Getting off a boat or a minecart may put you outside the boundary. If the boat or minecart are still in position, you may remount it, and reenter the world.
Villagers may not pass through the boundary. However, it is possible to trade across it.
Water passes through the boundary.
Arrows pass through the boundary.
Trees grow through the boundary.
I believe when blocks are destroyed, their remains may end up across the boundary. I believe if you get close enough you can capture it.
I have yet to observe monsters across a boundary, and I suspect I can be shot, but I don’t know if a creeper can blow me up. (I’m not sure I want to find out).

restoration

Minecraft Python coordinates

Using setBlock() to place blocks requires some attention.
Blocks are placed at Integer locations. You may pass a floating point value to the function, but it is converted to integer.
It is the converting to integer which may aggravate you:
The function int(x + dx) does not necessarily return the same value as int(x) + int(dx). If you are placing blocks around the origin (0,y,0), any -1 > dx < 1 will be truncated to 0. However, away from the origin, say x = 14, x – 0.5 will go to 13, where x + 0.5 will go to 14.
I assume your x and z will be integers already, so make dx and dz integers before you add them to x and z.
setBlock(x+int(dx), y, z+int(dz), 0)

hardware, restoration

The Hunt!

The CDC 6500 has been down since last Friday, so that will be a week in 3 hours. What have I been doing during that time? Let me tell you:

The first thing I noticed was that my PP memory test, called March, wasn’t working. The first real thing it does, after getting loaded into PP0, is copy itself to the next PP in line. In order to do that, it increments 3 instructions to point to the next channel from the one that got loaded in from the deadstart system. After it has self-modified its program properly, it runs those instructions to do the actual copy. The very first OAN instruction it tried to execute hung, this is not supposed to happen.

I spent 3 days looking at this problem before I started drawing timing diagrams of the channel address being selected by the various PPs. The PPs each have their own memory, but they all share the same execution hardware in chassis 1. This makes it a little hard to look at, as a PP is running 1uS cycles, the hardware is running 100nS cycles, and each PP gets a 100nS Slot to do his thing. As I was looking at PP0s slot time, and what channel he was trying to push some data to, it looked like it was getting done at the wrong time. When I plotted out 1uS of all the channel address bits, I finally noticed that PP0 was addressing channel 0, PP1 was addressing channel 1… and PP11 was addressing channel 11, and back to PP0.

The strange thing about that was that that is the way the system starts at deadstart time. Every PP sucks on the channel with his number. The deadstart panel lives off of the end of channel 0, PP0 sucks up everything the deadstart panel put on channel 0, stores it into memory, and when the panel disconnects, because he has run out of program to send, the PP starts executing the program.

Wait a minute here: the program was supposed to have incremented the 3 channel instructions, so they would be pointing to channel 1, why is PP0 still looking at channel 0? Rats: the channel hardware is doing fine, but the increment isn’t working! 3 days to prove something wasn’t the problem!

OK, so the increment isn’t working, what is it doing? I spent a while writing little bits of code to test various ways of incrementing a location of memory, and then Daiyu Hurst reminded me about a program she had generated for me that was a stand-alone version of the PP verification program that runs on the beginning of most deadstart tapes. OK, what does that do?

It hangs at location 6. It did that because it failed a ZJN (jump on zero) instruction. Why is that? The accumulator wasn’t zero. Hmm, instruction 1 was LDN 0, which loads the accumulator with 0! Why doesn’t that work? After another day, or so, I prove to myself that it actually does work, and 0 gets loaded into the accumulator at the end of instruction 1. Another thing that isn’t the problem!

What’s next? The next instruction is UJN 2, (unconditional jump 2 locations forward) which being at location 2, should jump to 4, which it does. It is not supposed to change the contents of the accumulator, but it does!

There are 2 inputs to the “A” adder, the A input is selected to be A, and the B input is zeros. All 12 of the inputs to the A side are zero. Wait: aren’t there 18 bit in the accumulator, what about those other 6 bits? Ah: bit 14 is a 1!

It will not sit still! I chase bit 14 for a while, and it starts working, but a different bit is failing now! I chased different bits around the loop for a while, put module K01 on the extender to look, and the test started passing! This worked for a while. I had the PPs test memory, and that worked, but if I had CP0 test memory, it didn’t like it. When I got back from lunch, it had gone back to failing my LDN 0 test. I put some secret sauce on the pins of module K01, and we are back to trying to run other diagnostics.

I remembered I was having trouble with the imaginary tape drives, to I tried booting from real tape, and I get to the part where it tests memory, and that fails. OK, we have some progress.

That was then, this is now, and we are back to failing to LDN 0. I found that bit 0 for the “B” input of the A adder was not correct. It seems that a via rivet was not conducting between the collector of Q30 and Q32 to the base of Q19 on my friend the QA module in K01. I resoldered all the via rivets, and the edge pins, just for good measure.

Central Memory still doesn’t work, but I can run some diagnostics again!

To paraphrase Sherlock Holmes: When you eliminate all the things the problem isn’t, you are left with what the problem is!

Bruce Sherry

hardware, restoration

Bendix G15 Germanium Diodes

In restoring the Bendix G-15 vacuum tube computer, I have uncovered a phenomena which is requiring us to replace 576 germanium diodes. These diodes appear to have lost their hermetic seal and the atmospheric contamination has caused their leakage current to rise to very high levels as they reach a normal operating ambient temperature of approx. 40 degrees C. Because these diodes are used in the clamp circuits that generate the 20 volt logic swing of the computer, the combined low impedance of the approx. 500 diodes ends up shorting out the -20 volt power supply after 5 to 10 minutes of power-on time.
We have replacement diodes on order, and this should resolve the power supply issue.

Interestingly though, the failed diodes exhibit another interesting phenomena which this engineer hasn’t seen before.  Hooking up a diode to an ohmmeter to measure its leakage current, and heating the diode to about 40 degrees C, causes the diode leakage, measured as resistance, to go from a few thousand ohms to a few tens of ohms.  If the ohmmeter remains connected and the diode is allowed to cool to normal ambient, the low resistance measurement persists.  If the ohmmeter is disconnected briefly and then reconnected, the diode leakage current returns to its nominal few thousand ohms.

restoration

DEC PDP10 Model 1095 Repair

A few months ago, our PDP10 Model 1095 ( pictured ) had just successfully booted the WAITS operating system and was running an early version of Ethernet.  One afternoon, the PDP11-40 front-end computer ( unit with chassis extended on left ) stopped working and I was tasked to find out what had happened and repair it.  What followed was almost three months of difficult troubleshooting and repair.

What had happened was, one of the peripheral devices ( a TC-11 DECTAPE Controller at the left end of the machine )  attached to the PDP11’s Unibus had had a power supply failure, causing the regulated 15 volt supply to rise to 28 volts.  These supplies have an over-voltage crowbar circuit which is designed to shutdown the supply by blowing a fuse if the power supply ever goes into an over-voltage condition.  This crowbar circuit failed and this resulted in a number of circuit boards in the PDP11 frying.

Once I replaced and/or repaired the failed circuit boards, I upgraded the TC-11 power supply to a modern switcher which doesn’t have the failure mode described above.

With the hardware sorted out ( this is a couple of weeks into troubleshooting ),  I set about trying to boot the WAITS operating system once again.  A further snag cropped up at this point.  WAITS wouldn’t fully start and would complain about a “pointer mismatch”.  This points to the DTE-20 10-11 interface, but no combination of replacement boards would succeed in bringing up WAITS ( except for a number of random times ).  The solution to this problem turned out to be an old bugaboo of the KL-10 processor.  A number of the control devices do not fully initialize at power-on as their reset lines do not go to all of the parts in a particular device.  We have seen this phenomena on the RH-20, but apparently the DTE-20 also has some hardware that doesn’t get initialized.  I determined this by running the -10 side diagnostic for the DTE-20, and then booting WAITS successfully.  This was after weeks of eliminating all other possibilities as myself and others were not aware that the DTE-20 had components that came up in an unknown state at power-on.

One side note: In upgrading the TC-11 power supply, it was found that the power controller that feed line voltage to it, had failed some time ago and been hacked to make it work without it’s contactor.  A new contactor was ordered and installed.

restoration

IBM 360/30 lost its memory?

Well, not quite.

It appears the memory can be read, but it also looks like it is not being restored/written.

See how the Yellow trace and the Blue trace ~almost~ are high together?  Well, IF they were both high together then the memory is supposed to write.

restoration

It Verks, it Verks!

It has been about 3 months, but we seem to have 128KW of memory on the CDC 6500 now.

From the picture you can see “CM = 303700”, which is just about 100KW free!

We have built 35 new Storage Modules, all of which work, and 32 of them are installed in the machine in place of Core Modules which were not happy. There are 10 more new Storage Modules in process, and the mostly assembled boards should be in next week. They will need their connector pins and pulse transformers installed. I will have to build more chassis sides and fronts for them, but that may take a while, as I still have 3 good modules, itching to go to work, sitting on my bench.

Something I find interesting in this photo, is the memory access patterns. It is hard to see in the picture, but bank 30 is at the top of chassis 11, and bank 34 is at the top of chassis 12. The left two LEDs in the new storage modules are the Read and Write indicators. The ones on chassis 12 are bright, and the ones in chassis 11 are off. The top rows are separated by 4 locations! The machine isn’t real busy, it just has two instances of a prime number program running, but still… Can’t see that with Core Modules.

Anyway, let’s see, both CPUs working: check! All of memory working: check! Real card reader working: check! Real tape drives working: oops, not at the moment. I guess I’m not done yet.

Bruce