Asianometry: "The Remarkable Computers Built Not to Fail"
Host: Jon Y
Date: January 11, 2026
Episode Overview
In this episode, Jon Y chronicles the rise and fall of Tandem Computers, the company that defined "nonstop" computing. The story explores how Tandem’s focus on fault tolerance revolutionized critical computing for banks, stock markets, and other industries demanding continuous uptime. Jon details not just the technical triumphs, but the business strategies, cultural quirks, and industry shifts that shaped Tandem’s legacy—from its origins in the 1970s through its absorption into HP.
Key Discussion Points & Insights
1. The Origins and Need for Nonstop Computing
- Jim (Jimmy) Treybig’s Vision
- At HP in the early 1970s, Treybig recognized that standard back-office computers struggled with real-time transaction processing, leading to issues like ATM fraud and uncharged hotel meals.
- Tribute to Treybig’s pitch:
“A computer built from the ground up, not to fail.” (07:56)
- Market Forces
- IBM’s dominance made competition near-impossible unless you found a niche—like high-availability banking systems.
- “It is hard to convey just how difficult that was back then. With top to bottom vertical integration [...] IBM felt unassailable.” (02:30)
- First Major Customers
- Early pain points: ATMs not updating central ledgers in real time; banks desperate for reliability; batch-job processes causing both money loss and customer frustration.
2. Designing Against Failure: The Tandem Architecture
-
Hardware Redundancy Reimagined
- Not just buying two mainframes as failover. The Tandem 16 (later “NonStop One”) featured:
- 2 to 16 fully independent processing modules (CPU, memory, I/O, operating system, and power supply per module)
- Hot-swappable, linearly expandable modularity
- Dual “Dynabus” interprocessor buses for communication redundancy
- Maintenance-caused outages nearly eliminated
- Not just buying two mainframes as failover. The Tandem 16 (later “NonStop One”) featured:
-
Software as Secret Sauce
- Fault tolerance implemented above hardware, not just duplicating hardware.
- “Tandem used software to orchestrate fault tolerance built on top of a custom operating system called Guardian.” (13:59)
- Key concepts:
- Message Passing: 16-bit messages between modules, avoiding shared memory faults.
- Fail Fast: Modules shut down immediately when a problem is detected.
- Process Pairs: Every application process runs primary and backup copies, with state checkpointing.
-
Cool Features
- “If the computer fails, then they would take their paper tape and they'd walk over to the other computer and put it in. That made it a fault tolerant system. Buy two big IBM mainframes instead of one. If one fails, use the other. Fault tolerance as designed by IBM salesmen. And it works. Except for the part where you pay for two systems to get the performance of one.” (09:01)
3. Market Success: Banking, Finance, and Beyond
-
First Sales and ATM Explosion
- First customer: Citibank (bought early just to “be up to date”). (27:10)
- Real-world adoption fueled by the ATM boom—by 1990, ATMs grew from 10,000 to 80,000 in the US, processing 450 million monthly transactions.
- “The Tandem Nonstop systems helped too. Their reliability and 24/7 availability helped people gradually trust that the machine would not eat the cards mid transaction.” (30:27)
-
Dominating Electronic Fund Transfer (EFT) Networks
- Cirrus, Visa, MasterCard, and even the US Treasury standardized on Tandem for their switching systems.
- “People chose Tandem because they needed computers that did not die.” (40:35)
4. Corporate Culture and Growth
- “The HP Way” at Tandem
- Stock options for all employees—dozens became millionaires.
- Radical for the time: “Flexible work hours, open door policies and sabbaticals. Every four years...beer and popcorn get together on Friday afternoon...” (43:16)
- Turnover <8%, ¼ that of rivals.
- But, as Jon notes: “There were murmurs at the time that maybe things were a little too insular, cultish even, and that there were maybe too few meetings. But it worked for a while.” (44:07)
5. Competition and Industry Shifts
-
Stratus vs. Tandem: The Hardware/Software Divide
- Stratus offered cheaper, “hardware only” fault tolerance (lockstep processors), bypassing the complexity of Tandem’s process-pair software approach.
- “Tandem software architect Jim Gray argued that Lockstep handles hardware failures fine, but does nothing for software failures. Both processors run the bug the same way.” (53:52)
-
Good Enough Hardware, Software Inflation
- As off-the-shelf components improved and salaries for programmers rose, software became costlier than hardware.
- Tandem’s proprietary Guardian ecosystem became a liability; customers preferred portable applications on Unix.
6. Product Diversification and Struggles
-
New Markets—Mixed Results
- Experimented with integrating Unix (Integrity S2), low-end modular computers (CLX, LXN), and jabbed at the PC market (failed Dynamite workstation).
- “Dynamite was incompatible with the IBM ecosystem and failed on arrival.” (01:15:47)
-
Technical Accomplishments
- NonStop SQL introduced in 1987—renowned for high availability, linear scalability, beat IBM’s DB2 in DMV bake-off.
- “NonStop SQL beat IBM's iconic DB2 database and in five of the seven technical criteria. And Tandem bid less than half of IBM. The big win convinced Tandem that they were finally ready to take the beast head on.” (01:19:48)
-
Mainframe Showdown
- Launched Cyclone (superscalar, fault tolerant, looked rad), but real-world setbacks followed, including failed DMV contract (due to DMV, not Tandem).
7. Downturn, Adaptation, and Endgame
-
Economic Woes & Missed Pivots
- Wall Street downturn (Drexel collapse), Gulf War, and tech commoditization hammered Tandem’s customer base in the early 1990s.
- “You could probably say it's a disaster in the UK and to some degree in the southeastern US.” (01:37:08)
-
Delayed Response: The Himalaya Lesson
- Slow to release new models (Himalaya), loss of customer confidence, and catastrophic quarterly losses:
“He tried to frame the launch as evolving the company towards a future of client server architectures…What customers heard however, was that the Himalaya had twice the performance for a third to a sixth the price...but it wasn't ready yet, so sales collapsed.” (01:43:51)
- “In July 1993 Tribig announced the company's worst quarterly loss in history, a jaw dropping $550 million. About $450 million of that were write offs and costs associated with plant closures, consolidations and a 1,800 person layoff...” (01:45:21)
- Slow to release new models (Himalaya), loss of customer confidence, and catastrophic quarterly losses:
-
Leadership and Strategy Shifts
- Jon is blunt about Tribig’s tenure: “He sparked their early success and crafted their unique people centric culture. But he also failed to position the company for success in the new world of commodity hardware, open systems and distributed clusters.” (01:51:35)
- New CEO Rol Pieper attempts to pivot Tandem to a software-first, platform-agnostic model (e.g., Servernet), and strikes partnership deals with Microsoft and Compaq.
8. Acquisition, Legacy, and Lasting Impact
- Acquisition Cascade
- Tandem acquired by Compaq (1997), which then acquires DEC and is itself eaten by HP; “There is something poetic about that, created as an idea inside HP, ended up inside HP after the acquisitions...” (01:59:48)
- Remaining DNA
- As of today, (HPE) Tandem-designed “NonStop” systems, running on Xeons, still operate in banking/finance where “Google and Amazon, with their big computer clusters, might be bigger and more scalable, but those seem to crash all the time nowadays. The Tandems powering Visa, MasterCard and RATMs on the other hand, are still chugging along.” (02:03:44)
Notable Quotes & Memorable Moments
- On Fault Tolerance
“Most systems had about 99.6% availability, which sounds high, but that means a failure once every two weeks, which nobody will accept from a computer on the front lines. Imagine the NYSE stopping for an hour and a half every two weeks.” (10:39) - On the Cultural Edge
“Now none of this feels very special nowadays, but for the late 1970s and early 1980s it was pretty radical.” (43:52) - On Open Systems
“Unix commoditizes Tandem, but the customer insisted and they were just too large to ignore. So Tandem decided to saw the baby in half and add a newer lower end SKU…” (01:24:09) - Candid Assessment of Management
“He tried to frame the launch as evolving the company…What customers heard however, was that the Himalaya had twice the performance for a third to a sixth the price...but it wasn't ready yet, so sales collapsed.” (01:43:51)
Timestamps for Major Segments
- 00:02 – Tandem’s origins & the niche for reliable computers
- 07:56 – Treybig pitches “a computer built not to fail”
- 13:59 – Tandem’s technical approach (hardware & software innovations)
- 30:27 – ATMs go mainstream, Tandem’s systems build trust
- 43:16 – Corporate culture at Tandem
- 53:52 – Stratus enters, hardware vs. software debate
- 01:15:47 – Misses and pivots (Dynamite, CLX, LXN, NonStop SQL)
- 01:19:48 – NonStop SQL beats IBM in the DMV contract
- 01:37:08 – Economic downturn impacts Tandem
- 01:43:51 – Himalaya, strategic error, company crisis
- 01:59:48 – Acquisition and poetic “full circle” into HP
- 02:03:44 – Legacy: NonStop still driving critical infrastructure
Final Thoughts
Jon Y delivers a nuanced, energetic retelling of Tandem’s journey—a tale that’s as much about technological ingenuity and startup culture as it is about the relentless change of the computer industry. While Tandem’s proprietary, reliability-obsessed approach faced headwinds from commodity hardware and open software, its DNA remains in today’s world, quietly ensuring debits, credits, and trading sessions don’t fail.
