• Home
  • Category: Thoughts

Artificial Intelligence and Human Judgement: A Cold War Cautionary Tale

Shortly after midnight Moscow time on the 27th1 of September 1983, Stanislav Yevgrafovich Petrov saved the world. This incident, unknown outside of the Soviet air defense forces until 1998, is a highly relevant case study as we accelerate the pace of automation and artificial intelligence in complex military systems.

Cold War backdrop

Cold War tensions between the US/NATO and USSR/Warsaw Pact were higher than they’d been since the Cuban Missile Crisis. Both sides had intermediate-range nuclear missiles on alert, treaty negotiations were breaking down, and NATO was preparing to deploy upgraded Pershing II and additional Ground Launched Cruise Missile systems to counter the Warsaw Pact’s buildup of long-range theatre nuclear forces.

For years the US had been conducting “psychological operations” designed to test defenses, demonstrate nuclear capabilities, and, most importantly, scare Soviet forces. Soviet technology was inferior and the Warsaw Pact Organization was weaker than NATO2. In March 1983, President Reagan called the USSR “the evil empire”. Joint NATO exercises demonstrated strength, often including overflying Soviet airspace. USSR leaders expected that an attack was forthcoming, putting KGB spies on constant watch for even small signs of the US preparing for imminent action, such as stocking up on blood reserves and readying fallout shelters.

Soviet forces were on a hair-trigger alert. With no defense against nuclear weapons, USSR leaders knew that their only hope was to preempt or retaliate. On the 1st of September 1983, the crew of Korean Air Lines Flight 007, enroute from New York to Seoul after refueling in Anchorage, made a navigation error and entered Soviet airspace. Believing it to be a US spy plane, and upset about earlier airspace violations from US exercises in the Pacific, USSR fighter jets were scrambled. It was nighttime so the Soviet pilot could not see the commercial livery, but he did identify the aircraft as a Boeing from the rows of passenger windows, later stating “I did not tell the ground that it was a Boeing-type plane; they did not ask me”. On orders from superiors on the ground, the pilot fired an air-to-air missile at the 747, fatally damaging it. The plane continued to fly for 12 minutes with the pilots having some control for the first few minutes and most likely with all of the passengers alive, until all control was lost and the aircraft eventually broke apart, killing all 269 aboard3.

The end of the world?

As you might imagine, this incident did not help ease the tension in the region. Just a few weeks later, the Soviet Oko (“Eye”) early warning satellite system detected an incoming ICBM, and then a flight of four ICBMs. Lt. Col. Petrov was the duty officer at the Serpukhov-15 control center. His job was to immediately report this threat to higher headquarters; very likely a retaliatory strike would have been ordered in line with Soviet doctrine.

Left: At the time of the incident, Petrov was a lieutenant colonel in the Soviet Air Defence Forces; right: the Serpukhov-15 facility that was, and continues to be, the Oko control center

Yet something didn’t feel right to Petrov. For one, a first strike would have been much larger; just five ICBMs was not a logical attack. Also, the system was brand new and not fully validated; as one of the developers of the early warning software, he was an expert in its capabilities and imperfections. Instead of reporting immediately, he waited a few minutes for corroborating detections by ground-based radar, which never came. The end of the story is anticlimactic. He didn’t sound the alarm, no actions were taken, and Petrov logged and reported the reported the incident to improve the system.

Russian Ministry of Defense video showing modern early warning technology at Serpukhov-15. TWZ has a great analysis

Petrov’s actions were privately praised but not officially recognized, as it would have revealed the shortcomings of the missile warning system and put Soviet leadership even further on edge. In an interview years later, after the incident was revealed to the public, Petrov stated that he was intensely questioned, reassigned to a less sensitive post, took an early retirement, and suffered a nervous breakdown. One can imagine the stress he was under, both in the immediate situation and dealing with the Soviet political state in the aftermath.

It was later determined that the false detection was caused by high-altitude clouds over US missile fields strongly reflecting the sun. Okos satellites are in highly-elliptical Molynia orbits which look at the earth from the side, seeing missiles after they are already several miles in the air and against the blackness of space; this has the benefit of a strong signal-to-noise ratio and minimizing false detections from reflections, it just happened that the sun, the Oko satellite, and the clouds lined up in such a way to cause the false detection. PBS Nova has a good write-up with more technical details.

Modern lessons for trust in automation

Had Petrov followed protocol and reported the detection, the Soviets most likely would have launched an attack against the US. That would have prompted an in-kind response from the US and an all-out nuclear war resulting in the deaths of hundreds of millions directly and billions—if not the entire human species—from the resulting nuclear winter.

Petrov notes that his civilian background contributed to his decision, and that someone from a Soviet military background conditioned to only follow orders would not have hesitated to immediately sound the alarm. We saw this with the Korean Air Line incident, where the pilot executed orders as given without caring if the aircraft was of military or civilian nature. This highlights the critical role of the human-in-the-loop: AI may be faster and more analytical, but only has the data provided to work with, lacking a full context that is especially critical in shifting sociopolitical environments.

Will our future, AI-enabled missile alert systems be as careful as a human? Complex systems can easily become brittle when the situation exceeds the limits of the technology. In these cases, we rely on the human to fill the gap. Yet automation also removes the human from the loop, reducing their situational awareness, technical understanding, and authority to make and execute the right decision. Conversely, systems that effectively combine technology performance and human judgement are resilient and robust.

Human judgement was the difference between a normal September day and Armageddon. Midnight Moscow Summer Time was 2 pm Central Daylight time in St. Louis, where Bob Forsch would take the mound to pitch a no-hitter against the Montreal Expos. It was 3 pm in Rhode Island where the Royal Perth Yacht Club was successfully challenging the New York Yacht Club’s 132-year defense of the America’s Cup. The day in Moscow would be typically cold and overcast, dreary and unremarkable except that it happened to be the day the USSR would return items recovered from the wreckage of Korean Air Flight 007.

“A House of Dynamite” Movie Review

⚠️ Contains spoilers, if you’re the type of person who cares. ⚠️

Netflix’s thriller A House of Dynamite raises important, modern-day questions about our nation’s nuclear deterrence strategy and missile defense capabilities, but it ultimately fell flat for me. With an all-star cast and the Oscar-winning director of The Hurt Locker, it’s still worth a watch for a realistic fictionalized glimpse into the workings of our nation’s ICBM response. Just don’t expect deep insights4.

The Cold War never truly ended

Wikipedia will tell you that the Cold War ended in 1991 as the US-Soviet Union relationship improved, new treaties on nuclear and chemical weapons were signed, and several proxy wars were brought to a close. It certainly began a more peaceful period in world history with serious cooperation between the US and Russia, the fall of socialism and communism, and a wave of democratization and move towards capitalism around the world.

In the intervening decades, both countries have significantly reduced their nuclear arsenals:

Our World In Data: “Nuclear Weapons”

Yet both countries still have around 1,500 nuclear warheads on strategic alert, in missile silos across their respective plains, on submarines, and ready to load on bombers. These form the three legs of the “nuclear triad”: the land leg provides sheer volume and difficulty eliminating, the air leg provides power projection and flexibility in deployment/retargeting, and the undersea leg provides stealth and survivability5. Current nuclear warheads, while limited in power by treaty, are still many times more destructive than those dropped on Hiroshima and Nagasaki6; these arsenals are enough to end the world multiple times over 7.

Arms Control Association: “Nuclear Weapons: Who Has What at a Glance”

New threats

The concepts of “nuclear deterrence” and “mutually assured destruction” result in an equilibrium; neither nation dares employ nuclear weapons because they know the response will be proportional. Driven by treaty agreements and deterrence strategy, details of US and Russian nuclear arsenals are surprisingly public. Coordinates for silos and launch control centers are published on Wikipedia 8 and you can buy chunks of the hardened communications cables on eBay.

Yet, many other nations have also acquired nuclear weapons and that certainly makes the calculus more complex, as Tom Lehrer eloquently describes:

The concept of deterrence is predicated on a high degree of certainty that the adversaries have reliable weapons, effective control systems, and resilience to launch a counterattack. None of that can be assumed for countries like North Korea, wildcards with the potential to destabilize global security with little regard for anything other than their own priorities.

To deal with this threat, the US created Ground-Based Midcourse Defense (GMD), which launches interceptors at incoming ICBMs (really, their reentry vehicles during the suborbital phase) in an attempt to destroy them. As the characters in the film point out, it’s like trying to hit a bullet with a bullet and each interceptor is only slightly better than a coin flip; for this reason, multiple interceptors are fired at each threat to increase the odds. The US has 44 interceptors, this capability is meant to defend against the small, rogue states, not a large-scale attack.

The film

A House of Dynamite gestures at this complexity and gets the details right, but fails to come to a meaningful conclusion. In the movie, a single ICBM of unknown origin is detected by the Sea-Based X-Band Radar. The initial launch wasn’t detected by Space-Based Infrared (SBIR) satellites, so the launch location is unknown. No mention is made of determining the type of threat based on signatures. A single ICBM doesn’t match the expected attack profile of any known power. It’s a contrived situation, which is reflected in the bafflement of the characters in the movie as they theorize possible sources, the implications of each, and potential responses. I don’t fault the movie for it, though; an unlikely scenario was perhaps necessary to create an interesting and dramatic story.

In fact, the movie uses a unique narrative device, playing the same few crucial minutes over several times from different perspectives: the White House Situation Room, the Missile Defense Complex at Fort Greely, senior military and government leaders, and the President and First Lady. It’s compelling at first and is used in interesting ways, such as resolving unknowns and apparent inconsistencies while providing additional character development and backstory as the movie unfolds. But about halfway through it started feeling repetitive and a little boring.

Predictably, each major character has someone they’re extremely concerned about in a doomsday scenario: a young child, a pregnant wife, a hopeful fiancée. These exist in the story mainly to add a little bit of dimension to the characters. With one exception9, they don’t drive any meaningful behavior or motivation. I routinely found myself wondering what purpose these family characters served.

In an early vignette, two GMD Ground-Based Interceptors (GBIs) are launched to attempt to destroy the missile; neither succeeded. The portrayal of GMD’s limitations was realistic and could’ve been an interesting angle, but ultimately its contribution to an overall nuclear defense strategy was dismissed as an expensive boondoggle.

With an incoming ICBM of unknown origin, senior military leaders urge the president to authorize a nuclear counterstrike. Against who? It doesn’t matter, only that it’s immediate and decisive or we may all die! This was a major issue in the plot for me. The pressure for a speedy counterstrike seemed entirely artificial when so much was unknown, including, critically, who was responsible. The film tries to manufacture tension by having military advisors insist on immediate retaliation, but their arguments don’t hold up to scrutiny.

In a real scenario like this, the strategic calculus would demand patience, not haste. There was no indication of a larger-scale attack: no additional launch indications, communication jamming, or large-scale force mobilization. The warhead’s size and yield were unknown—it could be a relatively small weapon, a dud, or a hijacked test. Nations around the world shifted to more alert—but not aggressive—postures, exactly the expected response when an unknown actor launches an ICBM at the United States. The single incoming ICBM wasn’t even heading towards any strategically important target10, which might have suggested an attempt at a decapitation strike.

Launching a nuclear counterstrike without knowing the responsible party is strategically incoherent. The entire premise of deterrence theory is that nuclear powers won’t attack because they know retaliation is certain. The threat of retaliation only works as deterrence when it’s directed at the actual aggressor. A panicked, misdirected counterstrike doesn’t restore deterrence; it shatters it by proving the system is unstable and prone to catastrophic miscalculation. A hasty counterstrike risks deteriorating into all-out nuclear war, not preventing our destruction but precipitating it. There’s simply no strategic justification for launching world-ending retaliation against such an ambiguous threat. In fact, doing so seems far more dangerous than the incoming missile itself, with the potential to trigger the exact all-out nuclear exchange that deterrence was designed to prevent.

This false urgency undermines what could have been the film’s most interesting question: In an era of nuclear proliferation beyond the Cold War superpowers, how do traditional deterrence strategies break down? Instead, we get contrived drama that sacrifices strategic realism for manufactured tension. We continually see the countdown to impact while STRATCOM pressures the President to make a retaliatory decision, as if that timer drives the timeline, when it simply does not.

In the minutes before the warhead’s predicted impact, the President struggles with this decision, lamenting that deterrence as a strategy was meant to keep a stable peace and prevent any actual attack. It’s a surface-level discussion at best. The message of the film seems to be “nuclear weapons bad”, underscored by “missile defense technology expensive and useless”, without acknowledgement of the complexities of the world order with unstable regimes in control of nuclear weapons.

Other than the President, there doesn’t seem to be a voice of reason. Throughout the situation, the Secretary of Defense abandons his post and his responsibility to the American people. The stark contrast across SECDEF (cowardly), STRATCOM (gung ho), and the President (cautious) highlights the importance of putting effective, trustworthy individuals into key advisory and decision-making positions. We just start to explore this when the movie ends.

It left me shrugging my shoulders11. I tried to figure out if the movie had a deeper message and decided it didn’t.

Despite these flaws, A House of Dynamite is worth watching. The performances and cinematography are excellent. The technical details are spot-on. Just don’t expect the tight plotting or sharp commentary the premise deserves.

For those who’ve seen it: Am I wrong about the counterstrike urgency being contrived, or did that plot hole bother you too? Was the Secretary of Defense’s abandonment of duty the point, or just bad characterization?

A fighter jet isn’t a smartphone… but it could be

I can’t tell you how many times I’ve seen a senior DoD leader hold up their smartphone12 and wonder aloud why their military systems can’t work as seamlessly.

The answer is simple: There is no market that incentivizes companies to build seamless products for the military. Androids and iPhones work so well because there is competition. If Facebook Messenger13 starts releasing buggy versions, users will uninstall it and switch to Signal or Telegram or Snapchat or dozens of other messaging apps with various capabilities. Conversely, if a developer creates a fantastic new app that disrupts the incumbents, everyone will quickly switch to it. This forces the entire industry to continually innovate14.

Apple and Google are also in competition, thus it’s in their interests to foster ecosystems of hardware and software developers that in turn build and maintain market share for their products. The market results in the success or failure of the companies in that ecosystem and that competition results in excellent consumer technologies.

Good enough for government work

The US defense industry is not a competitive market, at least not in the same way15. Incentives across the military-industrial complex are misaligned and our nation’s security suffers for it. Even when everyone involved has the best of intentions, military prime contractors only win projects when they’re just cheap enough, just fast enough, and just good enough.

We joke that it’s “good enough for government work”, but the warfighter and the taxpayer deserve better.

Open architectures

The solution is relatively simple, at least in theory: the government needs to support the creation and enforcement of modular open system architecture (MOSA) standards for every aspect of the battlefield. We have a model for this already: Future Airborne Capability Environment (FACE) is an open software standard and certification process for military helicopters developed as a consortium between government acquisition agencies and major prime contractors. FACE has many benefits:

  • Software reuse across platforms: Solutions developed for one platform can be reused on all compliant platforms, with no or few changes
  • Plug-and-play: Systems can be easily reconfigured for different mission sets
  • Speed and reliability: Developers can easily understand the interfaces and capabilities and automated compliance checking ensures the delivered solutions will work
  • Competition: Anyone can develop to the published standards and offer competing products
  • Sustainment: If a supplier goes out of business, their components can be replaced easily without being hampered by proprietary interfaces
  • Upgradability: Software updates can be released faster and with less risk, as long as compliance checks are passed

All of this adds up to cost and schedule savings as well as the potential for more capable solutions. In addition to being an effective approach, FACE serves as a case study for other acquisition organizations on how to develop their own open standards and enforcement, which is helpful now that federal law requires the DoD to use MOSAs in systems development.

Future vision: There’s an app for that

I’m excited for the ecosystem that this surge will create. I imagine a future where warfighters choose what apps to use from an available library, just like an app store. Instead of program offices acquiring specific technologies, MOSAs will enable them to open up the competition and allow multiple vendors to make approved apps available, and then pay them proportionally by hours of use. This is better for the warfighter as they’ll be able to choose the solution that works best for their needs and mission. This is better for the government as they’ll offload development risk and funding. And this is better for innovative developers who truly care about delivering the best solutions, who will be financially rewarded for creating the best solutions.

That’s a big vision, and a lot has to change before we can get there, but it’s just one of the possibilities opening up as we push toward developing and adopting MOSAs. If you’re interested in learning more and becoming part of the conversation, a new community called MOSA Network was recently launched. Start here with a brief analysis of the Tri-Services Memo based on the new law:

What’s your vision for a MOSA-enabled future? How else can consumer technologies inspire better battlefield solutions? How will you engage in the MOSA network?

Postal vehicles: Function over form

One of my favorite items in my small model collection is a 1:34 scale Grumman Long Life Vehicle (LLV)16 with sliding side doors, a roll-up rear hatch, and pull-back propulsion. The iconic vehicle has been plying our city streets for nearly 40 years, reliably delivering critical communiques, bills, checks, advertisements, Dear John letters, junk mail, magazines, catalogs, post cards from afar, chain letters17, and Amazon packages.

Read More

What makes a good human factors engineer? Five critical skills

Recently, the head of a college human factors program asked for my perspective on the human factors (and user experience) skills valued in industry. Here are five critical qualities that emerged from our discussion, in no particular order:

Systems thinking

Making sense of complexity requires identifying relationships, patterns, feedback loops, and causality. Systems thinkers excel at identifying emergent properties of systems and are thus suited to analyses such as safety, cybersecurity, and process, where outcomes may not be obvious from simply looking at sum of the parts.

Read More

Military-industrial complex

The phrase “military-industrial complex” was coined by President Eisenhower in his farewell address to the nation in 196118. In this address, Eisenhower spoke of the deterrence value of military strength:

A vital element in keeping the peace is our military establishment. Our arms must be mighty, ready for instant action, so that no potential aggressor may be tempted to risk his own destruction.

Simultaneously, he warned of the potential danger in the growing relationship between the military establishment and the defense industry:

Read More

College interviewing tips

For several years I’ve been volunteering as an alumni interviewer for my alma mater. It’s enjoyable to spend a bit of time interacting with a younger generation and exploring their interests; my optimism is buoyed by their potential.

Read More

Agile isn’t faster

A common misconception is that Agile development processes are faster. I’ve heard this from leaders as a justification for adopting Agile processes and read it in proposals as a supposed differentiator. It’s not true. Nothing about Agile magically enable teams to architect, engineer, design, test, or validate any faster.

In fact, many parts of Agile are actually slower. Time spent on PI planning, backlog refinement, sprint planning, daily stand-ups19, and retrospectives is time the team isn’t developing. Much of that overhead is avoided in a Waterfall style where the development follows a set plan.

Read More

“Diversity of thought” is the “all lives matter” of corporate inclusion efforts

For at least the last decade, engineering companies have talked a great deal about “diversity and inclusion”. Inevitably, many people20 have the takeaway that this means “diversity of thought”. This is like telling a Black Lives Matter supporter that “all lives matter”; of course all lives matter, but that’s completely missing the point21. Diversity of thought is important to avoid groupthink and promote innovation; but that’s not the point of diversity and inclusion efforts22.

Diversity and inclusion means making sure that teams are actually diverse, across a range of visible and not-visible features. Why does that matter?

Read More

Agile SE Part Zero: Overview

“Agile” is the latest buzzword in systems engineering. It has a fair share of both adherents and detractors, not to mention a long list of companies offering to sell tools, training, and coaching. What has been lacking is a thoughtful discussion about when agile provides value, when it doesn’t, and how to adapt agile practices to be effective in complex systems engineering projects.

I don’t claim this to be the end-all guide on agile systems engineering, but hope it will at least spark some discussion. Please comment on the articles with details from your own experiences. If you’re interested in contributing or collaborating, please contact me at benjamin@engineeringforhumans.com, I’d love to add your voice to the site.

Read More