• Home
  • Author: Benjamin

Learn from the mistakes of others

The problem with being too busy to read is that you learn by experience… i.e. the hard way. By reading, you learn through others’ experiences, generally a better way to do business…

General James Mattis

The most successful people in any profession learn from the experiences of others. You can learn from their successes, sure. But don’t focus on doing things exactly they way they did, you’ll stifle your own innovation. Instead, understand their successes, extract relevant lessons, and forge your own path.

More importantly, learn from others’ failures and mistakes.

That’s why I publish a Reading / Listening List. As of the publishing of this article, 5 of the 6 recommendations are about poor engineering and design1. I find these stories fascinating, enlightening, and valuable. By avoiding the pitfalls of the past, we improve the likelihood of success in our own projects.

It’s okay to make mistakes, but strive to at least make original mistakes.

A Functional Team is NOT an Integrated Product Team

“My name is Inigo Montoya. You won a government contract. Prepare to deliver CDRLs.”

TL;DR: An Integrated Product Team (IPT) is a cross-functional group. If everyone on the team has the same background, that’s a functional or discipline team. There’s a difference.

Read More

Board man gets paid

For years I’ve been advocating for the effective inclusion of human systems integration (HSI) in the systems engineering (SE) process. I had to address a persistent misunderstanding of what HSI is and how it relates to human factors; while that can be frustrating, I recognized that it wasn’t going to change overnight. Instead, I worked diligently to share my message with anyone who would listen.

Recently, my diligence paid off. I was contacted by a group putting together a proposal for a defense contract. The government’s request outlined their expectations for HSI as part of the systems engineering effort in a way that the proposal team hadn’t seen before. Someone on the team had heard me speak before, knew I had the right expertise they needed, and reached out to request my support.

It will be a while before we find out who won the contract, but I am certain that our proposal is much stronger for the inclusion of HSI. The HSI piece of the work is small but essential, and any competitors without the requisite expertise may not have understood its impact or importance to the customer.

This experience reminded me of basketball star Kawhi Leonard’s most popular catchphrase: “The board man gets paid.” See, Leonard is known for his skill at grabbing his team’s rebounds1. This is a key differentiator on the basketball court. The team has done all that work to get the ball up the court, yet failed to score. Grabbing the rebound before the opponent does gives the team another chance. Most of the time, the defensive team is in a better position to grab the rebound; Kawhi Leonard has made a career of getting to those balls first.

Leonard identified an underexploited opportunity and worked hard to develop the skill to take advantage of it. Throughout high school and college, he called himself “The Board Man”. He shaped his career around this unique skill and has been extraordinarily successful because of it.

That’s not to say you have to find a niche to be successful. Obviously there are superstars in every field. But, it’s a heck of a lot easier if you can identify those opportunities nobody else is taking advantage of2.

Bonus read: The top 5%. Share your own tips, inspiration, and niche in the comments below.

Diversity in engineering careers

I had the privilege to attend the Society of Women Engineers conference WE19 in Anaheim, CA last week. I left inspired and optimistic.

Speakers and panelists relayed their experiences over the previous decades. These women had been denied entrance into engineering schools, marginalized in the workplace, and forced to become ‘one of the guys’ to be accepted among their peers.

We’ve come a long way. It’s never been a better time to enter the workforce as a woman/person of color/LGBTQ/etc. Diversity in the workforce and leadership of engineering companies is on the rise, barriers are falling, and the value of diversity is being recognized. And yet, we still have so far to go.

We recognize that diversity is good for business 1 and companies are actively recruiting more diverse talent. Our organizational cultures are still adapting to this diversity. In many ways, we still expect all employees to conform to the existing culture, rather than proactively shape the inclusive culture we desire.

A great example is the “confidence gap” theory for why men are more successful in the workplace. Writing in The Atlantic  in 2014, Katty Kay and Claire Shipman explain that “compared with men, women don’t consider themselves as ready for promotions, they predict they’ll do worse on tests, and they generally underestimate their abilities. This disparity stems from factors ranging from upbringing to biology.”

Jayshree Seth‘s WE19 closing keynote combated the confidence gap with a catchy “confidence rap”. I was excited to share it with you in a gender-neutral post about combating imposter syndrome. In researching this post, I learned that the “confidence gap” is symptom, not a cause. Telling women to be more confident won’t close the gap because our workplace cultures are often biased against women who display confidence.

Jayshree Seth countered the “confidence gap” with the “confidence rap” in an excellent keynote.

Research demonstrates that an insidious double standard2 is what’s holding women back. Women who talk up their accomplishments the same way men do are perceived as less likeable. Women who are modest are more likeable, but nobody learns of their accomplishments and they appear to lack confidence. Women can be just as confident as men, but the cultural expectations of the workplace do not allow it.

That’s not to totally dismiss the confidence gap theory. This double-standard stems partly (primarily?) from continuing societal expectations. Though gender equality has advanced significantly in recent decades, many parents continue to raise girls and boys differently3. A girl raised to be modest and display less confidence will join the workforce with the same attitude.

That’s not the whole story, of course. Our behaviors and habits continue to be shaped by the workplace culture, especially for younger employees just learning to fit in at the office. Currently most office cultures encourage confidence in men and discourage it in women.

I think this is changing slowly over time along with other aspects of gender equality. I also think that a gradual change is not good enough. We owe it to ourselves, to our female peers, and to the advancement of the profession to consciously bring gender equality in engineering more swiftly.

We should define what a gender-equal workplace looks like, identify where our cultures diverge from this ideal, and create strategies for closing that gap. As a starting point, Harvard Business Review shared some management and organizational strategies. And all of us can contribute by recognizing our own biases and by finding ways to highlight others’ accomplishments.

What does workplace gender equality mean to you? How does the culture of your office support (or not) gender equality? What strategies would you recommend for addressing bias on an individual, team, or organizational level? Post in the comments below.

The Swiss cheese model: Designing to reduce catastrophic losses

Failures and errors happen frequently. A part breaks, an instruction is misunderstood, a rodent chews through a power cord. The issue gets noticed, we respond to correct it, we clean up any impacts, and we’re back in business.

Occasionally, a catastrophic loss occurs. A plane crashes, a patient dies during an operation, an attacker installs ransomware on the network. We often look for a single cause or freak occurrence to explain the incident. Rarely, if ever, are these accurate.

The vast majority of catastrophes are created by a series of factors that line up in just the wrong way, allowing seemingly-small details to add up to a major incident.

The Swiss cheese model is a great way to visualize this and is fully compatible with systems thinking. Understanding it will help you design systems which are more resilient to failures, errors, and even security threats.

Holy cheese

Four slices of Swiss cheese, an arrow going through a set of holes illustrates failures through multiple levels of controls
A version of the Swiss Cheese Model; an image search will turn up a number of alternatives
© Davidmack, used under a CC BY-SA 3.0 license.

The Swiss Cheese Model was created by Dr. James Reason, a highly regarded expert in the field of aviation safety and human error. In this model, hazards are on one side, losses are on another, and in between are slices of Swiss cheese.

Each slice is a line of defense, something that can catch or prevent a hazard from becoming a catastrophic loss. This could be anything: backup components, monitoring devices, damage control systems, personnel training, organizational policies, etc.

Of course, Swiss cheese is famous for its holes. In the model, each hole is a gap in that layer that allows the hazard condition to progress. A hole could be anything: a broken monitoring device or backup system, an outdated regulation or policy, a misunderstanding between a pilot and air traffic control, a receptionist vulnerable to social engineering, a culture of ‘not my job’.

If you stack a bunch of random slices of Swiss, the holes don’t usually line up all the way through. A failure in one aspect of the system isn’t catastrophic because other aspects of the system will catch it. This is a “defense in depth” strategy; many layers means many opportunities to prevent a small issue from becoming a major issue.

As shown in the diagram, sometimes the holes do line up. This is the trajectory of an accident, allowing an issue to propagate all the way through each layer until the catastrophic loss.

UPS 747 cargo plano
UPS 747 Cargo Plane
© Frank Kovalchek, used under a CC BY 2.0 license

A great example is UPS Flight 6, which crashed in Dubai in 2010. A fire broke out on board, started by lithium ion batteries which were being improperly shipped. Other planes, including UPS planes, have experienced similar fires but were able to land safely. This fire had to proceed through many layers before the crash happened:

  1. hazardous cargo policy — failed to properly identify the batteries and control where they were loaded in the plane
  2. smoke detection system — inhibited because rain covers on the shipping pallets contained the smoke until the fire was very large
  3. fire suppression system — not intended for the type of fire caused by batteries, thus less effective
  4. flight control systems — unable to withstand the heat and made controlling the plane increasingly difficult
  5. air conditioner unit failed — apparently unrelated to the fire, allowed the cockpit to fill with smoke
  6. cockpit checklists and crew training — didn’t have sufficient guidance for this type of situation, leading the crew to make several mistakes which exacerbated the situation
  7. pilot’s oxygen mask was damaged by the heat — he became incapacitated and likely died while still in the air
  8. copilot oxygen mask — he had on a mixed-atmosphere setting instead of 100% oxygen, allowing some smoke into his mask and reducing his effectiveness
  9. air traffic control wasn’t monitoring the emergency radio frequency — copilot tried to use this (international-standard) frequency, but air traffic controllers were not; he couldn’t find the airport without directions from the controllers

Ultimately, the flight control systems failed completely and the copilot could no longer control the aircraft. As you can see, this incident could have ended very differently if any single one of those nine layers did not allow the accident to progress1.

Applying the model in design

This model is most often used to describe accidents after-the-fact. But it’s just as applicable for describing the resiliency of the system during the design phase, where applying it can have the greatest impact on the safety and security of the system.

Add layers deliberately and with care

The layers of cheese in the model suggest that the easy solution to many holes is to add more layers of cheese. Another inspection step, component redundancy, a review gate, etc.

This is a defense-in-depth strategy and it’s essential for the first few layers. However, it quickly becomes onerous and costly. It can also backfire by providing a false sense of security (‘I don’t have to catch 100% of problems because the next step will’).

Before adding a slice, carefully analyze the system to determine if there may be a better way to address the concern.

Fill the holes

Most often, the best solution is to minimize the holes in each layer by making them more robust or to replace a layer with one that better addresses all of the risks.

An easy example might be the copilot’s oxygen mask setting from the UPS example above. The copilot had chosen a setting which varied the amount of oxygen based on altitude. In one sense, this setting makes sense; at a lower altitude the need for supplemental oxygen is lower. In another sense, there’s no risk in providing too much oxygen, so why not provide a simpler system which only delivers 100% oxygen and has less risk of error?

The best designs add minimal friction while providing value to the users. For example, a maintenance system which pre-fills documentation when the user scans a part. The user prefers it because it simplifies their job, documentation is more complete/accurate, and the system can automatically double-check that the part is compatible.

This is always going to be easiest to accomplish during initial design rather than added after the fact. Software engineering has made a big step forward in this regard with DevSecOps, baking security right in rather than trying (usually with little success) to force it on the system later. Resiliency should be incorporated into every step of an engineering project.

Analyze and accept risk

Finally, there’s never going to be a 100% safe and secure system. We must quantify the risks as best as possible, control them as much as practicable, and eventually accept the residual risk.

How have you applied the Swiss cheese model in your work? Do you have any criticisms of it or perhaps an alternative perspective? Share your thoughts in the comments.

Thoughts on “A Message to Garcia”

“A Message to Garcia” is a brief essay on the value of initiative and hard work written by Elbert Hubbard in 1898. It is often assigned in leadership courses, particularly in the military. Less often assigned but providing essential context is Col. Andrew Rowan’s first-person account of the mission, “How I Carried the Message to Garcia”.

There are also a number of opinion pieces archived in newspapers and posted on the internet both heralding and decrying the essay. There are a number of interpretations and potential lessons to be extracted from this story. It’s important that developing leaders find the valuable ideas.

Work ethic

Hubbard’s original essay is something of a rant on the perceived scarcity of work ethic and initiative in the ranks of employees. He holds Rowan up as an example of the rare person who is dedicated to achieving his task unquestioningly and no matter the cost.

Of course, this complaint is not unique to Hubbard1 nor is it shared universally. Your view on this theme probably depends on whether you are a manager or worker and your views on the value of work2. Nevertheless, Hubbard’s point is clear: Strong work ethic is valuable and will be rewarded.

No questions asked

If that were the extent of the message, it would be an interesting read but not particularly compelling. One reason the essay gained so much traction is Hubbard’s waxing about how Rowan supposedly carried out his task: with little information, significant ingenuity, and no questions asked. This message appeals to a certain type of ‘leader’ who doesn’t think highly of their subordinates.

It’s also totally bogus.

Lt. Rowan was a well-trained Army intelligence officer and he was sufficiently briefed on the mission. Relying on his intelligence background, he understood the political climate and implications. Additionally, preparations were made for allied forces to transport him to Garcia. He did not have to find his own way and blindly search Cuba to accomplish his objective.

I don’t intend to minimize Rowan’s significant effort and achievement, only to point out Hubbard’s misguided message. Hubbard would have us believe that Rowan succeeded through sheer determination, when the truth is that critical thinking and understanding were his means.

There may be a time and place for blind execution, but the majority of modern work calls for specialized skills and critical thinking. Hubbard seems to conflate any question with a stupid question, which is misguided. We should encourage intelligent questions and clarifications to ensure that people can carry out their tasks effectively. After all, if Rowan didn’t have the resources to reach Garcia he may still be wandering Cuba and Spain may still be an empire.

The commander who dismisses all questions breeds distrust and dissatisfaction. Worse, they send their troops out underprepared.

Leadership

On the topic of work ethic, Hubbard is preaching to the choir. Those with work ethic already have it while those with is won’t be swayed by the message. Of course, managers always desire employees who demonstrate work ethic.

“A Message to Garcia” would be more effectively viewed as a treatise on leadership. After all, Army leadership effectively identified, developed, and utilized Rowan’s potential.

Perhaps the most important lesson, understated in the essay, is choosing the right person for the job. Rowan had the right combination of determination, brains, and knowledge to get the job done. In another situation, he may have been the worst person. How did Col. Wagner know about Rowan and decide he was the right person for the job? How do we optimize personnel allocation in our own organizations?

That’s my two pesetas, now you chime in below. What lessons do you take from Hubbard’s essay? Feel free to link to an interpretation, criticism, or praise which resonates with you.

It’s time to get rid of specialty engineering: A criticism of the INCOSE Handbook

Chapter 10 of the INCOSE Systems Engineering Handbook covers “Specialty Engineering”. Take a look at the table of contents below. It’s a hodge-podge of roles and skillsets with varying scope.

Table of contents for the Specialty Engineering section of the INCOSE handbook.
Table of contents for the Specialty Engineering section of the INCOSE handbook.

There doesn’t seem to be rhyme or reason to this list of items. Training Needs Analysis is a perfect example. There’s no doubt that it’s important, but it’s one rather specific task and not a field unto itself. If you’re going to include this activity, why not its siblings Manpower Analysis and Personnel Analysis?

On the other hand, some of the items in this chapter are supposedly “integral” to the engineering process. This is belied by the fact that they’re shunted into this separate chapter at the end of the handbook. In practice, too, they’re often organized into a separate specialty engineering group within a project.

This isn’t very effective.

Many of these roles really are integral to systems engineering. Their involvement early on in each relevant process ensures proper planning, awareness, and execution. They can’t make this impact if they’re overlooked, which often happens when they’re organizationally separated from the rest of the systems engineering team. By including them in the specialty engineering section along with genuinely tangential tasks, INCOSE has basically stated that these roles are less important to the success of the project.

The solution

The solution is simple: re-evaluate and remove, or at least re-organize, this section of the handbook.

The actual systems engineering roles should be integrated into the rest of the handbook. Most of them already are mentioned throughout the document. The descriptions of each role currently in the specialty engineering section can be moved to the appropriate process section. Human systems integration, for example, might fit into “Technical Management Processes” or “Cross-Cutting Systems Engineering Methods”.

The tangential tasks, such as Training Needs Analysis, should be removed from the handbook altogether. These would be more appropriate as a list of tools and techniques maintained separately online, where it can be updated frequently and cross-referenced with other sources.

Of course, the real impact comes when leaders internalize these changes and organize their programs to effectively integrate these functions. That will come with time and demonstrated success.

The Boeing 737 Max crashes represent a failure of systems engineering

The 737 is an excellent airplane with a long history of safe, efficient service. Boeing’s cockpit philosophy of direct pilot control and positive mechanical feedback represents excellent human factors1. In the latest generation, the 737 Max, Boeing added a new component to the flight control system which deviated from this philosophy, resulting in two fatal crashes. This is a case study in the failure of human factors engineering and systems engineering.

The 737 Max and MCAS

You’ve certainly heard of the 737 Max, the fatal crashes in October 2018 and March 2019, and the Maneuvering Characteristics Augmentation System (MCAS) which has been cited as the culprit. Even if you’re already familiar, I highly recommend these two thorough and fascinating articles:

  • Darryl Campbell at The Verge traces the market pressures and regulatory environment which led to the design of the Max, describes the cockpit activities leading up to each crash, and analyzes the information Boeing provided to pilots.
  • Gregory Travis at IEEE Spectrum provides a thorough analysis of the technical design failures from the perspective of a software engineer along with an appropriately glib analysis of the business and regulatory environment.

Typically I’d caution against armchair analysis of an aviation incident until the final crash investigation report is in. However, given the availability of information on the design of the 737 Max, I think the engineering failures are clear even as the crash investigations continue.

Hazard analysis

The most glaring, obvious, and completely inexplicable design choice was a lack of redundancy in the MCAS sensor inputs. Gregory Travis blames “inexperience, hubris, or lack of cultural understanding” on the part of the software team. That certainly seems to be the case, but it’s nowhere near the whole story.

There’s a team whose job it is to understand how the various aspects of the system work together: systems engineering2. One essential job of the systems engineer is to understand all of the possible interactions among system components, how they interact under various conditions, and what happens if any part (or combination of parts) fails. That last part is addressed by hazard analysis techniques such as failure modes, effects, and criticality analysis (FMECA).

The details of risk management may vary among organizations, but the general principles are the same: (1) Identify hazards, (2) categorize by severity and probability, (3) mitigate/control risk as much as practical and to an acceptable level, (4) monitor for any issues. These techniques give the engineering team confidence that the system will be reasonably safe.

FAA Safety Risk Management Process flowchart and Risk Categorization Matrix table
FAA Safety Risk Management Process and Risk Categorization Matrix from FAA Order 8040.4B, Safety Risk Management Policy.

On its own, the angle of attack (AoA) sensor is an important but not critical component. The pilots can fly the plane without it, though stall-protection, automatic trim, and autopilot functions won’t work normally, increasing pilot workload. The interaction between the sensor and flight control augmentation system, MCAS in the case of the Max, can be critical. If MCAS uses incorrect AoA information from a faulty sensor, it can push the nose down and cause the plane to lose altitude. If this happens, the pilots must be able to diagnose the situation and respond appropriately. Thus the probability of a crash caused by an AoA failure can be notionally figured as follows:

P(AoA sensor failure) × P(system unable to recognize failure) × P(system unable to adapt to failure) × P(pilots unable to diagnose failure) × P(pilots unable to disable MCAS) × P(pilots unable to safely fly without MCAS)

AoA sensors can fail, but that shouldn’t be much of an issue because the plane has at least two of them and it’s pretty easy for the computers to notice a mismatch between them and also with other sources of attitude data such as inertial navigation systems. Except, of course, that the MCAS didn’t bother to cross-check; the probability of the Max failing to recognize and adapt to a potential AoA sensor failure was 100%. You can see where I’m going with this: the AoA sensor is a single point of failure with a direct path through the MCAS to the flight controls. Single point of failure and flight controls in the same sentence ought to give any engineer chills.

The next link in our failure chain is the pilots and their ability to recognize, diagnose, and respond to the issue. This implies proper training, procedures, and understanding of the system. From the news coverage, it seems that pilots were not provided sufficient information on the existence of MCAS and how to respond to its failure. Systems and human factors engineers, armed with a hazard analysis, should have known about and addressed this potential contributing factor to reduce the overall risk.

Finally, there’s the ability of the pilots to disable and fly without MCAS. The Ethiopian Airlines crew correctly diagnosed and responded to the issue but the aerodynamic forces apparently prevented them from manually correcting it. The ability to override those forces, plus the time it takes to correct the flight path, should have been part of the FMECA analysis.

I have no specific knowledge of the hazard analyses performed on the 737 Max. Based on recent events, it seems that the risk of this type of failure was severely underestimated or went unaddressed. Either one is equally poor systems engineering.

Cockpit human factors

An inaccurate hazard analysis, though inexcusable, could be an oversight. Compounding that, Boeing made a clear design decision in the cockpit controls which is hard to defend.

In previous 737 models, pilots could quickly override automatic trim control by yanking back on the yoke, similar to disabling cruise control in a car by hitting the brake. This is great human factors and it fit right in with Boeing’s cockpit philosophy of ensuring that the human was always in ultimate control. This function was removed in the Max.

As both the Lion Air and Ethiopian Airlines crew experienced, the aerodynamic forces being fed into the yoke are too strong for the human pilots to overcome. When MCAS directs the nose to go down, the nose goes down. Rather than simply control the airplane, Max pilots first have to disable the automated systems. Comparisons to HAL are not unwarranted.

In summary

Boeing is developing a fix for MCAS. It will include redundancy in AoA sensor inputs, not activating MCAS if the sensors disagree, MCAS activating only once per high-angle indication (i.e. not continuously activating after the pilots have given contrary commands), and limiting the feedback forces into the control yoke so that they aren’t stronger than the pilots. This functionality should have been part of the system to begin with.

Along with these fixes, Boeing is likely3 also re-conducting a complete hazard analysis of MCAS and other flight control systems. Boeing and the FAA should not clear the type until the hazards are completely understood, controlled, quantified, and deemed acceptable.

Many news stories frame the 737 Max crashes in terms of the market and regulatory pressures which resulted in the design. While I don’t disagree, these are not an excuse for the systems engineering failures. The 737 Max is a valuable case study for engineers of all types in any industry, and for systems engineers in high-risk industries in particular.

Visiting an operational missile cruiser

I was recently offered an incredible opportunity to spend a day aboard an operational U.S. Navy ship, meeting the crew and observing their work as they conducted a live fire exercise. The experience blew me away1. I came away with new appreciation for our surface forces as well as observations relevant for defense acquisition policy and systems engineering.

Naval Base San Diego

Americans are more disconnected than ever before from their military. To help develop awareness of the Navy’s role, Naval Surface Force Pacific occasionally invites community leaders to visit the fleet in San Diego. I felt very fortunate to be included as one of eight participants in an impressive group including business leaders, community leaders, and a district court judge who created a successful veterans treatment court.

The day began with a tour of San Diego Bay and the many ships docked at Naval Base San Diego. Our guide was Captain Christopher Engdahl, Chief of Staff of the Naval Surface Force Pacific.

White ship with red cross plus two other ships
The hospital ship USNS Mercy (T-AH-19) and two other ships docked at Naval Base San Diego.

As we cruised around the bay, Captain Engdahl described the role of the surface force (surface being distinct from the aviation and submarine forces). Like most of our military forces, the surface force has a diverse mission set. Naval warfare has changed significantly from the large-scale, fleet vs. fleet battles of centuries past.

Recently, the surface force has been emphasizing projection of power, freedom of navigation operations, anti-piracy missions, and humanitarian aid. The surface force also supports air and land operations with forward deployment platforms, fire support, and direct enemy engagement. I haven’t even touched on mines, anti-submarine, electronic warfare, intelligence gathering, and countless other roles.

USS Independence (LCS-2) and USS Comstock (LSD-45) docked in Naval Base San Diego.
USS Independence (LCS-2) and USS Comstock (LSD-45) docked at Naval Base San Diego.

As we cruised by the piers, Captain Engdahl exhibited an encyclopedic knowledge of each of the ships we passed. As he spoke, he sprinkled in stories and details gained from an impressive Navy career. He was recently nominated to Rear Admiral and expects to be assigned to the Board of Inspection and Survey, which assesses the condition of Navy ships and reports to Congress.

Speaking of Congress and readiness, acquisition challenges seem to plague the Navy. These issues have been well publicized. Take the woefully over-budget and under-performing Littoral Combat Ships (pictured above is the USS Independence, featuring a trimaran hull). Another example is the futuristic-looking Zumwalt-class (photo below), which saw its initial 32-unit plan cut to just three amid ballooning costs and watered-down capabilities. The Ticonderoga-class cruisers are reaching the end of their service life and the replacement program is both vaguely-defined and on an aggressive timeline; the results remain to be seen. Given this recent track record, it’s hard to imagine the Navy fulfilling its plan to expand the fleet from 285 to 355 ships.

USS Michael Monsoor (DDG-1001), USS Cape St. George (CG-71), and USS Sterett (DG-104) docked and undergoing maintenance in Naval Base San Diego.
USS Michael Monsoor (DDG-1001), USS Cape St. George (CG-71), and USS Sterett (DG-104) docked and undergoing maintenance at Naval Base San Diego. The white tarps covering parts of the ships help prevent environmental contamination.

Captain Engdahl touched on these concerns but didn’t dwell on them. He did express concern regarding the military’s struggle with recruitment and retention. Many young Americans don’t meet the physical requirements or don’t view the Navy as a viable career. Top personnel often leave after a few tours to work in industry, which can offer more lucrative compensation and work-life balance. Though the Navy has been making significant strides on retention, personnel will likely remain a perennial issue.

USS Bunker Hill (CG-52)

After our tour of the bay, we headed to Naval Air Station North Island to catch a flight to the USS Bunker Hill (CG-52). Bunker Hill is a Ticonderoga-class guided-missile cruiser. She is assigned to Carrier Strike Group Nine, which includes the aircraft carrier USS Theodore Roosevelt (CVN 71). Though she has reached the end of her service life, the Navy has committed to maintaining her for several more years.

This was evident in the condition of the ship. Bunker Hill had just completed a maintenance and refurbishment period and was in the process of re-certifying the equipment and crew. The flight deck, just barely large enough for the MH-60 Seahawk which was our ride, had been certified the day before. On the day of our visit, the crew was conducting a live fire exercise to certify the ship’s two five-inch guns.

MH-60 helicopter on the tarmac
Our chariot at NAS North Island.

I almost fell over with the rocking of the ship in the ocean swells. Captain Kurt Sellerberg and his executive officer Commander David Sandomir welcomed us aboard. It was lunchtime and we were invited to enjoy burgers2 with several of the ship’s officers in the wardroom. The food was good, the coffee was strong, and the officers were proud of their ship.

Segment of wooden flight deck in a curio cabinet.
The wardroom contains a cabinet with artifacts related to the ship. This is a section of flight deck from the carrier USS Bunker Hill (CV-17), decommissioned in 1947.

After lunch, we were each assigned a petty officer to guide us around. With the recruitment discussion fresh on my mind, I asked my guide why she chose to enlist in the Navy. She told me that she had been interested in becoming an engineer, but for personal reasons college had not been an option. We didn’t get the opportunity to discuss her future plans (the big guns started firing), but she clearly was the type of dedicated, knowledgeable sailor the Navy wants to retain; if she decides to pursue an engineering degree, she’d be heavily recruited by defense contractors.

In fact, every sailor and officer I met aboard the ship was a model professional. We had the opportunity to tour the medical facilities, the engine and generator rooms, the engineering plant central control station, berthing and hygiene facilities, the torpedo room, helicopter hangar, and firefighting facilities. By interacting with the crew, I gained a sense of the culture onboard, which I would describe as strong camaraderie and trust. I later learned that we were witnessing the first implementation of a new training strategy which allows them to complete their basic certification early and utilize the remaining time for more advanced exercises.

Today, the crew was demonstrating the 5-inch gun by firing at imaginary targets on San Clemente Island3. We had free range of the upper-level deck and bridge during the live-fire exercise, an almost-unbelievable amount of access. We listened to the radio calls as the forward observer on the island called in firing coordinates and we watched the gun aim and fire in response. The evaluation team on the island recorded data on each round to score the exercise. We witnessed illumination rounds4, spotting rounds, and the rapid “fire for effect”.

Between the forward and rear guns and multiple test scenarios, about 150 rounds were fired. You can see a few of them in the video below. All of the goals for the objectives were satisfied and Captain Sellerberg came on the 1MC (PA system5) to congratulate the crew on a successful exercise.

Heading home

Finally, it was time to head home. The captain and executive officer spent a few more minutes chatting with us while the helicopter landed and refueled. We said farewell and were off6.

View of San Diego and the Coronado bridge from the air
Coronado and San Diego from the air.

The flight back provided an opportunity for reflection on the day. Beyond being seriously impressed by the exercise and the crew, there are concrete lessons to be learned:

Engineers need field trips

You can read about the [Air Force/Army/Coast Guard/Marines/Navy/Space Force7] ’til the cows come home. But it can’t compare to observing and participating in the culture first-hand. In-person visits to an operational facility build unmatched user empathy and mission understanding. On each project, engineering teams need to take the time to visit their users and spend a few days observing their work; leaders in both the contractor and customer organizations need to support these visits.

Traditions matter

The Navy is proud of their traditions. The ship’s brass bell is rung every half hour and water fountains are called scuttlebutts. Traditions provide continuity, reminding us of our history even as we adapt to the future.

As far as I am aware, there is no psychology research on the effects of tradition on performance. I would venture to guess that tradition is highly correlated with culture and organizational learning in high-risk and high-performing organizations8. In this sense, tradition substitutes for shared experience.

In the military, traditions are intentionally-instilled doctrine. In engineering, tradition varies significantly by domain and organization. Engineering is evolving more rapidly than ever, and I think it’s important that we carry forward traditions and institutional knowledge even as we innovate.

Innovate with intention

Bunker Hill may be 35 years old, but you’d be hard-pressed to see signs of her age. Her crew may be young, but you’d be hard-pressed to see signs of immaturity. The Navy relies on centuries of experience with maintaining ships and training new sailors. They know what works and what doesn’t.

Meanwhile, industry gets excited about every hot new buzzword. We breathlessly promote blockchains, machine learning, artificial intelligence, and electromagnetically-launched projectiles. We shoehorn technologies into projects for the sake of innovation and not because it’s what the system really needs. Innovation is essential, but should be done with care and intention, not novelty.

Bravo Zulu

I can’t say enough about the men and women I interacted with during this experience. They represent the Navy and our country with dedication, skill, and professionalism. This experience gave me a renewed sense of pride in the work we do in the defense industry. Thanks again to everyone who took the time to share their world with us: you made an indelible impression on myself and the entire group.

System lexicons and why your project needs one

A system lexicon is a simple tool which can have a big impact on the success of the system. It aligns terminology among technical teams, the customer, subcontractors, support personnel, and end users. This creates shared understanding and improves consistency. Read on to learn how to implement this powerful tool on your program.

Read More