When Security Architecture Depends on Tribal Knowledge

 > Disaster Recovery, Information Security, IT Strategy >  When Security Architecture Depends on Tribal Knowledge
0 Comments

There is a moment in almost every organization when someone says a phrase that sounds reassuring on the surface but I hope should make security leaders just a little uncomfortable: “Don’t worry, Mike knows how that works.” (no real Mike’s are used in today’s example). Mike I’m sure is a great guy, he’s been with the company for years. In fact is like the building itself part of the infrastructure itself. Mike remembers why things were done/built a certain way. Mike knows which system talks to which, where the dependencies live, and what will break if you touch the wrong thing.

Don’t get me wrong Mike is, in many ways, invaluable to your organization, but…Mike is also a single point of failure. This is what security architecture looks like when it depends on tribal knowledge, nothing documented, modeled or even consistently understood across teams. Instead, it lives in conversations, experience, and memory. It lives in the heads of the people who were there when decisions were made. That works… until it doesn’t.

Most organizations do not intentionally design their architecture this way. No one sets out to build a system that only a handful of people understand. It all happens gradually growing organically over time. A project gets implemented quickly to meet a deadline and the documentation is created but not updated and the teams change, some people move roles and new systems are added and/or old systems remain because they still work. All this is to say over time, the architecture becomes less of a blueprint and more of a story that has to be told….and stories can change depending on who is telling them.

Ask three different people how a critical system works and you might get three slightly different answers. Not because anyone is wrong, but because everyone sees a different (their) slice of the environment. The network team understands connectivity, the application team understands functionality and the security team understands controls…but holistically very few people understand (or even see) the whole picture.

That gap matters more than most organizations realize, because security architecture is not just about how systems are built… it’s about how they behave under stress, understanding dependencies, the trust relationships, data flows, and failure points. Yet when that understanding exists only in fragments, the ability to respond effectively during an incident becomes limited or chaotic…this is where the problem becomes visible.

During “normal” daily operations, that tribal knowledge feels efficient..you need an answer, you ask the person who knows. You need to make a change to the system, you check with someone who has seen it before. Things move quickly, your decisions get made and the system continues to function. (You know what’s coming next by now)

During an incident, that same model becomes a liability, because incidents do not wait for the right person to be available. Incidents do not respect vacation schedules, do not pause while someone tracks down the one engineer who understands a legacy integration…they unfold in real time, and they demand decisions based on accurate, complete information. If your architecture depends on tribal knowledge, your incident response depends on finding the right person at the right time. Sometimes that isn’t an issue….sometimes they’re hiking the Appalachian trail without cell service.

That is not a strategy (well at least a good one)…that is a gamble. Culture has given us plenty of examples of this dynamic in different stories. There is always that one character in a movie who understands how the system works. You know the person who built it, or who has been around long enough to know its quirks. When everything is going smoothly, they are just part of the background. Yet when something breaks, suddenly they are the most important person in the room. That makes for great (if not sometimes predictable) storytelling, but it does not make for resilient systems.

In real life, that person is not always available, If you are lucky they might just be on vacation or worse case they might have left the organization. Time may have passed and they might simply not remember every detail of a system that has evolved over years and when that happens, the organization is left trying to reconstruct its own architecture and knowledge in the middle of a crisis.

This is where the risk becomes tangible.

Imagine trying to isolate a compromised system without fully understanding its dependencies. You might stop the attacker (yay!), but you might also take down a critical business function (ouch!). Imagine trying to determine data exposure without clear data flow mapping. You may underestimate the impact (or overestimate it) which can create problems for leadership and communication.

Security decisions depend on context, tribal knowledge creates incomplete context in some cases. This issue can also extends beyond “just” incident response into everyday security operations…vulnerability management, for example, relies heavily on understanding system criticality and dependencies. If that knowledge is inconsistent or undocumented, your prioritization becomes guesswork. A high-severity vulnerability on a non-critical system might get more attention than a moderate issue on a system that actually supports core business functions.

The same applies to access control, network segmentation, and monitoring. Controls are often designed based on assumptions about how systems interact. If those assumptions are not validated, updated or consistently understood, controls can be misapplied or bypassed entirely.

The challenge is that tribal knowledge does not feel like a risk that is on most of our radars, we instead we treat it like expertise…and expertise is valuable. Organizations depend on experienced individuals all the time and the goal is not to eliminate that expertise, it is to ensure it is not the only place where critical knowledge exists.

As every article that I write, I think this requires a shift in mindset.

Documentation is often seen as a low priority task (or a task that most of us loathe), it is something teams intend to do “when there is time.”…but here’s the truth there is never time, there is always another project, another deadline, another issue to resolve or fire to put out. So documentation becomes outdated, incomplete, or nonexistent. The result is an environment where the most accurate architecture diagram exists in someone’s head…and that is not sustainable.

Mature organizations treat architectural knowledge as a shared asset, not an individual capability. They invest in keeping diagrams current, map data flows, document dependencies and they ensure that knowledge is accessible, understandable, and maintained over time. Documentation isn’t a silver bullet it is not enough on its own….because documentation can become outdated just as easily as it can be created…the real goal is continuous understanding.

Teams should regularly validate that their understanding of the environment matches reality. Tabletop exercises should include architectural questions, not just response steps. Changes to systems should trigger updates to documentation. New integrations should be reviewed not just for functionality, but for how they fit into the broader architecture…and this is where leadership plays a critical role.

Leaders set priorities, and if documentation and knowledge sharing are treated as optional, they will be ignored. If they are treated as essential components of security and resilience, they will be maintained (even begrudgingly). Us leaders also need to challenge the idea that having “that one person who knows everything” is a strength (it is not), it is a hidden risk that only becomes visible when something goes wrong.

Not everyone needs to know everything, but critical knowledge should not depend on a single individual. Systems should be understandable by teams, not just by the people who built them. This also has implications for talent and team development. When knowledge is shared, teams become more capable. When knowledge is isolated, teams become dependent. Over time, that dependency limits growth, increases risk, and creates bottlenecks.

There is also a cultural element to this, in some work environments, knowledge becomes a form of job security. The more you know that others do not, the more valuable you feel. In security though, that mindset can be dangerous. The goal is not to be the only person who understands the system. The goal is to build a system that can be understood, defended, and managed by the organization as a whole.

Attackers do not rely on tribal knowledge, they rely on discovery. They aren’t asking Mike how something works, they map your environment, identify relationships, test assumptions and they build their own understanding of your architecture, often end up knowing more about systems than some teams…that should be a wake-up call. If an attacker can understand your environment faster than your own teams, the problem is not the attacker, it is the visibility on your end. Security architecture should not be a mystery that needs to be explained, it should be a foundation that can be referenced.

As leaders, we need to move from storytelling to clarity…. from individual expertise to shared understanding… From “ask Mike” to “check the architecture documentation”. Because when the moment comes, and it always does, the question will not be who knows the system best, the question will be whether the organization understands it well enough to respond.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.