A Model of Performance

Upgrading performance1, as in the successful accomplishment of something, can be elusive. In some cases, it can seem downright impossible to meaningfully shift what people do, and in the case of teams, it can be head-scratching or hair-pulling levels of frustration to get changes to stick.

“We all agreed, why now are we slipping back into the old way of doing things?”

Often, we spend a tremendous amount of energy pushing, bribing, or giving pep talks, but still not really making a permanent change in what we’re doing. Somehow, it just makes more sense to do something in the old way.

All this pushing and self-cajoling can lead to burnout, and a lot of beating yourself up: “I’m not good enough, I am a procrastinator, I’m lazy, it’s no wonder this is hard for me.”

And all of that is besides the point when it comes to the only question that matters:

Did I or did I not take the necessary action?

When you do not, you seek an explanation, and you assess.

But how do you get access to change, to transforming the way you act?

What is Performance?

Performance, for the purpose of this consideration, is simply the completion of the necessary actions that lead to a result.

If something is performed (or you performed) it means that you did what’s needed to get the result you wanted.

Action is what matters.

Why then do some people seem to do the same actions as others, but get different results?

This is where things get more nuanced.

The Doer is in the Deed

Action has the effect. But the action has something of who’s doing it in the action.

The way that Michael Jordan plays basketball comes from being Michael Jordan. If someone else could be exactly like Michael Jordan, he would play basketball like the same way.

What exactly does it mean to be Michael Jordan? Outside of the physical, it would be to be present in the game in the way he is present, it would mean that the games present themselves in the same way to his perception, and the same options therefore present themselves, and are similarly chosen.

One kind of readily intelligible example of this is intention. My intention changes something about how I do what I do, and the nature of the action itself.

Throwing a ball to someone as a playful gesture in good fun is different from throwing it in a playful gesture meant to embarrass someone. It is simply not the same action.

We intuitively sense this when working with an honest versus dishonest salesperson. When working with the honest one, we answer their questions to give them information to serve our needs and help us find what we want. In the latter, we avoid answering the same questions because we do not trust they’ll use them in the same way. The intention comes through.

A lever you can press

Why do actions happen?

Actions happen because a person wants something to happen. Therefore, they map out what they think will get them there, and take those actions.

The key to knowing what a person will do is to know what they want and what their map to that outcome looks like.

If you can give a person a new map, they will have a new way of getting to their destination, and thus entirely new actions at their disposal.

If you want to change the actions, you have to do one of three things. Change the goal, change the map to the goal, or change the doer.

It turns out that all three of these are accessible through language.

Our goals are things we choose with words. Learning a distinction that helps us clarify what we want often opens up slightly different goals (and thus paths) to us.

The situation is also described with words. Our description of it helps filter what shows up for us, and when we play around with other descriptions, sometimes new possibilities for action emerge.

And the same is true of ourselves. We have perceptual filters, which can be thought of as embodied stories, which we think about ourselves, but these are the things that live in our subconscious, and in ourselves physically as habits of perceiving and feeling. When you start to become aware of these stories, you start to get access to changing them with language.

What’s in a word?

It turns out that how things occur to us happens in language. It happens in the way we talk about things, and then the talking ends up in our subconscious in ways that are surprising. Language alters perception. Reevaluating experiences can alter patterns in our brain to change responses.

Lera Boroditsky (also discovered from the above talk) set out to prove that language does not affect our perception. She found the opposite. One powerful example is a tribe that uses ordinal directions exclusively to denote their orientation and the orientation of objects (i.e. not right and left, but North, South, East, West). They always knew where they were (in a way previously thought not possible) and they also interestingly arranged imagined sequences of time not right to left or left to right, but East to West (regardless of what direction they were facing).

The habits of describing changed the way that their situations occurred to them, and also changed the way their mind perceived more abstract things.

The words used to describe things affect our patterns of seeing. You pay attention to different things based on the words you use to describe them. The language you habitually use changes how things show up to you and how you show up to yourself, because your filters shift.

If this is true, then the question that next shows up is “How do I access that?” which I will cover in my next post.

  1. A lot of the thoughts in this article about performance come from ideas taken from this talk by Werner Erhard: https://www.youtube.com/watch?v=DwQr_BJrHJ8. I do not believe his model is a complete or appropriate anthropology, and has some particular psychological and moral dangers of applying radically, but it is certainly powerful and useful and appears to be true on some level. ↩︎

The Costs of Creating Constant Crises — and an alternative

I’ve been on teams that are constantly in crisis mode.

At first, it’s exhilarating! There’s urgency, clarity, challenge – a chance to shine!

But what happens when the crisis is over?

People lull about. Nothing much happens. This is of course the natural comedown from a stressful big push. You have some dopamine withdrawal, and you need some quiet activity to recharge.

But what happens when you add another crisis? Or the crisis extends?

People get fatigued. And they get tired.

And then something worse happens.

When they do get a break, they’re so broken down that they don’t get much done on their own. They start to occur to themselves as ineffective and powerless, which leads to a lot of ineffective powerless actions (or lack of action).

“What difference does it make anyway? Even if we start owning our work again, there will just be another crisis.”

Recreating Ownership

The truth is, there will always be crises. Out there.

Someone else will always be panicking and freaking out.

But what about your team? How does your team respond? Does your team create a crisis?

Or does your team accept the situation as it is, evaluate it, and take calm and deliberate action?

The team needs to find a way to still make progress even when crises are happening everywhere else. To use the pressure to build systems to keep things under control even when it’s hard, and to focus on the outcomes not just for the organization, but for the team itself.

The team is building not just the software, but the team’s own capacity as individuals, as a team, as part of a delivery pipeline. And the team should build something it loves being a part of.

If you want amazing results, you have to stop creating crises all the time, and create ownership instead.

If a crisis doesn’t result in increased ownership, and a stronger team on the other side of the crisis. you have misused the crisis you have created.

Every crisis should result in a team that needs fewer crises to perform, and therefore does not create them for itself, but instead grows and improves in a way that doesn’t lead to disempowering stories for itself.

If you are able to use the crisis you create for your team to make a team that doesn’t need the crisis, then you can create a team that drives instead of being driven.

“I always work on more than one thing so I’m never blocked”

I have seen engineers, with some pride, tell me about how many things they like to work on in parallel so that they’re never blocked. “Oh don’t you work on more than one thing? You like being blocked?”

I’ve even seen managers suggest that team members do this.

On the surface, this makes perfect sense. Keep yourself busy, don’t waste time, stay optimally productive.

But being blocked is usually caused by a problem. When you treat that problem as something to ignore, or work around, what happens?

Naturally, the problem looks like something less important, and it gets less attention, and it continues to block you.

And this means that you’ll be blocked again. Or delayed. Or need a fancy work around.

If you keep yourself busy with 10 things at a time because you can’t easily get a single one done quickly, you are never dealing with the problem that keeps you from getting one thing done quickly.

You are hiding the real work that needs to be done.

The (Negative) Value Work of in Progress

Work in progress is undelivered work. It is work that is not yet ready to be shipped. This is incomplete features, bug fixes, half-baked designs, or anything else that cannot be delivered to the customer.

Work in progress is when time and money has been invested into creating something but the time and money cannot yet be extracted. It is unrecovered investment.

The longer work is in progress, the larger the overall payoff of the project must be in order to justify the investment.

If I spend 1 month of engineer work (let’s say it costs me ~$16k fully burdened for an engineer) for a feature, then that feature needs to be worth at least $1.6k/year in addition to the original investment just to meet the risk-adjusted rate of return (i.e. to break even with doing relatively boring things with your money).

A major function of money in a business is to create action. It creates action that allows more action to be taken. If a business fails, it’s because the actions taken are not sensible enough to create conditions for more action to be taken. This is what it looks like when you go bankrupt.

If your delivered feature is going to create 20k of sales immediately and it costs 16k to develop and one month to deliver, you’re getting a phenomenal return, and you immediately have 20k to invest into another project. If we assume we have projects that all have 20% rates of return in short time frames, then we could in theory simply grow by delivering project after project.

Let’s assume our company resources are exactly 16k, and it takes 16k to deliver, and we get 20k when we deliver which we can feed back into our next project. That project delivers 24k, and the next project delivers 28.8k. After 3 months, we have captured 12.8k additional dollars, simply by finishing things when they need to be finished.

If we wait for 3 months to deliver the first project, at the end of the 3 months, we have 20k. That’s 4k additional.

Why? The money in the system (as inventory) was not available to create more action, which would deliver more value. The same work was done, but because it was not finished, it did not deliver any value, and therefore amounted to less impact in the end.

Money that is not recovered at the end of a process cannot do more work. But work is where value is created. Therefore WIP is the enemy of value creation.

But it gets worse for WIP!

Often inventory becomes out of date. A great feature this month becomes outdated in a year as other competitors release better features. If it takes a year to complete, it’s worse than capturing the value late. The value is lost forever.

And the longer it takes to capture the value, the longer the team is not available to switch gears without taking the risk of losing the value forever. Which means that the burden on new opportunities is higher, as it not only must take into account the expected value of the new opportunity, but the value represented by completing the current work based on how much work is left to finish.

How Constraining WIP helps

By simply constraining how much WIP is allowed, you are forced to deal with the problems that actually block delivery.

This means you can now deliver smaller batches of work with less delay.

You can capture the value, and free up the money to do more work in your business.

Dealing with these constraints means you are increasing the possible flow of work through the system, which means the system is overall more profitable. (Because there is less delay between dollars to do work and dollars captured as value).

Additionally, morale improves because people are able to build an environment where work is done well, and they understand the context of their work and how to deliver. Seeing the fruits of one’s labor is an edifying and motivating experience.

Surfacing the problems leads to solving the problems, which leads to faster throughput and higher rates of return.

Can your team perform at a high level? Not without this one thing

If I ask you “How do you make a project happen?”, you might tell me a number of things.

You might say “I just get started and keep going until it’s done”, or “I plan it all out on paper and then break it down into parts”, or “I carry it around in my head and make progress when I can”, or “I keep track of my next actions in a system, and make sure I keep doing them”.

These are all valid answers, but they all presuppose something about you.

These answers all presuppose that you are someone who can maintain an intention for long enough to finish a project. That you have a way of remembering what you’re doing, and continuing to do it, beyond the present moment.

The critical point is carrying an intention through time, and continuing to hold it, at least insofar as you complete the actions the original intention takes.

What would you call it to carry an intention through time?

One way of describing this is: Integrity, or being complete with your word.1

How do you “be complete with your word”? You either do what you say, or you acknowledge when you don’t and clean up the mess.

In that way, you have “completed” what you said, and now it no longer carries forward with you into the future.

When we do not act with integrity, we cannot make our word generate beyond the moment. Performance deteriorates into whatever we feel like doing right now.

There is a hidden problem, called by Jensen as a “Veil of Invisibility” about Integrity, which is “Not having your word in existence when it comes time to keep your word”.

If you can’t remember what you said, and nobody else can, then there is no way to honor your word. You simply cannot be complete with it.

Being Complete With Your Word Requires Tools

Have you ever noticed a great kid excuse for why they didn’t do a chore is “Oh, I forgot”?

It’s because it’s a great excuse. You simply cannot honor your word if you don’t remember.

As you become older, you perhaps make more of an effort to remember what you said you’ll do.

But then, sometimes you still forget, because all the things you’ve said, and all the complexity of those obligations starts to become more than you can handle.

Which leads to the necessity of writing things down.

And when simply writing things down becomes unwieldy, we increasingly need tools to keep track of things we said, the dates we gave, etc, simply to be complete with our word.

The Essence of Integrity

The essential characteristic of Integrity is carrying your word forward in time so that you can honor it.

Now is the only time you can act. Any action requiring more than the barest now requires some form of integrity.

Any sizable undertaking will require tools, internal or external, to truly make it happen over time.

And that requires a commitment to being someone who honors their word.

If you are unable to make and keep commitments, you will simply be ineffective, unless someone else has made a system to manage your actions.

If you want to generate a new reality, you need to commit to doing what you say, and continuing to build on this capability.

High Performing Teams Need Integrity

For any team to perform at a high level, they need to have integrity.

Every team member needs a way of knowing what the intention is from the beginning, where they are in the process, what’s left, what the other team members are doing at a high level, and what their personal commitment is.

There needs to be a way of keeping track of the intention so that I know when I should drop my current task to assist another team member, or when I can make the request to interrupt someone because what I am doing is on the critical path and is blocking progress.

The intention and the plan must all be carried forward, and they must be carried forward in a way that truly facilitates action.

When language on a team breaks down, and people individually are not complete with their word, the whole team cannot reliably be complete with its word unless some team members commit in a special way to carrying the slack.

For a team to create a new reality, the team must honor its word, and make space for accountability.

If a team routinely does not honor its word, the use of language loses its power, and the ability of the team to create disappears.

It is only when a team’s word means something that the team can even have a meaningful discussion where they can plan.

How do you create space for accountability? How can you challenge your team to own this for themselves, so that they can create something they’re proud of?

Integrity Requires Owning The Vision

It all starts with a vision from the team. If they know what they want, they will want to hold themselves accountable to that vision. If instead, it’s happening to them, they may want to hide non-performance, as there’s “nothing in it for them”.

Ultimately, for the team to want integrity with their word, they’ll have to want the word to create the reality. That means they have to be bought into the vision.

Which means that for a team to embrace integrity, they have to embrace the vision. The team has to speak the vision themselves, and make it their own.

Until they own the vision as their own word, they are not going to want to be integrity with it.

Creating integrity, therefore, requires leadership. It requires listening for the future of the team, and building it with them.

If you want to create a high-performing team, help the team write a new future, a future they love and will fight for. And then be ready to fight for it with them, even harder than they will.

At that point, everything will start to work on your team.

  1. See “Integrity: Without it, Nothing Works” for a lot more about this. I have read it multiple times, and keep coming away with new insights. ↩︎

How do you get engaged employees? Wrong Question

“How do I get them to care?” “How do I get them to take ownership?” “Why aren’t they interested in the success of the business?”

As a leader of teams, these questions used to be ones that occurred to me frequently. Why weren’t my team members showing up with the same enthusiasm and sense of ownership I had (sometimes)?

While this is a great result, these are the wrong questions to ask.

But wait – don’t we want our employees to be owning the results of their work? Won’t they be happier if they care? Why wouldn’t we ask how?

The “How to get them” questions all presuppose that the problem is over there and not over here.

Your employees have a relationship with the company and with management.

It’s a two way street.

And you only get to control one direction.

Wanting something from someone? or wanting something for someone?

Who do you trust more than anyone else, and who would you do anything for?

You might answer a parent, your spouse, your best friend, or someone like that. Maybe you’ve been lucky enough to have a great mentor, or a great boss as well.

What made you trust them, and go the extra mile?

It’s because they genuinely had your interests in mind. They cared about you, and you knew that they cared about you, and you knew they weren’t going to lie to you to get what they wanted from you.

Do you know what your team wants? The individual members? The team as a team?

If you knew that, and let them know what you wanted, maybe there would be a path forward, where you can build that trust and relationship so that they’re engaged.

It is really hard to be 100% engaged doing a job just for the money, when there’s no path to what really matters to you.

Giving lip service to this is easy. Having some meetings, sharing some feelings.

But when do the gears really shift for people?

When it costs you something. When you have to make a decision between doing what you said you’d do for them and doing what’s easiest for you.

If you break your agreement with your team when management pushes a little, they’re going to know you’re not for real.

When you fight tooth and nail for what you’re creating together, even when it gets a little dicey, and they see that?

They’re going to show up for you in a completely new way.

Changing Organizational Habits

Have you ever sat down to work and emerged two hours later wondering why you’ve been scrolling social media?

You went to your email but then you started processing your emails. You want to respond to a question on a JIRA ticket, but need to find a Pull Request for context. You jump on Github to find the PR, but then start reviewing PRs, and need to look something up. You Google it, find an amazing article, and decide to quickly share it on Twitter before you forget, and 2 hours later wonder how you got there.

Why didn’t you stop yourself?!

Because most of that behavior was entirely unconscious – our conscious mind mostly turns off when we’re in a habit routine.

This is great most of the time. For example, driving would be extremely stressful if most of it didn’t become automatic at some point. Do you remember your first time behind the wheel, and how much thinking was involved?

Interestingly, memory seems to be mostly uninvolved in the process. Just repetition. In The Power of Habit, a patient known as Eugene is able to learn new behaviors in a new house. The twist is he has complete and total amnesia. He doesn’t know where he is. But somehow he’s able to learn. When asked where the kitchen is, he doesn’t know. Or how to get to the bathroom. But he does it anyway.

Habits aren’t learned like facts. They are rehearsed, until they can be done without thinking.

Organizational Habits

Organizations have habits as well. And they are also largely unconscious.

There are routines of behavior that come up over and over again between various team members, where particular interactions tend to go wrong in predictable ways, which often leads to an avoidance of real communication in other subjects.

These habits not only affect behaviors, but they affect the way we perceive ourselves and other teams as well.

When we routinely do not do what we said we would do as a result of our meetings, we tend to perceive meetings as a waste of time. They’re not where the real decisions happen. As a result, we don’t engage, and don’t attempt to make decisions there, thus reinforcing the perception (i.e. living into our notion) that meetings aren’t where real decisions are made.

These perceptions are contagious.

In a famous experiment, Solomon Asch subjected male college students to a vision test. The test was simple. Each student would say which of 3 lines on the right matched a line on the left. And the answer was obvious. The only catch was that before the experimental subject gave an answer, 11 actors confidently gave the wrong answer.

In most cases, the subject of the experiment would give the answer that the other 11 gave.

What’s striking about this is that while some of the participants lied (they wanted to fit in) and others simply doubted their judgement and assumed they saw wrongly, one group actually started perceiving the situation differently based on the answers of the preceding 11 actors.

So it turns out that our underlying assumptions that cause our habits can also be communicated by our actions and actually affect the basic perceptions of others about the nature of a situation.

Habits form situationally; situations occur to us in language

Habits all requires triggers that produce cravings for something we want, which largely happen in a particular context. It is not the things themselves, but rather how we relate to them, that causes the habits to form.

If you ask someone who simply finds sugar or smoking to be disgusting and gross about those statements, they will disagree. Those things have no appeal to them.

But why?

Because the trigger of the habit, as well as the reward that is craved, both exist in how things occur to us, and how things occur to us arises in language.

In Three Laws of Performance, the authors assert that “how people perform correlates to how situations occur to them”, and “how a situation occurs arises entirely in language.”

It turns out that underlying our habits is a basic occurrence about the world. The cookie looks good. The cigarette is relaxing.

What we say about the cigarette and the cookie is usually more of a statement about ourselves.

When we can’t see that we’re talking about ourselves, we can’t see what is in our power to change.

It is only through awareness that we become capable of making a choice. We must surface the things we were unaware of, or the things that go without saying, in order to evaluate them and decide if they still serve us.

Changing the language can change the habits

If our habits about the cookies and the cigarettes emerge from our relationship to those things, and our relationship to those things can be expressed in language, then it would follow that organizational habits also have to do with relationships that can be expressed in language.

And these relationships are largely conversations that happen between people, often in the form of behaviors. And those behaviors, as we showed above, change what we think about the things in our environment.

Conversations are not simply verbal or written. They also consist of gestures, facial expressions, posture, clothing, and anything else that is expressive and communicative.

All of these conveyances can be expressed, though often with some effort, in words.

In the same way as you can change your mind about smoking by simply changing your associations with it, you can change your mind about anything by surfacing and becoming aware of what’s going without saying.

In order to change the conversations, we have to put into words the underlying communications that happen in behaviors, and then evaluate the truth of those things, and see if they serve us.

It is not something you can do without creating trust.

Many leaders make themselves incapable of this kind of transformational work because they look at their team as people they want something from instead of people they want something for.

Until we deeply care about the people we work with as people, many of our change management techniques and programs will fail. Because we will never get to the root causes because the conversations that are happening don’t fundamentally change.

Improving Software Delivery Transforms Culture

According to the research in Accelerate, focusing on improving three metrics can be a catalyst for organizational transformation. In my last article, I laid out the research showing that we can use Mean Time to Restore, Lead Time, and Deployment Frequency as a way to gauge progress on much more impactful transformations that ultimately improve Organizational Performance.

This naturally leads to the question – Why?

What about focusing on those three metrics would cause both a shift in culture and a shift in organizational performance?

Let’s refresh our memory of the chart we looked at last time showing predictive correlative relationships.

Image Source: https://medium.com/@steve.alves2/accelerate-book-cheatsheet-d065e482f8a0

What are the necessary conditions to continuously improve Software Delivery Performance?

Could other systems of management or delivery also improve Software Delivery Performance?

Probably so. But what are the essential elements of this predictive relationship that demand upstream changes of one kind or the other?

  1. Awareness of the system as a system
  2. A system-level goal (as outlined in The Goal)
  3. A mechanism for finding bottlenecks

The magic of focusing on Software Delivery Performance happens because it focuses on an output of the system as a system. While delivering software doesn’t mean you’ll positively impact the organization, focusing on improving it will change how people relate to one another positively.

A mechanism to find bottlenecks (such as limiting work in progress) can help identify what needs to be improved and what people are necessary to improve it.

To fix the problems, they must be communicated, and that means that information needs to be discovered and encouraged to be shared.

That requires transformational leadership to make space for new behaviors.

New leadership patterns are going to have wide-ranging effects on many different aspects of how people perform.

As people start to be given more latitude and freedom and are encouraged to own the results and collaborate, patterns of organization and practice will be developed that remove impediments to flow.

A culture that values the overall team performance instead of individuals will help collaboration to happen at the right levels, and if that is also connected with individual mentoring and growth, who knows what can happen?

This is an overly simplified map of the process, but a valuable one.

If we have a picture of what levers will facilitate change, we can stay the course when things get rocky.

Effectiveness: What can you measure?

It is sometimes difficult to directly measure, in the context of a single business, if something you are doing is working well or not working. The impact often is downstream, and hard to isolate from all the other changes and activities going on.

Fortunately, there is research that supports predictive correlations between upstream changes that are hard to measure and downstream effects that are directly measurable.

Accelerate lays out these relationships, which are based on multiple years of experience.

This excellent summary chart (shown again here) lays out the predictive relationships that were uncovered by the research.

Understanding The Diagram

Each box in this diagram is a construct, which means a phenomenon that is hard to measure directly. However, constructs can be measured by measuring parts that we believe make up the construct. (Here’s a relatively simple primer on constructs.)

The constructs were shown to be reliable and valid, in that survey respondents understood survey items similarly, and that the survey items correlated with one another in cluster analysis, and that they were not unintentionally related to some other construct. (Interestingly, Change Failure Rate did not cluster with Lead Time, Deployment Frequency, or Mean Time To Restore, so it was left out of the construct Software Delivery Performance).

Arrows in this diagram indicate a predictive correlative relationship. For example, Westrum Organizational Performance is predictively correlated with both Software Delivery Performance and Organizational Performance.

Okay, so what does this tell us?

There is a clear predictive relationship between Continuous Delivery -> Culture -> Software Delivery Performance -> Organizational Performance.

Therefore, if we do not have good software delivery performance, our culture is likely not optimal for organizational performance.

Software Delivery Performance, which is easy to measure, can be a proxy to see what is working upstream in our diagram of relationships.

By focusing on improving Software Delivery Performance we can impact the entire organization.

And also, even if we do not affect anything upstream, Software Delivery Performance still predicts Organizational Performance.

That means that we know what to measure that has an extremely high likelihood of impact, and everything else can be monitored as lagging indicators to ensure that we are having the impact we think we are having.

The Hidden Costs of Slow Merge Pipelines

The obvious costs of adding a lot of slow processes before merging to prevent Bad Things are generally known to everyone (typically it’s e2e or system tests, but your flavor may be different):

  1. Slow merge times
  2. Developer frustration
  3. Large PRs + slow reviews

But there are some other costs that don’t always get noticed.

  1. Fear of Mistakes
  2. Less Resilient Systems
  3. Longer Outages
  4. Lost Focus
  5. More Tech Debt
  6. Slowing Feature Delivery

Let’s look into the way these happen.

Fear of Mistakes

If your team believes that spending all this extra time to prevent Bad Things is valued by the company, what does that mean that we think the cost of Bad Things is? How should we position production problems in our minds?

As an organization you are signaling that it is worth a lot of effort to prevent mistakes.

So culturally, your people learn that the way to succeed there is to avoid mistakes. Prevention becomes the focus.

Less Resilient Systems

If we focus on something, we are by definition not focusing on something else.

If we are focused on preventing mistakes, we are paying less attention to our story for dealing with mistakes.

And because we think mistakes are very bad, nobody wants to look like they’re not taking this mistake very seriously, so you often end up with All Hands On Deck as an emergency response, which means your disruption disrupts more people.

By magnifying the avoidance, we have also somehow made the occurrence worse when it does happen. That reinforces the idea that mistakes need to be more strenuously avoided.

Longer Outages

Because our slow pipelines are slow and we want to now be extra sure not to break things further, our response is often cumbersome.

Because we made the first mistake, we now don’t trust ourselves not to make another.

So we test even more, and do extra manual steps to make extra sure we have not broken something in addition to what was originally broken.

Notice that we still haven’t talked about building a resilient system where mistakes and outages are easily recoverable.

We are trying to fix problems and prevent problems, but never build something that allows for problems.

Lost Focus

Because of the impact of incidents, the team is frequently pulled away from planned work.

This stressful situation as a regular occurrence has a refractory period, during which motivation for non-stress situations starts to wain. Non-emergency work is not nearly as interesting to our brains, so our focus naturally slows down.

And remember, it’s slow to merge, so now we add multi-tasking so that we avoid the downtime that happens when waiting for our pipelines to tell us if we are good to go (and these pipelines run multiple times during a single pull request when changes are requested during reviews).

This means that every story will have to be re-contextualized. There’s no way to quickly burn through a single thread of work except for making massive pull requests that take even longer to review and have more rounds of requested changes (and more possibility of merge conflicts and rework).

More Tech Debt

Because merging is slow, we won’t do single-line refactors or fix typos, except in the course of other changes.

Small changes cost too much to merge by themselves, especially if they’d be in an area we are about to work on more. So we make our MRs larger rather than pay this price.

We now get the experience that refactoring makes reviews slower. But we want to be productive, so we do fewer refactors. (Merge conflicts also happen more with long-lived refactor branches).

Additionally, since our focus is fractured by incidents and refractory periods after incidents, it’s unlikely we are engaging deeply enough with our domain as often as we might, so that we often miss opportunities to improve our code.

So we don’t do the little refactors or the big refactors. Small problems become entangled with more of the system, until its entanglements make it a big problem. Things are hard at this point to unstick.

Slowing Feature Delivery

A team that never fixes its model in code to match the understanding of the domain that develops will have to constantly translate from what you discuss in conversations to what exists in the codebase. Features that are easy to describe seem hard to implement.

When the refactors don’t happen, this doesn’t get a lot better.

Eventually, these problems bubble up and become “Cleanup Tech Debt” initiatives, which either seem too expensive, or get partial traction until the team or management loses appetite for all this work which “adds no value”.

It is very hard to justify 6 months “cleaning up tech debt” from a business point of view. The costs of the tech debt are usually hidden, and the cleanup benefits are usually hidden, but all the other costs are not.

Now cleaning up tech debt starts to occur to the team as impossible, and they don’t really make an regular effort to do it in the course of their work.

Building A Different Future

What changes all of this?

Simply speaking, it is to shift our focus.

If our problem was that we focus on preventing mistakes, what could we focus on instead?

Resilient systems allow for the reality of mistakes. We stop fighting mistakes, but instead start to tolerate their existence and deal with them effectively.

If you admit your system will have to handle a hurricane, you’ll have data center redundancy.

If you admit your system is built by people (or worse, AI) you’ll have to accept that bugs will get into production eventually, and work out ways to limit impact and recover quickly.

Our desire to prevent all danger usually has undesired secondary effects, that often lead to more danger but of a different sort.

Would you rather avoid all pathogens, or strengthen your immune system? If you keep your immune system weak, the eventual pathogen is devastating.

The same is true in our systems. Avoidance can only do so much.

Netflix’s Chaos Monkey is a great re-envisioning of how to think about your systems design.

By embracing chaos, they create systems that are strengthened by adversity. When they can’t handle something that happens from their automated chaos, they know what problem they need to address.

Adversity makes those systems stronger.

Train for harder circumstances than you face

Increase the difficulty of your operations so that you are training to handle harder and harder situations.

Your merge pipeline does not need to prevent every bad thing. The bad things can happen, and then you can deal with them effectively with minimal impact to your customers and your business.

This change of focus from prevention to resiliency allows you to focus on speed and agility rather than creating a system that is fragile and leads to further fragility, which is slow and leads to further slowness.

Fear or Power? What is your team’s experience of their work?

What could change for your team if you weren’t operating out of fear of mistakes, but rather building power to handle them?

What if the team knows it has everything it needs to deal with whatever happens, and focuses on building what’s needed to avoid disasters and handle unplanned issues?

What would that look like?

Effectiveness: How do you know when it’s working?

This is the first in a short series of articles, asking questions about how we measure effectiveness in software development.


One of the biggest problems in software development is knowing when something is working from a business perspective.

Do your users like your features? Are you gaining market share?

What’s the impact of what you’re delivering? Are you solving the right problems? Are you asking the right questions?

If it was obvious, why do you hear stories of new management destroying perfectly working systems that were performing well? Who’s right? The current employees, or the new managers? How can you tell?

And what is “Working” anyway?

This comes back to our conversations about the common goal of all organizations: increase throughput while reducing inventory and operational expenses.

Since “working” depends on where software sits in your organization, we can’t answer the question naively with respect to the whole business. It really depends on where the software is being used and for what to determine the ultimate impact of the software organization.

But what set of questions would we want to ask?

Are there any capabilities that all software organizations need? What is truly proper to a software development / delivery organization that is always (or almost always) true?

So these are the questions I am setting out to answer:

How do we know it’s working? What are the leading and trailing indicators? What model of a software organization helps us identify what needs improvement? How do we validate that model?

And finally – how do we take all of this, and create a system that helps us get better in a way that matters?

If we are able to formulate good questions, and some idea of how to find their answers in any given business, we will have a solid foundation for uncovering what works and what doesn’t work in any business.