14 November 2024

Thoughts on a Pipeline-Driven Organization

This post is inspired and based on a talk “The Pipeline-Driven Organization” by Roy Osherove at GOTO 2022 conference1. It is a summary of this talk with extra comments to help remember parts I found the most relevant. Roy is a great speaker and this talk struck a chord with me, so I encourage you to take a look at it.

Play

The talk starts with stating that there is a difference between:

  • continuous delivery
  • true continuous delivery
  • automated builds.

Roy asks the audience to raise a hand if they have:

  • automated pipelines
  • continuous delivery.

On one hand, I’m glad that there are fewer people who raised their hand to the second question. It means people know the difference between automation and continuous delivery, or at least that it is there. On the other hand, I’m sad that there are so many people in the audience who haven’t reached Continuous Delivery.

In my experience, many don’t really know what Continuous Delivery really is. The fact that “True continuous delivery” is brought up is a sign that the original term is misunderstood. I do recommend reading “Continuous Delivery”2 by Dave Farley and “Modern Software Engineering”3.

Theory of constraints

Roy says that “The Goal” series of books by Eliyahu M. Goldratt 4 5 6 helped him to move from thinking in terms of agility to thinking in terms of limitations/bottlenecks. He mentions good bottleneck examples: knowledge and permissions. All of it leads to the process of adopting a new technology. Before an adoption, we have to figure out why and how. There are four questions to help us with it:

  1. What is the power of the technology? What can we do with it that we couldn’t before?
  2. What limitation/restriction/constraint does it decrease?
  3. What rules/processes helped us accommodate the limitation until this technology?
  4. What rules should we use with it instead?

The last two questions are especially interesting. I’ve seen so many teams and organizations figure out the first two but completely forget about the last couple. I like an anecdote that captures this situation in a simple and clear manner.

A young daughter asks her mother why she always cuts off the ends of sausages before frying them. The mother pauses and responds that she learned to do it that way from her own mother, but she’s not sure of the exact reason. Curious, the daughter decides to ask her grandmother the same question.

The grandmother explains that she also learned it from her mother and doesn’t have a particular reason except that’s how it’s always been done in their family. Still curious, the daughter then goes to her great-grandmother to find out why she was doing it.

The great-grandmother is surprised to find out that this small frying pan is still in a family. Except it’s not, only the practice, invented to fit a sausage into a tiny pan.

In this anecdote, the main character is lucky to have the source of a practice available. Usually, there are no Decision Records, and the original adopter left the company years ago. It is important to understand why we’re doing things the current way. Even the most obnoxious solution might be the most fitting, but just for some time. Context changes, and we need to know how to adapt.

Adopting pipelines the wrong way

When we do everything manually, we don’t have reliable metrics for failure rates, how long does a release step takes, and how often it happens and more. We’re also blocked by people to set up, review, test, update, approve, and many other possible activities. At 8:26, Roy goes through a list of rules and processes that help us manage manual delivery.

All of it boils down to the fact that we don’t know at any given moment if our software is releasable. To get this knowledge before a release, we do a lot of costly, complex and long rituals. At this point, we are limited by people: QA, Sec, Ops, Compliance and someone else. They act as gates for a release. With such an expensive release process we tend to cram every possible feature into a release. Because the next one is not going to be soon.

The QA lead or whoever signs or approves such a release is stressed. They are responsible for the works of dozens or hundreds of people over the last two weeks or even months. It’s a huge burden to approve such releases. Even bigger one to deny them. And a certain way to get burned out.

This is an answer to the 3rd question. But Roy’s and my experiences show that many of these practices migrate into workflows built around a pipeline. I’d like to take a moment and blame CI/CD tools for being just automation tools. GitLab is definitely guilty of slapping a CI/CD label on everything. I’ve conducted a small experiment with 50 applicants for the different leading tech positions at the EdTech company I’ve worked for at the end of 2023. Only 3 managed to give correct answers to:

  • “What are we integrating in CI?”
  • “What does it mean ‘continuously’?”

For some, CI/CD, Continuous Delivery, and DevOps are just fancy synonyms for automation. In this case, all of it can be considered achieved after just a build pipeline with nothing more to strive for.

Waterfall pipelines

I like this term; it’s well named. Waterfall pipelines don’t go through the whole release process in one go. They might have multiple gates: QA, Sec, Ops, and so on. It might even be more than one pipeline. QA team might have their own that they manually start for a special environment. Ops might have their own special infrastructure repository with their own pipeline.

They are usually flaky and unreliable. Roy mentions a term “the build whisperers”. They are experts in a particular pipeline and can tell that it’s good even when it’s red. They are also a bottleneck.

Cooperative Pipelines, CoOps

It’s a better approach to pipelines. There is no human gatekeeping in it. A Cooperate Pipeline goes from the start to the delivery in one go. It can reliably decide if software is releasable.

You can just start trusting an unreliable pipeline and pray for the best. It is a journey. Roy describes the destination with the following rules:

  • The pipeline decides when a stage is approved all the way to production.
  • Trust the pipeline results (or make the pipeline more trustworthy).
  • Run the pipeline as often as the machine allows.
  • Run the tests on as many environments as possible.
  • Any manual test is potentially entered into the pipeline as automated.
  • Everyone writes tests - everyone is a tester.
  • Work in small batches to merge incremental work.
  • No code freeze, no branching - use feature toggles.
  • Spend 50% of your time teaching the pipeline to make a good decision.
  • Spend time coaching others about your expertise.
  • Don’t disable a part of pipeline to pass the build.
  • Pipeline is part of daily discussions and large-screen monitors.
  • Measure how long a pipeline stays red.
  • Test Recipes QA+DEV.
  • A red pipeline means the task is not done. No need to create a bug item.

It’s a long list and there is no general best way to get there. Fortunately the northern star is well-defined - pipeline should become the sole judge of relatability. For that it has to be reliable.

Roy suggests moving unreliable stuff to a separate discovery pipeline. Even something that can’t be properly automated. It helps to keep the delivery pipeline clean & reliable. I strongly agree with the idea of moving flaky stuff out of the main pipeline until it is ready. It’s ok to split existing test suites in two and migrate individual tests when they are refined.

Roy disapproves of yellow pipelines. They are not actionable, thus confusing. Red - we have to fix it now. Green - DEPLOOOOOOOY JENKINS!!!7

I’d like to dig a bit more into the feature toggle topic. Roy suggests that canaries are a useful tool and can be automated in a pipeline. He provides Kayenta as an example of automated metric analysis. This is a part of disassociating deployments and releases. With feature toggles and similar runtime configuration technologies, feature can be released and rolled back independently of a deployment. This decoupling makes canaries, thus partial deployment, more available.

Full Cycle Developer

Full cycle developers are expected to be knowledgeable and effective in all areas of the software life cycle8. It is a term introduced in Netflix’s blog post. I do recommend reading it too.

Roy brings it up because responsibility for the pipeline should be shared. Many suffer from role segregation when bugs are bounced back and forth because it doesn’t seem like “our issue”. Understanding the whole pipeline allows diving deeper yourself and asking for help from an expert in a part of the pipeline.

Pipeline Driven Organization

A Pipeline Driven Organization operates around the idea that automated pipelines make the important IT decisions9. They practice Cooperative Pipelines.

Summary

There are no new ideas in Cooperative Pipelines, CoOps or Pipeline Driven Organizations. Nevertheless, I find them useful. While CI/CD is lost to semantic diffusion, these relatively new takes are getting straight to the point.

I’ve definitely enjoyed the talk. Roy mentioned writing a book about Pipeline Driven Organizations, but I haven’t found it. There is pipelinedriven.org, but it doesn’t look active since 2022.

I suggest reading “Modern Software Engineering”3. It covers many ways to help one build processes, practices, and a mental model to be Pipeline Driven.


Footnotes

  1. GOTO Conferences. “The Pipeline-Driven Organization • Roy Osherove • GOTO 2022.” YouTube, 6 Feb. 2023, www.youtube.com/watch?v=zmA5fhV-FGk.

  2. Farley, D., & Humble, J. (2010). Continuous delivery: a handbook for building, deploying, testing and releasing software. Addison-Wesley Professional.

  3. Farley, D. (2021). Modern Software Engineering: Doing What Works to Build Better Software Faster. Addison-Wesley Professional. 2

  4. Goldratt, E. M., & Cox, J. (2016). The goal: a process of ongoing improvement. Routledge.

  5. Goldratt, E. M. (2017). Critical chain: A business novel. Routledge.

  6. Goldratt, E. M. (2011). Beyond the Goal: Eliyahu Goldratt Speaks on the Theory of Constraints. Gildan Audio.

  7. Dubs, Jamie. “Leeroy Jenkins.” Know Your Meme, 5 Nov. 2024, knowyourmeme.com/memes/leeroy-jenkins.

  8. Blog, Netflix Technology. “Full Cycle Developers at Netflix — Operate What You Build.” Medium, 21 June 2018, netflixtechblog.com/full-cycle-developers-at-netflix-a08c31f83249.

  9. “Pipeline Driven.” Pipeline Driven, 11 July 2022, pipelinedriven.org.