Tuesday, February 27, 2018

It's all about Problem Management

It occurred to me, recently, that all of the articles I've tweeted from Flipboard while drinking my morning coffee fall into one of two categories.

A new product or service is unveiled.  This could be by a new company who is trying to make a name for itself, but just as often it's from an existing company that is trying to elevate itself to either stay ahead of the competition or take a threat head-on to avoid being pushed aside.

An existing product or service had a failure of some sort that was discovered and/or revealed.  Frequently, these are security breaches but can also be a failure in the sense that customer satisfaction was significantly impacted, the brand was damaged, etc.

When a company fails to consider their purpose from the prospect of their customers then failures occur.  For example, Equifax failed to secure their environment which allow them to be the victim of one of the largest (reported) security breaches in history.  However, on a bigger scale, they seemed to forget that customers (every citizen in the United States) rely on them to provide a credit score so that they can purchase things like houses, cars, etc.  In order to do that, Personally Identifiable Information (PII) is required, which requires that the PII is properly secured to prevent misuse by thieves.

The point that I'm trying to make with this long introduction is that organizations are the most effective when they realize that they exist to solve problems and then audit every decision through the magnifying glass of "will this help us solve the problem?"  Doing so will also solve the desire for revenue because problem solving is the proverbial "better mousetrap" that will have the world beating a path to your door.

My question is this:  what's the problem that is being solved with Agile development and, more generally, DevOps?  To answer that, let's examine how companies generate bottom line revenue.  Limiting this to operational cash flow, this happens through the following three activities:

Increase top line revenue.  This occurs when greater numbers of one's product or service are sold; new products or services are developed and sold; or some combination of the two.

Reduce expenses.  Top line revenue generation has a cost associated with it, whether it's the cost of salaries of your sales team; the cost of new product development; the cost of infrastructure required to run the business; etc.

Mitigate risk.  The impact of this is a bit harder to quantify because most companies don't keep metrics on the impact of a production outage, for example.  Anything that impacts the ability of the company to deliver the products or services is a risk to the company because not only is there an opportunity cost associated with it, there are also impacts to the Net Promoter Score and ultimately the brand. These affect future sales, which then affect top line revenue as described above.

Looking at each of these from a DevOps perspective but starting with the problem statement, we get the following:

Problem:  we aren't releasing new features quickly enough.  These either increase the desirability of our product, resulting in increase sales or allow our sales reps to more effectively do their job so that they can sell more in the same amount of time.  This is an increase top line revenue play.

Problem:  in order to release new features more quickly, we increased the amount of infrastructure and people used in the process of developing and deploying those features at the cadence that we want.  This is a reduce expenses play.

Problem:  we've added more infrastructure and people, which has allowed us to increase the release frequency, but due to the lack of full automation of our processes our Defects Per Million Opportunities have substantially increased too.  As a result, we have more errors that occur due to failed change and release process executions, which costs us both revenue and erodes our customer base.  This is a mitigate risk play.

Keeping these problems up front instead of allowing them to get lost as solutions start to be discussed will allow you to select the best solutions for each type of problem.  One of the biggest areas where I've seen this is with respect to automation, which is a key component of any DevOps strategy.  During a discussion I had recently, someone commented that no company has an "automation center of excellence" to which I replied that "automation" by itself is a concept and not a specific discipline per se around which you can establish a center of excellence.  So why, then, do companies think that any automation will work when a specific type of automation is needed?

For example, many companies are flocking to point solutions (e.g. Jenkins, Chef, Ansible, etc.) to solve their need for Continuous Delivery.  They are quick to ignore the fact that these point solutions solve one specific and narrowly-focused part of the CICD problem, but instead of trying to expand the scope of their search they try to force the solution to do more than what it is intended to do.

Am I suggesting that these point solutions have no place in CICD?  Hardly.  But the correct solution for the problem needs to be chosen for the solution to actually have its intended effect.  Otherwise, you may eventually solve one problem (e.g. increase top line revenue) while creating another (e.g. increasing the cost associated with doing so due to the higher TCO of the selected solution).