<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.0">Jekyll</generator><link href="https://wghilliard.github.io//feed.xml" rel="self" type="application/atom+xml" /><link href="https://wghilliard.github.io//" rel="alternate" type="text/html" /><updated>2020-11-21T13:23:02+00:00</updated><id>https://wghilliard.github.io//feed.xml</id><title type="html">Project: {{catchy_title}}</title><subtitle>Welcome to the wonderful world of \{\{catchy_title\}\}! This is a place where I talk about cool stuff I find and projects I'm working on. Topics might range from languages to frameworks to development practices to philosophy and style.</subtitle><entry><title type="html">Opinion - Testing Miniseries - Part One - The (Formulated) Cost of Inadequate Testing</title><link href="https://wghilliard.github.io//opinion/development/2020/11/21/testing-miniseries-part-one.html" rel="alternate" type="text/html" title="Opinion - Testing Miniseries - Part One - The (Formulated) Cost of Inadequate Testing" /><published>2020-11-21T00:00:00+00:00</published><updated>2020-11-21T00:00:00+00:00</updated><id>https://wghilliard.github.io//opinion/development/2020/11/21/testing-miniseries-part-one</id><content type="html" xml:base="https://wghilliard.github.io//opinion/development/2020/11/21/testing-miniseries-part-one.html">&lt;p&gt;WIP Disclaimer - All content is subject to change!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;I’ve decided to break down my initial attempt at discussing this topic of &lt;em&gt;The Cost of Inadequate Testing&lt;/em&gt; in to a miniseries of articles - welcome to Part One.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Often I find that the value of testing is difficult to measure, however the cost of inadequate testing is quantifiable.&lt;/p&gt;

&lt;p&gt;There are many projects out there, all with constraints and caveats that make test strategies arguably unique. One might claim that common layers of the stack can be addressed in common manner. For example modern HTTP web servers often include a testing framework, and special runtime environments like Android or iOS have frameworks that provide conventional patterns to setup the environment and make assertions.&lt;/p&gt;

&lt;p&gt;However, something that often goes unconsidered is the cost of the tests written and the value they provide.&lt;/p&gt;

&lt;h2 id=&quot;formulation-and-terms&quot;&gt;Formulation and Terms&lt;/h2&gt;

&lt;p&gt;In this article I shall attempt to formulate the perceived cost to develop a &lt;strong&gt;Product&lt;/strong&gt; over its entire lifecycle. Terms are in &lt;strong&gt;bold&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For this thought experiment, we’ll express costs in the units of “developer hours”, the trendier version of a “man hour”.&lt;/p&gt;

&lt;p&gt;When we build a &lt;strong&gt;Product&lt;/strong&gt;, we can think of it as a composition of an environment (&lt;strong&gt;Infrastructure&lt;/strong&gt;), some business logic (&lt;strong&gt;Features&lt;/strong&gt;), and undetected bugs + refactoring (&lt;strong&gt;Maintenance&lt;/strong&gt;).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Product = Infrastructure + Features + Maintenance&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Caveat: &lt;strong&gt;Infrastructure&lt;/strong&gt; is a topic for another day, for this exercise we can pretend that if the code builds in the Continuous Integration pipeline, then it performs (as authored) in the production environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Maintenance&lt;/strong&gt; encapsulates both updating the business logic as the &lt;strong&gt;Product&lt;/strong&gt; (and our understanding of the business value) evolves, along with addressing defects as they are found. Let’s decompose &lt;strong&gt;Maintenance&lt;/strong&gt; in to &lt;strong&gt;Defects&lt;/strong&gt; and &lt;strong&gt;Business Logic Refactoring&lt;/strong&gt;, where we treat &lt;strong&gt;Business Logic Refactoring&lt;/strong&gt; as a constant cost and therefore omit it from the the latter formulas to reduce complexity. (If you believe this to be hersey, feel free to email me and tell me why you think I’m wrong!)&lt;/p&gt;

&lt;p&gt;Next, let’s zoom in on the &lt;strong&gt;Features&lt;/strong&gt; Term. Each &lt;strong&gt;Feature&lt;/strong&gt; needs to be described, designed, built, and updated as the product evolves. Easy enough!&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Feature = Design + Implementation&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Each of the above terms could be expanded further, but for now let’s unpack the &lt;strong&gt;Implementation&lt;/strong&gt; term - &lt;strong&gt;Implementation&lt;/strong&gt; is best described as writing the code, bringing the design in to the world of the living.&lt;/p&gt;

&lt;p&gt;Lastly, let’s assume the &lt;strong&gt;Product&lt;/strong&gt; exists on a timeline, where any given point on the timeline can referred to as &lt;strong&gt;T&lt;/strong&gt; (or &lt;strong&gt;t&lt;/strong&gt;) where &lt;strong&gt;Product(T)&lt;/strong&gt; is the cost of the &lt;strong&gt;Product&lt;/strong&gt; at time &lt;strong&gt;T&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And we’ll pretend we can deliver a new feature at each increment of &lt;strong&gt;T&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;E&lt;/strong&gt; is the sigma operation, so something like &lt;strong&gt;E(T=0, Now) {Foo(T)}&lt;/strong&gt; describes the sum of &lt;strong&gt;Foo&lt;/strong&gt; for every value of &lt;strong&gt;T&lt;/strong&gt; between &lt;strong&gt;0&lt;/strong&gt; and &lt;strong&gt;Now&lt;/strong&gt;. For example:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;E(T=0, 4) {2 * T} = 0 + 2 + 4 + 6 + 8 = 20&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The left side term will be the &lt;em&gt;total&lt;/em&gt; cost of the Product, which each iteration of the formula expressed via the letters of the Greek Alphabet.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Alpha = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) + Defects(T)}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Okay we’ve mostly defined the problem space, but we haven’t consider where &lt;strong&gt;Testing&lt;/strong&gt; should go. Let’s add it to the formula by weighting each Term with a coefficient.&lt;/p&gt;

&lt;p&gt;If we include sufficient &lt;strong&gt;Testing&lt;/strong&gt; with each &lt;strong&gt;Feature&lt;/strong&gt;, we can assume our &lt;strong&gt;Defects&lt;/strong&gt; will be reduced by say 50% and our cost to develop each &lt;strong&gt;Feature&lt;/strong&gt; is increased by 100%. (Some case studies of TDD claim the number of lines of code for tests and business logic is equal!)&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Beta = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) * 2 + Defects(T) / 2 }&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If we assume addressing &lt;strong&gt;Defects&lt;/strong&gt; is significantly cheaper than developing &lt;strong&gt;Features&lt;/strong&gt; and writing tests, &lt;em&gt;$Beta&lt;/em&gt; would be significantly more expensive than &lt;em&gt;$Alpha&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;However, if we assume inadequate &lt;strong&gt;Testing&lt;/strong&gt; is occurring, then we might reduce the &lt;strong&gt;Feature&lt;/strong&gt; cost to the original in &lt;em&gt;$Alpha&lt;/em&gt; and rely on our Customers and Support Tickets to determine when &lt;strong&gt;Defects&lt;/strong&gt; were introduced to the system. We’ll adjust &lt;em&gt;$Alpha&lt;/em&gt; to reflect this and rename it as &lt;em&gt;$Gamma&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Gamma = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) + Defects(T) * 2 }&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So far &lt;em&gt;$Gamma&lt;/em&gt; is the cheapest way to develop software! We should just never test until we find a &lt;strong&gt;Defect&lt;/strong&gt; in the field!&lt;/p&gt;

&lt;p&gt;However, we haven’t accounted for the compounding technical debt and increased development time due to an &lt;em&gt;unstable and non-assertable&lt;/em&gt; codebase.&lt;/p&gt;

&lt;p&gt;If we include the cost for context switching and debugging a misbehaving application that does not have a ground truth assertion to work with, a revised version of the formula might look like:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Delta = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) + Feature(T-1) / 2 + Defects(T) * 2}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Oh no!!! We’re paying more than full price for a &lt;strong&gt;Feature&lt;/strong&gt;! And we’ve got twice the &lt;strong&gt;Defects&lt;/strong&gt;!! It’s still cheaper than &lt;em&gt;$Beta&lt;/em&gt; though, right?&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Delta = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) * 1.5 + Defects(T) * 2}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lastly, let’s consider the situation where &lt;strong&gt;Features&lt;/strong&gt; rely on upon more than the previous &lt;strong&gt;Feature&lt;/strong&gt;, and worst case &lt;strong&gt;Feature(T)&lt;/strong&gt; relies upon &lt;strong&gt;Feature(0…T-1)&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;$Eta = Product(Now) = E(T=0, Now) {Infrastructure(T) + Feature(T) + E(t=0, T-1) {Feature(t) / 2} + Defects(T) * 2}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now we’re really in trouble, cost has entered in to realm of factorials.&lt;/p&gt;

&lt;p&gt;We could add more coefficients to account for the introduced &lt;strong&gt;Defects&lt;/strong&gt; found while maintaining the previous &lt;strong&gt;Features&lt;/strong&gt;, but I think the point has already been made, cost model &lt;em&gt;$Beta&lt;/em&gt; is cheaper in the long run.&lt;/p&gt;

&lt;p&gt;With cost model &lt;em&gt;$Eta&lt;/em&gt;, the technical debt will quickly overwhelm us and soon we won’t be able to make the minimum payment and our &lt;strong&gt;Feature&lt;/strong&gt; development will grind to a halt.&lt;/p&gt;

&lt;p&gt;The crux of the problem is that some teams treat testing as an after thought. The code is written, blessed, shipped, and tested. In that order, which only seems cost effective if you don’t look at the entire picture of the &lt;strong&gt;Product&lt;/strong&gt; over its lifecycle.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;My claim is that the &lt;em&gt;true&lt;/em&gt; cost to develop the &lt;strong&gt;Product&lt;/strong&gt; is cheaper if &lt;strong&gt;Testing&lt;/strong&gt; is done when the &lt;strong&gt;Feature&lt;/strong&gt; is written. Defects never go away for free, which means you’ll either be catching them while the &lt;strong&gt;Feature&lt;/strong&gt; is in context or after it has been shipped and new &lt;strong&gt;Features&lt;/strong&gt; are added to the dependency graph.&lt;/p&gt;

&lt;p&gt;I’ve witnessed laborious human-based rituals performed on release candidates by teams of developers taking days (and even in some cases weeks) such that it may be proclaimed from the mountain tops to be defect-free. I will neglect to detail the frequency at which I’ve witnessed these rituals of sacred tribal knowledge be frantically repeated because a defect was introduced during the previous iteration of the ritual.&lt;/p&gt;

&lt;p&gt;When one steps back, one must wonder “surely there is a better way, this can’t be &lt;em&gt;the right way&lt;/em&gt;, can it?” All we really want at the end of the day is to have confidence in our code and to avoid being called to address an outage in production on Friday at 7PM after having a beer.&lt;/p&gt;

&lt;p&gt;In the case that my example of rituals and proclamations is too remote to identify with, I would challenge the reader to recall a time when they were writing code (whether it be fixing a bug, adding a new feature, or just messing around) and they had to conduct some arcane process to assert that the characters they just typed didn’t cause the product to implode.&lt;/p&gt;

&lt;p&gt;That feeling of discontent, of inconvenience, is the subtle cost of inadequate testing.&lt;/p&gt;

&lt;p&gt;In &lt;em&gt;Part Two&lt;/em&gt; of this series I’ll discuss ways to determine the value of a test, where to start on both new and not-new projects, and will provide a practical example of TDD in action.&lt;/p&gt;

&lt;p&gt;If you found this post helpful or would like to fuel my caffeine addiction, &lt;a href=&quot;https://ko-fi.com/wghilliard&quot;&gt;consider donating.&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><category term="opinion" /><category term="development" /><summary type="html">WIP Disclaimer - All content is subject to change!</summary></entry><entry><title type="html">Opinion - The Cost of Inadequate Testing</title><link href="https://wghilliard.github.io//opinion/development/2020/11/12/the-cost-of-inadequate-testing.html" rel="alternate" type="text/html" title="Opinion - The Cost of Inadequate Testing" /><published>2020-11-12T00:00:00+00:00</published><updated>2020-11-12T00:00:00+00:00</updated><id>https://wghilliard.github.io//opinion/development/2020/11/12/the-cost-of-inadequate-testing</id><content type="html" xml:base="https://wghilliard.github.io//opinion/development/2020/11/12/the-cost-of-inadequate-testing.html">&lt;p&gt;WIP Disclaimer - All content is subject to change!&lt;/p&gt;

&lt;hr /&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# The Cost of Inadequate Testing

Often I find that the value of testing is difficult to measure, however the cost of inadequate testing is quite pronounced.

There are many projects out there, all with constraints and caveats that make test strategies arguably unique. One might claim that common layers of the stack can be addressed in common manner. For example modern HTTP web servers often include a testing framework, and special runtime environments like Android or iOS have frameworks that provide conventional patterns to setup the environment and make assertions.

However, something that often goes unconsidered is the cost of the tests written and the value they provide.

For instance, one might invest a substantial amount of time writing extensive tests to ensure line (and branch) coverage of given component, only to find bugs appear in production in adjacent components, or worse the component is refactored and the tests become obsolete.

In this situation, someone like Kent Beck (author of Test-Driven Development by Example) would argue that the tests were flawed from the start because changes in implementation details should not affect the tests. I'd allege he'd go on to say that the production bugs would have been caught if time was spent adding more contract tests to the adjacent components instead of testing implementation details, or if TDD was used from the start.

However it seems like other extrema of testing is more prominent - an after thought. The code is written, blessed, shipped, and tested. In that order.

I've witnessed laborious human-based rituals performed on release candidates by teams of developers taking days (and even in some cases weeks) such that it may be proclaimed from the mountain tops to be bug-free. I will neglect to detail the frequency at which I've witnessed these rituals of sacred tribal knowledge be frantically repeated because a bug was introduced during the previous iteration of the ritual.

When one steps back, one must wonder &quot;surely there is a better way, this can't be _the right way_, can it?&quot; All we really want at the end of the day is to have confidence in our code and to avoid being called to address an outage in production on Friday at 7PM after having a beer.

In the case that my example of rituals and proclamations is too remote to identify with, I would challenge the reader to recall a time when they were writing code (whether it be fixing a bug, adding a new feature, or just messing around) and they had to conduct some arcane process to assert that the characters they just typed didn't cause the product to implode.

That feeling of discontent, of inconvenience, is the subtle cost of inadequate testing.

## Risk

An old adage in software goes something like &quot;make the common case fast&quot;, however it's probably worth adding &quot;and stable&quot;. One could interpret this amendment in a few different ways.

### Only test the code that gets called 80% of the time

One way of assessing risk is understanding the common code paths that your product uses and writing tests to cover them. Writing Sock Shop? Then unit / integration / smoke / ui tests should ensure that the user can add socks to their cart.

### Only test the happy path

When writing a new interface, it's the perfect opportunity to add happy path tests to assert that the defined contract is satisfied by the implementing classes. If things like index errors are never really encountered because you're using functional programing techniques, then maybe that IndexOutOfBounds test isn't very valuable.

### Keep the dev loop __fast__

I prefer this interpretation to the others because it promotes the notion that confidence in one's product should be assert-able in a moments notice. Imagine being on a tech support call with a customer and DM'ing a developer to ask &quot;are null values allowed in the XYZ table?&quot; and the developer responds with, &quot;brb I need to deploy the product to staging to find out&quot;. Scenarios like this devalue the time of the developer, degrade the trust of the customer, and are expensive in terms of opportunity cost for the company.

I believe, as a developer, the less time I spend mind numbingly pulling levers and turning dials, the better.

Some may claim, &quot;But Grayson! My situation is different! My product runs in an environment that is non-conducive to testing and I can't afford to invest in building a custom test harness!&quot; This line of thinking fails to consider the long term maintenance costs of the product and the human nature of developers. If they must conduct a monotonous sequence of actions in order to assert the quality of the features, one will soon find that cost-estimates for features now must account for not only the manual cost of testing the old features __but also__ the cost of manually testing the new features. By not investing in a test harness, one exposes themselves to substantial risk of linearly increased development and maintenance costs. Oh, and developer burn out. Do you _really_ want to type that default password in to the login page? Or click through that first-time-experience dialog?

It's worth noting that development costs may not be affected if engineers find the using the manual testing strategy too laborious and omit it from the dev loop entirely. One may find the total cost has not changed, but rather the development cost has merely shifted to the maintenance column due to the increased rate of production bugs introduced when untested code gets shipped.

Lastly, the risk of shipping test-able bugs is proportional to the number of manual test sequences, meaning the longer one waits - the worse it gets.

## Creating New Things

Though old habits and lethargy some times make it challenging, I find myself often attempting to write tests _before_ writing the feature code. Yes, yes, Kent would be proud that I'm drinking the punch. However I would invite you consider the situation of &quot;The Unwritten Interface&quot;:

There are moments when building a product that a developer gets to create something new, to conjure something from thin air. In the land of test cases, the canvas is blank, the pallette undefined, and the brush in hand. In this realm, the developer is free to dream and to unleash their abundant creativity in order to craft something worthy of sacrificing to the alter of all that is good and beautiful.

There's a small caveat - whatever is good and beautiful may not compile, but for now that's okay.

Techniques like Test-Driven Development (TDD) give the developer room to run and experiment with what feels and looks right. Unconstrained by dependencies, control flow, or the call stack.

In order to get things to compile, whatever hasn't been defined can be stubbed and dependencies can be mocked. This freedom allows one to broaden their perspective of the problem in order to arrive at a solution without being burdened by the blinders of convention.

Once the solution is found and the test result turns green, one can start to replace mocks and stubs with real implementations in an incremental fashion such that the nature of the interface is preserved.

Then once the base case is working, they may move on to exercising the interface further, adding test cases and refactoring the design as appropriate.

In comparison, one could also design the interface _in the codebase_ in the land of _what is_ and _what works_, however they run the risk of over-fitting the solution to the problem and their time spent designing may be doubled if they failed to make the interface testable and therefore in need of a refactor when the tests are eventually written.

## Conclusion

Though the upfront cost of testing a product may seem expensive, it is important to consider the installments paid every time _that one feature_ needs to be worked on or a bug is found in production shortly after releasing. Having confidence in the product and fast dev loops is sometimes worth the overhead of investing in tests and test-driven development.

_If it means there's a better chance one won't get called at 7PM on a Friday after having a beer, it's worth trying right?_

## Further Reading

- Test-Driven Development By Example - Kent Beck

If you found this post helpful or would like to fuel my caffeine addiction, [consider donating.](https://ko-fi.com/wghilliard)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;</content><author><name></name></author><category term="opinion" /><category term="development" /><summary type="html">WIP Disclaimer - All content is subject to change!</summary></entry><entry><title type="html">Book Review - The DevOps Handbook</title><link href="https://wghilliard.github.io//books/development/2020/01/12/the-devops-handbook.html" rel="alternate" type="text/html" title="Book Review - The DevOps Handbook" /><published>2020-01-12T00:00:00+00:00</published><updated>2020-01-12T00:00:00+00:00</updated><id>https://wghilliard.github.io//books/development/2020/01/12/the-devops-handbook</id><content type="html" xml:base="https://wghilliard.github.io//books/development/2020/01/12/the-devops-handbook.html">&lt;p&gt;WIP Disclaimer - All content is subject to change!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;At one point in time, you could have asked me, “Grayson,
how would you measure quality of a software product?” and I would have
most likely answered with, “By how fast / accurate the code is!”&lt;/p&gt;

&lt;p&gt;After a few years of working in various industry
environments, I have become familiar with ways in which
a team might approach developing software. Sometimes
they focus on getting the MVP completed as fast as
possible, sometimes they want an MVP as cheap as
possible, and sometimes they want an MVP as good as
possible. However, it never occurred to me that “good” was
really a component of cheap AND fast. It wasn’t until I
began reading &lt;a href=&quot;https://www.amazon.com/dp/B01M9ASFQ3&quot;&gt;“The DevOps Handbook”&lt;/a&gt;
that I realized that “good” is usually often thought of as turning the “quality”
dial up &lt;a href=&quot;https://www.youtube.com/watch?v=s9F5fhJQo34&quot;&gt;to 11&lt;/a&gt;, where some algorithm
performs faster or the product is more feature complete.
However, I seldom remember being asked the
question “How do you know you what you’ve built is good?”&lt;/p&gt;

&lt;p&gt;Such a question could be interpreted as an accusation in
some settings, with ones rapport being challenged. In other settings it might be
simply answered with slides and graphs depicting the product’s performance over
previous iterations or implementations. However another interpretation would
consider how &lt;em&gt;trustworthy&lt;/em&gt; the product is, an incredibly important
attribute that I now believe is &lt;em&gt;wildly&lt;/em&gt; underrated.&lt;/p&gt;

&lt;p&gt;The DevOps Handbook could be potentially summarized with
the following sentence:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;One cannot claim quality unless it is regularly asserted.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These assertions can be executed in multiple ways:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Automated Testing&lt;/li&gt;
  &lt;li&gt;Production Telemetry&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;automated-testing&quot;&gt;Automated Testing&lt;/h3&gt;

&lt;p&gt;Automated testing allows an individual to claim with authority
that the thing they have built is &lt;em&gt;trustworthy&lt;/em&gt; and has &lt;em&gt;quality&lt;/em&gt;.
An argument could be made that automated tests could be written in
such a manner that they are effectively meaningless, but I would
challenge the reader, for the sake of this post, to assume that
an individual that was practicing this methodology
put in the necessary effort to write meaningful and complete &amp;lt; unit | integration | end-to-end &amp;gt; tests. (1)&lt;/p&gt;

&lt;p&gt;The DevOps Handbook claims that such automated testing allows for
value streams to produce meaningful quality-metrics and gives teams
a litmus test to determine if their development process is safe. These
tests would largely depend on the domain and architecture of the
product in question, but if done correctly they decrease risk when
releasing the product by revealing bugs early on, facilitate higher functioning
workflows (measured by lead time), and give external (and internal) customers
&lt;em&gt;faith&lt;/em&gt; that the product will work as intended when it is delivered.&lt;/p&gt;

&lt;p&gt;Some environments see an investment in to automated testing as a &lt;em&gt;nice to have&lt;/em&gt;
almost as if it were a feature to be implemented in the next major release. However,
this perspective trivializes the costs associated with last minute refactoring,
emergency hot fixes, and most importantly - brand degradation. A deficit of
automated testing leads to the product accruing a compounding mountain of technical
debt resulting in a codebase so fragile and mystifying that even senior engineers
might wonder how the product even worked in the first place. Spooky. (2)&lt;/p&gt;

&lt;p&gt;So, enough with the war stories, let’s pretend we’ve got automated testing!
Depending on the VSC flow the team is using, “quality gates” can be constructed
that give the team (and all other downstream consumers) &lt;strong&gt;faith&lt;/strong&gt; that the product
will do everything it claims to do. (3) These gates can be arranged such that the
intensity and scope increase with each level as described below:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Gate Number&lt;/th&gt;
      &lt;th&gt;Examples&lt;/th&gt;
      &lt;th&gt;Run Time&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;type checking, linting&lt;/td&gt;
      &lt;td&gt;milliseconds to seconds&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;unit tests - known IO, light mocking, accurate&lt;/td&gt;
      &lt;td&gt;seconds to minutes&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;integration tests - module composition, heavy mocking, behavioral&lt;/td&gt;
      &lt;td&gt;minutes to hours&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;end-to-end tests - no mocking, “real” services, api or ui driven, small data sets&lt;/td&gt;
      &lt;td&gt;minutes to hours&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt;performance tests - similar to end-to-end but with larger data sets&lt;/td&gt;
      &lt;td&gt;minutes to hours&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;6&lt;/td&gt;
      &lt;td&gt;manual testing - human + checklist driven&lt;/td&gt;
      &lt;td&gt;minutes (4)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The above gates are just a guideline, it may make sense to mix and match for a given product due to the domain, team, technologies, etc.&lt;/p&gt;

&lt;h3 id=&quot;production-telemetry&quot;&gt;Production Telemetry&lt;/h3&gt;

&lt;p&gt;Production telemetry helps answer the question “Is what we delivered providing value to
the customer?” Without a mechanism to determine if a feature is providing value to the
customer in an objective fashion, developers (and UX, QA, etc.) are left with anecdotes
to describe the customers satisfaction or dissatisfaction. Naturally only the extrema of
the feedback is reported and the development is guided by the outliers. Should time and 
energy be allocated to rebuilding / improving a certain feature? How do we know that the 
investment will provide value if we have no way to determine if our customers find that 
feature valuable? Moreover, the team is not empowered to experiment because it is 
arguably impossible to determine if a given change will improve or disrupt a customers 
workflow within the product. Therefore, instrumenting the product will give the 
development team visibility in to how the product is being used, and will objectively 
assert whether or not value is being provided. (5)&lt;/p&gt;

&lt;p&gt;Some examples of frameworks that facilitate telemetry include:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://openmetrics.io/&quot;&gt;OpenMetrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://opentracing.io/&quot;&gt;OpenTracing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.elastic.co/what-is/elk-stack&quot;&gt;ELK&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.microsoft.com/en-us/azure/azure-monitor/app/usage-flows&quot;&gt;Application Insights on Azure&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://developers.google.com/analytics&quot;&gt;Google Analytics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;so many many more!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Footnotes:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Some real-world architectures and brown-field projects make certain types of
testing a seemingly insurmountable task.&lt;/li&gt;
  &lt;li&gt;If there are no tests to assert quality / desired behavior and a “bug” is found, is it really a bug?&lt;/li&gt;
  &lt;li&gt;Or at least what the tests claim the product can do.&lt;/li&gt;
  &lt;li&gt;Ideally human testing is reserved for edge cases and difficult to reproduce scenarios, but this isn’t always the case.&lt;/li&gt;
  &lt;li&gt;Sometimes it’s not possible to derive a metric from aesthetic things, e.g. font or
page layout. In this case A/B testing can be used to help answer questions about qualitative features.&lt;/li&gt;
&lt;/ol&gt;</content><author><name></name></author><category term="books" /><category term="development" /><summary type="html">WIP Disclaimer - All content is subject to change!</summary></entry></feed>