The test automation pyramid. Should we take it seriously?

The current state of the software development industry requires developers to deliver code fast adapting the product to the market needs. But often times this mission can be hindered by testing when it becomes a lengthy part of the process. As a software product grows in size and complexity, its testing can really slow down the speed of code delivery. Therefore the question lies on which tests we should prioritise for automation. A practical guideline that was offered in the testing industry in the early 2000’s is the testing automation pyramid. 

The testing automation pyramid

The test automation pyramid is a framework that summarises which type of tests should be automated and in which proportion we should automate them, so developers can more easily identify which change is causing the code to crash.

The purpose behind this framework is to prevent already existing, functioning features from breaking after new code is implemented.  To achieve this, the automation test pyramid distributes tests across 3 layers: Unit tests, Service tests, and UI tests

The base: Unit testing

Unit testing checks that isolated software components work as expected. What is a unit? In programming, this concept refers, in a broader sense, to the smallest testable fraction of a program. In object-oriented programming, for example, the smallest part is a method. 

Unit test automation pertains to the realm of white-box testing since the automation engineer will have full knowledge of the structure of the code. Because unit testing covers the smallest chunks of the code the number of tests tends to be larger in this layer. Therefore, this layer is depicted at the base of the pyramid and as the biggest one. This kind of tests can be easily implemented by developers since they are usually coded in the same language as the software under test. 

Unit testing is the foundation of automation testing. A reason to approach automation this way, is because by assuring that the code is correct at its finest level of granularity we are preventing bugs that go unnoticed for a long time because they infrequently manifest themselves in dynamic runs

Moreover, test failures at this level offer developers a more detailed “post-mortem” snapshot of the state of the code. If you are coding a web scraper, probably you would prefer to be alerted on specific bugs in your functions, rather than on the data that your code did not manage to retrieve. Automated unit tests tell you exactly where in the code the flaws are located thereby speeding up bug fixing. A good unit testing will save lots of headaches to developers by the time QA engineers report bugs at the Services and UI levels because we can rest assure the smallest parts of the code work as expected.

The second floor: Services testing

Often referred to as the integration testing layer, service tests verify that the small units of code interact with each other correctly. The aim here evolves from checking the correctness of code units to assuring that a feature communicates well with external services. 

Having said that, in my humble opinion, “integration testing layer” is prone to cause confusion. Many people in the software industry claim that this layer is concerned with integration testing. As I mentioned before, integration testing verifies the proper interaction between components of the product. However, the way the automation pyramid is laid out seems to suggest that integration testing excludes the UI layer. And, obviously integration testing is perfectly implementable at the UI layer. We can test the integration between the front end of a WebApp and its APIs. To my knowledge, there is no 100% accurate term to constrain this layer. So for the sake of clarity, I will treat it as integration testing for services. 

Here we enter the fabulous realm of black-box testing because the object of testing is no longer the lines of code but the interaction between different instances of the product. This kind of testing, therefore, implies longer runs compared to unit tests because more parts of a system are involved. 

Services test automation targets calls to APIs and the business logic. Since we test services, we could say that on this floor of the pyramid we are more concerned with data processing and architectural structures. One note to make is that although many automation engineers refer to this kind of testing as black-box, often times we can consider this testing as “gray-box” because here we are focused on the integration between code components that make up the business logic. Therefore, it is likely that QA engineers dealing with this tests know something about  the internal structure of the product. 

The roof: UI testing

UI (user interface) testing is also referred to as the end-to-end level of the pyramid. Tests performed at this level cover the entire functionality of the product as a whole from the point of view of the end user. 

These kind of tests are at the top of the pyramid because they take the longest to run and their level of granularity is the coarsest compared to the other layers. In other words, bugs found while testing the UI require more exhaustive investigation by developers because they will not know the specific lines of code that make the product crash. 

In this case, we are so high in the pyramid that testers no longer see what is going on on the ground. QA engineers abstract themselves totally from the internal structure of the code. Here, we are dealing with pure black-box testing

Why do unit tests represent the smallest part of the pyramid? You have already read it. The level of detail offered by tests results at this layer is so low-resolution compared to the other layers that fixing bugs takes longer and therefore are more expensive.

Again, I have some doubts regarding the terminology and the logic behind the automation pyramid. This framework seems to suggest that the testing of the UI (the front-end of the product) overlaps with the end-to-end testing and requires no knowledge of the code it is made of. 

However, it is certainly the case that front-end developers perform their own unit tests. From my point of view, “end-to-end” is a more accurate term to define what the pyramid framework tries to implement. I understand that some refer to this layer as UI because these tests are performed exclusively through the front-end. But in my humble opinion, this terminology causes confusion and the peak of the pyramid would be more accurately described by E2E testing. Meanwhile, the base of the pyramid could still be called “unit layer” and be broken down into front-end and back-end unit testing.

So, should we take the pyramid seriously?

It is often the case when a business is overloaded with tons of UI tests, that are predominately manually executed, and with barely no coverage for the services and unit layers. The consequences are that testing becomes a too lengthy process and many bugs go uncovered because they are not easily reproducible at the UI level, but they are still there, waiting for the end user to execute the edge case of death. 

I would say the test automation pyramid offers considerable value as a “soft guide” for testing. As far as the gist of its logic goes, I would agree completely. This is, on a general basis it is preferable to focus automation efforts on unit tests because they provide accurate data for developers, uncovers many faulty code implementations (such as adherence to coding standards, unused dangling variables, etc.) and flaws discovered in this layer are cheaper and quicker to fix.

In this regard, the analogy of a pyramid is quite accurate. The furthest the tests are from the code the lower is their resolution. Unit tests are the first line of defense for errors such as formatting mistakes, null pointers or wrong data types which should be quick to fix. Testing software when we see what happens “behind the scenes” is much more reliable, at least when speaking about automation testing. Excessive automation efforts in the UI can be counterproductive since tests results at this level are more prone to be misleading due to the intersection of the functionality under test with other environmental variables. 

That being said, It would be wrong, in my estimation to take the test automation pyramid as a hard rule. Following testing recommendations in an arbitrary fashion can cause bad practices that might degenerate into an inefficient test automation plan.  Imagine for a minute an application with countless unit tests that are implemented for the sake of reaching arbitrary metrics while neglecting valuable unexpendable end-to-end tests. Could we conceive that this application’s quality is assured? Not in the least.

So, should we take seriously the testing pyramid? In my opinion, the answer is yes… as long as the testing team and the business have clear what the needs of their project are, have clearly defined concepts as is unit testing, integration testing, systems testing, white/black-box testing, etc. It is OK to adapt the terminology to match the business needs, at the end of the day a software project’s ultimate goal is to satisfy a market need not to blindly hit metrics.

To wrap this up, in my most humble opinion, the test automation pyramid is a highly valuable framework that contributes to the testing industry with some coherent principles that should nevertheless taken as general guideline, and, if necessary it should be adapted to our needs.