TLDR: Load testing is the process of putting demand on a system and then measuring how well it performs in various scenarios. An essential part of our major-league methodology is integrating performance testing into your CI/CD Pipeline to test these scenarios early and often, allowing you to prevent degradation of performance at low impact. True performance testing at various scales must be integrated into the delivery pipeline. This is how!
_________________________________________________________________________
You don’t want your performance to degrade: (page load) slowdowns and failures have the most impact on performance. Similar to athletes, a winning proposition requires good training. Before deploying top-performing new features with confidence, you want to test whether the necessary changes fit within the bigger picture. Not only at a functional level, but also under stress, one user does not equal volume.
Moreover, the faster you get, the more critical it is to constantly test, finetune, and evaluate new features and deployments. Our motto isn’t “Get fast, stay fast” for nothing. An essential part of our major-league methodology is integrating performance testing into your CI/CD Pipeline, for (automated) performance and feature load testing. In this blog, we’ll tell you all about it!
What is load testing?
Load testing does not have a “shift left” or “shift right” option. It should be done at every stage of the software development process. Load testing is the process of putting demand on a system and then measuring how well it performs. The term “continue” means “to continue” or “forming an unbroken whole; without interruption”. Continuous testing is a method of running automated tests across the entire delivery process. It can be seen as a form of risk management. Application risks typically translate to much more serious business concerns, whether financial or in terms of the company’s face value. This is especially true for performance-related risks, which have a significant impact on customer loyalty as shown below in a graph. This graph correlates downtime with social sentiment, and clearly shows that more downtime will have a longer effect on sentiment.
The NO-test: Integrating performance testing into your CI/CD Pipeline
It is an indisputable fact: the closer you place your functional testing to “The Source”, the more value it delivers. Then, more development, deployment, and operational costs can be saved which enables your teams to roll out new features, fixes and so forth to your users. No matter what efforts we see in companies to ‘shift left’, in reality, we always see at least one important link missing to create the perfect user experience: continuous (automated) performance testing.
For performance data, we see many organizations relying on subjective feedback from manual QA testing or trying to distil feedback from browser plugins about whether the application’s performance is improving, degrading or remaining stable based on the output of their automated QA. However, none of these alternatives give a very accurate picture of the performance in your latest build or even if you would need to reject the new code base for deployment. For that, true performance testing at various volume scales must be integrated into the delivery pipeline.
To deploy or not?
You are already (hopefully) receiving per-build automated test results regarding whether the application still functions correctly with the latest build. If not, any problems will only be flagged and tagged in the next stand-up and sent back to the dev team to be fixed. Imagine if the same were true for application performance (without even a preview as to the impact on scalability) and potential feedback from your consumer?
For example, with automated performance testing, we run through your critical use cases with a scheduled load test or nightly script after successfully completing your QA tests. First with 1 user, then assuming a single user sees response and load times within the expected KPI limits, repeat with 10 users, then 50, 100 or even more depending on the capacity of your development and/or acceptance environments. Each test case is scaled and matched against a specific set of Performance KPIs to provide a full comparison of code quality against what’s running in production.
Why do we do this? Just as with functional bugs, performance problems are far easier to mitigate within a day or so of being introduced into the code rather than down the road and surprisingly many performance bugs are already detectable at low to moderate loads. The alternative is they either turn up in larger-scale stress testing too late to fix or worse: as negative feedback from your users once in production.
Shift your performance?
Sounds simple, right? So why doesn’t everyone do it? Well, there are several likely suspects here.
#1: Stress and load testing demand a different skillset and experience
More than what a typical developer accumulates in their daily work. The options are often also a bit daunting; go with one of the many open-source test applications? Because you may use Selenium, for example in QA testing. Do you also stick to Selenium for CI/CD load testing in order to reuse scripts, or do you opt for something like Jmeter with protocol simulation so you can reuse the scripts for your periodic large scale stress tests?
#2: You need to manage test data at scale
And create an infrastructure for testing to test capacity to allow for each test to be analyzed equally.
#3: You need to measure code, infrastructure and user experience
Not only in your production environment but especially in any test environment. Production is your end-stage. Measuring performance there is a result of how well your Dev teams work. If you really want to have an impact, you need to have a full view of performance and build observability monitoring (formerly known as application performance monitoring) into your development environments.
#4: Measure it to draw a line in the sand
Decide whether the performance of new code features is good enough to deploy in the next development environment. With your performance requirements defined, your team will be able to automate release quality gates* based on data of the observability monitoring just as they do today for correct functionality. Has application performance gone outside of acceptable boundaries? Then it automatically goes back to the dev team to find out what was broken.
Now, your organization might even have the skills in-house to make this all happen but just needs help defining the process and building the tooling. Maybe your IT director’s eyes begin to glaze over simply because you’ve started to ask questions like this out loud. But start measuring to know what’s right and wrong.
Gain control! You cannot not loadtest
Not doing performance testing will make every deployment a gamble, will or won’t it work under stress. When any of the above reasons sound familiar, many organizations outsource portions of their integrated load testing to an external partner. MeasureWorks sets up the load test framework including platform, scripting and automation. This allows your own teams to focus on their primary deliverables, enhanced by the load test performance data they receive every morning along with the other automated test results.