Years of experiences in the software industry shows that there are many more thing that has not been discovered yet. Every approach found in the software industry aims to accelerate the development speed with more customer satisfaction. Hence the new term "TestOps" is also creating different perspectives on the relationship between test activities to the operational activities within the DevOps culture. In the post, I want to explain my experience with the TestOps.
What is TestOps
It shortly stands for tests and operations. Within the DevOps culture, it is a sub-discipline DevOps. There are two approaches to TestOps, these are related to the test approach that you are applying the overall test approach. Mainly if you are applying `shift-left testing` then the TestOps should be adopted to have more collaboration with the development team, `TestOps shift-right`. On the other hand, if you are applying `shift-right testing` then the TestOps should be adopted to have more collaboration with the operation team, this can be called `TestOps shift-left`.
Why We Need TestOps
Developing something from idea to a product is needing different expertise such as product owner, developer, tester, DevOps. This shows that one can not handle everything by himself, or doing one responsibility doesn't mean that you are the expert and doing the correct thing. In the testing, there are always changes. We need to adapt ourselves to the waterfall, v-model, agile development process, and also application architectures patterns like monolithic, microservices. Required tests for each process and pattern are different so the required capabilities of the testers are also different. The emerging pattern nowadays is microservices and the most popular development process is agile.
With emerging trends in the microservice architectural pattern, shift-right testing has risen significantly. One of the pioneers who applied the microservice pattern widely is Netflix, and they are suggesting to test those hundred of microservices in the later stage of development as well. Let's look at the real-world example of Amazon and Netflix below:
Number of Integration Point For Amazon and Netflix |
Since there are hundreds of microservices and the number of integration points between each of them is very high. Let's just think 2 services then the integration points become 2. If there are 3 services, then it becomes 3. Respectively 6, 10, 15, 21, and so on. It increases drastically when the number reaches to hundreds, for example, 100 services have 4950; 500 services have 124750 two subset integration points. Basically, the combination formula is below, where the n is the number of microservices, r is subset/subgroup which is 2 for integration calculation.
Netflix, as well as Google and Amazon, also has been the pioneer in the testing of the Microservices. Thanks to Netflix, they explained the experience transformation publicly. Netflix and also Spotify have changed the use of the traditional test pyramid. It is reverted and it becomes a test diamond, which is saying that the unit test is still important but instead of writing exhaustive unit tests we should focus on integration test more. For detail how Spotify change the test pyramid for microservice testing you can read this post.
Traffic in Netflix |
So far I want to explain the system that we need to make an impact on the quality of the overall system becomes more and more complicated. Handling the test activities of those systems should be updated with new approaches. Whatever your testing approach is shift-left or shift-right, you should use the discipline of TestOps effectively.
TestOps shift-left
TestOps is needed when you want to leverage the test automation benefits. This requires lots of expertise for TestOps shift-left, these are some basic tasks that you will need to do for a good test automation practice, such as:
- Using containers
- Using container orchestration tools
- Creating test pipelines with automation tools
- Integrating test tools the pipeline
- Integration test pipeline to the main development pipeline
- Creating or using a well-known automation framework
- Writing deterministic tests
- Creating test data
- Removing dependency in the tests
- Creating isolated test environments
- Running test parallel
- Creating test reports
- Binding these test reports
- Rerunning failed tests
- Extracting flaky tests and making them investigated
- Destroying the test environments after tests
- Removing test data
- Creating a monitoring system for the visibility of tests
- so on ...
TestOps shift-right
Let's look at the TestOps shift-right, which is mostly focusing on the collaboration with the operation team. Even though it includes most of the items for TestOps shift-left since it requires continues testing practice, its requirements are also different and we can list them as:
- Having continues testing
- Testing on the live environment
- Effective use of a monitoring system
- Leveraging data on the live environment to grasp anomalies
- Creating dashboards/charts from the live data
- Chaos engineering, defined by Netflix. Basically, identifying failures before an outage happens by injecting controlled failure scenarios for the reliability of the system.
Where We can Apply TestOps
We can see the benefits of TestOps in some situations more such as performance testing, test automation. For a well-designed performance testing, the strategy should include the real user scenarios. To be able to draw these scenarios, we need to tackle the real user data which is collected from the live environment. Let's list the items needed for performance testing:
- Real user flow data should be collected
- Data should be analyzed to create a distribution table for each user flow
- This distribution should be converted to a percentage of usage
- Test scenarios should be created depending on the real users
- Each test scenarios should be weighted with the percentage
- The test environment should be created if necessary, most of the time live environment should be used
- Environments for running performance scripts should be prepared, use the cloud services. If the master-slave configuration needed, the environment for each slave should be prepared. The better way is to create an IAAC file to handle this creation and scalability automatically.
- Prepare monitoring tools
- Run the tests
- Check the monitoring tools if loads are received correctly
- If any failure occurs, reconfigure the test/live environment and re-run the script. If everything goes well, stop the performance scripts
- Collect the data
- Stop every environment
- Create a report
- Possible I missed something
Since these activities are not done by a single person, TestOps discipline helps to handle these tasks. In the same way, test automation also includes most of the technical expertise to leverage the benefits.
How We can Align to TestOps
TestOps requires some new qualifications for QA/Test Engineers such as learning how DevOps tools can accelerate the testing experiences and also how the QA team helps the operation team to get better customer experiences. Let's summaries the qualifications which are needed to align TestOps discipline:
- Learn DevOps tools
- Containers
- Container orchestration tools
- Cloud testing ability
- Learn scripting languages
- How to spot defects
- How to debug defects
- More automation, not only test but also process
- Apply continues testing
- Adapt yourself to new tools and technologies
- Monitoring tool experiences
- Search data
- Create charts/graphs
- Create reports
- Learn how to communicate with the operation team
I have given a talk about What is TestOps in Five Question, you can it watch it (in Turkish)