The Curiosity Blog

Is test data the engineering problem to solve in 2024?

Written by Thomas Pryce | 29 March 2022 14:07:36 Z

It’s 2024 and the risks associated with poor test data practices show no signs of abating.

According to the latest World Quality Report, the use of potentially sensitive production data rose in 2021, despite the compliance risks associated with using raw data in less-secure test environments.

This risk of costly legislative non-compliance is compounded by the fact that testers at 45% of organisations do not always comply with security and privacy regulations for test data, making the distribution of sensitive information to test environments an even more startling concern.

In addition to compliance risks, test data continues to undermine both testing speed and quality, with around half of organisations reporting that they do not have sufficient data for all of their testing. Half again report that they do not have timely access to the right test environments, undermining testing agility as well as coverage [i].

These challenges are not new, and their persistence highlights how test data “best” practices have fallen behind evolutions in “agile” delivery methods, DevOps, automation, and CI/CD:

Perennial test data challenges reflect how far test data "best" practices have fallen behind "agile", DevOps, automation and CI/CD. Original Image: Khanargy, Wikimedia Commons, published under CC BY-SA 4.0 license.

This blog will explore some of the risks to software delivery associated with outdated test data practices. It then identifies current trends that threaten to make test data problems worse if not addressed proactively. Fortunately, these same trends often hold the key to solving test data challenges, offering techniques for automated, on-the-fly and “just in time” Test Data Automation.

To see how organisations today can transition from existing test data management practices to Test Data Automation, watch Curiosity's and Sogeti's webinar, The state of test data in 2022: New challenges, opportunities, and the role of “AI”.

The risks to software delivery of outdated test data practices

The risks associated with outdated test data practices touch upon delivery speed, costs, and legislative compliance.

These wide-reaching challenges undermine goals that commonly motivate an organisation’s drive to become more “agile”. In other words, test data frequently conflicts with a drive to deliver increasingly complex software, faster, and without incurring prohibitive delivery costs.

Some of the risks associated with test data are seen in the following stats drawn from recently available research:

  1. 44% of testing time is spent waiting for, finding, or making test data [ii]. This risks a detrimental impact on release speed and overall agility.
  2. Gaps in test data coverage risk further costs and time lost during development, given the increasing cost of fixing a bug for every stage it slips through the delivery lifecycle [iii].
  3. This impact of test data on agility and release speed threatens customer retention in an era of high customer expectation regarding digital experience [iv].
  4. Organisations with insecure or non-compliant test data practices risk a further competitive disadvantage. This is reflected, for instance, in the fact that 32% of people surveyed worldwide in 2019 said they had already previously switched providers over concerns regarding data or data-sharing policies [v].

The bad news: Complexity and demand for data is growing

Test data challenges accordingly risk the efficiency and quality of software delivery, while posing a threat to legislative compliance and an organisation’s bottom line. TDM is clearly a challenge worth solving today. Yet, it’s also a problem that is continuously becoming more complex to “fix”.

A range of trends in recent years have added to the complexity of finding, making, and provisioning fit-for-purpose test data. These same trends have contributed to the demand for data across the SDLC, and in turn risk exacerbating bottlenecks associated with test data. These recent trends include:

  1. The adoption of new technologies alongside existing and legacy tools. This adds to the complexity of the interrelated data needed to test hybrid architectures. Newly adopted tools today include cloud-based technologies, Big Data technologies, “AI”, new database types, and the proliferation of APIs:

    Test data today must span an intricate combination of new and existing technologies.

  2. The pace of agile and iterative delivery, alongside the adoption of CI/CD and DevOps. These trends have increased the speed with which complex systems change, and also the magnitude of the changes made. Each change must be de-risked by rapid and proportionate testing, adding to the demand for new and increasingly complex test data.
  3. Automation in testing, increasing the volume and variety of tests executed. Data-hungry automation frameworks today tear through data faster than ever, adding to the demand for voluminous and varied test data.
  4. Evolving privacy legislation. Changing legislation globally can add complexity to the management of potentially sensitive production data in non-production environments, while adding to the degree of control needed over data. This can increase delays at organisations already struggling to provision sufficient data for testing, often adding to calls to remove sensitive data from testing.

Unless test data tools and techniques match this growth in the complexity and demand for test data, test data challenges will persist and grow.

The good news

Fortunately, many of the factors that are today adding to the demand for and complexity of test data also offer solutions to perennial test data challenges.

For instance, techniques used today in test automation can be leveraged to make data via UIs, enabling test scripts to self-provision data on-the-fly. Integrating parameterizable test data utilities as part of CI/CD pipelines can likewise reduce the need for manual intervention when allocating data to tests, reducing the risk of bottlenecks during continuous test execution.

And, of course, there’s AI, and its promise to automate many of the complex tasks associated with test data. As with any discussion of AI and testing, caution must be exercised today to identify what truly constitutes “AI”, and whether the value of incorporating AI is greater than using alternative approaches. Nonetheless, techniques today can support the on demand creation of complex data, such as using solving techniques to reverse-engineer data values needed to reach given expected results.

If you’d like to see how test can leverage valuable techniques found across DevOps, automation, and the emerging world of “AI”,watch Curiosity's and Sogeti's webinar, The state of test data in 2022: New challenges, opportunities, and the role of “AI”.

References:

[i] All the stats in the introduction are taken from Capgemini, Sogeti (2021), The World Quality Report 2021-22. Retrieved from https://www.capgemini.com/gb-en/research/world-quality-report-wqr-2021-22/ on 18/02/2022.

[ii] Capgemini, Sogeti (2020), The CONTINUOUS TESTING REPORT 2020, 21. Retrieved from https://www.sogeti.com/explore/reports/continuous-testing-report-2020/ on 22/03/2021.

[iii] ScopeMaster, Shift Left Testing. Retrieved from https://www.scopemaster.com/blog/shift-left-testing/ on 18/

[iv] See, for example, Appnovation (2021), The Digital Consumer: Expectations are higher than ever - Can your brand keep up? Retrieved from https://www.globenewswire.com/news-release/2021/02/17/2176967/0/en/The-Digital-Consumer-Expectations-are-higher-than-ever-Can-your-brand-keep-up.html on 18/02/2022.

[v] Cisco (2019), Consumer Privacy Report. Cited in Thomas C. Redman and Robert M. Waitman (2020), Do You Care About Privacy as Much as Your Customers Do? Retrieved from https://www.globenewswire.com/news-release/2021/02/17/2176967/0/en/The-Digital-Consumer-Expectations-are-higher-than-ever-Can-your-brand-keep-up.html on 18/02/2022.