GDPR and testing: Are you a sceptic or a gambler?

Last week, we published a blog making the case for the next generation in TDM “best practice”. We considered why the logistical approach of “mask, subset, clone” provisioning cannot provide the data parallel test teams need, when they need it.

This week’s blog considers the benefit of “Test Data Automation” from the perspective of one of the core TDM requirements: test data compliance. In particular, this blog sets out the repercussions of the EU General Data Protection Regulation (GDPR) for testing, and how a new TDM paradigm can ensure compliance while also maximising testing speed and quality.

Test Data Compliance: An issue that won’t go away

The proposal for the EU General Data Protection Regulation (GDPR) was made as long ago now as 2012, and the Regulation was adopted in 2016. Throughout this time, two broad responses to the tightening legislation has been common among testers:

  1. The sceptic:“Big organisations will simply group together and resist this in the courts. Nothing will change in practice and there’s no way that national data protection agencies will be able to demand so much change so quickly, let alone levy fines this big.”
  2. The gambler:“Fines will still only be levied following high-profile data breaches. There’s no way agencies are going to start performing regular audits, let alone audit my company. Besides, the chances of us suffering a data breach are slim to none – it’s never happened before!”

2019: An issue that can’t be ignored

Fast forward five years and the implementation period is now over. The GDPR is now in force, and eye-watering fines cast doubt on the responses of both the sceptic and gambler. The steep punishments levied recently are a reminder of the real threat of data breaches, but also a serious statement of intent regarding the enforcement of the GDPR.

In July, for example, the UK’s Information Commissioner’s Office (ICO) announced a record fine of £183 million for British Airways, relating to the harvesting of 500,000 customer details by attackers. That reflects roughly 1.5% of BA’s annual worldwide turnover for the previous year, smashing the ICO’s previous record fine of £500,000. National enforcement agencies appear willing to impose the full force of the GDPR’s deterrents. 

The announcement of an intended £99.2 million fine for Marriott International came a day later, relating to the exposure of 339 million guests’ information. 30 million of the guests records belong to Europeans, but Marriott is a US company. This dispels the further scepticism regarding the ability of national agencies to enforce the GDPR’s global scope.

Authorities in each instance point to a lack of sufficient security measures, and also to the responsibility organisations of every size have for the data they process. So, how does this relate to testing practices?

We need to talk about TDM…

From a QA perspective, one glaring practice screams security risk: the use of production data in test and development environments. This has long been warned against from a data privacy perspective, yet 65% of organisations still use potentially sensitive production data in testing.[i]

Production data does appear an obvious place to source production-like data for testing. The issue is that test and development environments are necessarily less secure than production, so that any sensitive data stored in them increases the risk of a data breach.

Then there’s the rights of European Data Citizens, which have been strengthened by the GDPR. These rights apply regardless of whether a data breach has occurred, and present further challenges for current QA practices.

The Rights to Data Erasure and Data Portability are good examples. An EU Data Subject can request all that all their data is erased “without delay,” and can also ask for a complete copy of their data stored by an organisation.

This presents a logistical nightmare for current Test Data Management (TDM) practices. Many organisations store data across test environments, in unmanaged formats like spreadsheets on testers’ local machines. Such organisations struggle to know where certain data is kept, and will therefore struggle to identify, copy and delete it on demand.

Improving data security and test data quality

The good news is that using production data in test environments is frequently avoidable. Synthetic test data generation is today capable of generating realistic test data for even complex systems, rapidly mirroring the data dependencies found in production.

Quality synthetic test data is built from a model of the metadata found in production. It reflects even complex patterns in data like temporal trends, all while remaining wholly fictitious. It therefore supports accurate and stable test execution, without the risk of exposing sensitive information.

The benefit of increased security is furthermore coupled with a significant quality gain for QA. Synthetic data can be generated for the numerous data combinations not found in existing production data, including the negative scenarios and outliers needed for complete test coverage.

Improving data security in testing is not therefore just a logistical issue: it can drive up test coverage, improving the quality of software and reducing defect remediation efforts.

Organisations will not be able to switch to using wholly synthetic test data overnight. Nonetheless, an effective TDM strategy should aim to replace production data gradually with fictitious test data. This “hybrid approach” continues working with production data where needed, in time replacing all test data sources with fictitious, coverage-enhanced equivalents. Testers and data protection officers (DPOs) can then enjoy peace of mind, all while improving application quality.

Thanks for reading! Please feel free to share your thoughts using my email address below, and look out for next week’s blog on creating high-coverage test data sets. To learn more about how test data compliance can also maximise testing rigour, please join Curiosity and CloseSure at the “Test Data Automation” Meetup in Ultrecht on November 28th.

[i] Redgate (2019), State of Database DevOps, 23. Retrieved from http://assets.red-gate.com/solutions/database-devops/state-of-database-devops-2019.pdf on 19 June 2019.