Introduction to Synthetic Test Data Generation

Search Knowledge Base by Keyword

< Back

This Knowledge Base section describes in detail how to generate VIP Synthetic Test Data, part of VIP Test Data Automation.

Test Data Automation provides a simple, intuitive and largely automated approach to Test Data Generation, A high-speed workflow engine, automatically builds a data model of the target database. This produces an easy-to-use Excel control spreadsheet, in which a comprehensive set of Data Generation functions are defined. Event Hooks, pre-process variables and more allow you to retain the referential integrity of complex data, producing data for complex systems testing from simple configuration spreadsheets. High speed automation populates the data, producing rich synthetic data for manual or automated testing.

…but first, a few questions

Why do I want to generate Synthetic Test Data?

Primarily to be able to test software systems or applications. Software is created with a set of assumptions and constraints in mind. Once it is developed, we can use it, right? Not so fast. We first need to test the developed system using data not created by the developers.   VIP has a feature that allows the developers of the software to create ‘synthetic’ data which is generated using a set of predefined criteria .

Reasons why Synthetic Test Data is a good idea

  • Testers often spend many hours trying to find the correct test data for testing.
  • Data is often required to be consistent across multiple applications.
  • Many bugs are actually incorrect data being used in a test, not in the application.
  • Test Data often changes and becomes invalid for the specific test.
  • Testers often cannabilize each others data
  • Incorrect data destabilises automated testing and creates automated test failures.
  • Each tester hunts for their own data and there is little reuse of previous data finds.
  • High volumes of storage required for test and data and time-consuming to provision. It is often quicker to synthesise data and provision it in parallel.
  • Last, but not least, Synthetic Data complies with privacy and protection laws, so there’s no need to worry about illegal use of personal data.

What will I gain from generating Synthetic Test Data?

You will gain the peace of mind that the system has been tested and that the synthetic data is representative of production, but also goes well beyond it in terms of test coverage.

How long will it take me to generate Synthetic Test Data?

If the application tables and their relationships have already been created, then generating Synthetic Test Data should only take a short time (hour/s).

Are there any Prerequisites for generating Synthetic Test Data?

Yes, a Database should have already been created that models the application through a set of Tables and their relationships to one another.

  Note:  This document uses example-based walk-throughs for doing Data Generation using a Sample Commerce Database model.