Process Overview

Search Knowledge Base by Keyword

< Back

The Subset is run using a series of pre-defined, automated Actions. These Actions are informed by the Basic Control Spreadsheet.

The actions to run the Basic Subset are:

  1. The TABLES and GETKEYS Actions retrieve metadata from the Source Database.
  2. The PREPENV Action create tables and indexes in the Staging Database.
  3. The BUILDMODEL creates the rules to drive the Subset.
  4. The SUBSET Action writes data to the Staging Database.

These actions should be run in order. They auto-populate additional sheets in the Control Spreadsheet, creating an Advanced Control Spreadsheet. This includes the Subset Rules needed to produce coherent Subsets of inter-related data. The rules are formulated automatically.

The Subset Rules can then be toggled on/off. The relationships that will be fulfilled by the Data Subset can also be toggled. This enables you to define Advanced Subsets, iteratively including tables and relationships until you get a coherent data set of the right size.

Note: Subset Rules are formulated to generate a set of data that fulfils the relationships in the Source Database. It therefore creates a data set where the data reflects the Primary and Foreign Key relationships. However, the core actions involved in running a Subset (TABLES, GETKEYS, PREPENV, BUILDMODEL, and SUBSET) do not implement these Keys in the Staging database. To implement them, you must perform Post-Subset actions to add Keys.

This subsection of the Knowledge Base provides an overview of running the data subsetting actions. The process for running actions is as follows:

  1. Define any mandatory and optional parameters in a re-usable .cmd script.
  2. Make sure your Control Spreadsheet, the Subset Report, and any Log Files are closed.
  3. Run the script.
  4. Review any Sheets updated or created in the Control Spreadsheet.
  5. Check the Log file and Subset Report.