Perform the SUBSET

Search Knowledge Base by Keyword

< Back

At this point, you have run Actions to create the tables and indexes in the Target Database, and have registered the metadata needed to perform the Subset. You have used this metadata to formulate the Subset rules based on the criteria that you defined in your Basic Control Spreadsheet.

You are now ready to perform your Subset, using the SUBSET action.

The SUBSET action inserts data into the staging database. It is dependent on the rules in the ProcessModel sheet that is created when BUILDMODEL is run.

The SUBSET action runs the rules in the ProcessModel sheet are run multiple times until one of the following criteria is fulfilled:

  1. parMaxIterations has been met. This is set in the Configuration sheet.
  2. parMaxRows has been met. This is set in the Configuration sheet.
  3. The last iteration didn’t insert any rows or
  4. All found criteria (if provided) have been met. These criteria are set in the FoundCriteria sheet.

To run SUBSET, first open your re-usable .cmd script in a text editor.

If you have already configured a .cmd script for one Subset Action and wish to edit and re-use that script, you should only need to specify “SUBSET” for  -parAction=””:

Otherwise, configure all mandatory parameters in the .cmd script.

You can also specify optional parameters at this stage.

Make sure your Control Spreadsheet, the Subset Report, and any Log Files are closed.

Run the script to insert data into your staging Database.

Next, check the Log File and Subset Report at this stage, to check that the Actions have performed correctly.

Note: If the SUBSET action detects that any of the SQLCriteria, Tables, ForeignKeys or SoftKeys have changed in any way since the last BUILDMODEL action, then a warning will be given in the log. In this insance, it is recommended that you re-run the BUILDMODEL Action. To run the SUBSET action without re-running BUILDMODEL, you must set parOverrideBuildmodelRequirement to “True”.

Nb.: If processing terminates for any reason other than that the last iteration run didn’t insert any rows then it is unlikely that all the foreign key relationships in the subset will be valid.  You can try to correct this by running DELETEORPHANS.

Review the SubsetResults Sheet

The SUBSET action produces the SubsetResults Sheet:

Data Subsetting

Each row represents a Table and Schema name. The SubsetResults sheet displays the total number of rows in the Staging Table, and the number added by the last SUBSET action.

The columns in the sheet are:

  1. SchemaName: The Staging Schema name.
  2. TableName: The Staging Table name.
  3. RowsInStaging: The total number of rows in the table during staging.
  4. RowsAdded: The number of rows added by the last SUBSET action.

In addition to checking your database, you should check your SubsetResults sheet after running the SUBSET Action.

If you are happy with the Subset Results, you can next perform Post-Subset Actions.

Alternatively, you can update your Advanced Control Spreadsheet and run an updated Subset. These Subsets might be run iteratively.