UBS Hainer
Divide & Conquer: A Field Guide for Test Data Management
The Problem
When implementing processes to provide test data, you often face requirements that seem contradictory:
  • You can never stop production tables
  • The copy process must be as fast as possible
  • You want to retain referential integrity in your data
  • You need to provide a consistent copy
  • You need the option to copy a small subset of rows
  • You want to keep existing data in the target tables
Pre-Production
The key to filling these demands is to divide the process into smaller, more manageable sub-tasks.

Creating more test environments will require additional space, but that cost is easily offset by the benefits that a proper test data management process offers.

Why Create A Pre-Production?
A PRE-PRODUCTION provides several advantages for database administrators and for QA testers.
DBA's
QA Testers

Provides better availability because production tables do not need to be stopped for consistent copies

Better performance because other data provisioning processes no longer run utilities or queries

Better security because QA testers do not need authorization to access production anymore

Provides better stability because it is an environment that is not constantly changing

Better availability because other environments can be refreshed at any time

Better security because no tests can possibly effect production anymore

Integration Environments
Integration environments serve several purposes:

Typically, integration environments only require a subset of the pre-production data.

The best way to create integration environments is to use tools that make fast file system level copies of your DB2 objects.

Better Security
They provide additional security because different business units only access the data that they actually need.
Perfect Test Beds
They are perfect test beds for component and integration tests, which require large amounts of data.
Independent Refresh
They allow groups of testers to refresh data independently of the schedule in which the pre-production is refreshed.

It is common that multiple copies of the same set of tables are stored in the same DB2.

Therefore, these tools must allow you to easily select the set of tables that you need, plus they must offer the option to rename those tables during the copy process.

Additionally, if no pre-production exists, it must be possible to make consistent in-flight file system level copies directly from the production itself.

Unit Testing
Providing data for efficient unit tests

Today’s development processes often involve running unit tests and component tests as part of the build process. Automated build tools require the same database state every time they run, otherwise the tests might fail. The same is true for local builds.

These kinds of tests don’t require gigabytes of data. Instead, specific rows are required from a set of related tables. The key to successful unit testing is selecting rows that represent meaningful business objects.

These smaller Unit Testing Environments often contain edge cases that trigger a certain behavior in a program. When refreshing the tables, it is important to retain these edge cases. A test data management tool must therefore be able to merge new data into existing tables as per your specifications.

Refreshing the System
How often should you refresh?

Keep the data stable in the larger environments (such as the pre-production) for a longer period, and refresh the smaller environments more often.

Environments are populated from upstream systems. If the data in the upstream system has not changed since the last copy, then repeating the copy is tantamount to “resetting” an environment.

Stable data and the ability to reset a system are important for testers. It allows them to run tests repeatedly under the same preconditions.

A common refresh strategy looks like this:
Original Database Database / Subsystem Cloning Table Copying Row Level Processing
Release / System Tests / Pre-Production Component / Integration Tests Unit Tests
Refresh every 1 to 3 months Refresh weekly Refresh daily or on demand
UBS Hainer offers test data management products that allow you to successfully implement your test data management strategy
BCV4
BCV4 makes fast full clones of entire DB2 subsystems or data sharing groups on volume level without affecting the availability or performance of the source.
BCV5
BCV5 makes fast copies of DB2 table spaces and indexes, including structures and data, on file system level. On average, it only requires 10% of the CPU and elapsed time that an Unload/Load process consumes.

How is BCV5 scheduler friendly?

Dynamic Selection
  • Automates the creation of new test environments from scratch
  • Enables a scheduler to quickly set up a refresh process for thousands of DB2 tables automatically
Dynamic Renaming

Allows you to rename attributes of every object, including:

  • Creator and names of tables and indexes
  • Database names and tablespace names
  • Views, aliases and synonyms
  • Storage groups and buffer pools
  • And much more…
Automatic Space Allocation
  • Pre-allocates space in the target before copying
  • Adds missing target pieces and partitions
  • Determines object sizes based on ICF catalog
Parallelism & Workload Balancing
  • Distributes the work evenly among parallel threads
  • Number of threads can be chosen freely
  • Can also access tape image copies in parallel
Static JCL /
Dynamic Process
XDM-RLP
XDM-RLP selectively copies rows from tables that are related via foreign key constraints and provides unmatched flexibility for data modification and masking.

XDM‘s Critical Data Identifier (CDI) is an analyzer tool that detects data for potential masking and provides the following DATA MASKING METHODS.

  • Encryption
    Encrypts the values of elements
  • Nulling Out
    Applies a null value to elements
  • Substitution
    Elements are swapped out
  • Shuffling
    Data is randomly shuffled within the column
  • Masking Out
    Characters are scrambled or certain fields are masked out
  • Number / Date Variance
    Data is modified using a random percentage but remains meaningful
Test data management use cases
Look at these use cases to see how organizations in other industries are using UBS Hainer’s Test Data Management Suite’s unmatched speed and flexibility to improve their data refresh processes.
How Big and Small Customers Use BCV4
A modest public utility company with a single z/OS machine and a solitary systems programmer makes excellent use of BCV4, while the leading home improvement company in the United States uses BCV4 to create a network of staged clones.
Merging Multiple DB2 Environments
BCV5 proved to be the right tool following a merger between two of Germany’s biggest banks with two separate IT environments and more than 20 million customers.
Row Level Testing Across Multiple Platforms
KBC, one of Europe’s largest insurance companies, uses XDM-RLP to test applications on various platforms including DB2 on z/OS or LUW, Oracle and MS SQL Server.