Today's blog post was written by Sid Thakkar, Senior QA at Sonoma Partners.
The concept of the data migration is very simple; testing is conducted to compare the source data to the migrated data. In other words, we try to discover any discrepancies that take place when moving the data from one database system to another. As simple as it might sound, the testing effort involved in data migration project is enormous, and it often ends up taking a lot of time.
A well-defined testing strategy is essential for delivering a successful data migration.
One of the important aspects of a successful data migration test can be archived using an “Automated” approach of testing. It also saves significant time, minimizes the typical iterative testing approach, and gives us the ability to test 100% of the migrated data. Different phases of data migration testing include:
- Data Migration Design Review
- Pre-Data Migration Testing
- Post-Data Migration Testing
Data Migration Design Review
It is important for a Quality Analyst to understand the design review of the migration specification during the early stage of the migration implementation/configuration. The QA should go through the detail analysis of Data Mapping requirement document prior to the start of any sort of testing. Ideally, we would want to note if any of the columns or fields match the below criteria.
- Change in data type from source to target (e.g. data in source may be represented as a character but in target table the same is represented as an integer)
- Modifying the existing data (e.g. requirement of migrating “status = in progress” in source system to be migrated as “Status = lost” or “telephone = 1234567890” to be migrated as “telephone = 123-456-7890”)
- Document all Option Set values, lookups, and user mappings
Pre-Data Migration Testing
Before we jump into any kind of data testing, one should test source and target system connection from the migration platform.
Pre-Data migration testing can also be called Definition testing. Definition testing is something that doesn’t take place during the data migration testing. During definition testing, we should check the Data type and length of all fields in Source Database table to target. For example, Address_line1 field in source is of data type Varchar and has length of 50 whereas Address_line1 field in target is listed as Varchar(30). This basically means that there can be a potential issue with the data that has a length more than 30 in source table.
For each entity, run a similar SQL query to the one listed below for both source and target table in order to confirm that the definition of fields between both tables are correct.
Post-Data Migration Testing
Post-data migration testing is by far the most important phase of the migration testing. In a situation where we do not have enough time assigned for testing, we can directly jump into this phase of testing. The testing is divided in two parts:
- Record Counts
- Data Mapping
- Unmapped Record Counts
- Unmapped Record Values
This could be really easy to test once you understand the data structure of the migration process. In order to successfully automate some of the testing, you will need to find out database names, table names, primary Keys for the entity you are testing. For example, let’s assume that you are testing account migration, and the source table name is “Source_Accounts,” the target table name is “Target_Accounts,” and the primary key for both the table is “Account_ID.”
I prefer using Microsoft Excel to automate some of the testing. But you can write programs to do the same. As you can see in the image, I have listed source and target table names, columns and primary key in “sheet1” of an excel file.
You can create a new excel sheet and write this command to auto generate record count queries (see image below).
="select "&Sheet1!B5&" = count ("&Sheet1!B5&") From "&Sheet1!$A$5&" where "&Sheet1!B5 &" is not null"
select Address1_AddressId = count (Address1_AddressId)
where Address1_AddressId is not null
Next step is to run these queries in SQL window, and then store the result. Once you repeat the same process for target table, you should be able to compare record counts for all fields between the source and target tables.
Once we have done the row count testing, we can go one step further to verify if the content matches as well. During this phase of the testing, we basically will cover all the testing we have done so far (which is one of the reasons why we jump directly to the data mapping testing in time-crunch situations).
Unmapped Record Counts
Let’s use the image1, create a new tab in the same excel file, and write below listed command to auto generate data mapping queries. It’s easier and safer to first find out the record counts that did not match and then dive into finding those records. Counting unmapped records is the first step towards this process.
="select count(*) From "&Sheet1!$A$5&" t1 join "&Sheet1!$D$5&" t2 on t1."&Sheet1!C$5&"= t2."&Sheet1!$F$5&" where t1."&Sheet1!B5& " <> t2."&Sheet1!E5& " and t2."&Sheet1!E5& " is not null"
Unmapped Record Values
If the above query for unmapped record count returns zero for all fields, then the possibility of a successful migration is greater. But it isn’t really wise to leave the testing efforts just yet. I highly recommend that regardless of the result of above queries, one should go a step further and run below query to find out exact value mapping between source and target table.
Let’s use the image1 again and create a new tab in the same excel file to auto-generate the query for unmapped record values.
="select t1."&Sheet1!B5&" , t2."&Sheet1!E5&" From "&Sheet1!$A$5&" t1 join "&Sheet1!$D$5&" t2 on t1."&Sheet1!C$5&"= t2."&Sheet1!$F$5&" where t1."&Sheet1!B5& " <> t2."&Sheet1!E5& " and t2."&Sheet1!E5& " is not null"
In the next blog, I will be discussing how a QA can be involved in writing SSIS packages to be more self-dependent during any sort of data migration projects.