Belitsoft > Data Migration Testing

Data Migration Testing

The distinction between testing the data and testing the application is often difficult for the business to grasp. However, the primary goal of data migration testing is to ensure that data is correctly extracted, transformed, and loaded into the new target system as designed.

Contents

Types of Data Migration Testing

Clients typically have established policies and procedures for software testing after data migration. However, relying solely on client-specific requirements might limit the testing process to known scenarios and expectations. The inclusion of generic testing practices and client requirements improves data migration resilience. 

Ongoing Testing

Ongoing testing in data migration refers to implementing a structured, consistent practice of running tests throughout the development lifecycle.

After each development release, updated or expanded portions of the Extract, Transform, Load (ETL) code are tested with sample datasets to identify issues early on. Depending on the project's scale and risk, it may not be a full load but a test load.

The emphasis is on catching errors, data inconsistencies, or transformation issues in the data pipeline in advance to prevent them from spreading further.

Data migration projects often change over time due to evolving business requirements or new data sources. Ongoing testing ensures the migration logic remains valid and adapts to these alterations.

A well-designed data migration architecture directly supports ongoing testing.

  • Breaking down ETL processes into smaller, reusable components makes it easier to isolate and test individual segments of the pipeline.
  • The architecture should allow for seamless integration of automated testing tools and scripts, reducing manual effort and increasing test frequency.
  • Data validation and quality checks should be built into the architecture, rather than treated as a separate layer.

Unit Testing

Unit testing focuses on isolating and testing the smallest possible components of software code (functions, procedures, etc.) to ensure they behave as intended.

In data migration, this means testing individual transformations, data mappings, validation rules, and even pieces of ETL logic.

Visual ETL tools simplify the process of building data pipelines, often reducing the need for custom code and making the process more intuitive. A direct collaboration with data experts enables you to define the specification for ETL processes and acquire the skills to construct them using the ETL tool simultaneously.

However, visual tools can help simplify the process, but complex transformations or custom logic may still require code-level testing. Unit tests can detect subtle errors in logic or edge cases that broader integration or functional testing might miss.

A clearly defined requirements document outlines the target state of the migrated data. Unit tests, along with other testing types, should always verify that the ETL processes are correctly fulfilling these requirements.

While point-and-click tools simplify building processes, it is essential to intentionally define the underlying data structures and relationships in a requirements document. This prevents ad hoc modifications to the data design, which can compromise long-term maintainability and data integrity.

Integration Testing

Integration testing focuses on ensuring that different components of a system work together correctly when combined. 

The chances of incompatible components rise when teams in different offshore locations and time zones build ETL processes. Moving the ETL process into the live environment introduces potential points of failure due to changes in the target environment, network configurations, or security models.

Integration testing confirms that all components can communicate and pass data properly, even if they were built independently. 

It simulates the entire data migration flow. This verifies that data flows smoothly across all components, transformations are executed correctly, and data is loaded successfully into the target system.

Integration testing helps ensure no data is lost, corrupted, or inadvertently transformed incorrectly during the migration process.

These tests also confirm compatibility between different tools, databases, and file formats involved in the migration.

We maintain data integrity during the seamless transfer of data between systems. Contact us for expert database migration services.

Load Testing

Load testing assesses the target system's readiness to handle the incoming data and processes. 

Load tests will focus on replicating the required speed and efficiency to extract data from legacy system(s) and identify any potential bottlenecks in the extraction process.

The goal is to determine if the target system, such as a data warehouse, can handle the expected data volume and workload. Inefficient loading can lead to improperly indexed data, which can significantly slow down the load processes. Load testing ensures optimization in both areas of your data warehouse after migration.

If load tests reveal slowdowns in either the extraction or loading processes, it may signal the need to fine-tune migration scripts, data transformations, or other aspects of the migration. 

Detailed reports track metrics like load times, bottlenecks, errors, and the success rate of the migration. It is also important to generate a thorough audit trail that documents the migrated data, when it occurred, and the responsible processes. 

Fallback Testing

Fallback testing is the process of verifying that your system can gracefully return to a previous state if a migration or major system upgrade fails. 

If the rollback procedure itself is complex, such as requiring its own intricate data transformations or restorations, it also necessitates comprehensive testing. Even switching back to the old system may require testing to ensure smooth processes and data flows.

It's inherently challenging to simulate the precise conditions that could trigger a disastrous failure, requiring a fallback. Technical failures, unexpected data discrepancies, and external factors can all contribute.

Extended downtime is costly for many businesses. Even when core systems are offline, continuous data feeds, like payments or web activity, can complicate the fallback scenario.

Each potential issue during a fallback requires careful consideration.

Business Impact

How critical is the data flow? Would disruption cause financial losses, customer dissatisfaction, or compliance issues? High-risk areas may require mitigation strategies, such as temporarily queuing incoming data.

Communication Channels

Testing how you will alert stakeholders (IT team, management, customers) about the failure and the shift to fallback mode is essential.

Training users on fallback procedures they may never need could burden them during a period focused on migration testing, training, and data fixes. In industries where safety is paramount (e.g., healthcare, aviation), training on fallback may be mandatory, even if it is disruptive. Mock loads offer an excellent opportunity to integrate this.

Decommissioning Testing

Decommissioning testing focuses on safely retiring legacy systems after a successful data migration. 

You need to verify that your new system can successfully interact with any remaining parts of the legacy system.

Often, legacy data needs to be stored in an archive for future reference or compliance purposes. 

Decommissioning testing ensures that the archival process functions correctly and maintains data integrity while adhering to data retention regulations.

When it comes to post-implementation functionality, the focus is on verifying the usability of archived data and the accurate and timely creation of essential business reports.

Data Reconciliation (or Data Audit) 

Data reconciliation testing specifically aimed at verifying that the overall counts and values of key business items, including customers, orders, financial balances, match between the source and target systems after migration. It goes beyond technical correctness, with the goal of ensuring that the data is not only accurate but also relevant to the business.

The legacy system and the new target system might handle calculations and rounding slightly differently. Rounding differences during data transformations may seem insignificant, but they can accumulate and result in significant discrepancies for the business.

Legacy reports are considered the gold standard for data reconciliation, if available. Legacy reports used regularly in the business (like trial balances) already have the trust of stakeholders. If your migrated data matches these reports, there is greater confidence in the migration's success.

However, if new reports are created for reconciliation, it is important to involve someone less involved in the data migration process to avoid unconscious assumptions and potential confirmation bias. Their fresh perspective can help identify even minor variations that a more familiar person might overlook.

Data Lineage Testing

Data lineage testing provides a verifiable answer to the crucial question: "How do I know my data reached the right place, in the right form?"

Data lineage tracks:

  • where data comes from (source systems, files, etc.)
  • every change the data undergoes along its journey (calculations, aggregations, filtering, format changes, etc.)
  • where the data ultimately lands (tables, reports, etc.)

Data lineage provides an audit trail that allows you to track a specific piece of data, like a customer record, from its original source to its final destination in a new system.

This is helpful in identifying any issues in the migrated data, as data lineage helps isolate where things went wrong in the transformation process.

By understanding the exact transformations that the data undergoes, you can determine the root cause of any problems. This could be a flawed calculation, incorrect mapping, or a data quality issue in the source system.

Additionally, data lineage helps you assess the downstream impact of making changes. For example, if you modify a calculation, the lineage map can show you which reports, analyses, or data feeds will be affected by this change.

User Acceptance Testing

User acceptance testing is the process where real-world business users verify the migrated data in the new system meets their functional needs. 

It's not just about technical correctness - it's also about ensuring that the data is coherent, the reports are reliable, and the system is practical for their daily activities.

User acceptance testing often involves using realistic test data sets that represent real-world scenarios.

Mock Load Testing Challenges

Mock loads simulate the data migration process as closely as possible to a real-life cutover event. It's a valuable final rehearsal to find system bottlenecks or process hiccups.

A successful mock load builds confidence. However, it can create a false sense of security if limitations aren't understood.

Often, real legacy data can't be used for mock loads due to privacy concerns. To comply, data is masked (modified or replaced), which potentially hides genuine data issues that would surface with the real dataset during the live cutover.

Let's delve deeper into the challenges of mock load testing.

Replicating the full production environment for a mock load demands significant hardware resources. This includes having sufficient server capacity to handle the entire legacy dataset, a complete copy of the migration toolset, and the full target system. Compromising on the scale of the mock load limits its effectiveness. Performance bottlenecks or scalability issues might lurk undetected until the real data volume is encountered. Cloud-based infrastructure can help with hardware constraints, especially for the ETL process, but replicating the target environment can still be a challenge.

Mock loads might not fully test necessary changes for customer notifications, updated interfaces with suppliers, or altered online payment processes. Problems with these transitions may not become apparent until the go-live stage.

Each realistic mock load is like a mini-project on its own. ETL processes that run smoothly on small test sets may struggle when dealing with full data volumes. Considering bug fixing and retesting, a single cycle could take weeks or even a month.

Senior management may expect traditional, large-scale mock loads as a final quality check. However, this may not align with the agile process enabled by a good data migration architecture and continuous testing. With a good data migration architecture, it is preferable to perform smaller-scale or targeted mock loads throughout development, rather than just as a final step before go-live.

Data consistency 

Data consistency ensures that data remains uniform and maintains integrity across different systems, databases, or storage locations. For instance, showing the same number of customer records during data migration is not enough to test data consistency. You also need to ensure that each customer record is correctly linked to its corresponding address.

Matching Reports

In some cases, trusted reports already exist to calculate figures like a trial balance for certain types of data, such as financial accounts. Comparing these reports on both the original and the target systems can help confirm data consistency during migration. However, for most data, tailored reports like these may not be available, leading to challenges.

Matching Numeric Values

This technique involves finding a numeric field associated with a business item, such as the total invoice amount for a customer. To identify discrepancies, calculate the sum of this numeric field for each business item in both the legacy and target systems, and then compare the sums. Each customer has invoices. If Customer A has a total invoice amount of $1,250 in the legacy system, then Customer A in the target should also have the same total invoice amount.

Matching Record Counts

Matching numeric values relies on summing a specific field, making it suitable when there is such a field (invoice totals, quantities, etc.)

On the other hand, matching record counts is more broadly applicable as it simply counts associated records, even if there is no relevant numeric field to sum.

Example with Schools

  1. Legacy System: school A has 500 enrolled students.
  2. Target System: after migration, School A should still display 500 enrolled students.

Preserve Legacy Keys

Legacy systems often have unique codes or numbers to identify customers, products, or orders. This is its legacy key. If you keep the legacy keys while moving data to a new system, you have a way to trace the origins of each element back to the old system. In some cases, both the old and new systems need to run simultaneously. Legacy keys allow for connecting related records across both systems. 

The new system has a dedicated field for old ID numbers. During the migration process, the legacy key of each record is copied to this new field. Conversely, any new records that were not present in the previous system will lack a legacy key, leading to an empty field and wasted storage. This unoccupied field can negatively impact the database's elegance and storage efficiency.

Concatenated keys

Sometimes, there is no single field that exists in both the legacy and target systems to guarantee a unique match for every record, like a customer ID. This makes direct comparison difficult. 

One solution is to use concatenated keys, where you choose fields to combine like date of birth, partial surname, and address fragment. You create this combined key in both systems, allowing you to compare records based on their matching concatenated keys. While there may be some duplicates, it is a more focused comparison than just checking record counts. If there are too many false matches, you can refine your field selection and try again.

User Journey Testing

Let's explore how user journey testing works with an example.   

To ensure a smooth transition to a new online store platform, a user performs a comprehensive journey test. The test entails multiple steps, including creating a new customer account, searching for a particular product, adding it to the cart, navigating through the checkout process, inputting shipping and payment details, and completing the purchase. Screenshots are taken at each step to document the process.

Once the store's data has been moved to the new platform, the user verifies that their account details and order history have been successfully transferred.  Additional screenshots are taken for later comparison.

Hire offshore testing team to save up to 40% on cost, guaranteeing a product free from any errors, while you dedicate your efforts to development and other crucial processes. Seek our expert assistance by contacting us.

Test Execution

During a data migration, if a test fails, it means there is a fault in the migrated data. Each problem is carefully investigated to find the root cause, which could be the original source data, mapping rules used during transfer, or a bug in the new system. Once the cause is identified, the problem is assessed based on its impact on the business. Critical faults are fixed urgently with an estimated date for the fix. Less critical faults may be allocated to upcoming system releases.

Sometimes, there can be disagreements about whether a problem is a true error or a misinterpretation of the mapping requirements. In such cases, a positive working relationship between the internal team and external parties involved in the migration is crucial for effective problem handling.

Cosmetic faults

Cosmetic faults refer to discrepancies or errors in the migrated data that do not directly impede the core functionality of the system or cause major business disruptions. Examples include slightly incorrect formatting in a report. 

Cosmetic issues are often given lower priority compared to other issues.

User Acceptance Failures

When users encounter issues or discrepancies that prevent them from completing tasks or don't match the expected behavior, these are flagged as user acceptance failures.

If the failure is due to a flaw in the new system's design or implementation, it's logged into the system's fault tracking system. This initiates fixing it within the core development team.

If the failure is related to the way the data migration process was designed or executed (for example, errors in moving archived data or incorrect mappings), a data migration analyst will initially examine the issue. They confirm its connection to the migration process and gather information before involving the wider technical team.

Mapping Faults

Mapping faults typically occur when there is a mismatch between the defined mapping rules (how data is supposed to be transferred between systems) and the actual result in the migrated data.

The first step is to consult the mapping team. They meticulously review the documented mapping rules for the specific data element related to the fault. This guarantees accurate rule following.

If the mapping team confirms the rules are implemented correctly, their next task is to identify the stage in the Extract, Transform, Load process where the error is happening. 

Process Faults Within the Migration

Unlike data-specific errors, process faults refer to problems within the overall steps and procedures used to move data from the legacy system to the new one.

These faults can cause delays, unexpected disconnects in automated processes, incorrect sequencing of tasks, or errors from manual steps.

Performance Issues

Performance issues during data migration focus on the system's ability to handle the expected workload efficiently. These issues do not involve incorrect data, but the speed and smoothness of the system's operations.  

Here are some common examples of performance problems:

Slow system response times

Users may experience delays when interacting with the migrated system.

Network bottlenecks causing delays in data transfer

The network infrastructure may not have sufficient bandwidth to handle the volume of data being moved.

Insufficient hardware resources leading to sluggish performance

The servers or other hardware powering the system may be underpowered, impacting performance.

Root Cause Analysis

Correctly identifying the root cause ensures the problem gets to the right team for the fastest possible fix. 

Fixing a problem in isolation is not enough. To truly improve reliability, you need to understand why failures are happening repeatedly.

It's important to differentiate between repeated failures caused by flaws in the process itself, such as lack of checks or insufficient guidance, and individual mistakes. Both need to be addressed, but in different ways.

Without uncovering the true source of problems, any fixes implemented will only serve as temporary solutions, and the errors are likely to persist. This can undermine data integrity and trust in the overall project.

During a cutover to a new system (transition to the new system), data problems can arise in three areas:

  1. Load Failure. The data failed to transfer into the target system at all.
  2. Load Success, Production Failure. The data is loaded, but breaks when used in the new system.
  3. Actually a Migration Issue. The problem is due to an error during the migration process itself.

Issues within the Extract, Transform, Load Process

  1. Bad Data Sources. Choosing unreliable or incorrect sources for the migration introduces problems right from the start.
  2. Bugs. Errors in the code that handle extracting, modifying, or inserting the data will cause issues.
  3. Misunderstood Requirements. Even if the code is perfectly written, it won't yield the intended outcome if the ETL was designed with an incorrect understanding of requirements.

Test Success

The data testing phase is considered successful when all tests pass or when the remaining issues are adequately addressed. Evidence of this success is presented to stakeholders in charge of the overall business transformation project. If the stakeholders are satisfied, they give their approval for the data readiness aspect. This officially signals the go-ahead to proceed with the complete data migration process.

We provide professional cloud migration services for a smooth transition. Our focus is on data integrity, and we perform thorough testing to reduce downtime. Whether you choose Azure Cloud Migration services or AWS Cloud migration and modernization services, we make your move easier and faster. Get in touch with us to start your effortless cloud transition with the guidance of our experts.

Never miss a post! Share it!

Written by
Partner / Department Head
"I've been leading projects and managing teams with core expertise in ERP development, CRM development, SaaS development in HealthTech, FinTech and other domains for 15 years."
5.0
1 review

Rate this article

Recommended posts

Belitsoft Blog for Entrepreneurs

Portfolio

Portfolio
Software Testing for Fast Release & Smooth Work of Resource Management App
Software Testing for Fast Release & Smooth Work of Resource Management App
The international video production enterprise Technicolor partnered with Belitsoft to get cost-effective help with software testing for faster releases of new features and higher overall quality of the HRM platform.
Manual and Automated Testing to Cut Costs by 40% for Cybersecurity Software Company
Manual and Automated Testing to Cut Costs by 40% for Cybersecurity Software Company
Belitsoft has built a team of 70 QA engineers for performing regression, functional, and other types of software testing, which cut costs for the software cybersecurity company by 40%.
Offshore Dedicated Team of 100 QA Testers and Developers at 40% Lower Cost
Offshore Dedicated Team of 100 QA Testers and Developers at 40% Lower Cost
Our client is an Israeli high-tech company. They’ve grown into a leading global provider of innovative network intelligence and security solutions (both software and hardware). Among their clients, there are over 500 mobile, fixed, and cloud service providers and over 1000 enterprises.

Our Clients' Feedback

technicolor
crismon
berkeley
hathway
howcast
fraunhofer
apollomatrix
key2know
regenmed
moblers
showcast
ticken
elerningforce
Let's Talk Business
Do you have a software development project to implement? We have people to work on it. We will be glad to answer all your questions as well as estimate any project of yours. Use the form below to describe the project and we will get in touch with you within 1 business day.
Contact form
We will process your personal data as described in the privacy notice
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply
Call us

USA +1 (917) 410-57-57

UK +44 (20) 3318-18-53

Email us

[email protected]

Headquarters

13-103 Elektoralnaya st,
00-137 Warsaw, Poland

to top