Showing posts with label Data Migration. Show all posts
Showing posts with label Data Migration. Show all posts

Data Migration: A summary of my posts!

Over the last 3+ months, I have outlined my thoughts on data migration. In order to be successful with large scale implementations of business systems like (ERP, PLM, CRM, BPM etc.), data migration is a key element.

Data migration is often ignored and not enough attention is paid to this portion of the overall project.

The methodology I have outlined in these posts can be applied to a number of projects including data consolidation, server consolidation, migration from one application to another and the list goes on.

The key is to pay attention to the business needs and to make them successful by taking care of the technology and project management issues!

Good Luck.


1. Data Migration: Challenges & Joy!
http://improveprocess.blogspot.com/2009/07/data-migration-challenges-joy.html

2. Data Migration: Challenges & Joy!
http://improveprocess.blogspot.com/2009/07/data-migration-challenges-joy-part-2.html

3. Rules For Successful Data Migration
http://improveprocess.blogspot.com/2009/07/rules-for-successful-data-migration.html

4. Phases of Data migration
http://improveprocess.blogspot.com/2009/07/phases-of-data-migration.html

5. Phases of Data migration
http://improveprocess.blogspot.com/2009/07/phases-of-data-migration.html

6. Phases of Data migration: Analysis
http://improveprocess.blogspot.com/2009/07/phases-of-data-migration-analysis.html

7. Phases of Data migration: Design
http://improveprocess.blogspot.com/2009/07/phases-of-data-migration-design.html

8. Phases of Data migration: Test
http://improveprocess.blogspot.com/2009/08/phases-of-data-migration-test.html

9. Phases of Data migration: Validation
http://improveprocess.blogspot.com/2009/09/phases-of-data-migration-validation.html

10. Data migration: Risks
http://improveprocess.blogspot.com/2009/09/data-migration-risks.html

11. Tips for Successful Data Migration.
http://improveprocess.blogspot.com/2009/10/tips-for-successful-dat-migration.html

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Tips for Successful Data Migration.

  1. Maintain your sense of humor.
  2. Expect delays and/or road blocks.
  3. Run the data migration using traditional project principles.
  4. Secure alignment and approval from steering committee and stakeholders as changes occur.
  5. Appreciate the inter-dependencies.
  6. Understand your business process, data, system and application landscape. (Devil is in the details)
  7. Get the right software tools.
  8. Use the right resources.
  9. Plan for down time.
  10. Perform at least two dry runs (Wash Rinse Repeat)
  11. Develop risk mitigation plan.
  12. Communicate your plan early and socialize with all impacted users.


"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Data migration: Risks

Risk management should be part of every project, especially a large scale migration project. Risks related to schedule, costs and scope should be clearly identified, documented and a plan must be put together to mitigate them.

In my experience, risks specific to data migration come from a number of contributors like

(1) data quality

(2) extraction, transformation and load tool complexity

(3) performance of systems (extraction and target system persistence)

(4) coordination of project activities as related to testing, UAT, system preparation

(5) resource constraints

Data quality:

Without a thorough understanding of the business rules and logic related to how the information is stored within the source and target system all migration projects are at risk.

Even if you understand the rules from a system perspective, try and get a handle on how the business uses this information. In some cases, even after a thorough mapping exercise, gaps could arise due to data issues…in such an event additional cycles will be required to re-map and re-run the extraction, transform and load routines. Don’t underestimate the work effort required to cleanse the data…in some cases, the risk mitigation steps involve leaving the source system as-is and including additional logic

Extraction, transformation and load tool complexity

As business systems add more functionality and modules, data migration becomes an extremely challenging exercise. In the past, data migration of metadata could be easily accomplished by writing data directly into the database(s) by using simple commands and flat files at the database layer. As business systems have matured, most of the times the database layer doesn’t contain the hierarchy and relationship information…these are stored and managed within the application layer. Extraction and loads have to use the application APIs and in some cases, these are not conducive to support extraction and loads in a mass manner. Often, schedules are adversely impacted due to added cycles of development to find/develop routines and software to support the extraction, transformation and load.

I would highly recommend contacting the software vendor and request reference information and talk to other customers/users that have been through a similar exercise and learn from their experience. Benchmark exercises like this will help in setting a timeline for such activities… Risk management could involve adding additional resources to drive closure of risk items. If additional cycles from software vendor are required, engage your vendor management team and actively manage your contacts at the software vendor. Escalations and jeopardy’s should be raised so that executives within your organization and the software vendor are informed about potential impacts and risks…

Performance of systems

Bad performance of business systems could lead to delays and inability to complete a load in the allotted time. We have talked about the complexity of the extraction, transformation and load routines…with added complexity it will be no surprise that most systems are not setup properly to support mass migrations…in fact different settings are required within application /business systems to support migrations and these could be totally different from what is required for day to day operations…

Thorough testing with multiple wash-rinse and repeat cycles are essential to clearly identify performance issues…in some cases, because of a n – tiered architecture, you might run into bottle necks at the application front ends or at the database layers…ensure your best performance people are monitoring the load process and can tune the systems properly.

From an database perspective (Oracle or others…), you might need to increase the memory allocated along with updating indices on a frequent basis as well as maybe even turning off archiving…another thing to do is maybe turn off search indexing within business systems to speed up the load process and perform this activity as a post go-live activity.

The key is to able to load all the data within the allotted time frame, some times risk mitigations might involve loading over a prolonged period, added front ends, additional memory and CPU’s, database/business system tuning.

Project coordination

Most often data migration projects are part of a larger project to implement a new business system. In these large scale implementations, there are a number of activities which must be completed for e.g. system design, configuration, process design, data mapping, unit tests, user acceptance test etc. data migration tests need to be performed so that all elements of testing can be done. This requires a high degree of coordination and any slips in code development, design configuration or testing cycles will impact the overall schedule and time allocated for data migration.

Risk mitigation could involve procuring a separate environment to conduct all data migration tests. In addition, ensure that all project activities which could impact migration are clearly identified and have the necessary dependencies captured within the project plan. Having a project review similar to scrum reviews can be beneficial.

Resource constraints

In an ideal world, all projects would be funded and resourced well with proper resource leveling. In the real world this rarely happens, the same is true with most of the migration projects I have managed. Typically the roles for project manager, business analyst, data experts, process owners and code developers are all merged and melded together and these resources have to wear multiple hats through out the implementation of the business system. In order to mitigate any risks to schedule and costs, clearly identify the resource requirement and secure the resources required ahead of time. The later you start; the chances of securing resources will be limited. Contract resources for execution of the extractions and loads can be viable option but requires clear documentation of the process and thorough knowledge of what needs to be done. If you have the right resource and have a plan, migrations can be handled with a set of contract resources…

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration: Validation

This is a key phase to ensure success of the overall migration effort! Validation sounds easy but how do you go about it?

I have always setup a 3 tiered validation criteria,

• First level validation is to ensure that the records from a count perspective made it in to the target system.
• Second level validation to ensure that the key data elements made it into the target systems. The key data elements I look for are data that are essential to the running of the business. Another condition I have used is select data that is part of the object’s properties and are required fields within the application layer. For e.g. part unit of measure, cost are part of the object’s key attributes in ERP/PLM systems,
• Third level validation is to ensure the metadata for attributes and keywords that are nice to have but don’t have a significant impact to the business if they were incorrectly loaded.

I approach validation from three perspectives with a focus on

• Ensuring proper extraction from source system
• Ensuring proper data transformation into flat files (CSV, XML etc.)
• Ensuring proper load into target system

If you analyze failures or errors, you have to start by reviewing what you extracted. If you have any doubts at this layer, then the success of the overall project will be in doubt. If the data is properly extracted but incorrectly populated into a flat file, then your load will not be successful. If you have been successful in extraction and transformation and have properly tested the loads then you should have the data loaded successfully into the target system.

One key issue always pops up when it comes to validation: WHO is responsible? In most cases, the business owners point to the IT guys and IT guys point to the business owners. In order to be successful, engage both teams and work through the development of validation criteria, success criteria and identify what can be automated, validate the automation routines so that both sides are satisfied.

Automating this activity is almost a must in most cases, when you are faced with gigabytes or even terabytes of data manual lookups will not be sufficient. You could get fancy and dabble with sampling theory. In my opinion, go for 100% checking by putting technology to work!

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration: Test

Before any data is moved, it is important that some portion of the migration plan be tested and validated. Results of the migration test determine whether modification of the migration plan—for example, time line, migration tools used, amount of data migrated per session, and so on—is required. For example, if testing shows that allowable downtime would probably be exceeded, the migration methodology needs to be revisited.

Testing is a key element of the overall lifecycle of the project. Why? It

(1) proves the capability of migrating the data with no impact to the enterprise

(2) provides a good understanding of risks

(3) provides the ability to accurately define the sequence for the final migration including timelines

Start this process by reviewing the outputs from the analysis and planning stages. Engage a cross functional team and assess the capabilities, knowledge and expertise of the team. If the team has the skills, knowledge and expertise and has gone through a similar exercise in the past, then this phase is greatly simplified. Follow the same sequence as identified early.

In most cases, you might have to start from scratch. In that case, start by outlining the dependencies for data extraction, sequence them in the proper order, once the data is extracted from the source system, validate against the target system to ensure data integrity and then proceed with a sample load.

This is the time to engage your IT administrators to the fullest. Review application, database, network and infrastructure architecture and optimize from a data migration perspective.

For e.g. most data migration projects involve persistence in databases but this activity needs to be kicked off from the application layer following a syntax and methodology involving some structure in flat files (txt, xml etc.). In this case, the application and databases need to be tuned to identify the right parameters which will enable you to accomplish the load in a timely manner.

If you have distributed or federated systems, you will need the assistance of network and infrastructure administrators/architects to tune the network and servers from optimum performance, for e.g. remove bottleneck processes or establish a dedicated network etc.

This phase doesn’t conclude with successful migration and establishing a proper timeline for go-live. It should also include testing of post go-live activities. In most cases, search engines will need to be updated so that the indices are refreshed with the newly loaded data. there a number of such related activities that are tied to post go-live which are usually overlooked causing performance nightmares upon start up after data migration.

Keep at it, you can almost see the light at the end of the tunnel, next phase is validation. Remember the mantra “I love data”

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration: Design

Now the fun begins! :)

In previous posts, we had focused on phases of data migration, and looked at planning and analysis stages in detail. In this post, we will focus on design of the tools and methodology for data migration.

What tools do we need? We will need at a minimum three tools for

* Extraction & transformation
* Load
* Validation

The first item to complete would be data mapping from source to target system. Based on the scope identified in the planning stage, have your business analysts dig deep and clearly identify all data sources, formats/rules and map them to the target system. This mapping exercise needs to be comprehensive for e.g. simply mapping name of the customer is stored in table XYZ and in column 1 in source system to table ABC and in column 2 is not enough. Why? In order to be thorough, you need to consider the maximum length of characters in your source system and check to see if your target system can handle this.

If your target system cannot handle this length, you have some choices (extend character size, truncate some customer names etc…). Regardless of decisions, this will have to be captured in your planning assumptions, socialized and then documented in your design document.

Word of caution: Don’t make assumptions! Don’t trust anyone! Including your data experts. Have your experts prove their knowledge of the data and their expertise. The reason for this seemingly paranoid approach is justified; in a lot of cases “knowing the data” means knowing the data base construct and limited knowledge of how the business uses the data or sometimes the other way around knowing the business logic but having no clue of persistence at the database layer.

Once your data mapping is done, you will have a clear idea of any data cleanup that needs to be done. Drive your team to think in terms of numbers and impact to schedule. The numbers should indicate the number of (1) records that have to be fixed/cleaned, (2) resources required to perform this clean up (3) hours/dollars effort/cost impact.

If data cleanup is required, consider this as a separate body of work. Don’t scavenge on the design activity to fund or resource the data cleanup. If you need additional help, make sure you raise awareness/jeopardy and ask for help.

Continuing on with design, ETL (extract transform and load) utilities can be challenging, but you don’t have to reinvent the wheel. Think outside the box, in most organizations that focus on KPI’s and performance metrics, there will be well defined tools to extract data. These tools can form the skeleton for your extraction tools. Review these carefully and document, what else you might need. If you are starting from scratch, try to get the experts to document the application layer and database layer. Once you have an idea of how source system manages and stores data, you can work towards extraction. With the advent of J2EE based systems, pure database centric extractions have become exceedingly difficult. The reason is that the logic and rules for extraction exist in the application layer and may not be clearly available in the database. In some cases, I have spent hours trying to create relationship diagrams using tools and that finally managed to construct diagrams based on combining application logic with some creative thinking (hacks) into the databases.

This activity of design and building tools to support data migration are no different from the equivalent standard SDLC tasks.

The key requirements for these tools are scalability and performance. Your tools should be able to perform the tasks within a timely manner and be able to handle the data set identified in the planning stage. While going through design, build and test iterations, I would highly recommend keeping a spreadsheet to record performance.

Thoroughly analyze the sequence for data extraction, load and validation. This is the sequence you need to solidify in the test phase and execute to during go-live. This sequence is usually identified as the last step and this is a common mistake. Instead of honing the strategy, data migration leads continue to spin their wheels.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration: Analysis

This is probably the trickiest part of the project. It all depends upon how well you know your data!

In an earlier post
, I had outlines the characteristics of good data. Focus on the following items in your analysis.

* Completeness,
* Conformity,
* Consistency,
* Accuracy,
* Duplicates, and
* Integrity

Based on your scope, try and identify all the sources of data (business systems like ERP, CRM, MES, PLM, document management systems etc.). Once you have the source identified, identify the quality of data.

If you have business analysts on your team, put them to work to

(1) document business rules and logic in source and target systems

(2) document gaps in data conformity to existing business rules and business processes

(3) document duplicates and plan of action to address duplicates

(4) document data integrity gaps and plan of action

(5) document plan to map data from source to target systems

Your business users should be assigned to

(1) assess completeness of data

(2) assess impact of data mapping

(3) assess data quality issued reported by business analysts

Based on the two bodies of work, you will have a good idea as to whether you need to clean your data prior to the move! In my experience, you will have some tough choices to make: Clean source data or design your extraction utilities to account for the cleansing actions!

I would recommend focusing on this aspect. “Garbage In is Garbage Out”

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration: Planning

In an earlier post, I had outlined the key elements of a data migration plan. Now let us delve into the details.

(1) Scope: Clearly identify what data needs to be migrated over from the source system into the target system. Insights of subject matter experts are invaluable, Use them well! Work with your team, to identify what needs to be migrated, base the decision on how the business process will have to be executed in the new system and what information is essential to ensure success, effectiveness and efficiency. In every migration project I have managed, this is a crucial building block. At the end of this stage, you should have Estimate of the data assets (number of records, metadata, documents etc.)

(2) Criteria for a successful migration: this deliverable is closely tied to the scope of the data migration. Migrating the data into the target system without any errors doesn’t mean the project was successful; focus on what your customers will need! The criteria should start with error free migration and also include impact to customers if additional cycles are involved. This will take a few iterations, engage your subject matter experts and end users and work with them to refine the scope and define the criteria for success.

(3) Decision making authority of each of the data domains: In my experience, ownership of data is a tricky item. Different groups may own pieces of information that make up usable data to the enterprise! First identify the data elements and then start asking for who is the owner or steward of this data; this will lead you to decision making authority. Ensure that this person is always engaged, communicate often and well! Without their buy-in, scope and criteria for success will be meaningless.

(4) What data needs to be migrated: in almost all cases, all the data mayn’t be required. Consider the lifecycle of the information, in most cases the data can be divided into three buckets

a. Currently relevant to business

b. Historic information for archival / research purposes

c. Newly created, which may not have any significant value yet

Consider these buckets well, in most cases you might need to splice the data and truly identify all facets of usage. Dig deep and clearly identify usage patterns, this will indicate the value of your dataset and will provide insight into your final decision of partial or complete migration

(5) Timing: this is a key element of your plan. You need to clearly identify the time line for cutover into production. Work backwards from go-live date and identify spots for key tasks like development of extraction, loading and validation utilities, test runs (at least 2-3), stakeholder acceptance tests. You mayn’t have a clear idea of time needed to load into target system, work with your software vendor or benchmark with companies/individuals who have worked on similar systems and assess the time required for final migration.

(6) Requirements: focus on resource, system (hardware/software) and budgetary requirements. Gather as much information as possible from benchmarks and vendors to clearly identify what you might need to ensure success of this project. Start communicating the requirements to program sponsors, your resources and stakeholders, get alignment and then go secure the requirements.

(7) Roles and responsibilities: clearly define the roles and responsibilities for each and every one on your team. At a minimum, your team should include

a. Project manager or lead

b. Business user

c. Business analysts

d. Data architect and or data administrator

(8) Assumptions: this is a key element of any project, as you define the scope and success criteria, ensure that your assumptions are well documented and communicate them. Ensure your stakeholders, program sponsors and decision making authority are aligned. If you ever have to change any of the underlying assumption, secure alignment again.

(9) Risks and Risk Mitigation: every migration project is fraught with risk, if you remember an earlier post, I had outlined the success rate of projects and this paints a dismal picture. For every risk, ensure you have a risk mitigation plan. Document the risk and communicate your plans and secure alignment before proceeding.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Phases of Data migration

Just like SDLC, I would like to propose distinct phases and stage gates that have to be met in order to complete data migration.

(1) Strategy

(2) Analysis

(3) Design (& build)

(4) Test

(5) Validation

In this post, let us focus on the strategy or planning phase. The first step is to put together a plan. The data migration plan should describe, in detail,

(1) Scope of the project

(2) Criteria for a successful migration

(3) Who is the decision making authority of each of the data domains (should be from the business organization)

(4) What data needs to be migrated (full or a subset)

(5) Timing

(6) Requirements from hardware, software perspective

(7) Resource requirements

(8) Budget requirements

(9) Roles and responsibilities

(10) Assumptions

(11) Risks

(12) Risk mitigation / Contingency

The plan also sets expectations up front with customers about the complexity of the migration, timing, and potential issues and concerns. Remember this is the first cut at the plan; this can be refined as move along your project. If you make any changes, remember to socialize with governance and accountability system.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Rules For Successful Data Migration

(1) Clearly define the scope of the project

(2) Actively refine the scope of the project through targeted profiling and auditing

(3) Profile and audit all source data in scope before writing mapping specifications

(4) Define a realistic project budget and timeline, based on knowledge of data issues

(5) Secure sign off on each stage from a senior business representative

(6) Prioritize with a top down, target driven approach

(7) Aim to volume test all data in scope as early as possible at unit level

(8) Allow time for volume testing and resolving issues

(9) Segment the project into manageable, incremental chunks

(10) Keep total focus on the business objectives and cost/benefits throughout.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Data Migration: Challenges & Joy! Part 2.

Forewarned is forearmed

Before we jump into details about migration methodologies, let us step back and understand some of challenges ahead of us. Whether you are migrating from a legacy system or a spreadsheet/database, you have to understand everything about your “SOURCE” system.

Common misconceptions about migration

• Data migration is an IT job function.
• We know our data!
• Data migration is one of the last steps taken before you go live with the new system.
• We can always change it after we go live.
• Acquiring legacy data is easy.
• Existing data will fit the new system.
• Existing data is of good quality.
• Existing data and business processes are understood.
• Documentation exists on data rules and formatting.
• We Don’t Need Tools or Special Skills
• Migration Is a Separate Activity

What you as the lead of the migration effort need to do is work with your team to dismiss these misconceptions.

Data migration is not a matter of copying data! In order to be successful at migrating data, one has to thoroughly understand
(1) Why is the data being migrated, significance and value to the organization?
(2) What data is being migrated?
(3) Where does the data reside currently?
(4) What are the rules for the data in the “Source” system and how is the target system setup?
(5) Who are the experts for each of the data domains?
(Hint: do not limit yourself to an IT resource)
(6) What is the success rate of migrating into this application?
(7) Who else in your industry segment has been through this activity?
(Hint: Do a benchmark)
(8) What do you need from a hardware/software perspective to support the data migration?
(Hint: Benchmarking and reference calls will provide this information)

Now that you armed with some answers which will highlight what you need to focus on, we can step back and think through our methodology.

Don't lose your humor, remember your mantra “I love data migration”

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Data Migration: Challenges & Joy!


What is data migration?


Data migration is the process of transferring data between storage types, formats, or computer systems.

Over the last decade, I have led multiple data migration efforts and found each one of these projects challenging and enriching. I keep swearing that I will not take up another but yet I always do. In a series of posts, I am going to share my experiences so that you may benefit from my lessons learnt and insights.

Common Data Migration Scenarios: when would you have a need to migrate data and create a project around this activity?
1. Mergers and acquisitions
2. Legacy system modernization
3. Enterprise application consolidation, implementation, or upgrade, such as an SAP ERP or CRM implementation
4. Master data management implementation
5. Business process outsourcing

Why are Data Migration Projects Are Risky: If you have been assigned as the lead for data migration, be aware of the heavy odds against you! Do your research and do it well.
Based on reference documents I have researched over the years (Gartner, Standish Group Study), I have found that
1. 84 percent of data migration projects fail to meet expectations
2. 37 percent experience budget overruns
3. 67 percent are not delivered on time

Why Data Migration Projects Fail: In earlier posts, I have outlined the importance of data management and the pitfalls of bad data management. These contribute to the overall success/failure of large implementation (and its data migration). Here are some reasons that have been attributed to failures of data migration.

1. Lack of methodology
2. Unrealistic scope
3. Improper understanding and use of tools
4. Inattention to data quality
5. Lack of experience

While data migration is essential to the success of implementation of a new application or business system, its role in the project often overlooked and underestimated. The common assumption is that tools exist to extract and move the data into the target application, or that data migration is something a consulting partner will handle. Often project teams tasked with data migration focus solely on the timely conversion and movement of data between systems. But data migration is not just about moving the data into the new application; it’s about making the data work once within the new application. This means that the data in the new application must be accurate and trustworthy for business users to readily transition from their legacy applications to adopt this new application.

In upcoming posts, I will outline the methodology I have used and why I have chosen this approach. Most of my team members would fondly remember my mantras of “Wash, Rinse & Repeat” and “I love data migration”. :)

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"