Phases of Data migration: Design

Now the fun begins! :)

In previous posts, we had focused on phases of data migration, and looked at planning and analysis stages in detail. In this post, we will focus on design of the tools and methodology for data migration.

What tools do we need? We will need at a minimum three tools for

* Extraction & transformation
* Load
* Validation

The first item to complete would be data mapping from source to target system. Based on the scope identified in the planning stage, have your business analysts dig deep and clearly identify all data sources, formats/rules and map them to the target system. This mapping exercise needs to be comprehensive for e.g. simply mapping name of the customer is stored in table XYZ and in column 1 in source system to table ABC and in column 2 is not enough. Why? In order to be thorough, you need to consider the maximum length of characters in your source system and check to see if your target system can handle this.

If your target system cannot handle this length, you have some choices (extend character size, truncate some customer names etc…). Regardless of decisions, this will have to be captured in your planning assumptions, socialized and then documented in your design document.

Word of caution: Don’t make assumptions! Don’t trust anyone! Including your data experts. Have your experts prove their knowledge of the data and their expertise. The reason for this seemingly paranoid approach is justified; in a lot of cases “knowing the data” means knowing the data base construct and limited knowledge of how the business uses the data or sometimes the other way around knowing the business logic but having no clue of persistence at the database layer.

Once your data mapping is done, you will have a clear idea of any data cleanup that needs to be done. Drive your team to think in terms of numbers and impact to schedule. The numbers should indicate the number of (1) records that have to be fixed/cleaned, (2) resources required to perform this clean up (3) hours/dollars effort/cost impact.

If data cleanup is required, consider this as a separate body of work. Don’t scavenge on the design activity to fund or resource the data cleanup. If you need additional help, make sure you raise awareness/jeopardy and ask for help.

Continuing on with design, ETL (extract transform and load) utilities can be challenging, but you don’t have to reinvent the wheel. Think outside the box, in most organizations that focus on KPI’s and performance metrics, there will be well defined tools to extract data. These tools can form the skeleton for your extraction tools. Review these carefully and document, what else you might need. If you are starting from scratch, try to get the experts to document the application layer and database layer. Once you have an idea of how source system manages and stores data, you can work towards extraction. With the advent of J2EE based systems, pure database centric extractions have become exceedingly difficult. The reason is that the logic and rules for extraction exist in the application layer and may not be clearly available in the database. In some cases, I have spent hours trying to create relationship diagrams using tools and that finally managed to construct diagrams based on combining application logic with some creative thinking (hacks) into the databases.

This activity of design and building tools to support data migration are no different from the equivalent standard SDLC tasks.

The key requirements for these tools are scalability and performance. Your tools should be able to perform the tasks within a timely manner and be able to handle the data set identified in the planning stage. While going through design, build and test iterations, I would highly recommend keeping a spreadsheet to record performance.

Thoroughly analyze the sequence for data extraction, load and validation. This is the sequence you need to solidify in the test phase and execute to during go-live. This sequence is usually identified as the last step and this is a common mistake. Instead of honing the strategy, data migration leads continue to spin their wheels.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Process Mapping: Part 2.

In an earlier post, I had outlined how to construct a simple flowchart. In this post, let us see how can add more detail and enhance this flowchart using different techniques like SIPOC (six sigma), value stream mapping (lean), swim lane etc.

Swim lane diagrams: This is an extension of flowcharts and includes additional details like
  • Actors: The people, groups, teams, etc, who are performing the steps identified within the process.
  • Phases: These might reflect the phases of the project, different areas of the project, or any secondary set of key elements that the process flow needs to traverse to successfully complete this process.
Some times, these are also called cross functional flowcharts. This method of allows you to quickly and easily plot and follow processes and, in particular, the handoffs between processes, departments and teams and identify inefficiencies easily.
For example, if you look at the image shown, the flow chart is extended with additional information (phases are distinctly listed in the columns and the actors are listed in the rows).



SIPOC diagrams
: This is an extension of flowcharts and clearly indicates the suppliers, input, process, output and customers. In some cases, the process can be shown not only in a simple flowchart but also using swim lanes. SIPOC depiction of the process is very useful because it clearly identifies who supplies the information, which organization is impacted by the process and who generates the output and what the deliverables are.



Value stream mapping: This is an extension of flow charts & swim lanes and clearly identifies management and information systems that support the basic process. This methodology started as part of LEAN manufacturing with an emphasis on reducing wastes within manufacturing, but the benefits of using this across all business processes are valuable. The primary goal of this depiction is to clearly identify value added and non value added tasks performed in order to minimize wastes. It clearly outlines all tasks tasks, cycle time for each of the tasks so that the reviewer/management can identify how the process can be improved.



"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Process Mapping: Part 1

Process Mapping refers to activities involved in defining a business process (Who does What, When, Where, how and Why). Once this is done, there can be no uncertainty as to the requirements of every internal business process.

It is a visual depiction of the sequence of events that occur from the beginning to the end of the business process.

Process maps can be constructed using a number of different techniques like flowcharts, swim lanes, process maps. Six Sigma methodologies recommend using a SIPOC approach. SIPOC stands for supplier, input, process, output and customers to clearly identify the handoffs, the inputs and outputs.

Let us start with the simplest approach, a flowchart.

How does one create a process map with a flowchart?

Step 1: Determine the Boundaries: Identify the start and end of the processes. Observe the process in action (if possible).

Step 2: List the Steps in the process. My recommendation is to start with post-it notes, identify the steps.

Step 3: Sequence the Steps: now place the post-it notes in the order

Step 4: Draw Appropriate Symbols

  1. Start with the basic symbols:
  2. Ovals show input to start the process or output at the end of the process.
  3. Boxes or rectangles show task or activity performed in the process.
  4. Arrows show process direction flow.
  5. Diamonds show points in the process where a yes/no questions are asked or a decision is required.
  6. Usually there is only one arrow out of an activity box. If there is more than one arrow, you may need a decision diamond.
Step 5: Finalize the Flowchart
  1. Check for completeness and duplication/redundancy
  2. Ask if this process is being run the way it should be.
  3. Do we have a consensus?
Here is an example of flow chart.



"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"

Architect!

Over the last few years, there has been an increased focus on architecture within the IT domain. If you spend some time searching for people with “architect” in their title, you will find a multitude of titles like

(1) Enterprise architect

(2) Data architect

(3) Business process architect

(4) Application architect

(5) Solution architect

(6) Infrastructure architect

(7) Security architect

(8) Technology architect

Are all of these roles the same? Or are they complimentary?

I like the description posted on Wikipedia for the function/responsibility of different architects.

Enterprise architects are like city planners, providing the roadmaps and regulations that a city uses to manage its growth and provide services to its citizens. In this analogy, it is possible to differentiate the role of the system architect, who plans one or more buildings; software architects, who are responsible for something analogous to the HVAC (Heating, Ventilation and Air Conditioning) within the building; network architects, who are responsible for something like the plumbing within the building, and the water and sewer infrastructure between buildings or parts of a city. The enterprise architect however, like a city planner, both frames the city-wide design, and choreographs other activities into the larger plan.

These roles are different and serve different purposes. The roles are complimentary and the functions. In order to implement business systems and underlying infrastructure, specific architecture domains to be covered (Business, Data, Applications, Technology).

The key to success is engaging all the different facets of architecture to create a technology roadmap and strategy by which your organization can start from the current state and finish in the end state so as to achieve corporate objectives and goals.

"Disclaimer: The views and opinions expressed here are my own only and in no way represent the views, positions or opinions - expressed or implied - of my employer (present and past) "
"Please post your comments - Swati Ranganathan"