Easing the Pain of Data Migration
To, many lessor executives, and CIO’s in particularly, what we are about to discuss equates, possibly, to their worst nightmare – a data migration. However, in the experience of the authors such fears can be confronted, resolved and a successful migration achieved by the application of some forward thinking and methodical planning.
The reality in today’s data intensive world is that most software implementation projects require that a data migration be carried out. In our sector, whether you are deploying a new Contract Management System, or in technical speak an Online Transactional Processing (OLTP) system, or a data warehouse type solution, again in technical speak an Online Analytical Processing (OLAP) system, you will most likely need to perform a data migration.
To the uninitiated one could be forgiven for thinking that any two systems that maintain the same sort of data must be doing very similar things and, therefore, should map from one to another with ease – regrettably this is hardly ever the case. Legacy systems have historically proven to be far too lenient with respect to enforcing integrity at the atomic level of data. Fields that should be populated from a list of valid values tend to require that a value be entered, but seldom validate the actual value entered by the user.
For the reasons above, and more besides, there exists an obvious and absolute need for a sound methodological approach, by which an organization should approach a data migration project. Although there is no way to avoid unpleasant surprises, one can certainly be prepared to confront and resolve them.
The Phases of a Data Migration
The data migration project plan should be broken down into phases, which mirror the overall project. The phases can be defined as Analysis, Data Mapping, Build, Transfer Reconciliation, Revision, Migration Reconciliation and Maintenance.
The analysis phase usually starts with defining the overall project. As has already been observed data transfer projects usually come about through the introduction of a new system, hence the project manager will have many competing priorities, not the least of which will be the detailed requirements that the new system must address. This is where a critical mistake can occur. With focus directed to the new system the project team pays little attention, or completely overlooks the data transfer requirements. In reality most data transfer projects in either the OLTP or OLAP space require a separate detailed project plan, those that simply include a one line entry on the GANTT chart “Data Migration” are kidding themselves.
The second element in the analysis phase is to create the data migration team. The challenge this usually presents is that the same people are expected to undertake a role in the system implementation project as well as be part of the migration team. This leads to an immediate concern over priorities. As there is clearly no right or wrong answer to this conundrum the project manager needs to make the right priority call as appropriate to the business need.
One further point worthy of note here is that the project team needs to be dedicated to the project. The most successful projects are those that have active executive sponsorship and 100% dedication from the project team. It should not be expected that the project team members continue to carry out their “day jobs” as well as perform as part of the project team. Clear communication of project criticality, along with priorities and responsibilities of the project team members, means that the justified gaps created in the day-to-day business need to be backfilled by other colleagues who understand why they are being asked to step up to fill the void.
Getting back on topic of data migration, as an executive of a leasing asset finance business you will know that your business is supported by more than one system. In most data migration projects we find ourselves faced with a myriad of different systems, or I should say more accurately data sources, in which the mix of data to be transferred resides. It is not unusual for business critical data to be stored around the organization in separate desk-top databases, spreadsheets and sometimes in flat text files. Accuracy, visibility, control and security issues immediately are brought home to the business, so inevitably these additional data sources need to be included in the data migration.
The next important part of the analysis phase involves getting acquainted with the actual data you plan to migrate. Remember, at this point of the project, you have only the minimum of information on which you are developing the data migration plan. To get a better sense of the task it is important to get some visibility as to the quality and quantity of the data to be migrated. Obtaining an extract of data gives a good guide as the extent of likely data cleansing and the quantity of data.
So far we have not considered what the options might be for actually migrating the data. In the case of an OLAP data migration, and particularly where there is a need for regular data refreshes, in for example refreshing the content in a data warehouse, there is little option but to follow an automated migration strategy. However, in the case of an OLTP migration, which is typically a one- time exercise the option of a manual approach might be worth considering. Generally the quantity of data would be the guiding indicator. In our industry a data transfer of more than 1,000 agreements points towards an automated transfer, with perhaps a small number of anomalies being handled manually. Less than 1,000 it is typically a risk cost assessment and as a result you may find that the overall cost of electronic migration is prohibitive relative to the quantity of data that needs to be transferred.
A further piece of advice that is worthy of note is to consider phasing the migration project. Large volume data migrations can take several months to complete and therefore carry a significant cost. This often leads to anxiety and risk pressures building up, as well the ongoing expense for seemingly no visible result. A suggestion to overcome this would be to consider building your data migration plan based on a number of smaller discreet transfers. This would demonstrate progress, justify cost and hugely de-risk the “all-in-one” transfer approach. You may also find this significantly relieves the anxiety that your CIO feels!
2. Data Mapping
After you have decided upon the legacy data sources, the next step is to define what data needs to be migrated; then those data elements are mapped into the fields of the new system. This involves going through the list of data elements from each and every data source, and deciding whether to migrate and compare to those in the new system.
During this exercise a further extract sample of the data will be necessary to help explain any terminology anomalies. As an outcome of this exercise you will be able to determine the extent to which the data is missing or needs to be cleansed.
The mapping phase is not intended to thoroughly identify the transformation rules by which historical data will be migrated to the new system; rather, it is essentially the act of making a checklist of the data elements that must be migrated.
The build phase is where the experience, knowledge and skill of your migration partner, NetSol in this case, really comes to the fore. A partner with industry experience from numerous and varied data migrations, will deliver project reassurance with a blend of business and technical expertise that is needed to accomplish this activity.
To provide a quick start to this process NetSol has built a data migration transformation tool into which source data and appropriate mappings temporarily reside. This is in essence a staging database, which is used to verify the data mappings and supports the cleansing and transformation of data to be carried out, all without impact to either the donor or recipient databases.
Whilst a transformation tool is not the only method available, it is by far the most efficient and effective. The alternative tends to be spreadsheets, which invariably fail when you need to map one source data element to one or more target elements. Spreadsheets typically prevent more than one person from making modifications at one time, resulting in a great deal of unnecessary administrative overheads. Spreadsheets are two-dimensional tools, and mapping is without question multi-dimensional.
The staging post data repository becomes particularly critical when undertaking a data warehouse type migration, where the need to transform and rationalize data into new tables will be a requirement to provide rapid access to online analytics in real time.
4. Transfer / Reconciliation
The next two phases of transfer / reconciliation and revise are iterative.
Typically this would involve a number of sub-set data and full data extracts of client month-end data pulled through the staging database with scripts to transform and map as necessary to the recipient database.
This phase would answer the following sort of questions;
- How many records did we expect this script to create?
- Did the correct number of records get created?
- Has the data been loaded into the correct fields?
- Has the data been formatted correctly?
- Does it reconcile?
Invariably this phase highlights to users the historical data elements that must be migrated that were not apparent to them during the analysis sessions.
The truth is that data mapping is not easy to get right first time. Most people do not realize that a data element is missing until they realize it is not there anymore. For this reason the revise phase must follow to correct these anomalies and the transfer / reconciliation phase must be run through again.
It is important to reach this point in the migration project as soon as possible, as it will highlight any issues that may require that the recipient systems\’ data model be tweaked to accommodate any specific client derived requirements.
The reconciliation aspect of this phase is critical to a successful migration. In conjunction with your migration partner you must define the criteria and processes needed to verify a successful reconciliation. Remember this is your business, your data and ultimately your responsibility to sign -off on the migration reconciliation results.
The extent of the revise phase is entirely dependent upon the outcome of the transfer / reconciliation phase previously described. In theory it should be possible to skip this phase altogether with a fully reconciled and successfully scripted transfer and reconciliation phase, but regrettably this is never the case, anomalies do always arise.
Interestingly, experience has shown that the value of the reconciliation imbalance, or perhaps the fact that a large number of agreements with data mismatches are discovered, is not a good guide as to how much extra effort the migration will require to complete. A large portfolio with a large reconciliation imbalance is likely to be corrected, or the majority of the difference found, by just one or two adjustments, and typically these are easy to track, whereas a lower value reconciliation imbalance, or a smaller number of agreements with a data mismatch is harder to find. The message here is don\’t be alarmed by the outcome of the first reconciliation. Two or three iterations is quite the norm.
Once the transfer / reconciliation and revise phases have been satisfactorily completed it is time to implement.
6. Migration / Reconciliation
As a first step in the migration / reconciliation phase it is recommended that a full migration \”dry-run\” is carried out. The dry run must take place on the final code set of the new system. This is critical as any further changes to the underlying data structure and / or rules can have an impact on the data transfer results. The migrated data should be thoroughly user tested and reconciled. Reconciliation at this stage is critical. Your migration partner will carry out a detailed reconciliation adhering to your criteria and processes, but again the final sign-off responsibility rest with you and approval from your auditors.
Following a successfully reconciled dry run the full migration can take place. Typically this would take place on month-end data and be timed to occur over a weekend. This activity quite obviously will involve some downtime of the business systems, so it is important to pick a time that minimizes business disruption and that also gives time to resolve any issues that may arise.
The maintenance phase only really applies in the case of performing a data warehouse (OLAP) system migration.
In the contract management system (OLTP) system migration, you are working in a one-time migration environment. The goal to successfully migrate the legacy data into the new system is complete hence the need to maintain the migration scripts is removed.
In the data warehouse (OLAP) migration, you will most likely be reloading the new system at timely intervals. As new information is recorded in your contract management system, you will want to transfer it to your data warehouse (OLAP). Script performance is a critical issue in data warehouse migrations, as new data is added to the data migration refreshed scripts will need to be reviewed to ensure peak performance is maintained.
To conclude, data migration is, more often than not, a necessary step. The key to success is to prepare early, monitor continuously and ensure that between your project team and your migration partner you have a mutually understood approach with defined responsibilities, and a dedicated team to carry it out. Unfortunately there is no magic formula other than preparation, planning, attention to detail and hard work, and ensuring that you have a partner suitably experienced and skilled in this business critical exercise.
By Tony Langford and GrahamTarrant, NetSol Technologies