The mapping rules that govern the data migration must be specified and maintained over the project lifespan as the target system evolves and eventually is developed, configured, and live.
This specification, mapping and implementation activity can be undertaken in many ways.
It is common to see people use some kind of textual specification, such as Word, Excel, or something similar.
In some respects, using familiar and available tools makes good sense, but there are significant drawbacks to such an approach.
What's the Challenge
The business people producing these specifications must have deep knowledge of the data involved and be knowledgeable about the business processes supported by these data. They are not technical people or code developers and should not be!
In many cases, these business experts are more comfortable or have no other option than producing the specifications using desktop applications such as Word or Excel.
The challenge is - turning these specifications into code
Based on the written specifications, executables (code) must be constructed (implemented) and maintained.
It is common to task a team of developers with writing a system or set of scripts or programs to execute the data migration according to the specifications.
Historically, some third-generation programming language was used, but recently, it has become increasingly common to leverage the capabilities of one of the many powerful ETL tools out there.
The tools commonly used for the specifications lack support for cross-referencing or validating the mapping specification's consistency. As a result, these documents or spreadsheets grow larger and larger, becoming hundreds of pages and thousands of Excel rows, making them unmanageable.
Inconsistencies and incompleteness can, at best, lead to problematic communication and wasted effort when implementing the specifications.
Worse scenarios are that deficiencies of this kind remain undiscovered or are discovered late in the lifetime of the data migration project, making them complicated, expensive and risky to correct.
Regardless of the toolset chosen for the specification and implementation, these tasks remain a crucial part of any data migration project.
- The success of any data migration is directly linked to the quality of the specifications and how they are translated into the executables performing the actual data migration.
- The specifications are often growing in size and complexity. It isn't easy to maintain validity and coherence in the specifications themselves.
- In many cases, it proves impossible over time to maintain complete fidelity in the consistency between the specifications and the executables. As the migration project progresses and deadlines approach, there is a severe risk that specifications are left behind while the implementations are modified directly.
- Very often, the effort invested in the specifications is rendered valueless as the specifications drift further and further behind the reality implemented in the executables.
Finally, it is a common consequence that, as the complexity of the implementation grows, the team in charge of the code and implementation rejects the requirements and/or wishes put forward by the business experts in charge of the mapping.
What can be done differently?
At Hopp, we approach this challenge differently.
A key component of hopp is Studio. This is a dedicated, multiuser productivity application that provides a comprehensive and consistent interface to produce specifications and actual mapping.
Using Studio, a team of business experts can collaborate to produce the specifications and mapping that are validated for consistency and then ready to be generated automatically into code. No developers needed!
Studio contains rich cross-reference and cross-validation functionality to ensure a very high degree of endurable consistency and coherence in the mapping.
Most importantly, Studio facilitates and enforces extremely structured mapping. In fact, the specifications are so structured that they serve as input to a code generator that generates the migration executables.
The structured specifications and the generated code is a core benefit of using Hopp.
The benefits are significant:
- Complete guarantee of consistency between the mapping and the actual executable at all times
- Ever-increasing quality of the mapping over time as it is reused from project to project
- The specification is the implementation. The quality of the generated code is such that it is never manually retouched. There is no longer any implementation process or indeed any implementation team
- The business expert is empowered and in charge. Modifications to the specifications done by the business exports directly modify the generated executable
Knowing that data migration specifications can be consistently created and turned into code is news to the industry. Too often, practitioners struggle with communicating the specifications and managing endless loops, slow progress and poor results.
Structured specifications - a cornerstone of Hopp
In the end, what we want to achieve is the following:
A guarantee of unwavering consistency between the mapping and the actual executable components is maintained at all times. Specifications as Implementation Over time, the mapping quality improves as it is reused across multiple projects. Specifications as Implementation The generated code is of such quality that manual adjustments are unnecessary. This obviates the need for a separate implementation process or team.
Business experts have direct control and authority over the specifications, as their modifications directly impact the generated executables.