These purposes are intimately related with the characteristics of the source system and the
target system. Regarding the infrastructure, it is common to move from an on-premise
database to a cloud database to improve data accessibility or management efficiency. If the
database is already online, organizations might want to migrate to new software solutions
looking for new features.
Since a database is a collection of structured data, migration does not only mean moving
data, but loading it into a new structure after a transformation process that encompasses
data cleaning and data mapping (matching fields from different data models).
This transformation can take place in origin (preparation) or after the extraction
(transformation). Once the process is finished, the data should be fully located in the target
database keeping its previous quality; in other words, the features that make it usable. The
source database is often discarded after the migration process, when the data migrated is
fully verified.
Undertaking this process manually can be effort consuming and tedious. Furthermore, it
could lead to mistakes such as record duplication or data entry inconsistencies, mostly
when working with large datasets collected over years. To the extent possible, the
extraction, transformation and loading operations should be automated to ensure
consistency. Sometimes, automated data migration requires advanced technological skills,
but depending on the magnitude of the project there are basic transformation strategies to
reformat the data and have it ready for the load.
The steps and tasks shown below are valid to migrate large and small amounts of data, but
big migration projects might require additional planning and control measures. If the data
comes from different sources or the process is carried out sequentially, it is very important
to keep track of every step to avoid duplicating work or losing data.
It is also important to remark that the complexity of the project is not only determined by
the amount of data to migrate. As we will see later, the data quality and the differences
between the source and the target databases can add new levels of complexity to the
process.
Integrating data from different sources is often considered as independent to data
migration, but both processes share the extraction, transformation and loading procedure,
so they can be undertaken together. The more different the datasets to combine are, more
3