The international standard on records management, ISO 15489 defines migration as the:
“Act of moving records from one system to another, while maintaining the records’ authenticity, integrity, reliability, and usability.”
But, migration is much broader than just records. Essentially, any time a legacy system is decommissioned, it should be reviewed to determine:
- What information is stored there?
- Does this information still need to be actively accessible?
After asking these questions, you’ll be able to determine when migration is needed.
Migration Issues to Consider
There are fundamental considerations to be assessed before and during a migration:
- File Formats – Were all the PDFs a searchable format (OCRed)?
- System Dependencies – Integrations, platform layers, etc.
- Data Quality – Garbage in is still garbage out.
- Process Issues – How something was done 15 years ago may not work today -> or can be done better.
File Format Issues
Any migration involving documents will run into issues migrating certain file formats. These issues include:
- Proprietary Formats: The older these files are, the more likely they are to have issues. Special care should be taken to ensure that they need to be migrated and that they were migrated successfully. The latest one is HEIC from Apple.
- Complex Formats: These are similar to proprietary file formats; in fact, most proprietary formats are also complex.
- Linked Formats: Sometimes called ‘compound documents’, files like engineering drawings with linked external reference drawings, spreadsheets or PDFs that link to each other, or any other types of linked files often run into issues with the paths to the linked documents.
- Unknown Formats: Most repositories can store any kind of digital data, but if you run into unknown formats, there is a question as to whether to even bother migrating them. Anyone remember WK2 files?
- Duplicate Files: Depending on which research you believe, organizations have anywhere from 3 to 10+ copies on average of every document they store. This is often because the legacy system in question is network file shares, and when users finally locate a long-sought resource, they download it to their own personal file stores. One of the outcomes of the migration, and frankly one of the first ones, should be to identify those duplicate files, determine which one should be the official copy, and mark the other copies for deletion.
System Dependencies
When migrating, there are a number of technology-level considerations to take into account.
- Integrated Systems: Systems that have been tightly integrated may have unanticipated dependencies around data structures and output files.
- Reports: Integrated systems may also be generating reports that rely on data from both/all systems involved. And even single system repositories may be generating reports in a unique way that can’t be done identically in the new system.
- Process Dependencies: Work processes, both manual and automated, may rely on how a system works, what its reports contain, how its metadata is structured, and more. This is exacerbated in automated processes where workflow rules are very specific as to the conditions for a particular task or step.
- Bandwidth and Processing Issues: This doesn’t seem like it would be a huge issue until you start transmitting 10 TB of data across the network – or from one data center to another thousands of miles away. Moving to a cloud option may need some shipping of physical media closer to the destination.
Data Quality Issues
Data quality issues are a huge concern for a migration project – what’s the point of doing it if the end result is inaccurate, inconsistent, or ends up in data actually being lost? Some of the issues to consider here:
- Redundant, Outdated, and Trivial Information (ROT): A migration takes long enough without including terabytes of outdated, personal, or irrelevant information. Where that information can be isolated (and it can be), it should not be migrated, and in fact, it should be disposed of in accordance with the records management program.
- Lifecycle Considerations: What we mean here is that if some of the data in the system to be migrated has met its retention requirements and there are no other legal, operational, or historical reasons to keep it, it doesn’t really make sense to migrate that data just to turn around and delete it.
- Missing Metadata: This is often the case because new fields were added; in many instances, these new fields are also mandatory. As the target system and its data structures are being designed, attention should be paid to this to determine how best to fill in that missing metadata. This is also known as metadata enrichment.
- Inconsistent Metadata: This is very common as different systems use different data structures. The way to approach this is generally to map the field in the legacy system to the one in the new system, either through a middleware application or by actually transforming the legacy metadata value into the new one during the migration process.
Process Issues
There are a lot of process-related issues to consider during a content migration, including:
- Accuracy and Quality Control: This is perhaps the one most thought about. That is, was the migration accurate, and how can you verify it? The migration tool and process can provide some metrics around the number of items, etc. but in all likelihood, you won’t *know* the migration was accurate until your users start to interact with their content using the new system.
- Timing and Duration: How long will the migration take? Longer than you think, but it shouldn’t be a never-ending effort. When does it start? Well, when can you freeze any additional changes to the legacy system, and what’s the impact of that on your work processes? And of course, when is the migration, and therefore the migration project, complete? And maybe as importantly, when do you cut off all access to the legacy system? Because if you don’t, users will continue to use it.
- Communication: It should be clear that the communication of all the previous points to those affected should be a high priority. In the absence of consistent, regular communication, rumors will fly, and users may take counterproductive steps like saving all their content to a less governed location like their own computer or a flash drive.
- Users: Finally, where you have individual users participating in the migration process, getting them to do it is often a challenge, and when you do get them to do it, they often want to keep everything “just in case.” You should ask the question of whether they really need all that information, and why. One tool that can really be helpful here is a report that can show users that they haven’t accessed a particular document or folder or repository in X number of years, that that document or folder hasn’t been accessed by ANYBODY in X number of years, etc. These are easy to generate for many repositories.
What to do? What to do!
The economics and limitations across all this information is holding organizations back. Taking the journey to a better place requires the right experiences and tools that InfoDNA Solutions bring to any sized organization. Learn more at www.infodnasolutions.com