This is part of Solutions Review’s Premium Content Series, a collection of reviews written by industry experts in maturing software categories. In this submission, Komprise Chief Customer Success Architect provides a template for unstructured data migration planning, along with tools to consider.
Data migrations have never been easy. But now the need to do them intelligently and painlessly is urgent, as enterprises simply have too much unstructured data relying on their best-in-class storage technologies in legacy environments. Even though storage technology prices have come down in recent years, at the same time data growth has been exponential. It is imperative to continuously assess what data is stored on your best performing tiers and whether the data can be migrated to a solution at a better price and/or to meet organizational needs such as provisioning cloud data lakes or compliance with ever-changing regulations. .
There are many options for storing unstructured data these days, from storage as a service (STaaS) to object storage, cloud network storage (NAS), and deep archives such as AWS Glacier and Azure Archive Storage. These choices mean that IT teams responsible for unstructured data need a detailed understanding of their data and the ability to pivot at any time to adapt to change. And let’s face it: Pivoting workloads of any size and scale, on-premises or to the cloud, can be time-consuming and disruptive without a plan.
Creating a plan of what you need to know before migrating will avoid errors and delays associated with cost overruns while ensuring you meet your overall unstructured data management goals; for most organizations, this means moving to the cloud faster and maintaining an agile, hybrid cloud environment.
The plan should encompass the key questions:
- What level ? What cloud?
- What about rules and regulations?
- What are our common data types and workloads?
- What topology requirements do we have?
- Do I really need to test?
- Free tools or enterprise solution?
- How do you write communications that people will read?
Unstructured data migration plan: steps to follow during development
Map sources and targets
First you will want to get the lay of the land by setting your sources and targets. When you develop your plan, make sure it details the locations of points A and B and that you have a process for identifying and resolving mitigating factors and potential complications with your source and target storage.
Rules and Regulations
Rules are usually established and governed within your organization, such as retention policy, legal hold, deletion policy, and disaster recovery. Regulations are usually set by a governing body that can impose fines for non-compliance, such as HIPAA, SOX, GDPR, and GxP. It is essential to work in partnership with your HR, security, legal and compliance teams to ensure that everyone is doing their part to meet or exceed applicable rules and regulations. Additionally, consider collaborating with data owners or subject matter experts who can shed light on potential roadblocks and provide feedback while establishing the best unstructured data management strategy.
When you perform proper data discovery, you understand your workloads and potential speed bumps. Are you migrating one share or thousands? Are they millions of small files, terabytes of large files, or a mixture of everything? Can you identify orphaned data and move it to an archive or contain it for deletion? The tools can help create a central index to help make better decisions through holistic visibility, which, by the way, usually grabs the attention of legal and/or compliance teams and opens up partnership opportunities for funding or demonstration of cost avoidance.
Simplify and standardize
Just because you did things a certain way in the past doesn’t mean it will be the right way tomorrow. Legacy standards that haven’t evolved over time or those adopted through mergers can wreak havoc on migrations, including cloud adoption strategies. You’ll need to consider whether to carry over the old permissions or standardize them in the new target, for example. Another example is choosing between two shares – SMB or NFS where one protocol takes precedence or a mixed protocol architecture. In the latter case, both protocols can set permissions and crash, which usually presents support issues.
A huge benefit of data visibility is that it allows you to make layered decisions about your data rather than taking a one-size-fits-all approach. Instead of directly moving 2PB of unstructured data to another platform, you might want to consider archiving or tiering cold data to cost-effective object storage, which will save you a lot of money. from one year to the next. Organizations implementing data visibility strategies can identify 60-80% of their data as cold. Reducing hot storage capacity directly reduces data protection and replication costs, which can represent a significant percentage of your overall storage budget.
Network and security configurations can have a huge impact on migrations. Are you moving data between sites or regions, cloud to cloud, or even from the cloud? Define your path and understand your round trip latency, total versus. consumable bandwidth and security requirements. Security technologies, especially antivirus and IDS/IPS, are known to negatively impact migrations when not configured to compensate for increased workloads. The purpose of understanding the topology is to avoid bottlenecks in advance that can slow down or even completely stop the migration.
Test, test, test
Pre-migration testing is equally critical. Some of the most common issues include nodes or clusters being used, misconfigurations, vendor-specific technology limitations such as shares that include a million or more files, short filenames (8.3 application), names long pathnames and Unicode versus Non-Unicode (which affects data storage due to differences in character standards). Oversubscribed or saturated networks, asymmetric routing, or security systems can cause problems: Frequently, packet loss, out-of-order packets, or retransmissions usually trace back to one of the above. Starting with the basic tools included in most operating systems is a good idea. Nothing is more basic than ping, traceroute, and nslookup, which test network connectivity, network path, and DNS configurations. iPerf can be used to measure bandwidth while Wireshark is excellent for showing blocked, dropped, failed or retransmitted packets.
Free Copy Tools vs Enterprise Migration Software
Robocopy and Rsync are common open source tools designed to copy data only and lack the functionality of enterprise migration software. Look for a solution that provides the ability to efficiently run, monitor, and manage hundreds of data migrations to hybrid cloud storage; identify the right files to migrate to maximize efficiency and reduce expenses; minimize network usage; automatic retry if network or storage is unavailable; migrate with or without full file permissions and access control and preserve data integrity by performing MD5 checksums not only on parts of files, but on all files.
The most effective way to keep people informed of the migration plan and milestones is to send out an email thread to all stakeholders on a predetermined interval during the migration. Keep the topic and summary brief and include relevant details near the end. Less is sometimes more, and consider applying color-coded status updates: red, yellow, and green. The subject line can be used strategically, like green status. People want to hear the positives. Avoid the blame game and celebrate wins.
Cloud migrations are a team sport. While there are many tools and metrics that help, bringing teams together across IT and business sectors fosters shared responsibility, which is imperative to achieving a satisfying outcome that meets organizational and business objectives. the end user. The more you plan and test in advance, the less likely you are to run into problems later that will erode confidence in your cloud data management strategy.