Data is everywhere today. It is constantly being created, moved, and refined. Experts at the World Economic Forum speculate that by the year 2020, the digital universe will contain 44 zettabytes of data, 44 x 1021 bytes to be exact. To put this in perspective, a zettabyte is 1 billion (1,000,000,000) terabytes and one trillion (1,000,000,000,000) gigabytes. This is a HUGE amount of data! Because of this large volume, it has become increasingly important for companies to manage and organize data as it flows in from multiple sources. The more information an organization accumulates, the greater the risk of being pulled in by data gravity.
What is Data Gravity?
Data gravity, a term coined in 2010, refers to the way that applications and data are attracted to each other. The more data that exists within an organization, the greater the attractive forces that pull applications and services together.
Similar to the way that objects are attracted to each other due to gravity, data likes to be with other data. Just like how Newton’s Law of Gravitation explains the effects physical objects have on one another, applications and processing power are attracted by the pull of large, complex datasets. Consequently, the bigger a dataset becomes, it builds “mass,” making it harder and harder to move.
As a result of their sizable “masses,” some datasets are cumbersome and tricky to move. Think of a heavy object, like a box full of books. That box has a lot of mass, making it difficult to carry. The box is pulled toward the ground because of the force of gravity at work, meaning that picking it up and moving it is will require a significant amount of effort.
The same idea applies to sizable aggregations of data. A smartphone, for example, might have 32 gigabytes of data stored on it. Those 32 gigabytes can easily be transferred to another device with little to no effort. A terabyte is a significant quantity of data, one trillion (1,000,000,000,000) bytes, but with the right skills, it can be moved with relative ease. Anything much bigger than a terabyte becomes a challenge to move because it’s “heavier.”
Data Gravity Around Us
One example of data gravity can be seen in the once wildly popular music platform, iTunes. This music and video storage service was useful if you wanted to upload music from external sources, like CD’s, or purchase new content from their store. However, this system became difficult to work with as soon as a user tried to export any of their files. Songs didn’t save correctly or were given wrong titles, causing the organization of music and video to become a hassle.
Another area where data gravity frequently comes into play is when interacting with data silos. These silos are large groupings of information that exist within a company but are only used and accessed by a select few departments. This is harmful because it prevents information from being shared company-wide. Data silos by themselves can be tricky to merge or migrate to one database, but the pull of data gravity can make the process even more difficult. The larger the amount of data that resides in the various silos or systems, the more effort it will take to merge or move them.
An additional place we frequently see data gravity at work is in cloud-centered databases. With 90% of the world’s data having been generated in the last 2.5 years, many companies have rapidly moved from on-site or private cloud-based databases to either a mixed or public cloud system like Microsoft, Google, or one of Amazon’s services. Many of these options are able to store large sums of data with security, versatility, scalability, and ease of access, making these systems very attractive.
While these providers offer many perks, the trouble often comes when companies want to move data from one type of storage system to another, but feel trapped within the current system framework. Migrating data to a new platform can be challenging due to formatting problems, or because of the sheer size of the platform in its current state. Cloud-based databases often gain a great deal of mass because of the high threshold of information that can be held there. As more information is moved to these services, it will naturally become more difficult to migrate to a new system.
One of the reasons that cloud-centered storage options become so “heavy” is because of the unique features and customizable options available. For example, those who use Google Cloud can use the Google Assistant AI feature to ask questions about the data, gain quick overviews of information, or even integrate the AI software directly into a specific project. While beneficial, a project stored in Google Cloud that has enabled Google Assistant AI might not migrate easily to a new database.
A cloud database can be further customized through the selection of: the type of database (relational, non-relational, warehouse, etc.), amount of storage (as much as 500 terabytes), choice of onsite/cloud-native/hybrid structure, and who across your company has permission to view or manipulate the data. While these customizations differ from platform to platform, they make it very easy to store datasets in a variety of forms. Cloud solutions offer the ability to easily and frequently input new data, but they don’t always offer a seamless way to transfer those features and information to other solution offerings should you choose to change platforms.
If you are searching for a new system or application, it is important to keep in mind that, while the ease of data input is important, you must also consider how easily you will be able to pull out your data in the future. You will save yourself a major headache in the future if you select a service that allows you to both easily access your information and transition it to a new database if needed. A good place to start is simply to research. What have other users of a particular platform said about their experience? Are there businesses similar to yours who have found this solution successful? An additional step you can take is to talk with a business intelligence consulting company (like us!) who knows what to look for and can make suggestions based on your company’s database needs. This way, you can stop data gravity before it begins.
Defying (Data) Gravity?
If you feel overwhelmed by the task of transporting your business’ data from one warehouse/system to a new one, don’t despair! It is indeed possible to defy data gravity— it takes a great deal of patience and effort, but it can be done!
There are a variety of strategies that can be used to draw your data out of the gravitational pull of other systems, but it’s key to take it slow. Data migration won’t happen overnight. Data might need to be restructured or reformatted before it will fit into a new system. Taking your time helps to ensure that your data is clean and accurate, which saves work in the long run and gives a higher return on investment (ROI) on the entire process.
If you find yourself frustrated by data gravity or are struggling with a migration project, VanData is here to help. Our team of experts can make this transition as smooth and efficient as possible. We know that the process of data migration can be intimidating. No matter what system you have now, we can advise as you choose a new platform or guide your ongoing shift. Shoot us a message and set up a free consultation to see how VanData can help your company defy data gravity and tailor a solution that’s as unique as your business.