In today’s data center an inefficient archive storage system will cause more than a headache, it can be a full-blown nightmare. Most of the archive storage systems came from a time when organizations were simply looking for an inexpensive place to “park” their data. But in today’s enterprise, people depend on their data – all of it – to do their jobs. Their data is their business, and even archived information needs to be easily and quickly accessible. As a result, that storage solution you thought was going to save you money can end up piling on a lot of unanticipated costs that legacy vendors often don’t tell you about. And without good analytics, many businesses don’t even know how much more their archive solution is costing them.
Archive Storage Drives Up Data Center Costs
Since archives need to be powered and supported for very long periods of time, archive data footprint in the data center is a key factor in determining costs. Most archive systems are massive, proprietary systems that are hard to rack and service. They often require two to three people to install and replace. Because they were built before the era of cloud computing, most data archives were built on proprietary hardware that requires inefficient racks and power supplies.
Even worse, the structural inefficiency of many archive storage systems leaves about 25% of usable capacity unavailable to users. By contrast, cloud providers, who are always looking to optimize cost, have moved away from this hardware model long ago to 1U or 1.5U servers, inexpensive systems that are easy to service and replace by one person without special hardware. Cloud providers are dealing with extreme amounts of data, but as companies continue to generate and keep their data, these same issues of scale will be felt across many data centers.
Out of Sight, Out of Mind
An archive is often the place where data goes to sit and wait, only to be dug up when there’s a critical need. That means most administrators only look at their archives when they need to, and never consider whether they can optimize it for efficiency. So when someone asks: “How much space is group X using?” the administrator has to walk through every file and folder, a process that can take an incredibly long time, especially given the slow performance of many of the cheaper archive systems.
Real-time analytics would solve this problem, giving administrators insight to how archive data is being used so they can identify opportunities to improve efficiency.
Archive Storage Data Protection Uses Even More Space
Archive technologies vary greatly in their approach to data protection. Many new products on the market use object and usually protect data by making 2-3 copies. This is NOT efficient and leads to a huge amount of waste. Other products protect data at the file level, with better efficiency. However, the efficiencies vary greatly by file size. This makes it very difficult to manage since you cannot answer the direct question of “How much more data can I add to the archive?” You are forced to say, “It depends on the files you are storing.” All of these vendors do not want you filling your system to 100%, so much more proactive planning is required.
Archive Storage Shouldn’t be a Nightmare
Almost half of today’s data center expenses come from power and cooling. That leaves data center managers looking for any way to make their systems more efficient. Because archive storage has been thought of as “cold storage,” those systems might not be the first place we look for data center improvements. But, with outdated hardware configuration and no real way to get actionable information about your data, the true costs of your “cheap” archive storage could be much higher than the original quote. However, there are things system administrators can do right away to get the most value from their archive data.
Check back with the Qumulo blog for more on why archive data is so valuable and how to get the most out of it.