Flash is here to stay. And it couldn’t have come at a better time. As we have talked about before, we are in a new age of data where more and more of our information is being created by machines. This swell of data requires a performant storage environment, causing many companies to look for ways to leverage flash technology to give them the boost they need without breaking the bank.
Enterprise flash storage has plenty of advantages, ranging from reduction in power and a higher number of I/O operations. But while all-flash systems are a great solution when performance is a requirement, the reality is the economics still do not make sense for mass adoption. That doesn’t mean there aren’t ways to leverage enterprise flash storage technologies. In hybrid systems comprised of both SSDs and hard disk drives, flash can help get more mileage out of HDDs as a metadata, hot data, and write-back cache. This allows flash to take on some of the tasks that normally caused performance penalties on HDDs.
Flash has already proved to be the medium that will take enterprise storage systems into the new era of data. But before we talk about how we’re using flash, let’s take a moment to explore why spinning disk alone isn’t enough for today’s workloads.
HDDs have two key advantages that make them an appropriate storage mechanism for scale-out NAS. First, they are a reliable technology that has withstood the test of time. Second, HDD-based systems continue to require less up-front costs when compared to enterprise flash storage. This makes them ideal in environments where not all data is “hot” and needs immediate access.
For instance, a production studio working on a movie will want data for that project readily available. But once the project is done, that same data can be moved to a different tier to make room for the next project. However, increasingly demanding workflows and explosive growth in unstructured data have led to several shortcomings of traditional HDD-based scale-out NAS systems:
Insufficient NVRAM capacity results in non-contiguous writes to the HDDs causing poor I/O and throughput performance. All incoming data in a traditional scale-out NAS system is written to an NVRAM card (a write back cache to reduce latencies) via the Server Message Block (SMB) or Network File System (NFS) protocols. From the NVRAM card, data is randomly written to one of the HDDs. The NVRAM cache layer has a limited amount of space and therefore has a short amount of time before that data has to be pushed to HDDs. That small window of time is also not enough to combine writes of a file into contiguous blocks, causing latencies from disk seek time. And this slow-down only gets worse with increases in storage utilization and file size.
These systems suffer from very poor metadata performance and random I/O. HDDs typically only deliver ~120 input and output operations per second (IOPS). To maximize their read/write performance, the data access pattern needs to be long blocks of contiguous data. Without this, systems suffer from high latencies on metadata operations (small, random reads) and other random I/O operations that require access to small non-contiguous blocks of data.
Bolt-On Flash Systems don’t address these issues. Systems that have an external flash “cache” layer to attempt to save “hot” data to serve client needs have several drawbacks. Because the tiering logic is not built into the filesystem, they suffer from data movement latencies between the cache and the underlying storage. Second, these systems cannot leverage the flash layer to optimize data layout in the HDDs.
Flash gives better I/O and throughput performance compared to HDDs. Some SSDs can support 200,000 to 300,000 IOPS, making them an attractive storage option for the most demanding workflows. But as we discussed above, not all data is “hot” in a given environment. In fact, a material portion of an entire namespace will not require immediate access and can be stored in a lower tier. So at a time when enterprise flash storage might be cost-prohibitive, we are looking ways to get the most out of both SSDs and HDDs.
We saw the trend towards flash when we started developing Qumulo Core and designed it to deliver maximum performance and capacity for your investment. Our software is capable of leveraging an arbitrary mix of flash and HDDs. In the future, that means that we can leverage more flash as it becomes more cost-effective. But today, that means using a hybrid architecture of both spinning disk and flash. So let’s look at how flash works in a hybrid system:
Flash optimizes HDD data layout, maximizing performance. Qumulo Core uses SSDs instead of NVRAM as a write-back cache. Unlike NVRAM based systems, each node in a Qumulo cluster has several terabytes of SSD space to cache inbound writes. This ensures the system can optimize the physical data layout before the writes are committed to the HDDs. Furthermore, unlike NVRAM, the SSDs are a persistent storage mechanism that does not depend on a charged battery to keep data safe during power failures.
SSD space is used intelligently to deliver millisecond responses to metadata operations. All file metadata is cached in the SSD layer even after the data is flushed to the HDDs. This provides lightning-fast responses to metadata requests that comprise 90+% of most storage operations.
Qumulo Core is hardware-agnostic. Our filesystem runs on a Linux-based OS, operating 100% in user space. We architected our software to run on commodity hardware and be easily extensible to support an arbitrary mix of SSDs and HDDs. We currently support nodes that have twice as many HDDs as SSDs mainly because of the relative economics of the mediums today. However, Qumulo Core can be extended easily to support all-flash storage media if and when it the economics make sense.
Enterprise flash storage lets us do things that seemed impossible only a few years ago. But we are still at a point where you have to consider many things when deciding the best storage technology for your organization. Like many people facing big IT purchases, costs are a big concern. That is why we use a flash-first hybrid approach that also employs HDDs to deliver the maximum capacity and performance possible for your investment. With Qumulo Core’s flash-first design, we are not only taking advantage of what flash can offer today, but are positioned to fully utilize its potential tomorrow. If all-flash is the future, hybrid systems are how we get there.
Biren has 18+ years of progressive experience in corporate strategy, product/program management, supply chain & operations across multiple industries. At Qumulo, he is a product leader with end-to-end responsibility in the creation and delivery of hardware and software strategies.
We are always looking for new challenges in enterprise storage. Drop us a line and we will be in touch.
Enter a search term below