Examining Qumulo’s Read Prefetch System and How It’s So Effective

We recently kicked off the new fiscal year here at Qumulo. During the company kickoff, this chart was presented, showing the average read latency for Qumulo customers around the world for the last year.

NFS and SMB data read latency in milliseconds

As you can see, NFS and SMB protocol read latencies are down substantially. As customers upgraded their Qumulo software during 2018, they watched their read performance get faster and faster.

The speed of memory for the cost of hard disk

While Qumulo’s cost-effective hybrid cloud file storage system stores the majority of data on relatively slow hard disk drives, you can see below that the majority of data is served to customers at the speed of SSD and memory. The majority of read operations for Qumulo’s customers have sub-millisecond latency.

chart comparing typical read latency of 64 kilobytes read from various media vs Qumulo customer read latency in milliseconds

How do our customers get low NFS read latency and low SMB read latency? Much of this fast performance is the result of our innovative read prefetch system. But, before we get into our software, let’s take a look at the hardware to understand what we’re going to do.

Storage media hardware characteristics

A typical Qumulo cluster–let’s say 4 nodes of QC208–would have 104 hard drives (HDDs), 56 solid state drives (SSDs), and 600 gigabytes of memory (RAM). Our customers have a lot of data and they need that data safely protected on disk. They also want to have fast access to that data.

HDD and SSD Storage media hardware characteristics

Below is a diagram showing the time it takes to read data from the various storage media that are found in the typical Qumulo cluster.

Reading 1MB from a single HDD takes 10 to 20 milliseconds

The fastest way to read data from Qumulo storage is to read data from RAM. But, we can only keep a tiny fraction of the overall cluster’s data in memory at any given moment. How can we get smart about what we put in RAM, and when we put it there? How can we anticipate what a user is going to read and get it there before they even ask for it?

A simple read pattern with no prefetch

To answer the questions above, let’s take a look at a simple read pattern from a single client for a single file that has no optimization.

read pattern with no prefetch

Each read is relatively slow. This illustrates a scenario where every read comes from HDD and takes many milliseconds. You might read a 50MB file in 1 second in this scenario, or 50 MB/s.

Prefetching for a serial, slow read pattern

This next diagram shows what would happen if you read with this same pattern, but Qumulo’s software figures out what you’re going to read next and puts it into RAM, so that when you ask for it you get it back very quickly. Now you read the file in 0.5 seconds and you’re getting 100MB/s per second. That’s decent.

Prefetching for a serial, slow read pattern

Prefetching client-optimized read patterns

But, Windows, Mac, and Linux clients don’t tend to read files serially. They tend to read files in large, parallel batches. This next diagram shows that scenario. Three reads are kicked off at nearly the same time. The third read is successfully prefetched, though it still took some time. But, the next seven reads are now all in memory and they are read roughly in order and the reads all take less than a millisecond. Now you’re able to read that 50MB file in a tenth of a second. That’s 500 MB/s. It’s getting fast.

Prefetching client-optimized read patterns

Prefetching and file prediction

This next diagram shows what happens when Qumulo anticipates the next file you read. In that case we can pull all the data for file 2 from our 104 disk drives into RAM and you can read that second file in 0.03 milliseconds. That means you’re now reading the file (and any future files) at a rate of 1.6GB/s. That’s blazing fast!

Prefetching and file prediction

Things aren’t always perfect like the example above. Sometimes a read pattern is completely random. In that case, nothing is prefetched.

file offset vs time

Our prefetcher shuts off and doesn’t waste valuable RAM. Don’t worry though, you can still benefit from our “read promotion to SSD cache”, but that topic is for another day.

How to get great read performance from Qumulo

  • When reading inside of a file, read it in order, starting at the beginning
  • Within a directory:
    • Read the files in their numeric order (file1.txt, file2.txt, etc)
    • Or, read files in the order they are returned via “ls -U”
  • Read in large chunks, when possible
  • When working with small files (especially files less than a few megabytes) read the whole file

Contact us here if you’d like to set up a meeting or request a demo. And subscribe to our blog for more helpful best practices and resources!

Share this post