We recently kicked off the new fiscal year here at Qumulo. During the company kickoff, this chart was presented, showing the average read latency for Qumulo customers around the world for the last year.

As you can see, NFS and SMB protocol read latencies are down substantially. As customers upgraded their Qumulo software during 2018, they watched their read performance get faster and faster.

The speed of memory for the cost of hard disk

While Qumulo’s cost-effective hybrid cloud file storage system stores the majority of data on relatively slow hard disk drives, you can see below that the majority of data is served to customers at the speed of SSD and memory. The majority of read operations for Qumulo’s customers have sub-millisecond latency.

How do our customers get low NFS read latency and low SMB read latency? Much of this fast performance is the result of our innovative read prefetch system. But, before we get into our software, let’s take a look at the hardware to understand what we’re going to do.

Storage media hardware characteristics

A typical Qumulo cluster–let’s say 4 nodes of QC208–would have 104 hard drives (HDDs), 56 solid state drives (SSDs), and 600 gigabytes of memory (RAM). Our customers have a lot of data and they need that data safely protected on disk. They also want to have fast access to that data.

Below is a diagram showing the time it takes to read data from the various storage media that are found in the typical Qumulo cluster.

The fastest way to read data from Qumulo storage is to read data from RAM. But, we can only keep a tiny fraction of the overall cluster’s data in memory at any given moment. How can we get smart about what we put in RAM, and when we put it there? How can we anticipate what a user is going to read and get it there before they even ask for it?

A simple read pattern with no prefetch

To answer the questions above, let’s take a look at a simple read pattern from a single client for a single file that has no optimization.

Each read is relatively slow. This illustrates a scenario where every read comes from HDD and takes many milliseconds. You might read a 50MB file in 1 second in this scenario, or 50 MB/s.

Prefetching for a serial, slow read pattern

This next diagram shows what would happen if you read with this same pattern, but Qumulo’s software figures out what you’re going to read next and puts it into RAM, so that when you ask for it you get it back very quickly. Now you read the file in 0.5 seconds and you’re getting 100MB/s per second. That’s decent.

Prefetching client-optimized read patterns

But, Windows, Mac, and Linux clients don’t tend to read files serially. They tend to read files in large, parallel batches. This next diagram shows that scenario. Three reads are kicked off at nearly the same time. The third read is successfully prefetched, though it still took some time. But, the next seven reads are now all in memory and they are read roughly in order and the reads all take less than a millisecond. Now you’re able to read that 50MB file in a tenth of a second. That’s 500 MB/s. It’s getting fast.

Prefetching and file prediction

This next diagram shows what happens when Qumulo anticipates the next file you read. In that case we can pull all the data for file 2 from our 104 disk drives into RAM and you can read that second file in 0.03 milliseconds. That means you’re now reading the file (and any future files) at a rate of 1.6GB/s. That’s blazing fast!

Things aren’t always perfect like the example above. Sometimes a read pattern is completely random. In that case, nothing is prefetched.

Our prefetcher shuts off and doesn’t waste valuable RAM. Don’t worry though, you can still benefit from our “read promotion to SSD cache”, but that topic is for another day.

How to get great read performance from Qumulo

  • When reading inside of a file, read it in order, starting at the beginning
  • Within a directory:
    • Read the files in their numeric order (file1.txt, file2.txt, etc)
    • Or, read files in the order they are returned via “ls -U”
  • Read in large chunks, when possible
  • When working with small files (especially files less than a few megabytes) read the whole file<

Share with your network