Qumulo’s Distributed File System
Contents of this guide
Real-time visibility and control
Qumulo’s file system is designed to do much more than store file data. It also lets you manage your data and users in real-time. Administrators of legacy storage appliances can often be hampered by “data blindness,” meaning they can’t get an accurate picture of what’s happening in their file system. Qumulo’s file system is designed to give exactly that kind of visibility, no matter how many files and directories there are. You can, for example, get immediate insight into throughput trending and hotspots. You can also set real-time capacity quotas, which avoid the time-consuming quota provisioning overhead of legacy storage. Information is accessible through a graphical user interface and there is also a REST API that allows you to access the information programmatically.
The integrated analytics features of the Qumulo file system are provided by a component called QumuloDB.
How it’s possible
When people are introduced to Qumulo’s real-time analytics and watch them perform at scale, their first question is usually, “How can it be so fast?” The breakthrough performance of Qumulo’s real-time analytics is possible for three reasons:
- QumuloDB analytics are built-in and fully integrated with the file system itself. In legacy systems, metadata queries are answered outside of the core file system by an unrelated software component.
- Because the file system relies on B-trees, QumuloDB analytics can use an innovative system of real-time aggregates (more on this below).
- QumuloDB analytics are possible because of the file system’s streamlined design, which is due to its use of the B-tree indexes and the virtualized protected blocks and transactions of the Qumulo Scalable Block Store (SBS).
Real-time aggregation of metadata
In the Qumulo file system, metadata such as bytes used and file counts are aggregated as files, and directories are created or modified. This means that the information is available for timely processing without expensive file system tree walks.
QumuloDB maintains up-to-date metadata summaries. It uses the file system’s B-trees to collect information about the file system as changes occur. Various metadata fields are summarized inside the file system to create a virtual index.
The performance analytics that you see in the GUI and can pull out with the REST API are based on sampling mechanisms that are built into the file system. Statistically valid sampling techniques are possible because of the availability of up-to-date metadata summaries that allow sampling algorithms to give more weight to larger directories and files. Aggregating metadata in Qumulo’s file system uses a bottom-up and top-down approach.
As each file (or directory) is updated with new aggregated metadata, its parent directory is marked “dirty” and another update event is queued for the parent directory. In this way, file system information is gathered and aggregated while being passed up the tree. The metadata propagates up from the individual node, at the lowest level, to the root of the file system as data is accessed in real time. Each file and directory operation is accounted for, and this information eventually propagates up to the very core of the file system. Here is an example.
The tree on the left is aggregating file and directory information and incorporating it into the metadata. An update is then queued for the parent directory. The information moves up, from the leaves to the root. In parallel to the bottom-up propagation of metadata events, a periodic traversal starts at the top of the file system and reads the aggregate information present in the metadata. When the traversal finds recently updated aggregate information, it prunes its search and moves on to the next branch. It assumes that aggregated information is up-to-date in the file system tree from this point down towards the leaves (including all contained files and directories) and does not have to go any deeper for additional analytics. Most of the metadata summary has already been calculated, and, ideally, the traversal only needs to summarize a small subset of the metadata for the entire file system. In effect, the two parts of the aggregation process meet in the middle with neither having to explore the complete file system tree from top to bottom.
Sampling and metadata queries
One example of Qumulo’s real-time analytics is its performance hot spots reports. Here is an example from the GUI:
Representing every throughput operation and IOPS within the GUI would be infeasible in large file systems. Instead, QumuloDB queries use probabilistic sampling to provide a statistically valid approximation of this information. Totals for IOPS read-and-write operations, as well as I/O throughput read-and-write operations, are generated from samples gathered from an in-memory buffer of more than 4,000 entries that is updated every few seconds.
The report shown above displays the operations that are having the largest impact on the cluster. These are represented as hotspots in the GUI.
Qumulo’s ability to use statistically valid probabilistic sampling is only possible because of the summarized metadata for each directory (bytes used, file counts) that is continually kept up-to-date by QumuloDB. It is a unique benefit of Qumulo’s advanced software techniques that are found in no other file storage system.