Here’s how a truly universal-scale storage system works
Go under the hood and see QF2’s unique abilities
Qumulo is a new kind of storage company, based entirely on advanced software. Commodity hardware running advanced, distributed software is the unchallenged basis of modern low-cost, scalable computing. This is just as true for large-scale file storage as it is for search engines and social media platforms.
How can it be so fast?
When people see QF2 in action for the first time, they often ask: “How can it do that so fast?” Get the answers with our technical white paper.
Clusters that work together
In QF2, cloud instances or computing nodes with standard hardware work together to form a cluster that has scalable performance and a single, unified file system. QF2 clusters work together to form a globally distributed but highly connected storage fabric tied together with continuous replication.
QF2 is unique in how it approaches the problem of scalability. Its design incorporates principles used by modern, large-scale, distributed databases. The result is a file system with unmatched scale characteristics.
Billions of files
For massively scalable files and directories, the QF2 file system makes extensive use of index data structures known as B-trees. B-trees minimize the amount of I/O required for each operation as the amount of data increases. With B-trees as a foundation, the computational cost of reading or inserting data blocks grows very slowly as the amount of data increases.
A highly distributed scalable block store persists the B-trees across the QF2 cluster.
QF2 provides real-time visibility and control for file systems of all sizes, even with file counts numbering in the tens of billions. Up-to-the-minute analytics allow administrators to pinpoint problems and effectively control how storage is used. The answers to these queries arrive instantly. With QF2, storage administrators can see usage, activity and throughput at any level of the unified directory structure.
In the QF2 file system, metadata such as bytes used and file counts are aggregated as files and directories are created or modified. This means that the information is available for timely processing without expensive file system tree walks.
View videos of QF2 analytics in action.
Just as real-time aggregation of metadata enables QF2’s real-time analytics, it also enables real-time capacity quotas. Quotas allow administrators to specify how much capacity a given directory is allowed to use for files.
Unlike legacy systems, in QF2 quotas are deployed immediately and do not have to be provisioned. They are enforced in real time, and changes to their capacities are immediately implemented. Quotas can be specified at any level of the directory tree.
Try QF2’s real-time quotas on a test cluster.
Snapshots let system administrators capture the state of a file system or directory at a given point in time. If a file or directory is modified or deleted unintentionally, users or administrators can revert it to its saved state.
Snapshots in QF2 have an extremely efficient and scalable implementation. A single QF2 cluster can have a virtually unlimited number of concurrent snapshots without performance or capacity degradation.
Try QF2’s snapshots using a test cluster.
QF2 provides continuous replication across storage clusters, whether on premises or in the public cloud. Once a replication relationship between a source cluster and a target cluster has been established and synchronized, QF2 automatically keeps data consistent. There’s no need to manage the complex job queues for replication associated with legacy storage appliances.
Continuous replication in QF2 leverages QF2’s advanced snapshot capabilities to ensure consistent data replicas. With QF2 snapshots, a replica on the target cluster reproduces the state of the source directory at exact moments in time. QF2 replication relationships can be established on a per-directory basis for maximum flexibility.
Try continuous replication yourself using test clusters.
When people are introduced to QF2’s real-time analytics and watch them perform at scale, their first question is usually, “How can it be that fast?” The breakthrough performance of QF2’s analytics is possible because of a component called QumuloDB.
QumuloDB continually maintains up-to-date metadata summaries for each directory. It uses the file system’s B-trees to collect information about the file system as changes occur. Various metadata fields are summarized inside the file system to create a virtual index. The performance analytics that you see in the GUI and can pull out with the REST API are based on sampling mechanisms that are enabled by QumuloDB’s metadata aggregation. QumuloDB is built-in and fully integrated with the file system itself.
Scalable Block Store (SBS)
The QF2 file system sits on top of a transactional virtual layer of protected storage blocks called the Scalable Block Store (SBS). Instead of a system where every file must figure out its protection for itself, data protection exists beneath the file system, at the block level. QF2’s block-based protection, as implemented by SBS, provides outstanding performance in environments that have petabytes of data and workloads with mixed file sizes.
SBS has many benefits, including:
- Fast rebuild times in case of a failed disk drive
- The ability to continue normal file operations during rebuild operations
- No performance degradation due to contention between normal file writes and rebuild writes
- Equal storage efficiency for small files as for large files
- Accurate reporting of usable space
- Efficient transactions that allow QF2 clusters to scale to many hundreds of nodes
- Built-in tiering of hot/cold data that gives flash performance at archive prices
- Built-in support for all-flash configurations for workloads that require the highest performance
The virtualized protected block functionality of SBS is a huge advantage for the QF2 file system. In legacy storage systems that do not have SBS, protection occurs on a file by file basis or using fixed RAID groups, which introduce many difficult problems such as long rebuild times, inefficient storage of small files and costly management of disk layouts.