How can I budget for a high-performing data strategy? originally appeared on Quora, the place to gain and share knowledge, empowering people to learn from others and better understand the world.
Welcome to our first Qumulo on Quora insights roundup!
The Quora Experts in Technology Space is a place where thought leaders in the field of technology collaborate to answer your most pressing questions. In October, 2021, Quora welcomed Ben Gitenstein, Vice President of Product Management at Qumulo, as one of Quora’s newest technology experts to join the community of thought leaders sharing their knowledge, experiences, actionable steps, and advice. And he’s eager to answer your most pressing questions!
Qumulo on Quora
Today we are kicking off a new running series on our blog — Qumulo on Quora — in which we’ll be sharing Ben’s insights with some additional actionable steps and analysis to help our readers succeed in your own technology journeys.
Our first story covers a question Ben answered about how to budget for a high-performing data strategy. We’re going to drill down on his answer with some additional insights and analysis.
However, before we get to the answer to this question, it’s important for our readers to understand that controlling costs must begin with a data storage budget based on predictable expenses — and that can be difficult without an efficient data architecture. Let’s analyze this further.
Storage efficiency must be built into the core data architecture
Storage efficiency can’t be bolted on to your file system, instead, it must be built into the core software architecture to enable storage of large and small files. For example, Qumulo’s file system supports very large storage capacity with low computational overhead and 100% usable capacity.
These attributes of efficient storage comprise a high-performance system that consumes less power, resulting in energy savings. Advanced, sustainable software–higher performance with lower power consumption–begins with an efficient software architecture that stores more in less space.
The role of storage density in efficiency at scale
Storage density plays an important role in efficiency at scale. Qumulo’s software-defined approach to file data storage enables hardware flexibility and supports the densest media from manufacturers and OEMs as soon as it is available.
Qumulo Core features the capability to scale dynamically, so you can add new node types to existing clusters to take advantage of increased density. For example:
- Denser nodes deliver more capacity in the same (or often smaller) energy envelope, yielding efficiency and energy savings
- Generous DRAM and SSD/HDD ratios in hybrids reduce energy consumption further through intelligent machine learning (ML) caching
“Ultimately, it came down to consolidation,” said Jim Mercurio, executive vice president and general manager of Levi’s Stadium, for the 49ers. “We were rolling about 44 TB a day. IT was just clunky. […] Qumulo allowed us to consolidate everything and made things much simpler.
How can I budget for a high-performing data strategy?
When enterprises are managing massive amounts of unstructured data with high-performance demands, one of the greatest challenges for most modern organizations is how easy it can be to rack up a bill. Data continues to grow, yet, generally speaking, your budget isn’t growing with it. “You might find yourself falling into the trap of overpaying for a vendor — and even worse, you could be locked into that vendor for years,” noted Ben on Quora.
His advice? Ben suggested the following:
You can reduce costs by creating a strategy that optimizes for efficiency and streamlines your data practices. This requires breaking down data silos and consolidating into a single data platform so that every scientist, researcher, developer, and artist has access to the data they need to build products and answer questions. From a storage standpoint, that means you need to look for products that can create massively scalable namespaces with no limit on the capacity or performance they can offer, and a single converged identity model so that your users can access any data regardless of the client applications that were used to write it.
Lastly, be sure to use real-time analytics to help understand where your organization is using bandwidth, and to monitor how your infrastructure is performing on both capacity and performance. You can’t budget if you don’t have visibility.
At the end of the day, the most cost-effective solutions should allow you to scale your workloads to billions of files, big and small, without overcharging you for functionality you’re not using.
Related story: What to Consider When Evaluating Enterprise Data Storage Solutions
Now, let’s take a look at his answer with some additional insights and analysis — beginning with balancing your data strategy budget for more than storage capacity.
Budget for more than storage capacity
Your data strategy budget should span the cost of labor to manage and run your data workloads to the cost of the hardware capacity and maintenance to store it. However, don’t overlook dollars per throughout, dollars per IOPS, startup cost, or the technical debt that accumulates with an ever-expanding data center.
For strategies to manage the budget of data storage as part of your high-performance data strategy, read: How To Resolve the Storage Pains of Legacy Software, Availability, and Budget.
Optimize your compute resources with a hybrid-cloud file system architecture
A hybrid cloud file system architecture helps you reduce the costs mentioned above from data center ‘sprawl’ while leveraging cloud efficiencies of scale for high-performance workloads. While it makes sense to run some workloads on-premises, others require the capacity of the cloud. Enterprises can burst to the cloud, or simply move a subset of their data there to leverage cloud-native services, apps and storage.
“Our team has been able to sustain burst scaling at a rate of 1.3 million IOPS for upwards of 5 hours at a time, with peaks as high as 2 million IOPS,” said Jeremy Brousseau, Head of IT, Cinesite Vancouver. “This is a level unheard of in the past, and it highlights how much Qumulo has helped us to condense our production timelines when required and allow artists to have more iterations in less time, overall resulting in higher-quality final work.”
Better resource sharing by way of cloud computing can help you reduce expenses and run more efficiently. In addition to streamlining costs, it will help you reduce your carbon footprint. To support a hybrid cloud strategy your data needs to be stored in a single namespace with multi-protocol file access from infrastructure on-premises and in public clouds. This enables different groups of users to collaborate on the same datasets whether they are using Linux, Windows, or Mac applications.
Qumulo’s file system was built for hybrid-cloud environments to help reduce the costs of buying and maintaining hardware in your own data center while enabling you to:
- Migrate to the cloud without needing to refactor applications or workflows
- Mobilize data to and from Amazon S3 using built-in data services that come standard in the Qumulo Core file data platform
- Right-size cloud workloads for high-performance with the ability to add storage capacity if needed, versus provisioning for peak
- Use Qumulo on Azure as a fully managed SaaS to control expenses and pay only for what you use
Control data access and usage in real time via data visualization
The ability to budget effectively depends on knowing what’s happening with your data, so you can manage and control it proactively. For example, data visualization of real-time analytics helps you understand where you’re using bandwidth to prevent bottlenecks and overruns. It enables you to monitor your infrastructure’s capacity and performance to spot trends and adjust accordingly.
Data visualization is particularly helpful when you’re running in the cloud. For example, in the cloud or on premises, Qumulo’s real-time analytics dashboard enables you to see how many clients are connected, who is using the most bandwidth, and where the system is growing quickly.
Gain greater yield from hardware via software optimization
Back to the original tenet that efficiency must be “built in” versus “bolted on,” Qumulo Core software is optimized to reduce wasteful over-provisioning. To deliver more performance within the same power envelope, our software reads large chunks from HDD and small chunks from SSD (and even smaller from RAM). Being software defined (not dependent on hardware sales) means our customers get the most out of their fixed asset life cycles with 100% of hardware models still supported.
Join the discussion with thought leaders in technology
Head over to the Quora Experts in Technology Space where you can join in with your own thoughtful discussion, ask questions, and read more of Ben Gitenstein’s expert answers about data strategy trends and the dynamic forces shaping the data management and storage industry today.
And, of course, keep an eye out for our next Qumulo on Quora roundup!