Due to the rapid, enormous growth of unstructured data on a multi-petabyte scale, the classic Network Attached Storage (NAS) array model has been reaching both technical and economic limits for some time: This issue applies to scalable capacity expansion costs in relation to the required availability (locality environments, data replication, recovery times with RAID 6, limited scalability, license costs, administration costs) and performance.
There is a growing need to give software developers, DevOps, IT and industry teams the freedom to use file data in ways that make more sense for the needs of modern businesses. Should unstructured data management be on premises vs cloud, or as a hybrid cloud deployment?
Today enterprises are sitting on a veritable data goldmine: Unstructured digital assets that are the key to innovation and create a direct path to rapidly increasing business value. Worldwide we generate, use and consume more data than ever. According to IDC, there will be a data explosion on the order of hundreds of zettabytes in the next few years (for comparison: 1 zettabyte = 1024 exabytes = ~1021 bytes), with 80 percent of this growth being due to unstructured data alone. New applications and services continue to appear on the stage. In addition, corporate customers are increasingly producing extensive digital content.
Unstructured data growth brings cloud adoption challenges
Unstructured data is massively valuable. This value can only be extracted by innovative enterprises if teams have unrestricted access to their respective data, process it, and use it securely. Seamless data exchange with other team members and partners is just as important. The data must be usable with essential application services that usually only exist in the cloud. Legacy and isolated enterprise data storage solutions — based on hardware-first models — create performance barriers and problems that severely limit a company’s growth options. Seamless access to a company’s resources is essential to adapt digital experiences and innovate quickly.
Who benefits from unstructured data?
Unstructured data typically affects a wide variety of industries. This type of data includes large-format digital content, for example with movies, animations and games, and also high-resolution images that have to be analyzed and recognized, as well as social media, geospatial data for resource analysis, commercial system log files, CAD files, genome research, and more. Unstructured data is also the key to uncovering errors in manufacturing processes: Images provide results in milliseconds that affect patient service and general diagnostic capabilities in the healthcare system.
When highly skilled, business-critical employees who rely on the added value of unstructured data (such as physicians, financial analysts, researchers, engineers, or media & entertainment (M&E) creatives) struggle to find and use data with applications, frustration and project slowdown are inevitable. The intended result can’t be achieved without the massive support of a systems expert. This is a major hurdle for enterprises that want to innovate beyond the IT level. The last thing modern enterprises want are their storage systems experts feeling slowed down in their field.
Enterprises are facing infrastructure challenges
The storage and processing of unstructured data often take place in real-time. But all too often unstructured data is still in silos, which in turn are located on legacy systems. The greatest concern for enterprises is whether their current infrastructure has the capacity to meet the demands of high-performance, data-intensive workloads. Additionally, IT teams often spend a lot of time managing poor quality data rather than providing their business units with the tools they need, such as application services.
Even if teams can properly identify the bottleneck and seek the use of cloud services to solve problems, their data is often trapped in proprietary systems and on-premises storage hardware. However, despite the reality that innovations in the cloud vs on premises can benefit teams for certain workflows, they are unsure when or if they can ever move their applications, data, and workloads to the cloud without an extensive new architecture with a time horizon of months or years.
On premises vs cloud: misunderstandings persist
A company’s data belongs where it makes the most sense for the business and the teams. What is clear, however, is the existence today of a common misconception for many organizations migrating to the public cloud requires refactoring apps for file data. But the reality is you don’t have to rewrite and replace the original applications to move the business forward. A redevelopment of the architecture is also not necessary.
Instead, it is important to rethink how file data (in the sense of unstructured data) is used in enterprises and which applications and data specifically need to be moved via private or public clouds. It is even possible to bring workflows and applications to the cloud. Even if they are not cloud-native, data can be connected to applications via file protocols and APIs. This can be achieved through the use of a file data platform that enables data to be moved effortlessly between the cloud or on-premises storage environments, and even between public cloud environments.
How enterprises benefit using unstructured data in the cloud vs on premises
The answer to gaining increased value and control over corporate data exists in leveraging a cloud file data platform that drives innovation across the enterprise by enabling teams to work together seamlessly and break down existing data storage limitations. The right platform should enable DevOps, IT, and information security teams to all have access to a single file data lake or centralized file system to help the company innovate smoothly.
Can enterprises similarly use unstructured data without the cloud?
The short answer: Yes.
Not everything can be implemented in the cloud, because applications are sometimes missing or workflows are not established. However, Qumulo’s breakthrough technology was developed for hybrid cloud environments as well as cloud first. The Qumulo approach is radically simple: software-defined and supportive in the context of standard hardware. This creates flexibility and reduces costs. Qumulo’s on-premises customers use the system on Apollo or Quanta systems from HPE locally on-premises, as well as Fujitsu and Supermicro and other platforms on HPE. Software updates include all updates that are a prerequisite for the respective storage hardware.
The result? A file data management and storage platform that has all the advantages of an integrated device.
With the ability to use unstructured data in the cloud or on premises comes new market opportunities
When unstructured data can be managed in the cloud or on premises, the possibilities for innovation are game changing. For example:
- A genome research lab catalogs and processes millions of genome sequences through an artificial intelligence (AI) service in the cloud and adds petabytes of information to your file data system.
- A mortgage processor uses data to streamline the digital mortgage application and approval process within minutes instead of months.
- An automaker scans each component for defects as it moves down the production line, and the result is returned in milliseconds by a public cloud service with high-resolution images captured in real-time for analysis.
- A retailer collects data from in-store purchases every second in order to advertise the most wanted products by region in near real time.
Analyzing unstructured data in the cloud vs on premises empowers innovation
The simple provision of data in the cloud opens up completely new data for enterprises and enables them to drive innovative business models and develop additional sources of income. This enables enterprises, such as M&E production studios, or life sciences research firms, to hire remote talent and provide them access to resources anywhere in the world.
When unstructured data is in the cloud, it is easier to use cloud-native functions such as AI and machine learning (ML) from cloud providers or other independent software providers—even with decades of data and IP stored in silos on older storage systems. With the right use of their data, enterprises can revive old assets and consider completely new business opportunities, such as: IoT Edge use cases, or a digital supply chain.
Cloud or on premises: To innovate, say goodbye to outdated storage solutions
With a file data platform that uses file software that can run the same enterprise file system in the cloud and on-premises, data can be replicated natively and seamlessly across locations or regions. Teams can work with data in a context of maximum capacity and performance and therefore can share large amounts of information in a short time with unlimited storage and computing power in the cloud without compromising manageability.
A version of this post was originally published in German by Qumulo on Storage Consortium.
- Related story: What to Consider When Evaluating Enterprise Data Storage Solutions
- Get the Playbook: Finding the Right Storage Solution to Manage the Explosion of Unstructured Data
- Learn more: Your data isn’t the same as it was 10 years ago; enterprises need a better file data platform that delivers the same experience across on premises and cloud environments — discover why Qumulo delivers simplicity at its core