Progenity conducts biotechnology research to improve patient care: genetic sequencing operations produce billions of small files used for diagnostic tests, and all of them are housed on Qumulo’s file data management platform.
Progenity, Inc. is a biotechnology company that provides clinicians with complex molecular and specialized diagnostic tests for women’s health, reproductive medicine, and oncology. Progenity conducts genetic sequencing research in areas like genetic carrier testing, CFDNA screening, and cancer testing to encourage the spread of information and ultimately, provide the industry with data that leads to better patient care.
The company was founded in 2012, currently has more than 500 employees, and is growing rapidly. In addition to its headquarters in San Diego, it has laboratory and operations facilities in Ann Arbor, Michigan and Dallas, Texas. As the company continues to expand, its data storage needs continue to grow exponentially.
Growing data-intensive genetic sequencing workloads requires efficient, scalable storage for billions of files
Over the years, Progenity’s work in genetic sequencing has generated more than a billion files. According to David Meiser, Solutions Architect for Linux and Windows applications at Progenity, “That pace is accelerating. Within two years, we might have another billion files.”
With its rapid growth and data-intensive workflows, Progenity knew that its previous storage vendor would be unable to meet its future needs. “After a few years with our original storage system, we realized that the way the company worked wasn’t a good model for us,” said Meiser, referring to both high costs and storage efficiencies.
“One problem that was always present was that there was significant file overhead. The files we write are very small, and the block size of our old storage system was very large,” Meiser explained. “We found that we couldn’t do analysis in-place because the access times were super high.”
Better cost-density value for genetic sequencing operations with rapid access on-premises or in the cloud
Progenity selected the Qumulo File Data Platform for hybrid cloud environments to replace its legacy vendor in 2016. Qumulo was able to offer an affordable solution for Progenity, without sacrificing performance or scalability. Qumulo storage also runs both on-premises and in the public cloud, another key advantage for Progenity’s planned migration to the cloud.
According to Meisner. Solutions Architect, “Qumulo handles small files very well, and that again goes back to cost because we’re getting better cost density. Even if we were paying the same for a gigabyte of data on Qumulo as we were for our legacy system, we’d still be getting a better value for our money.”
Meiser estimated that, conservatively, Progenity saves 13 percent on storage space with Qumulo compared to its previous vendor, but he noted that the actual savings is probably more between 17 and 20 percent.
Success factors: real-time visibility, support for multiple file access protocols and an easy path to the cloud
Qumulo’s ability to provide real-time visibility into data usage, including hotspots and inefficiencies, has proven to be an important benefit. “Now that we have it, we very much appreciate it. As we were migrating from the legacy system to Qumulo, we were able to figure out which files the research and development folks could delete so we weren’t wasting storage.”
Qumulo’s support of both network file system (NFS) and server message block (SMB) protocols is another key advantage for Progenity. According to Meisner, “We’ve got to be able to write data over SMB and read that data back over NFS. That’s our big story for how we analyze data.”
Progenity’s data inputs come from Illumina sequencers, which deliver results directly to the Qumulo cluster over SMB. The data is also copied to the NGS (next-generation sequencing) compute cluster, which includes local storage used for the actual computational analysis using the NFS protocol.
The company is planning to move its next-generation sequencing (NGS) compute cluster to the cloud and to use Qumulo to manage its public cloud storage on AWS. Qumulo’s continuous replication feature ensures that local shares will be kept up-to-date. The Customer Care team at Qumulo will be on hand to ensure success.
“We’ve been exceedingly happy with the answers we’ve gotten from Qumulo at all levels,” said Meisner. “One of the biggest selling points is having a Slack channel where I can immediately get in touch with Qumulo Care for any sort of question, whether it’s big or small. We use that feature a lot and we really love it.”
Read the full case study: Helping Progenity Plan for Its Future: Service, Value and Scale
Seconds Matter – Accelerating Analysis with Next-Gen PACS / VNA Imaging Systems
FBRI’s IT Upgrades Speed Data Analysis, Lay Foundation for 2x Research Capacity
Genomic Data and Sequencing: Store Billions of Small Files Efficiently