Claire Giordano, senior director for Emerging Storage Markets at Quantum, says storage infrastructure can make difference in scientific research.
Access Made Easy
Reanalyze, Replicate, Reproduce
Easy to Grow and Scale
Cost-Effective Resource Allocation
Interoperable & Integrated
- ACCESS MADE EASY
Storage isn’t about storing bits on a disk—the objective of a storage solution is to ensure that people have access to the information they need, when they need it, how they need it.
Most researchers need shared access, self-service access, and for some things, high-speed access.
Shared access enables more efficient workflows when individuals and teams are collaborating. And yet, not all storage solutions optimize for sharing. Some storage solutions optimize for the high IOPs speed of the local storage at the expense of sharing. An inability to share data across systems and between users can lead to inefficient, serialized workflows, where data has to first be moved from local storage to some other repository to enable other teams to work on it. Not good!
Self-service access means that scientists don’t have to wait. Most researchers do not want to have to file an IT ticket to request archived data—and then wait a few hours (or a few days) to get it back. When data has been archived for long-term storage on different infrastructure, it’s ideal if researchers can still access the files they need themselves, from the location where they expect to find the data, without the intermediation and delays of having to file an IT ticket.
High-speed access matters to data-intensive applications and workloads, particularly in high performance computing. And storage that delivers the speeds required by applications and HPC clusters—while also letting customers spread data across multiple tiers, so they’re not forced to put all of their data on expensive disk—well, that is yet another way that storage infrastructure can help research teams today.
- REANALYZE, REPLICATE, REPRODUCE
When I talk to customers in other industries, such as media and entertainment, I sometimes refer to this requirement as “Reuse, Repurpose, and Remonetize.” In the M&E world, video that was first shot years ago may be reused and repurposed as part of a new movie, documentary, or television show. Sometimes old sports video from decades ago can be remastered and remonetized in a new form.
Content producers recognize the value of remixing older content with new video—and that it’s important to archive content effectively. Because “your archive is only as valuable as your ability to retrieve it quickly.”
In the scientific world, of course, people’s goals and objectives are different. Their focus is on curing cancer, rather than entertaining customers. But that only means the requirement to reference older, archived data is perhaps even more critical.
Sometimes research projects last for years. Before publishing a genomics paper, for example, researchers might need to reanalyze some of the original raw sequencing results using newer bioinformatics techniques, in order to augment the original analysis.
- EASY TO GROW AND SCALE
The need to reanalyze older data leads to another key storage attribute in the realm of scientific research. Teams need to be able to scale—and to scale big. It’s not uncommon for our customers to have 15PB (yes, petabytes) of data today and to know that it will grow to, say, 25 or 30PB in the next several years.
So a storage solution that makes it easy to grow a file system on the fly, without downtime, without stopping people from doing their jobs—well, that can be useful.
And a storage solution that enables organizations to scale capacity with different types of storage, so they can balance the tradeoffs between cost and risk—well that is another way that storage infrastructure can make it easy to grow and scale.
And there’s more. A storage solution that makes it easy to archive on ingest, to create near-immediate copies of data without any backup headaches—well that too becomes incredibly useful, especially as the size of the dataset gets to a point where backup simply isn’t an option.
Of course, the need to grow and scale isn’t always about capacity. Many institutes and departments are securing more grants and increasing the number of projects they need to manage. Which drives up performance requirements, and increases the number of users who need access to the data repository.
So yet another way storage infrastructure can support scientific workflows is to make it easy to scale—to scale capacity, to scale users, and to scale performance.
- COST-EFFECTIVE RESOURCE ALLOCATION
Everyone has to live within some kind of budget, right?
In the world of science, grant money and charitable gifts create opportunity and also create limitations. So even in the noble disciplines of science, bills have to be paid and tradeoffs must be made.
Storage infrastructure that provides the ability to combine different types of storage such as flash, disk, object storage, tape, and cloud—each with different cost attributes—can give organizations the flexibility to deploy the type of storage that best balances their needs for performance, scale, access, and budget.
- INTEROPERABLE & INTEGRATED
Finally, rip and replace doesn’t work in environments where resources need to be allocated wisely.
So a storage solution that avoids rip and replace can make a big difference. Research teams can benefit from storage that integrates easily into existing infrastructure and with existing applications.
Claire Giordano is senior director for Emerging Storage Markets at Quantum, focused on demanding big data use cases, including life sciences, geospatial, and scientific research.