File and Object on Flash

This paper was sponsored by Pure Storage.

Flash has made major inroads into workloads that would historically have been classed as ‘second-tier’. The rapidly dropping cost of flash over the past ten years combined with the vastly superior throughput and stable, low-latency performance have made flash an obvious choice.

Disk-based storage is now not the first choice for systems where performance is valued over raw capacity, and not just on ‘first-tier’ systems built on traditional block-based SANs. Most data lives outside these systems, and it exists mostly as files and objects. This is why FlashBlade was an obvious product to introduce back in 2016, but the way in which customers are now using file and object data has changed since then.

Customers no longer have to justify using flash, but rather eliminate flash as a viable option before considering disk-based systems. Due to the major improvements in capacity available on flash-based systems like Pure Storage’s FlashBlade, flash now makes the most sense for many file and object workloads that used to end up on disk-based systems.

This is a subtle, but profound, change in outlook. It means that workloads as diverse as high-performance computing (HPC), modern application development, analytics, and data protection have all seen success on FlashBlade.

We take a closer look at a couple of these workloads—analytics and data protection—to see why this has happened, and the direction this trend is likely to take in the near future.

Analytics

Modern analytics systems are dealing with very large datasets. One example is McArthur Labs at Canada’s McMaster University. McArthur Labs developed a global database that curates data, models, and algorithms associated with superbugs. The data from these genomic datasets was doubling every three months, and analyzing data took up to two days. These kinds of datasets used to be stored on large arrays full of spinning disk.

But traditional storage is incapable of providing capacity with the necessary performance. “There’s no point in playing with traditional storage, because it’s just not fast enough,” said Andrew McArthur, Ph.D., a genomics professor and researcher. Enter FlashBlade. Large amounts of file and block accessible storage provided the performance to generate results in three hours. Pure’s flexible model helps the system keep growing with the data, and Evergreen upgrades keep the system current as McArthur’s needs change.

This flexibility of modern flash systems helps customers to treat storage infrastructure as a set of malleable options to bring to bear on a problem. The ability to make different choices as the world changes around us is a subtle, but far more profound, change to storage infrastructure.

Data Protection

Flash is becoming a more common choice for data protection for data that needs to be recovered rapidly, particularly due to the rise in ransomware.

There’s not a lot of motivation to pay a ransom of $10 million for a single Excel file unless there’s something incredibly special about that one file. But when a ransomware attack means losing your internal email systems, the entire payroll processing system and Active Directory all at once, the ability to quickly recover a lot of data makes flash a compelling choice.

Pure Storage customers like Sinai Medical in Chicago are particularly sensitive to having systems offline. Statistics from the Office of the Australian Information Commissioner (OAIC) show that healthcare organisations are consistently the most likely to suffer a data breach and with the health of patients potentially at risk, healthcare organizations need to know they can recover quickly if they’re attacked.

Again, the ability to react quickly to changing circumstances is, PivotNine believes, an under-appreciated benefit of modern infrastructure systems.

Products not Projects

But beyond the fairly obvious reasons for moving to flash for pure performance lies a change in IT from doing projects to managing products.

In some ways, very little has changed in using flash for file and object data compared to using spinning disks. The basic function of the storage—storing data—is the same. But the speed and flexibility of flash has lead to a qualitative change in how storage systems are used.

There is a lot more flexibility with flash. Flexibility that we couldn’t achieve with previous technologies, and this adds other benefits beyond mere performance. It’s much easier to simply upgrade storage now. Operating system upgrades can happen online and largely automatically. Full hardware replacements are also now done essentially online, and with negligible impacts on active work.

This frees infrastructure teams from worrying as much about low-level details and allows them more time to work on delivering a more abstract service to the systems that need storage. Rather than delivering one-off projects, infrastructure teams can now manage storage as a product that constantly adapts to the needs of its customers.

AutoNation moved to Pure Storage to have high performance storage that could handle large data sets to continue improving their customer experience. CatholicTV moved to FlashBlade because 15,000 RPM disks couldn’t keep up with modern UltraHD video content and streaming.

Note how the fundamental function of storing data hasn’t changed. It is our ability to provide that function in new, more appropriate ways that has changed. The flexibility of modern storage systems like FlashBlade allows us to respond to change, to adapt to it, rather than being an unnecessary barrier to desirable change. It allows infrastructure teams to move with their customers to where those customers want to go, instead of being a dead weight.

Infrastructure like FlashBlade allows us to respond to change, rather than to resist it. The infrastructure can get out of the way, and IT can think in terms of products, not projects; systems of infrastructure, not mere disks.

This is what PivotNine sees as the most interesting aspect of FlashBlade, and modern infrastructure in general.