The PivotNine Blog

VAST Data Shows What Stateless Containers Can Do

19 January 2022
Justin Warren

VAST Data uses containers as a building block inside its storage systems to create a very high performance scale-out file and object flash storage array. It's a really interesting approach to using software to create abstractions of hardware to combine standard, commodity components with proprietary special sauce.

“We were lucky to start VAST at a point in time where there were underlying technologies—hardware, software, protocol and networking—that allowed us to build a new architecture on top,” said founder and CEO Renen Hallak. “We could build a lot of new algorithms and metadata structures on top of that and break free from the fundamental trade-offs that have existed for so many years.”

Renen-Hallak-1-scaled-1.jpg
Renen Hallak, VAST Data co-founder and CEO (Source: Supplied)

The approach is clearly working, as Hallak boasts of $100 million annual run-rate and net dollar retention of 300%. With a healthy $200 million on the balance sheet, VAST is also cashflow positive so it doesn't need more funding, but with a valuation of around $4 billion and strong growth, it's unlikely to be a problem should they decide to raise more.

VAST Data have focused on a software-only storage approach that combines commodity hardware from major vendors, standard interfaces, with their own proprietary software smarts to create something special.

“Out of around 200 developers, only two know something about hardware,” says Hallak.

These two are focused on qualifying the software to ensure it will run on various vendors' hardware, with the interface standards doing the majority of the heavy lifting. Customers buy their own hardware from their preferred suppliers, just as they buy server hardware from whoever they prefer, and then load software onto the devices. This is unremarkable in servers, but is still somewhat novel in storage arrays (though less than it once was).

VAST uses NVMe over Fabrics (NVMe-oF) as the backbone of its direct attached, shared-everything architecture, with standard components from Mellanox and Intel. Metadata is kept on storage-class memory for fast persistence, but the flash storage itself can be quite low-grade flash for write endurance, thanks to VAST's special software.

“In terms of the actual software, it's all containerized,” says Hallak. “It can run on any standard server that has access to a network, so that part is the easiest. Customers run this software on many different types of server from many different vendors. Anywhere you can run a container, basically.”

I find this use of containers for backend persistence fascinating because of how little anyone needs to care about this when using VAST Data for storage. The access protocol is file and object, so NFS, SMB, or S3. What happens after that is essentially irrelevant, except for the performance and operational characteristics that it allows VAST to provide.

“You needed this type of very, very different architecture in order to be able to use containers in the way they were supposed to be for storage,” says Hallak. “But once you do that, you get so many side effects, so many nice benefits from it.”

“It allows you to go to that next level of hyper-converged. You no longer need all of your resources tightly coupled one to the other, and all of your services running on the same box coming from the same vendor. You can have best-of-breed storage containers, best-of-breed networking containers, best-of-breed database containers, all with access to all of the information over a standard network.”

VAST is a great example of the way to do persistent containers well: as a storage layer, not as a mongrel hybrid of application and persistence like so many attempts currently are. There is a kind of purity, a degree of focus from this approach that provides tremendous benefits to performance, reliability, and operational simplicity that I hope a lot more people pay attention to this kind of architecture and start to use it for similar scenarios.

Containers are wonderful when they are used to do container specific things and not as pseudo-virtual-machines.

I highly recommend diving more deeply into how VAST does what it does. There's a lot to be learned.