Open source v commercial parallel file systems – truth and fiction in an HPC world

High-Performance Computing (HPC) and its ability to store, process and analyse vast amounts of data in record time is driving innovation all around us. By Jim Donovan, Chief Marketing Officer, Panasas.

  • Friday, 9th August 2019 Posted 5 years ago in by Phil Alsop

With enterprises increasing their use of emerging technologies such as Artificial Intelligence (AI), machine learning and augmented reality to improve productivity and efficiency, they are looking for the best high-performance data storage infrastructure to support business operations and make automated decisions in real-time. HPC data storage systems rely on parallel file systems to deliver maximum performance and there are two options to choose from: open source or commercial parallel file systems. Opinions abound on both, so it’s worth examining what’s hype and what’s real.

 

Cost of acquisition – What’s better than free?

 

An inherent part of any open source product is the fact that its acquisition is free to the user. This is no different with open source parallel file systems such as Lustre and BeeGFS. While there are highly proficient Lustre and BeeGFS architects and developers in HPC labs around the world ready to tackle each system’s complex set-up, tuning and maintenance requirements, enterprise users can become overwhelmed by a system that lacks the manageability and ease of use they have grown accustomed to in their existing IT environment. By the time, Chief Information Officers (CIO) factor in the cost of additional staffing requirements to implement and manage an open source parallel file system, there’s quite a price tag associated with the ‘free’ acquisition.

 

Here’s where commercial parallel file systems have a competitive edge over open source offerings. Commercial parallel file systems are delivered as plug-and-play systems that offer some of the lowest total-cost-of-operations and ownership in the business. This is due to ease of deployment and simple manageability, which results in negligible administrative overhead. In addition, commercial file systems are capable of automatic tuning and retuning as workloads change, thereby reducing the opportunity cost of downtime.

 

Confusing customisation with flexibility

 

Open source file systems allow for an individual implementation of the code, giving skilled users the ability to modify, customise and expand the code’s functionality to meet their organisation’s unique workflows. But are users looking for customisation or do they actually yearn for more flexibility?

 

If true customisation is needed, enterprise users should assess the type of skill set and number of staff required to successfully modify and support the open source code. If flexibility is the ultimate goal, today’s modern commercial file systems offer dynamic adaption to changing workflows without making changes to code.

 

Built on industry-standard hardware that allows for the rapid adoption of new technology, commercial parallel file systems are self-tuning solutions, and purpose-built for adaptability and flexibility to handle a wide range of use cases. Users can configure the system to their exact workload needs without overprovisioning any single component. Systems scale without limitation and bandwidth, capacity, and metadata performance can be independently set with granular control.

 

Elimination of the ‘performance’ gap

 

Commercial parallel file systems have closed what used to be the performance gap with open source. The performance of today’s open source parallel file systems is on par with commercial portable file systems, which leverage the latest hardware and storage media technology. The ability to quickly scale in increments without interruption and tuning is crucial for commercial applications to stay on track and meet demanding time-to-market schedules. The processing of large and complex data sets with high precision while handling thousands of I/O operations simultaneously is a must for high-end computing deployments in the commercial space, such as computer-aided engineering (CAE) simulation and analysis, energy exploration and drug development, as well as emerging workloads such as AI and autonomous driving.

 

Performance is optimised and reliably consistent when the software and hardware are pre-tuned, allowing the system to automatically adjust to increasing complexity. This is the case with portable, commercial parallel files systems that have been optimised for, and are in tune with, pre-qualified commodity hardware components. Open source file systems don’t benefit from the same level of seamless integration as they often require deep knowledge of how the storage system works, in order to tune and re-tune it for the maximum level of performance and bandwidth utilisation required by different workloads.

 

System Maintenance – What does it take to keep things running reliably?

 

In the fast-paced world of HPC, users are tackling new and complex projects all the time. Data storage is an essential component in guaranteeing business critical deliverables, and solutions that are easy to deploy, manage and scale, have an immediate impact on a company’s bottom line. Simplicity across the board translates not only into low administrative overhead, but a finely tuned, self-managing system in which all the common maintenance workflows, as well as data reliability, have been automated. This means there is no need for enterprise users to worry about downtime, lost data or late-night emergency calls.

 

Commercial file systems have mastered this ‘lights-out’ operational approach, while many of their open source counterparts still spend a considerable amount of time on day-to-day storage management and maintenance, dealing with the time-consuming, complex, and error prone activities of tuning, in order to optimise the interaction of software and hardware.

 

Bringing it all together

 

Today, the need for high performance data storage infrastructure in commercial enterprise cannot be understated. The massive volumes of data generated from emerging technologies such as AI and machine learning is growing exponentially due to the ease of application integration with enterprise business, covering all industries from manufacturing to life sciences. Fueled by hardware innovations and software driven services, HPC data storage systems are allowing enterprises to use new technology to achieve greater levels of productivity and operational efficiency than ever before, and it’s the outstanding performance capabilities of parallel file systems that are servicing demand. 

 

When all evidence is considered, enterprise CIOs who want to avoid potential operational and reputational risk of failure will see that the benefits of choosing a commercial parallel file system strongly outweighs the exposure of undertaking the task to finance in-house resources and build the infrastructure required to implement an open source solution.