A look at how flash storage and AI have impacted each other, and how flash prices are set to change

By Federica Monsone, founder and CEO, A3 Communications, the data storage industry PR agency.

  • Tuesday, 7th November 2023 Posted 1 year ago in by Phil Alsop

Flash memory was, without doubt, a groundbreaking technology when it first entered enterprise datacenters around two decades ago and immediately began transforming the performance of a wide range of applications. We wanted to understand the relationship between flash and the revolutionary developments that are now happening in artificial intelligence (AI), so we assembled a group of experts and asked them how much impact flash has had on AI and on the related fields of analytics, IoT and edge computing. Vice versa, we also asked our experts how much those technologies will change the adoption rate of flash. Because cost is a driving factor affecting the implementation of any technology, and flash prices have tumbled over the last twenty years, we also asked our experts whether they expect flash prices to continue falling over the next five years.

The relationship between AI and flash is hard to define

The more data that an application needs to access, the greater the performance boost delivered by storing data in flash rather than on spinning disk. Because AI is a highly data-intensive application, it might seem reasonable to presume that without flash, there would be no modern AI. However more than one member of our panel questioned that notion.

“Some in this industry argue that we would not have today’s AI without the past decade’s shift to solid state storage. While that may be true, it’s enormously difficult to prove. AI training consumes enormous resources, and SSDs [solid-state drives] have accelerated the advancement of computing performance across the board, so AI will have benefited from this,“ said Jim Handy, general director at analyst firm Objective Analysis. He added: “The same holds true of any discipline based on advanced computing technology, whether it’s analytics, nuclear physics, or meteorology.”

David Norfolk, practice leader for development and government at analyst firm Bloor Research, said: “Insofar as flash makes storage faster, cheaper, and more reliable, it enables data-intensive innovations such as AI/ML, analytics, IoT, and edge processing. Conversely, these innovations need more fast, cheap, reliable storage and I'd expect flash take-up to track the take-up of these innovations.”

Leander Yu, president and CEO of Graid Technology said: “Flash memory and all-flash array storage solutions are all about performance. The killer apps of AI/ML and analytics are where customers are investing in their IT infrastructure, and these workloads demand the performance delivered by all-flash storage.”

A wider view is taken by Peter Donnelly, director of products at storage networking vendor ATTO, who said: “I believe that we’re in the middle of a dramatic change in how and where data is collected and consumed. This is driving the need for the disaggregation of the data center. It’s not to say that data centers will cease to exist, but they are becoming less structured and more flexible. This is an important dynamic that is driving the need for flash memory and flash storage. How do we access and use data that is across the country, or even around the world, in a way that makes it seem like it’s located locally? Flash helps answer that challenge, and it enables emerging technologies like AI and data analytics at a scale that was impossible until now.”

AI and analytics are changing the architecture of flash-powered storage systems

But even if the impact of flash on AI, analytics, IoT, and edge processing is difficult to quantify, flash is certainly a key element in the IT infrastructure built to handle those workloads. When it comes to implementing AI, that infrastructure is about to receive more attention than it has to date, according to Randy Kerns, senior strategist at analyst firm the Futurum Group.

“I think we are just beginning to see the importance of the underlying device technology used for AI/ML. Currently the focus has been on the algorithms and data conditioning from multiple sources to operate on and build the training and test data. Rightfully so, getting the functional aspects working has been where the attention has been placed. Now, as this is maturing, the importance of improving the technology and getting results faster will bring the technologies for storage into greater consideration. Some implementations may be further along than others, but we will see more importance in AI/ML and use of flash storage as a given,” said Kerns.

The ability of flash to handle small, random data accesses or IO operations fits the needs of AI/ML and analytics. “Hard disk drives are steampunk devices. SSDs have, as a result of their enormous IOPs [IO operations per second] advantages, taken over all workloads that involve small and/or random transfers. AI/ML training and analytics involve randomness in their I/O workloads, and IoT is dominated by extremely small transfers, making both early success stories for all-flash storage systems,” said Curtis Anderson, software architect at Panasas, a supplier of storage software for workloads needing high performance.

As well as contributing to the take-up of all-flash storage systems, the performance needs of AI/ML are also driving architectural changes within those systems. “Architectural considerations around how data enters and leaves the storage are also important. This is why traditional HPC storage is well suited to AI workloads, and there are many new storage companies entering the marketplace who are leveraging flash and NVMe [the storage protocol used to access flash] to deliver low latency across the board and eradicate any potential bottlenecks at the storage layer,” said Amos Ankrah, solutions specialist at Boston, a provider of high-performance servers and storage systems.

From TB to PB - the scale of flash usage varies hugely

AI applications such as autonomous driving and large language models (LLMs) are in Anderson’s words “poster children” for the use of huge datasets to train AI models. As an example, he cites Tesla’s use of more than a staggering 200PB of what the car maker calls “hot-tier cache capacity.” However, Anderson says most organizations are using far smaller datasets for AI development. “The vast majority (by count) of AI/ML projects have (significantly) less than 100TB of capacity needs,” he said. That is 2,000 times less capacity than Tesla’s hot tier.

Anderson and his colleagues at Panasas expect that these more typical AI datasets will grow, but only slowly. That is just as well, because flash is significantly more expensive than disk, but its usage is often essential for AI training. The gap between disk and flash performance is even wider for AI than for other applications, because of the general random nature of AI data access. For decades, storage vendors have compensated for the relatively low speed at which disk drives handle semi-random requests to access data by identifying hot or frequently-accessed data and storing it in very fast DRAM-memory read caches. “Read caching helps a lot when a small percentage of the data is being accessed multiple times.

AI/ML doesn't fit often with those traditional I/O access patterns which forces organizations to take a largely flash-based approach for many AI/ML workloads,” said Steven Umbehocker, founder and CEO at OSNexus, a vendor of scale-out, software-defined storage systems.

Performance is not the only flash virtue, especially for IoT and the edge

Performance is not the only advantage that flash offers compared to disk, as SSDs consume less power while also potentially being more reliable and able to withstand challenging environments. “In applications like IoT, edge processing, and TinyML (machine learning at the edge) one of the top design priorities is the ever-increasing drive to decrease power consumption – both dynamic and standby power – while ensuring the highest possible performance. On top of this, for any IoT design, keeping costs down is another huge priority,” said Coby Hanoch, CEO and founder of Weebit Nano, a developer of next-generation solid-state memories.

The ability of flash to survive harsh environments is another advantage. “If we mean at the edge, infrastructure at cell towers and other local infrastructure, then solid state storage, particularly SSDs, are a definite enabling technology since they perform better at extremes of temperature that other storage technology, such as hard disk drives, which would find these difficult,” said Tom Coughlin, president of analyst firm Coughlin Associates, and member of the Compute, Memory and Storage Initiative at industry body Storage Networking Industry Association (SNIA). Roy Illsley, chief analyst at research firm Omdia, highlighted another physical characteristic of flash when he said: “A second aspect worthy of note is that for edge use-cases the ability to operate from a small footprint so the AI inferencing workloads can be deployed in remote locations means flash is the storage of choice when physical space is a restraining factor.”

Dennis Hahn, principal analyst at Omdia, said that flash storage at the edge is often within hyperconverged infrastructure (HCI.) “In use-cases like edge, processing real-time results is often the case, so fast flash storage local to the processing servers is necessary. In its research, Omdia has found that these edge systems are frequently HCI systems using SSD devices.” But this does not mean that IoT data is always stored in flash. “ Data collection like that of IoT often focuses more on cost, and the data frequently travels over the relatively slow internet. [As a result] bulk storage solutions like HDD are more frequently used. But, ultimately, flash comes into play for its speed in enabling IoT data processing.”

Referring to the NOR variant of flash that is embedded in system-on-a-chip processors, Weebit Nano’s Hanoch said: “In devices performing AI or ML at the edge, flash is used not only for code / firmware storage and boot, but importantly flash, and even more so newer types of NVM like ReRAM, is also used to store the neural network weights needed for AI calculations. To support this functionality while keeping cost and power to a minimum, we’re seeing designs pushing to more advanced nodes such as 28nm and 22nm, currently the sweet spot for IoT and edge devices. This requires NVM that is embedded in an SoC monolithically, but embedded flash can’t scale to 28nm and below, so designers can’t integrate it with other functionality on a single die. This is a huge challenge in designing these small, inexpensive, often battery powered devices.”

The gap between disk and flash prices will not change

The variant of flash memory that hugely dominates flash usage is NAND flash. Until the late 90s, NAND flash was a very expensive and rarely used technology. This situation changed in the late 90s when makers of battery-powered devices such as MP3 players and mobile phones were searching for a data storage medium that consumed less power than miniature disk drives. NAND flash fit the bill, production

soared, and prices plummeted. Surprisingly however, it was not until around 2004 that NAND flash became cheaper than DRAM memory.

However, the important price comparison has always been between flash and disk. Although the price of flash has been falling for the last twenty years, so has the price of disk drives, when both are measured in terms of dollars per unit of storage capacity. For the last decade the gap between the two has been relatively steady. “SSD $/TB have maintained roughly the same 5x-7x multiplier over HDD $/TB over the last ten years,” said Anderson. That estimate of the price difference was echoed by Umbehocker and by Giorgio Regni, CTO at Scality, who both put the per-TB price difference at five-fold.

“We don’t think the market is pushing flash vendors very hard to change that in the future,” said Anderson. Referring to so-called fabs - the fabrication plants that make flash and other semiconductor chips - Anderson added: “There are only a handful of flash fabs around the world and new ones aren’t being built at a rate that will outstrip the growth in the demand for flash.” Again, this view was shared by other experts, who pointed to the need to build new fabs to increase global output, and the enormous expense of doing so, which ranges from hundreds of millions to billions of dollars per fabrication plant, and the years of planning and construction required.

Experts expect flash prices to continue falling

On a short-term basis, flash prices have a history of dramatic variations. For Objective Analysis, Handy said: “During shortages prices typically flatten, but sometimes they increase a little. In very rare cases they increase substantially, like they did in 2018. When the shortage flips to an oversupply there’s always an alarmingly rapid price collapse. We had that collapse in the second half of 2022, when prices fell by up to 70%.”

Boyan Krosnov, CTO and co-founder at StorPool, a vendor of software-defined, distributed storage systems, outlined the factors that influence long term price trends for flash. “Future price of flash would depend on cost of capital, energy costs, supply and demand, which is heavily dependent on overall growth of IT infrastructure. So, if you believe that the world economy is going to grow and IT infrastructure will grow even faster, then in the next 1-2 years the price of flash will be increasing. Then fab capacity will catch up and in a few years price will go back to the slow downward trend.”

Shawn Meyers, field CTO at Tintri agrees: “The worldwide economy will be the largest driving factor, outside of new revolutionary breakthroughs in flash manufacturing. Supply chain ripples will follow the bullwhip effect for the foreseeable future.” However, between price collapse and price surges, per-TB prices slowly fall, according to Objective Analysis’ Handy, who said the price trends are surprisingly predictable and that his company produces the industry’s most consistently accurate price forecasts. So how fast does Objective Analysis believe flash prices will fall over the next five years? “From now until mid-2028, the average price decline will be about 15% per annum,“ Handy said, adding that a possible shortage in mid-to-late 2024 would be followed by oversupply and price collapse in 2026.

However, Regni at Scality predicted an ever faster decline in price for the lowest-cost QLC variant of flash. “Based on roadmaps from hardware and disk manufacturers, we see a decline in the cost (measured as $ per Terabyte) of high-density (QLC) flash SSDs to decrease dramatically. Data shared with us shows a 60%+ decline between 2022 and 2025,” said Regni for Scality.

That 60% decline in price cited by Regni for QLC flash equates to a 26% compound average reduction from 2022 to 2025, which would be significantly faster than Handy ‘s 15% prediction for overall flash prices over the longer time period of 2023 to 2028. Regni added: “While this is a faster decrease than equivalent high-density HDDs, we still see HDDs maintaining a 5x cost advantage over SSDs in the same time frame,” he said.