Meghan McClelland

Innovative Storage Solutions: Versity’s Flash Cache

April 11, 2024

At Versity, we understand the critical importance of efficient and scalable mass storage solutions. As leaders in large-scale storage, we have pioneered a groundbreaking innovation that significantly enhances the performance of the cache element within our solution. Versity’s Flash Cache enables users to harness the advantages of flash technology without sacrificing capacity or exceeding budgets, by simultaneously leveraging the cost advantages of disk storage.

The Role of Primary Cache in Mass Storage Platforms

In a mass storage system, the primary cache, or data cache serves four key functions: a temporary data repository, online storage redundancy, fulfilling read requests, and parallel streaming to mass storage.

A data cache functions as a staging area for incoming data, facilitating the ingestion of large volumes within short timeframes. This is especially beneficial considering the unpredictable nature of data workloads in which systems might experience surges in data processing followed by stretches of lower activity. Hence, considering the burst nature of many workloads, the performance of the data cache becomes key to meeting peak demands during short-term data ingestion.

The data cache facilitates parallel streaming of data into cloud resources and tape libraries. With the advent of high-speed drives like LTO-10, the ability to support multiple streams at rates exceeding 1 GB/s per drive necessitates a high-performing primary cache. Traditional SANs with large NL SAS drives struggle to match this level of streaming performance, particularly when faced with real-world data access patterns that deviate from idealized conditions upon which published streaming specifications are based.

Another crucial service of the data cache is its ability to provide online storage redundancy. Imagine a scenario where long-term storage devices or cloud services are temporarily offline due to an outage or network failure. Here, the data cache acts as a safety net. Systems are often configured to retain a week or more of incoming data within the cache. This ensures uninterrupted operation even in the absence of long-term storage resources, safeguarding the continuity of the system.

Furthermore, the data cache acts as a performance booster. The data cache can quickly fulfill read requests, serving data to users without the need to retrieve it from slower or more distant resources such as tape drives or cloud services. By keeping the most recently ingested data online, the primary cache enhances system responsiveness and reduces reliance on long-term storage resources.

The Push for Larger Cache Capacities

As the volume of data collections continues to grow, there is an increasing demand from customers for larger primary data cache capacities. Today, it’s not uncommon to see caches reaching five petabytes in size.

The utilization of cached reads offers significant performance benefits, driving the need for even larger caches to accommodate the expanding datasets. Cached reads provide notably faster access to data, resulting in higher user satisfaction due to quicker data retrieval times, thus providing strong rationale for the construction of larger primary data caches. This trend of growing caches is expected to persist, considering the increasing adoption of AI and ML workloads.

However, one significant drawback of large primary caches is their impact on streaming performance. The layout of incoming data on underlying disks often leads to challenges for large SAN devices in sustaining numerous concurrent high-bandwidth data streams.

The New Flash Cache Paradigm: Flash and Object Storage Integration

Versity’s Flash Cache architecture maintains the core functions of the primary cache while leveraging flash technology to boost performance. Our solution employs a tiered architecture in the backend system to strike a balance between performance and capacity. The system partitions the primary cache into two distinct components: a portion utilizes flash storage while the remainder utilizes object or file storage. For instance, a 2 PB primary cache may be configured with 200 TB of flash and 1.8 PB of object/file storage.

Flash storage serves as the initial landing zone for incoming data, facilitating rapid ingestion, while object storage acts as a reservoir for data retention and cached reads. Flash drives excel in streaming performance despite the randomness of data patterns, enabling systems to keep pace with modern tape infrastructure.

Capacity management in the Flash Cache system mirrors traditional disk-based caches. When object storage reaches capacity less frequently accessed, older files are automatically strategically evicted to create space for newer data. However, to safeguard against data loss, archival copies of evicted files are created before their removal. These copies are typically stored in a separate, more cost-effective storage tier.

Addressing Capacity Concerns in Cache Size

One potential concern with the Flash Cache architecture is the disparity between incoming data size and flash storage capacity. For example, a customer that frequently ingests 1 PB data sets and needs that data to be readily available for an extended period of time might feel that this architecture will not be viable. However, it is important to remember that the flash storage is always active in this architecture. Unlike a traditional monolithic cache, the data does not arrive and then remain on the flash disk.

As data is flowing into the system, it is quickly and automatically moved over to the object storage component so that flash capacity is available to accept more data. This is a fluid process whereby space is constantly freed to accept incoming data. We refer to the object storage component in this architecture as the Extended Cache. Data will reside in the Extended Cache as long as possible and will only be evicted when all archival copies are made and a high water capacity level is reached. Unless the data is permanently archived in the object store, the oldest data will be released to make room for newer data. As long as the Extended Cache is sized to accommodate the quantity of data that needs to be cached (1 Petabyte in our example) then the system will provide the same functionality as a traditional disk cache by dynamically swapping data between the flash disks and the Extended Cache.

Rapid Recovery & Isolated Workflows

An additional, noteworthy benefit of the extended cache system is the expedited rebuild process in the event of a device issue within the primary cache. Since the primary flash based cache is typically smaller, focused mainly on high-speed access and frequently accessed data, any necessary rebuilds or maintenance tasks can be completed more rapidly compared to larger, more cumbersome storage arrays. This not only minimizes downtime but also ensures that critical operations remain largely unaffected.

Furthermore, separating the primary and extended caches means that any work or rebuilds taking place in the extended cache have no adverse impact on the operations within the primary cache. This segregation of duties ensures a smoother, uninterrupted performance of the primary cache, maintaining its role in providing fast access to key data, even while maintenance or expansion activities are underway in the extended cache. This dual-structure approach significantly enhances overall system reliability and efficiency, ensuring that storage infrastructure can adapt quickly to any hardware issues without compromising on performance or accessibility.

Adoption of Flash Cache for ScoutAM

While adoption of the Flash Cache is optional, it is highly recommended for environments with numerous tape drives requiring concurrent data streams. By caching frequently accessed data, the Flash Cache reduces the reliance on slower tape drives, resulting in a noticeable improvement in data throughput. This translates to faster completion times for data processing tasks. Overall, the Flash Cache significantly increases performance and delivers higher throughput for heavier workloads, optimizing your ScoutAM solution.

Implicit vs. Explicit Archiving: A Deep Dive into Storage Management

March 4, 2025March 4, 2025

Discover how traditional backup systems, though vital, often fall short when dealing with massive datasets. By directing backup data to an archiving platform, organizations can overcome inefficiencies, reduce storage costs, and enhance data scalability. Learn how this innovative approach can optimize your data management strategy, ensuring both long-term preservation and swift recovery.

Articles

Enhancing End-to-End Data Integrity in ScoutAM with User-Supplied Checksums

February 4, 2025March 4, 2025

Ensuring data integrity is at the heart of modern archival systems, especially for organizations managing critical or large-scale data workflows. […]

Articles

The Benefits of Stateless Architecture in Versity S3 Gateway

December 18, 2024March 4, 2025

The Versity S3 Gateway’s stateless architecture transforms S3-compatible storage with unmatched scalability, resilience, and efficiency. Learn how it simplifies load balancing, enhances fault tolerance, and adapts seamlessly to any infrastructure.