Los Alamos National Laboratory stands at the forefront of national security research. Since its inception in 1943, LANL has consistently pushed the boundaries of scientific understanding, playing a pivotal role in shaping the modern world. From its historic contributions to the Manhattan Project to its ongoing leadership in cutting-edge fields like computational science, LANL has established itself as a cornerstone of scientific progress.

Today, LANL’s High-Performance Computing (HPC) division exemplifies this commitment to innovation, enabling researchers to manage exabytes of data from cutting-edge research. They continuously push the boundaries of extreme-scale supercomputing, enabling researchers to tackle the exabytes of data generated by cutting-edge scientific research.

This drive to leverage new technologies led to the adoption of the Versity S3 Gateway. This solution bridges the gap between object protocols and various storage backends, including computational storage. It allows researchers to directly access and query simulation data from NVMe storage devices using S3 commands and workflows. Pushing data reduction functions closer to the storage devices saves power and time, allowing analytics functions to be performed on a much smaller analytics cluster vs the traditional ‘big iron’ HPC machines.

Challenges with Scientific Data Analytics

Scientific research routinely generates massive datasets, often exceeding petabytes in size for one time step of a single simulation that might capture thousands of time steps. This sheer volume of data presents significant challenges in the realm of scientific data analytics.

Firstly, moving these datasets to analytics applications is time-consuming and expensive, especially since scientific queries typically focus on small data portions. Furthermore, the limitations of legacy data analysis workflows exacerbate this challenge. Traditional workflows necessitate transferring all raw scientific data associated with a query result to the application, demanding that the application execute analysis code on the entirety of the dataset. This leads to unnecessary overhead and undue strain on computational resources.

Using the Versity S3 Gateway for OCS

To address these limitations, LANL developed a novel approach. They envisioned a system where, upon query initiation, data processing occurs directly on a dedicated computational storage device. This device would then transmit only the relevant results to the host application, thereby significantly reducing unnecessary data movement.

LANL leverages an object-based computational storage (OCS) infrastructure, which allows NVMe devices to directly access and interpret data blocks, necessary for query pushdown capabilities. This system simplifies data mapping between data and NVMe blocks compared to traditional file systems. LANL partnered with SK Hynix, leveraging their advanced memory solutions, to develop this advanced computational storage device capable of handling query pushdown and data analytics.

However, in order to push analytic functions down from a logical object view users have of data to a block based NVMe, a translation has to be made. The Versity S3 Gateway facilitates seamless communication between disparate storage systems and enhances query pushdown capabilities. Combined with Apache columnar analytics tools, it bridges the gap between storage technologies, enabling efficient data analysis on massive datasets.

The Versity S3 Gateway streamlines scientific workflows by eliminating data transfers between object storage and NVMe. It removes server input/output (I/O) bottlenecks and improves data access times, allowing a single host to manage petabyte-scale data volumes efficiently. This marks a significant advancement in object data processing capabilities, resulting in faster analysis times, improved research productivity, and deeper scientific insights.

“We are thankful that Versity engaged to produce a flexible and performant S3 gateway that enabled our exploration of push-down analytics at scale,” said Dominic Manno, lead of hot storage research at LANL. “Versity’s open community gateway technology has and will play a part in our journey toward providing next-generation at-scale analytics that leverage the Apache ecosystem.”

Conclusion

Scientific research, particularly at institutions like LANL, often grapples with managing and analyzing massive datasets. These exabyte-sized datasets can be prohibitively expensive to move and analyze, hindering the pace of scientific discovery.

The Versity S3 Gateway bridges the traditional gaps between disparate storage technologies, significantly enhancing the efficiency and scalability of LANL’s HPC applications. By streamlining the integration of object storage and computational storage devices like NVMe, the Gateway accelerates data access, reduces bottlenecks, and empowers researchers to handle large data volumes more effectively.

As LANL continues to lead in computational science and national security research, the Versity S3 Gateway stands out as a critical component in their technological arsenal, driving faster research outcomes and enabling deeper, more insightful scientific discoveries. This advancement underscores LANL’s commitment to maintaining its status as a cornerstone of global scientific progress and innovation.

Read more about the Versity S3 Gateway

Looking Back, Reaching Forward: The Journey Behind the Versity S3 Gateway
Articles

Looking Back, Reaching Forward: The Journey Behind
the Versity S3 Gateway

Born from the need for seamless integration across diverse storage systems, the Versity S3 Gateway ensures high performance and scalability for large-scale data operations. Dive into the journey behind its development, from overcoming compatibility challenges to leveraging high-performance frameworks like Fiber. Read this article toExplore the Versity S3 Gateway’s innovative features and real-world impact in our comprehensive article.

Unlocking the Power of Scalability: Analyzing the Versity S3 Gateway’s Scale-Out Performance
Articles

Unlocking the Power of Scalability: Analyzing the
Versity S3 Gateway’s Scale-Out Performance

The Versity team executed comprehensive tests to evaluate the performance of accessing an object storage system through the Versity S3 Gateway, comparing the performance as gateway instances were added to the system. Read the full study to see how the zero communication stateless design of the Gateway allows nearly perfect scalability!

Evaluating Efficiency: An Analysis of the Versity S3 Gateway’s Performance and Overhead
Articles

Evaluating Efficiency: An Analysis of the Versity S3
Gateway’s Performance and Overhead

The Versity team conducted a series of tests to evaluate the efficiency and scalability of their Versity S3 Gateway. These tests included concurrent 1 MB object uploads, 10 GiB object concurrent multipart uploads, and various request rate loads. The objective was to assess the Gateway’s performance across different workloads and to compare its capabilities in direct S3 service access versus Gateway proxy access. Additionally, the team examined CPU and memory usage, demonstrating the Gateway’s proficiency in managing substantial workloads with minimal resource overhead.

Rise to the challenge

Connect with Versity today to find out how we can tailor a solution to keep your organization’s data safe and accessible as you advance your mission.