Unlocking the Power of Scalability: Analyzing the Versity S3 Gateway’s Scale-Out Performance
In a digital landscape defined by exponential growth, the demand for scalable and efficient solutions has never been more critical. A part of this is ensuring seamless communication between varying applications and mass storage systems. The Versity S3 Gateway is an S3-to-file translation tool that enables seamless compatibility between S3-based applications and file-based storage, while also providing scalability to handle high-performance workloads.
A previous analysis delved deeply into the performance of a single Gateway instance, offering a look at its foundational capabilities.
This article builds on that analysis by exploring how the Versity S3 Gateway performs when scaling with multiple instances. The Gateway enables the deployment of multiple instances within a cluster to boost overall throughput. Its stateless design allows any gateway to process requests independently, enabling efficient workload distribution and performance enhancement across users and applications. Adding more Gateway instances directly contributes to an increase in overall performance. This article examines the performance of multiple gateways working together and the aggregate performance achieved for various thread counts.
Test Setup
In the Versity lab, the team executed comprehensive tests to evaluate the performance of accessing an object storage system through the Versity S3 Gateway, comparing the performance as gateway instances were added to the system.
Test node specs
Dell R750
Dual socket Xeon(R) Silver 4314 16C/32T CPU @ 2.40GHz
256GB memory
Centos 7.9 3.10.0-1160.105.1.el7.x86_64
ScoutFS v1.19
Test environment
The testing environment comprised of a five-system cluster running a Versity S3 Gateway instance on each system. The setup was designed to test the Gateway’s performance against the ScoutFS backend filesystem, which is run on shared block storage.
For these tests, the raw performance numbers are highly dependent on the ScoutFS backend filesystem, primarily the storage hardware connected to the test cluster. The important characterization the team is looking for is a linear increase in aggregate performance as they increase the total number of gateways. So, the best case would be each gateway contributing 100% of the case of the single gateway as the number of gateway systems increases.
10G Large Object PUT
First, the Gateway’s large object upload performance was tested using a multipart upload of a 10-gigabyte object, segmented into 64MB chunks, with varying thread counts. Tests were conducted using 2, 4, 8, 16, and 32 threads to measure how these configurations impacted upload speeds. The tests collected scaling performance using 1, 2, 3, 4, and 5 gateways which were all deployed on separate nodes.
Results show a consistent increase in performance as gateways are added using smaller thread counts. The larger thread counts can scale up to 3 gateways linearly but start saturating the backend storage system. So while the gateway performance is expected to keep scaling linearly, the backend system wasn’t capable enough to keep up with this.
10G Large Object GET
After uploading, the next test was how well the Gateway could retrieve a large 10-gigabyte object, broken down into 64MB download chunks, and examined across varying thread counts and gateways. Like in the previous test, the team utilized 1, 2, 4, 8, and 16 threads and 1, 2, 3, 4, and 5 gateways to understand the scaling capability of download speeds.
The results consistently demonstrated consistent scaling while adding more gateway systems. The larger thread counts achieved higher overall performance, and both the smaller and large thread counts achieved close to linear scaling with the addition of more gateway systems. The performance between 4 and 5 gateways was slightly less than perfect scaling, once again primarily due to backend storage saturation.
1 MB Small Object PUTs
Next, the team examined the upload performance of the Versity Gateway for 1MB small objects, with each client thread continuously uploading a single 1MB object. The tests evaluated how the performance varied with the addition of gateways, ranging from one to five.
Results clearly showed that upload speeds increased significantly with each additional gateway. Starting at 1238 MB/s with a single gateway, the speed scaled linearly with each additional gateway to reach 6152 MB/s with five gateways. This demonstrates the system’s strong scalability, where adding more gateways substantially enhances throughput, proving the Gateway’s capability to handle multiple concurrent data uploads efficiently.
HEAD Object Requests
Finally, the team’s last test assessed the performance of the Versity Gateway for object requests per second. The request was a simple HEAD object that returns a given object’s metadata. They measured the number of requests per second across different gateway configurations, ranging from one to five gateways.
The results showcased a substantial increase in request handling capacity as more gateways were added. With a single gateway, the system managed 9704 requests per second. This capacity scales linearly with each additional gateway, culminating in 48048 requests per second with five gateways. This test demonstrates the Gateway’s impressive ability to scale up. It effectively handled a growing number of requests without compromising performance. This translates to a more robust system overall, capable of adapting to increasing demands.
Summary
In conclusion, the zero communication stateless design of the Gateway allows nearly perfect scalability up to the limits of the backend storage system capability. The series of tests conducted on the Versity S3 Gateway powerfully highlights its scalability and efficiency in processing large-scale data transfers. These evaluations demonstrated that the Gateway’s performance significantly enhances as additional gateways are deployed and the number of concurrent threads increases. The ability to scale up through multiple gateway instances and to boost throughput by increasing thread counts underpins the system’s superior design for handling rapid and large data operations.
This dual scalability is crucial for environments that require robust throughput and quick data management, where the addition of gateways and the adjustment of thread counts can lead to significant increases in performance. The results from the tests confirm that the Versity S3 Gateway excels in scalable architecture, making it an ideal solution for organizations anticipating growth and demanding data transfer needs. This study not only showcases the gateway’s core strengths but also its adaptability to complex and expanding operational requirements.
Born from the need for seamless integration across diverse storage systems, the Versity S3 Gateway ensures high performance and scalability for large-scale data operations. Dive into the journey behind its development, from overcoming compatibility challenges to leveraging high-performance frameworks like Fiber. Read this article toExplore the Versity S3 Gateway’s innovative features and real-world impact in our comprehensive article.
Massive scientific datasets slow research at Los Alamos National Lab. The Versity S3 Gateway solves this by bridging the gap between storage systems, allowing researchers to directly analyze data using familiar commands. This translates to faster analysis, reduced bottlenecks, and deeper scientific discoveries. Learn how LANL unlocked the power of their data and see how the Versity S3 Gateway can accelerate your research.
The Versity team conducted a series of tests to evaluate the efficiency and scalability of their Versity S3 Gateway. These tests included concurrent 1 MB object uploads, 10 GiB object concurrent multipart uploads, and various request rate loads. The objective was to assess the Gateway’s performance across different workloads and to compare its capabilities in direct S3 service access versus Gateway proxy access. Additionally, the team examined CPU and memory usage, demonstrating the Gateway’s proficiency in managing substantial workloads with minimal resource overhead.