Resilient architectures part 1: Optimizing AWS test workloads for performance and cost
This blog is the first in a series of cloud testing use cases focused on helping organizations that deploy workloads in the public cloud to optimize architectures for resiliency, cost, and performance. We'll leverage Ixia application performance test tools to first establish a baseline for test workloads and then move through several use cases focused on performance, high availability, and security validation.
As with any test and measurement tool, before diving into the depths of testing we need to take the proper first steps to “tune” and “calibrate” our tool against the variables and constraints of the environment. The linearity of performance and cost are considerations compounded in a public cloud setting as performance varies and users pay more as traffic crosses use thresholds.
Although the following strategy is applicable to the broader topic of baselining and optimizing application performance test tools in public clouds, let’s walk through a specific use case for IxLoad Virtual Edition (VE) on Amazon Web Services (AWS).
In AWS, the selection of the rightly sized instance type can have a significant impact not only on the network-level performance but also the cost to run traffic scenarios for long durations. There is a cost to launching our virtual test appliance instances in the cloud and a cost to generating the traffic. For a given new version of IxLoad on AWS it is recommended to baseline the performance using two Ixia Virtual Test appliances with one test port each and generate plain HTTP traffic between them within the same virtual private cloud (VPC). Network traffic tests are highly compute-intensive, so we recommend using an instance from a high-compute family like C4. Amazon indicates that the C4 family has “Moderate”, “High”, and “10 Gbps” instance types.
Table 1. IxLoad 8.50 HTTP performance baseline using Amazon AWS C4 family instances
The instance type you have selected may have physical infrastructure underneath such as Single Root – I/O Virtualization (SR-IOV) Network Interface Cards (NIC) that increases the performance of instances deployed on that cloud. However, it does not guarantee that the Ixia virtual test ports will be capable of generating full line-rate traffic as the infrastructure may rate-limit the bandwidth or pass traffic through additional internetworking or security controls that reduce performance along the path.
Additionally, the public cloud is first and foremost a shared usage model and could be subject to transient environmental issues that could affect a particular test run at a particular time. Therefore, it is key to understand the performance of plain HTTP throughput when generated between two VMones where one virtual test appliance is acting as client(s) and the other as server(s).
Understanding what type of instance will be needed for different workloads
After running a series of tests, we recorded the observable plain HTTP throughput for each instance type in the C4 family. We collected those results into a table along with the on-demand cost to execute that instance type. Generating this type of table for our specific environment and IxLoad configuration gives us an ability to better forecast the cost of running our test tool instances for longer durations of traffic. By having this table, we can make decisions such as selecting two instances of a smaller instance type instead when a single instance of a larger size is too costly for our needs. Perhaps a smaller instance type covers 80%+ of our primarily functional use cases and we simply need to gang several together to generate enough client traffic for the occasional large capacity test. Do we need to increase our instance size because we will be processing video?
Table 2. IxLoad 8.50 HTTPS performance baseline using Amazon AWS C4 family instances
In the public cloud, security is paramount and the increasing push towards end-to-end encryption marches ever onward and ever faster. What type of performance hit will be taken when traffic is encrypted as opposed to plain text? What could the public cloud infrastructure do to offload or otherwise accelerate our traffic?
Baselining Ixia + AWS infrastructure performance before adding SUTs
Our next step was to generate a second table for our public cloud environment that used a test tool configuration with HTTPS encrypted using AES128-SHA:SSLv3 between client and server test appliances. Again, the ratings of “Moderate”, “High”, and “10 Gbps” do not specifically talk to the type of traffic we are looking to benchmark nor to the type of cipher we use to encrypt, hashing mechanisms, etc. Only by executing tests for each instance type and recording our observations can we make informed choices in our environment for our unique constraints. What we observed is that with encrypted traffic we were unable to obtain higher throughput using the largest instance type in the C4 family. In this case, should we want to achieve 10 Gbps aggregate throughput generation, we can use four of our test appliances. This is because it is far more economical to use the next size down as there is no gain to the mantra of “go big or go home.”
Table 3. IxLoad 8.50 Update 1 HTTPS Throughput baseline comparison against prior release IxLoad 8.50
Great, so. armed with empirical results from our tests of plain text and SSLv3, we have some key insight into what the guidelines are for selecting an instance in those cases. But what about other cipher types and key lengths? Surely a complete test series executed against a system under test (SUT) will include a variety of encryption mechanisms that need to be evaluated. Some of these mechanisms have inherent advantages and disadvantages in and of themselves and add on top of that the complexity that security vendors themselves have moved to a more agile product development model and are pushing out updates faster than ever thanks to continuous integration and continuous delivery (CICD) practices.
Even test tools benefit from the relentless march forward on public cloud infrastructure improvement and increased capabilities within the test tool with each new release. One such intersection between improvements in platform support and increased tool functionality is support for Intel Advanced Encryption Standard - New Instructions (AES-NI) in IxLoad 8.50 Update 1 for Galois/Counter Mode (GCM) ciphers and IxLoad 8.50 Update 2 for Non-GCM ciphers.
Executing a series of throughput configurations having differing ciphers shows a significant improvement for those ciphers that are accelerated by the AES-NI instruction set supported by the C4 instance family in AWS, in some cases by a factor of 10x. Similarly, connections per second (cps) tests reveal improvements across the board.
Table 4. IxLoad 8.50 Update 1 HTTPS connections per second baseline comparison against IxLoad 8.50
The wide reach of the public cloud enables many exciting application performance test scenarios including hybrid cloud and multi-cloud traversal. Understanding the performance characteristics of the underlying public cloud infrastructure and calibrating the application performance test tool to our environment’s particular constraints and capabilities is needed prior to adding complexity to the system.
The multi-tenancy nature of the cloud makes it easy to iterate quickly and compare and contrast performance results between versions of the SUT as well as allow you to introduce new iterations of the test tool with new capabilities or performance improvements without disrupting prior environments. Your extra effort to baseline the test tool performance and cost will pay off in the immediate term as well as in the future.
For more information, watch the companion video to this blog:
For more details on the IxLoad on AWS solution, consult the data sheet.