A. Joseph Dupre III
Technical Product Manager
Blog

Resilient architectures part 2: Benchmarking performance of webserver-based applications

October 17, 2019 by A. Joseph Dupre III

This blog entry is the second in a series of cloud testing use cases focused on deploying and using a world-class application performance test tool within the larger scope and elasticity of the public cloud. Read Resilient architectures part 1  

We’ll start this discussion armed with the previous empirical data we collected in our last use case while baselining the performance of Ixia’s IxLoad Virtual Edition (VE) Layer 7 application performance test tool in the Amazon Web Services (AWS) public cloud. We selected a c4.4xlarge for the Ixia Virtual Test Appliance (VMone) to generate our real-world client HTTP and HTTPs traffic emulation against a web server deployed on AWS.  This will give us flexibility to spin up one or more VMones to generate the level of client traffic we need to push the performance threshold for our target web server and its workload. The c4.4xlarge is a useful balance between low cost to operate in the cloud and multiple Gbps of traffic for generating application performance traffic testing.

Our goal with this exercise was to understand how we could use IxLoad to obtain the key application performance metrics of connections per second (CPS), concurrent connections (CC), and throughput for a web server deployed using different AWS instance types. We generated a table of measurements to enable data-driven decisions about the number of web server instances we might need and the aggregate cost tradeoffs between picking many smaller instances and scaling horizontally versus fewer larger instance types (scaling vertically). To start, we needed to understand how a single instance performs for a variety of sizes.

Understanding how web servers perform on different instance types

We decided to first compare instances in the T2 family. This instance family is interesting because it is low cost and targeted at general-purpose computing. A t2.micro instance is how a lot of us get started with the AWS Free Tier when learning with our first “Hello World” exercises. The T2 family is typically rated in the “Low” to “Moderate” performance tier by Amazon. What’s even more interesting though about this particular instance family, is that these instances do have the ability to temporarily burst their performance above their rating based on utilizing CPU credits. 

When collecting our data, we wanted to use IxLoad to run repeated and longer-duration tests so we could not only observe the burst behaviors but also associate some numbers with the notation that these instances have a “consistent baseline performance”. Essentially, it is helpful to understand what our steady-state behavior for these instance types might look like when subjected to a sustained workload so we understand what the floor looks like in our everyday performance. To flexibly change between instance types while maintaining consistency in the web server configuration, we used AWS CloudFormation templates to automate the deployment of a single web server device under test (DUT).

1

Table 1. IxLoad 8.50 HTTPS client metrics against AWS T2 family web server instances

For this test series, we only needed a single VMone virtual test appliance as the steady-state performance was observed to be under 1Gbps. We found the steady-state performance of the t2.xlarge, which is rated as “Moderate” and is the largest in the family, to clock in around 820 Mbps for example. Again, in cases where the CPU has credits that can be utilized, this instance type is capable of higher throughput levels.  Still this gives us a quick table by which to compare when horizontal scaling may be a better fit versus vertical scaling when we go to increase the number of instances in future test series. While the increase in steady-state throughput is significant between t2.large and t2.xlarge, the increase again to t2.2xlarge may not be worth the additional roughly doubling in cost per hour for the majority of cases, especially when we know the CPU can burst to handle transient periods of higher activity than we typically expect.

Understanding how to use AES-NI to optimize the webserver’s performance

We saw in the previous use case blog entry that Intel® Advanced Encryption Standard (AES) New Instructions (AES-NI) had a significant impact on the performance baseline we were able to achieve with our IxLoad application performance test tool when the underlying instance type supports the AES-NI. The same is of course true when a web server can use AES-NI to accelerate encrypted traffic processing to achieve higher results. According to Intel, AES-NI can accelerate the performance of AES up to 10x over software alone and does so by implementing the complex and performance-intensive portions of the AES algorithm using hardware offload. (SOURCE: https://software.intel.com/en-us/articles/intel-advanced-encryption-standard-instructions-aes-ni/).

Understanding how vertical scaling of different instances impact cost

For this test series, we selected the C4 family, which has support for AES-NI acceleration among other features focused on compute-intensive tasks. We wanted to understand the behavior of instances rated “Moderate”, “High”, and “10 Gbps” by Amazon. A series of tests were executed where we recorded the observable HTTPS throughput for each instance type in the C4 family. We collected those results into a table along with the on-demand cost to execute that instance type. Similar to vertical scaling with the T2 family, we found that for the C4 family the increase in performance between c4.2xlarge and c4.4xlarge was roughly double for a doubling in cost per hour whereas the further increase to c4.8xlarge did not yield a significant enough increase for our encrypted traffic workload. This is one of the examples that has led us to note that the c4.4xlarge falls within a noticeable sweet spot in the cost/hour versus throughput range.

2

Table 2. IxLoad 8.50 HTTPS client metrics against AWS C4 family web server instances

Understanding cost per KPI for HTTP vs. HTTPS

Throughput is often one of the most, if not the most, critical metrics customers want to understand about their DUTs or systems under test (SUTs), but it is certainly not the complete story. IxLoad’s automatic objective goal seeking feature makes it easy to switch between other key performance indicators (KPIs) such as CC or CPS and let the test tool adjust the other dimensions accordingly. 

We collected CPS metrics for each of the same compute instance sizes for the C4 family that were measured in the previous test series. In this case, we observed the affects of acceleration provided by AES-NI as when the traffic was plain HTTP the results were very flat. However, when the traffic was encrypted, we saw an almost doubling of the overall performance even though here too, the results were fairly flat.  Again, if CPS were a more critical KPI for our particular workload, the benefit of repeatable and automatable application performance testing ahead of a larger-scale deployment would allow us to make data-driven cost savings analysis without sacrificing performance.

3

Table 3. IxLoad 8.50 Update 1 HTTPS throughput baseline comparison against prior release IxLoad 8.50

Understanding how to conduct PoCs to select the “best” web-server

Throughout this process we have used AWS CloudFormation templates to structure not only our IxLoad test tool configuration in the public cloud but also our web server instance so that we could rapidly and iteratively change instance size while maintaining consistent web server configuration. 

It turns out that our tests were executed against an Apache-based web server, but what if another web server code base might have features we need for our particular workload or may be better optimized for our expected traffic patterns. I think Ryan George would agree with me that instead of this being difficult, it is actually “Super Easy, Barely an Inconvenience”. We already had a repeatable IxLoad deployment template and it was easy to modify our existing Apache deployment template with the Linux commands needed to spin-up an Nginx web server test and repeat our IxLoad configurations against that setup as well. IxLoad gives us a consistent yardstick with which to measure the effects of our tradeoffs and fairly compare alternate possible implementations. 

Takeaway

Prior to deploying a large-scale web service in the public cloud, it is critical to use an industry-leading performance application test tool to validate your assumptions about how your cloud architecture will behave by collecting real-world metrics in a controlled environment. Armed with this empirical data, we can right-size our web service in both the vertical scaling direction and, as will be discussed in our next blog, horizontally through the use of Auto Scaling Groups (ASGs).

For more details on the IxLoad on Amazon Web Services (AWS) solution, please consult the data sheet. For more information, watch the companion video to the following blog.