Comments:"Frameworks Round 2 - TechEmpower Blog"
URL:http://www.techempower.com/blog/2013/04/05/frameworks-round-2/
Last week, we posted the results of benchmarking several web application development and frameworks. The response was tremendous. We received comments, recommendations, advice, criticism, questions, and most importantly pull requests from dozens of readers and developers.
On Tuesday of this week, we kicked off a pair of EC2 instances and a pair of our i7 workstations to produce updated data. That is what we're sharing here today. We dive right in with the EC2 JSON test results, but please read to the end where we include important notes about what has changed since last week.
JSON serialization test
In this test, each HTTP response is a JSON serialization of a freshly-instantiated object, resulting in {"message" : Hello, World!"}
. First up is data from the EC2 m1.large instances.
Repeating its performance from last week, the Netty platform holds a commanding lead for JSON serialization on EC2. Vert.x, which is built on Netty, retains second place. Third place is held by plain Java Servlets running on Caucho's Resin Servlet container.
In this round, we added latency data (available using the rightmost tab at the top of this panel). The latency data is captured at 256 concurrency. Plain Go delivers the lowest latency, with a remarkable 7.8 millisecond average and tight standard deviation on EC2.
Dedicated hardware
Here is the same test on our Sandy Bridge i7 hardware.
On our dedicated hardware, plain Servlets lead with over 220,000 requests per second. Tapestry sees a marked improvement versus last week, in part thanks to a pull request that updated our test.
In this week's tests, we have added latency tab (available using the rightmost tab at the top of this panel). On i7, we see that several frameworks are able to provide a response in under 10 milliseconds. Only Cake PHP requires more than 100 milliseconds.
Database access test (single query)
How many requests can be handled per second if each request is fetching a random record from a data store? Starting again with EC2.
We received pull requests that have improved the performance of several frameworks in this database access test. Wicket and Spring have seen notable improvements. Plain Servlets are paired with the standard connection pool provided by MySQL and Gemini is using its built-in connection pool and lightweight ORM.
Other minor improvements versus last week may be attributed to our use of Wrk as a test tool in this round versus the first round use of WeigHTTP.
Dedicated hardware
The dedicated hardware processes nearly 100,000 requests per second with one query per request. JVM frameworks are especially strong here thanks to JDBC and efficient connection pools.
In the latency data (rightmost tab at the top of this panel), it is not surprising that processing a query requires more work than the JSON test. However, several frameworks are still capable of providing a database-sourced response in less than 20 milliseconds. Sinatra on JRuby struggles dramatically in this test, with an alarming average of 583 milliseconds. Meanwhile, Django has the widest standard deviation probably in large part because Django does not provide a MySQL connection pool (Postgres tests are planned).
Database access test (multiple queries)
The following tests are all run at 256 concurrency and vary the number of database queries per request. The tests are 1, 5, 10, 15, and 20 queries per request.
Looking at the 20-queries bar chart, roughly the same ranked order we've seen elsewhere is still in play, demonstrating the headroom afforded by higher-performance frameworks.
The latency data (available using the rightmost tab at the top of this panel), shows ten frameworks—predominantly running on the JVM—are capable of executing twenty queries per request on EC2 in under 1 second on average. As before, Raw PHP is also very strong in this test. The Flask and Django results are impacted heavily by the lack of a connection pool. Later rounds will either test on Postgres or use a third-party MySQL connection pool.
Dedicated hardware
The dedicated hardware produces numbers nearly ten times greater than EC2 with the punishing 20 queries per request. Again, Raw PHP makes an extremely strong showing. PHP with an ORM and Cake improved dramatically from last week's test thanks to configuration changes recommended by the community.
An impressive demonstration of modern hardware and networks: seven frameworks are able to provide a response with 20 individually database-sourced rows (that's twenty round-trip conversations with a database server regardless of how you slice it) in less than 100 milliseconds on average.
At the advice of readers, this round of data was collected using Wrk (https://github.com/wg/wrk). In the first round from last week, we used WeigHTTP (https://github.com/lighttpd/weighttp). This change accounts for the very slight increase in rps seen in several frameworks, including those that saw no change to their benchmark or library code. Our conjecture is that Wrk is just slightly quicker at processing requests.
We didn't switch tools to improve the rps numbers, though. Some readers wanted to see data points that WeigHTTP wasn't providing us. Wrk gives latency data including average, standard deviation, and maximum. For example:
Making 100000 requests to http://10.253.42.235:8080/ 8 threads and 256 connections Thread Stats Avg Stdev Max +/- Stdev Latency 10.07ms 7.80ms 73.59ms 77.37% Req/Sec 2.99k 1.07k 8.00k 88.42% 100002 requests in 3.68s, 59.89MB read Requests/sec: 27202.70 Transfer/sec: 16.29MBThe latency information is now available in the results panels above (the rightmost tab in each panel).
The raw Wrk output from the latest run is in the Github repository.
Additional “stripped” tests
We received community contributions for Rails and Django that removed unused "middleware" components to fine-tune the configuration of these two frameworks to the particular use-case of these benchmarks. We've accepted these contributions but identified them as Django Stripped and Rails Stripped.
We have also retained the original Django and Rails tests (with some other tweaks).
To reiterate the intent of this benchmark exercise: we want to identify the high-water mark of performance one can expect from each framework for real-world applications. Real-world applications will do much more than serialize "Hello, World" and random rows from a simple database table. But we use these simple tests as stand-ins for an application. For that reason, we intentionally did not turn off features that are enabled by default (such as support of HTTP sessions) in our first-round tests.
Still, there is value in demonstrating the degree of increased performance that can be realized by fine-tuning a framework to your application's specific needs. Don't need sessions? What kind of savings can you expect if you turn session support off?
We are not yet certain how best to differentiate tests that exercise the framework mostly as provided versus those that fine-tune the configuration for the particular use-case of these benchmarks. For now, we use the "stripped" name suffix.
Revised Environment Details
Images for sharing
We are grateful to have received Github pull requests and comments from dozens of users: Licenser, th0br0, davidmoreno, Skamander, jasonhinkle, pk11, vsg, knappador, RaphaelJ, chrisvest, dominikgrygiel, jpiasetz, mliberty, nraychaudhuri, bjornstar, shenfeng, bitemyapp, jmgao, larkin, ryantenney, normanmaurer, hlship, burtbeckwith, sashahart, abevoelker, tarndt, skelterjohn, myfreeweb, gleber, sidorares, philsturgeon, patoi, dcousineau, asadkn, BeCreative-Germany, rrevi, goshakkk, tarekziade, julienrf, mitsuhiko, jerem, huntc, alexbilbie, AlReece45, jameswyse, CHH, hassankhan, Nazariy, and onigoetz. A big thank you to all of you!
We have indicated any frameworks that received community review or for which the tests were wholly contributed by the community with a flag after their name in the results tables. For example: play-scala.
Another update is planned for next week. We already have additional frameworks to include thanks to ongoing pull requests.