Comments:".NET and Node.JS - Performance Comparison | Salman Quazi"
URL:http://www.salmanq.com/blog/net-and-node-js-performance-comparison/2013/03/
If you talk to any silicon valley startup today chances are you will hear about node.js. One of the key reasons most argue is that node.js is fast, scalable because of forced non-blocking IO, and it’s efficient use of a single threaded model. I personally love JavaScript, so being able to use JavaScript on the server side seemed like a key gain. But I was never really sold into the notion that node.js is supremely fast because there aren’t any context switches and thread synchronizations. We all know these practices should be avoided at all costs in any multi-threaded program, but to give it all away seemed like an extreme. But if that meant consistently higher performance, then sure, that would make sense. So I wanted to test this theory. I wanted to find out exactly how fast node.js was compared to .NET – as empirically as possible.
So I wanted to come up with a problem that involved IO (ideally not involving a database), and some computation. And I wanted to do this under load, so that I could see how each system behaves under pressure. I came up with the following problem: I have approximately 200 files, each containing somewhere between 10 to 30 thousand random decimals. Each request to the server would contain a number such as: /1 or /120, the service would then open the corresponding file, read the contents, and sort them in memory and output the median value. That’s it. Our goal is to reach a maximum of 200 simultaneous requests, so the idea is that each request would have a corresponding file without ever overlapping.
I also wanted to align the two platforms (.NET and Node.js). For instance, I didn’t want to host the .NET service on IIS because it seemed unfair to pay the cost of all the things IIS comes with (caching, routing, performance counters), only to never use them. I also avoided the entire ASP.NET pipeline, including MVC for the same reasons, they all come with features, which we don’t care about in this case.
Okay, so both .NET and Node.JS will create a basic HTTP listener. What about client? The plan here is to create a simple .NET console app that drives load to the service. While the client is written in .NET, the key point here is that we test both .NET and Node.JS services using the same client. So at a minimum how the client is written is a negligible problem. Before we delve into the details, let’s look the graph that shows us the results:
Performance of sorting numbers between .NET and Node.JS
As you can see .NET wins, hands down. No question about it. Let’s look at each aspect of this test more carefully.
We’ll start with the client, the client uses a HttpClient to drive requests to the service. The response times are maintained on the client side so that there aren’t any drastic implementation difference on the service that could impact our numbers. Notice that I avoided doing any Console.Write (which blocks) until the very end.
public void Start() { Task[] tasks = new Task[this.tasks]; for (int i = 0; i < this.tasks; ++i) { tasks[i] = this.Perform(i); } Task.WaitAll(tasks); result.ToList().ForEach(Console.WriteLine); } public async Task Perform(int state) { string url = String.Format("{0}{1}", this.baseUrl, state.ToString().PadLeft(3, '0')); Stopwatch timer = new Stopwatch(); timer.Start(); string result = await this.client.GetStringAsync(url); timer.Stop(); this.result.Enqueue(String.Format("{0,4}\t{1,5}\t{2}", url, timer.ElapsedMilliseconds, result)); }
With that client, we can start looking at the service. First we’ll start with the node.js implementation. The key point I want to make is that I am using the async NPM instead of the default blocking Array.Sort. Now one could argue that the sorting algorithm for the async package isn’t as great as the one used by .NET. My argument would be that I didn’t use any magical sort algorithm in the .NET side either (even though they are available), for instance I could have used a parallel quick sort algorithm right out of the TPL extensions, but I didn’t I used what’s available right of the BCL against an array. My second point would be that it really doesn’t matter, since we are fairly bounded in regards to the numbers we are trying to sort (between 10K-30K) and yet we see a overall trend of disparity between .NET and Node.JS over time is a concern. In other words, if you look at the graph notice the area between the two series is almost a right triangle, narrow in the beginning, wider and wider over time.
var http = require('http'); var fs = require('fs'); var async = require('./async.js'); http.createServer(function(request, response) { var file = parseInt(request.url.substring(1)); file = file % 190; // 190 is the exact number of files we have file = String("000" + file).slice(-3); // read the file fs.readFile('../data/input'+file+'.txt', 'ascii', function(err, data) { if(err) { response.writeHead(400, {'Content-Type':'text/plain'}); response.end(); } else { var array = data.toString().split("\r\n"); async.sortBy(array, function(obj, callback) { callback(null, parseFloat(obj)); }, function(err, results) { // error if(err) { console.log('errr'); response.writeHead(400, {'Content-Type':'text/plain'}); response.end(); } else { response.writeHead(200, {'Content-Type': 'text/plain'}); response.end('input'+file+'.txt\t' + results[(parseInt(results.length/2))]); } }); } }); }).listen(8080, '127.0.0.1');
And lastly, let’s look at the .NET service implementation. Needless to say we are using .NET 4.5, with all the glories of async/await. As I mentioned earlier, I wanted to compare purely .NET without IIS or ASP.NET, so I started off with a simple HTTP listener:
public async Task Start() { while (true) { var context = await this.listener.GetContextAsync(); this.ProcessRequest(context); } }
With that I am able to start processing each request, as requests come in I read the file stream asynchronously so I am not blocking my Threadpool thread, and perform the in-memory sort which is a simple Task that wraps the Array.Sort. With .NET I could have severely improved performance in this area by using parallel sorting algorithms which come right of the parallel extensions, but I choose not to.
private async void ProcessRequest(HttpListenerContext context) { try { var filename = this.GetFileFromUrl(context.Request.Url.PathAndQuery.Substring(1)); byte[] rawData = null; using (FileStream stream = File.Open(Path.Combine(dataDirectory, filename), FileMode.Open, FileAccess.Read, FileShare.Read)) { rawData = new byte[stream.Length]; await stream.ReadAsync(rawData, 0, (int)stream.Length); } var sorted = await this.SortAsync(context, rawData); var response = encoding.GetBytes(String.Format("{0}\t{1}", filename, sorted[sorted.Count / 2])); await context.Response.OutputStream.WriteAsync(response, 0, response.Length); context.Response.StatusCode = (int)HttpStatusCode.OK; } catch(Exception e) { context.Response.StatusCode = (int)HttpStatusCode.BadRequest; Console.WriteLine(e.Message); } finally { context.Response.Close(); } } private async Task<List<string>> SortAsync(HttpListenerContext context, byte[] rawData) { return await Task.Run(() => { var input = this.encoding.GetString(rawData); var array = input.Split(new string[] { "\r\n" }).ToList(); array.Sort(); return array; }); }
You can download the entire source code, this zip file includes both client, and service sources for both .NET and node.js. It also includes a tool to generate the random number files, so that you can run the tests on your local machine. You will also find the raw numbers in the zip file.
I want to end with a realization while I was working on this prototype. As I was building the node.js service I looked around the web for packages that did asynchronous sorting. Sorting! A CS101 request, and I think I came up with only one solution. The solution really didn’t talk about what sorting algorithm they used, in this case I didn’t care, but what if I did? What if I wanted to extend the behavior of the sort algorithm, could I? .NET has such a highly organized and cohesive framework that there’s nothing else that comes close. Imagine the power of having type-safety, and the ability to write queries right within the language (LINQ), and being able to effectively parallelize it (PLINQ). And then what about simple parallelism? As I was reading through the forums on node.js, I started to hear resentments about the natural issues that come with the node.js control flow, people everywhere seemed to have invented techniques to manage it – things that turn tree like control flows into flat waterfall. But these are the happy scenarios, when everything works, what happens when one of these, throws an exception? How do you synchronize these events, so on and so on.. And that is when you realize that you have such a powerful, flexible framework like .NET under you fingertips with amazing performance like we’ve shown today. I strongly believe that you should always choose a tool that’s best for the job, but often times it’s these small projects that turn big and if you don’t think carefully about your choices, then you’ll have no choice but to start over. Flexibility, therefore should be the tipping point of your decision making.