Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

John Resig - Asm.js: The JavaScript Compile Target

$
0
0

Comments:"John Resig - Asm.js: The JavaScript Compile Target"

URL:http://ejohn.org/blog/asmjs-javascript-compile-target/


Like many developers I’ve been excited by the promise of Asm.js. Reading the recent news that Asm.js is now in Firefox nightly is what got my interest going. There’s also been a massive surge in interest after Mozilla and Epic announced (mirror) that they had ported Unreal Engine 3 to Asm.js – and that it ran really well.

Getting a C++ game engine running in JavaScript, using WebGL for rendering, is a massive feat and is largely due to the toolchain that Mozilla has developed to make it all possible.

Since the release of the Unreal Engine 3 port to Asm.js I’ve been watching the response on Twitter, blogs, and elsewhere and while some developers are grasping the interesting confluence of open technologies that’ve made this advancement happen I’ve also seen a lot of confusion: Is Asm.js a plugin? Does Asm.js make my regular JavaScript fast? Does this work in all browsers? I feel that Asm.js, and related technologies, are incredibly important and I want to try and explain the technology so that developers know what’s happened and how they will benefit. In addition to my brief exploration into this subject I’ve also asked David Herman (Senior Researcher at Mozilla Research) a number of questions regarding Asm.js and how all the pieces fit together.

What is Asm.js?

In order to understand Asm.js and where it fits into the browser you need to know where it came from and why it exists.

Asm.js comes from a new category of JavaScript application: C/C++ applications that’ve been compiled into JavaScript. It’s a whole new genre of JavaScript application that’s been spawned by Mozilla’s Emscripten project.

Emscripten takes in C/C++ code, passes it through LLVM, and converts the LLVM-generated bytecode into JavaScript (specifically, Asm.js, a subset of JavaScript).

If the compiled Asm.js code is doing some rendering then it is most likely being handled by WebGL (and rendered using OpenGL). In this way the entire pipeline is technically making use of JavaScript and the browser but is almost entirely skirting the actual, normal, code execution and rendering path that JavaScript-in-a-webpage takes.

Asm.js is a subset of JavaScript that is heavily restricted in what it can do and how it can operate. This is done so that the compiled Asm.js code can run as fast as possible making as few assumptions as it can, converting the Asm.js code directly into assembly. It’s important to note that Asm.js is just JavaScript – there is no special browser plugin or feature needed in order to make it work (although a browser that is able to detect and optimize Asm.js code will certainly run faster). It’s a specialized subset of JavaScript that’s optimized for performance, especially for this use case of applications compiled to JavaScript.

The best way to understand how Asm.js works, and its limitations, is to look at some Asm.js-compiled code. Let’s look at a function extracted from a real-world Asm.js-compiled module (from the BananaBread demo). I formatted this code so that it’d be a little bit saner to digest – it’s normally just a giant blob of heavily-minimized JavaScript:

Technically this is JavaScript code but we can already see that this looks nothing like most DOM-using JavaScript that we normally see. A few things we can notice just by looking at the code:

  • This particular code only deals with numbers. In fact this is the case of all Asm.js code. Asm.js is only capable of handling a selection of different number types and no other data structure (this includes strings, booleans, or objects).
  • All external data is stored and referenced from a single object, called the heap. Essentially this heap is a massive array (intended to be a typed array, which is highly optimized for performance). All data is stored within this array – effectively replacing global variables, data structures, closures, and any other forms of data storage.
  • When accessing and setting variables the results are consistently coerced into a specific type. For example f = e | 0; sets the variable f to equal the value of e but it also ensures that the result will be an integer (| 0 does this, converting an value into an integer). We also see this happening with floats – note the use of 0.0 and g[...] = +(...);.
  • Looking at the values coming in and out of the data structures it appears as if the data structured represented by the variable c is an Int32Array (storing 32-bit integers, the values are always converted from or to an integer using | 0) and g is a Float32Array (storing 32-bit floats, the values always converted to a float by wrapping the value with +(...)).

By doing this the result is highly optimized and can be converted directly from this Asm.js syntax directly into assembly without having to interpret it, as one would normally have to do with JavaScript. It effectively shaves off a whole bunch of things that can make a dynamic language, like JavaScript, slow: Like the need for garbage collection and dynamic types.

As an example of some more-explanatory Asm.js code let’s take a look at an example from the Asm.js specification:

function DiagModule(stdlib, foreign, heap) {
 "use asm";
 // Variable Declarations
 var sqrt = stdlib.Math.sqrt;
 // Function Declarations
 function square(x) {
 x = +x;
 return +(x*x);
 }
 function diag(x, y) {
 x = +x;
 y = +y;
 return +sqrt(square(x) + square(y));
 }
 return { diag: diag };
}

Looking at this module it seems downright understandable! Looking at this code we can better understand the structure of an Asm.js module. A module is contained within a function and starts with the "use asm"; directive at the top. This gives the interpreter the hint that everything inside the function should be handled as Asm.js and be compiled to assembly directly.

Note, at the top of the function, the three arguments: stdlib, foreign, and heap. The stdlib object contains references to a number of built-in math functions. foreign provides access to custom user-defined functionality (such as drawing a shape in WebGL). And finally heap gives you an ArrayBuffer which can be viewed through a number of different lenses, such as Int32Array and Float32Array.

The rest of the module is broken up into three parts: variable declarations, function declarations, and finally an object exporting the functions to expose to the user.

The export is an especially important point to understand as it allows all of the code within the module to be handled as Asm.js but still be made usable to other, normal, JavaScript code. Thus you could, theoretically, have some code that looks like the following, using the above DiagModule code:

document.body.onclick = function() {
 function DiagModule(stdlib){"use asm"; ... return { ... };}
 var diag = DiagModule({ Math: Math });
 alert(diag(10, 100));
};

This would result in an Asm.js DiagModule that’s handled special by the JavaScript interpreter but still made available to other JavaScript code (thus we could still access it and use it within a click handler, for example).

What is the performance like?

Right now the only implementation that exists is in nightly versions of Firefox (and even then, for only a couple platforms). That being said early numbers show the performance being really, really good. For complex applications (such as the above games) performance is only around 2x slower than normally-compiled C++ (which is comparable to other languages like Java or C#). This is substantially faster than current browser runtimes, yielding performance that’s about 4-10x faster than the latest Firefox and Chrome builds.

This is a substantial improvement over the current best case. Considering how early on in the development of Asm.js is it’s very likely that there could be even greater performance improvements coming.

It is interesting to see such a large performance chasm appearing between Asm.js and the current engines in Firefox and Chrome. A 4-10x performance difference is substantial (this is in the realm of comparing these browsers to the performance of IE 6). Interestingly even with this performance difference many of these Asm.js demos are still usable on Chrome and Firefox, which is a good indicator for the current state of JavaScript engines. That being said their performance is simply not as good as the performance offered by a browser that is capable of optimizing Asm.js code.

Use Cases

It should be noted that almost all of the applications that are targeting Asm.js right now are C/C++ applications compiled to Asm.js using Emscripten. With that in mind the kind of applications that are going to target Asm.js, in the near future, are those that will benefit from the portability of running in a browser but which have a level of complexity in which a direct port to JavaScript would be infeasible.

So far most of the use cases have centered around code bases where performance is of the utmost importance: Such as in running games, graphics, programming language interpreters, and libraries. A quick look through the Emscripten project list shows many projects which will be of instant use to many developers.

Asm.js Support

As mentioned before the nightly version of Firefox is currently the only browser that supports optimizing Asm.js code.

However it’s important to emphasize that Asm.js-formatted JavaScript code is still just JavaScript code, albeit with an important set of restrictions. For this reason Asm.js-compiled code can still run in other browsers as normal JavaScript code, even if that browser doesn’t support it.

The critical puzzle piece is the performance of that code: If a browser doesn’t support typed arrays or doesn’t specially-compile the Asm.js code then the performance is going to be much worse off. Of course this isn’t special to Asm.js, likely any browser that doesn’t have those features is also suffering in other ways.

Asm.js and Web Development

As you can probably see from the code above Asm.js isn’t designed to be written by hand. It’s going to require some sort of tooling to write and it’s going to require some rather drastic changes from how one would normally write JavaScript, in order to use. The most common use case for Asm.js right now is in applications complied from C/C++ to JavaScript. Almost none of these applications interact with the DOM in a meaningful way, beyond using WebGL and the like.

In order for it to be usable by regular developers there are going to have to be some intermediary languages that are more user-accessible that can compile to Asm.js. The best candidate, at the moment, is LLJS in which work is starting to get it compiling to Asm.js. It should be noted that a language like LLJS is still going to be quite different from regular JavaScript and will likely confuse many JavaScript users. Even with a nice more-user-accessible language like LLJS it’s likely that it’ll still only be used by hardcore developers who want to optimize extremely complex pieces of code.

Even with LLJS, or some other language, that could allow for more hand-written Asm.js code we still wouldn’t have an equally-optimized DOM to work with. The ideal environment would be one where we could compile LLJS code and the DOM together to create a single Asm.js blob which could be executed simultaneously. It’s not clear to me what the performance of that would look like but I would love to find out!

Q&A with David Herman

I sent some questions to David Herman (Senior Researcher at Mozilla Research) to try and get some clarification on how all the pieces of Asm.js fit together and how they expect users to benefit from it. He graciously took the time to answer the questions in-depth and provided some excellent responses. I hope you find them to be as illuminating as I did.

What is the goal of Asm.js? Who do you see as the target audience for the project?

Our goal is to make the open web a compelling virtual machine, a target for compiling other languages and platforms. In this first release, we’re focused on compiling low-level code like C and C++. In the longer run we hope to add support for higher-level constructs like structured objects and garbage collection. So eventually we’d like to support applications from platforms like the JVM and .NET. Since asm.js is really about expanding the foundations of the web, there’s a wide range of potential audiences. One of the audiences we feel we can reach now is game programmers who want access to as much raw computational power as they can. But web developers are inventive and they always find ways to use all the tools at their disposal in ways no one predicts, so I have high hopes that asm.js will become an enabling technology for all sorts of innovative applications I can’t even imagine.

Does it make sense to create a more user-accessible version of Asm.js, like an updated version of LLJS? What about expanding the scope of the project beyond just a compiler target?

Absolutely. In fact, my colleague James Long recently announced that he’s done an initial fork of LLJS that compiles to asm.js. My team at Mozilla Research intends to incorporate James’s work and officially evolve LLJS to support asm.js. In my opinion, you generally only want to write asm.js by hand in a very narrow set of instances, like any assembly language. More often, you want to use more expressive languages that compile efficiently to it. Of course, when languages get extremely expressive like JavaScript, you lose predictability of performance. (My friend Slava Egorov wrote a nice post describing the challenges of writing high-performance code in high-level languages.) LLJS aims for a middle ground — like a C to asm.js’s assembly — that’s easier to write than raw asm.js but has more predictable performance than regular JS. But unlike C, it still has smooth interoperability with regular JS. That way you can write most of your app in dynamic, flexible JS, and focus on only writing the hottest parts of your code in LLJS.

There is talk of a renewed performance divide between browsers that support Asm.js and browsers that don’t, similar to what happened during the last JavaScript performance race in 2008/2009. Even though technically Asm.js code can run everywhere in reality the performance difference will simply be too crippling for many cases. Given this divide, and the highly restricted subset of JavaScript, why did you choose JavaScript as a compilation target? Why JavaScript instead of a custom language or plugin?

First of all, I don’t think the divide is as stark as you’re characterizing it: we’ve built impressive demos that work well in existing browsers but will benefit from killer performance with asm.js. It’s certainly true that you can create applications that will depend on the increased performance of asm.js to be usable. At the same time, just like any new web platform capability, applications can decide whether to degrade gracefully with some less compute-intensive fallback behavior. There’s a difference in kind between an application that works with degraded performance and an application that doesn’t work at all. More broadly, keep in mind the browser performance race that started in the late 00′s was great for the web, and applications have evolved along with the browsers. I believe the same thing can and will happen with asm.js.

How would you compare Asm.js with Google’s Native Client? They appear to have similar goals while Asm.js has the advantage of “just working” everywhere that has JavaScript. Have there been any performance comparisons?

Well, Native Client is a bit different, since it involves shipping platform-specific assembly code; I don’t believe Google has advocated for that as a web content technology (as opposed to making it available to Chrome Web Store content or Chrome extensions), or at least not recently. Portable Native Client (PNaCl) has a closer goal, using platform-independent LLVM bitcode instead of raw assembly. As you say, the first advantage of asm.js is compatibility with existing browsers. We also avoid having to create a system interface and repeat the full surface area of the web API’s as the Pepper API does, since asm.js gets access to the existing API’s by calling directly into JavaScript. Finally, there’s the benefit of ease of implementability: Luke Wagner got our first implementation of OdinMonkey implemented and landed in Firefox in just a few months, working primarily by himself. Because asm.js doesn’t have a big set of syscalls and API’s, and because it’s built off of the JavaScript syntax, you can reuse a whole bunch of the machinery of an existing JavaScript engine and web runtime. We could do performance comparisons to PNaCl but it would take some work, and we’re more focused on closing the gap to raw native performance. We plan to set up some automated benchmarks so we can chart our progress compared with native C/C++ compilers.

Emscripten, another Mozilla project, appears to be the primary producer of Asm.js-compatible code. How much of Asm.js is being dictated by the needs of the Emscripten project? What benefits has Emscripten received now that improvements are being made at the engine level?

We used Emscripten as our first test case for asm.js as a way to ensure that it’s got the right facilities to accommodate the needs of real native applications. And of course benefiting Emscripten benefits everyone who has native applications they want to port — such as Epic Games, who we teamed up with to port the Unreal Engine 3 to the web in just a few days using Emscripten and asm.js. But asm.js can benefit anyone who wants to target a low-level subset of JavaScript. For example, we’ve spoken with the folks who build the Mandreel compiler, which works similarly to Emscripten. We believe they could benefit from targeting asm.js just as Emscripten has started doing. Alon Zakai has been compiling benchmarks that generally run around 2x slower than native, where we were previously seeing results anywhere from 5x to 10x or 20x of native. This is just in our initial release of OdinMonkey, the asm.js backend for Mozilla’s SpiderMonkey JavaScript engine. I expect to see more improvements in coming months.

How fluid is the Asm.js specification? Are you open to adding in additional features (such as more-advanced data structures) as more compiler authors being to target it?

You bet. Luke Wagner has written up an asm.js and OdinMonkey roadmap on the Mozilla wiki, which discusses some of our future plans — I should note that none of these are set in stone but they give you a sense of what we’re working on. I’m really excited about adding support for ES6 structured objects. This would provide garbage-collected but well-typed data structures, which would help compilers like JSIL that compile managed languages like C# and Java to JavaScript. We’re also hoping to use something like the proposed ES7 value types to provide support for 32-bit floats, 64-bit integers, and hopefully even fixed-length vectors for SIMD support.

Is it possible, or even practical, to have a JavaScript-to-Asm.js transpiler?

Possible, yes, but practical? Unclear. Remember in Inception how every time you nested another dream-within-a-dream, time would slow down? The same will almost certainly happen every time you try to run a JS engine within itself. As a back-of-the-envelope calculation, if asm.js runs native code at half native speed, then running a JS engine in asm.js will execute JS code at half that engine’s normal speed. Of course, you could always try running one JS engine in a different engine, and who knows? Performance in reality is never as clear-cut as it is in theory. I welcome some enterprising hacker to try it! In fact, Stanford student Alex Tatiyants has already compiled Mozilla’s SpiderMonkey engine to JS via Emscripten — all you’d have to do is use Emscripten’s compiler flags to generate asm.js. Someone with more time on their hands than me should give it a try…

At the moment all DOM/browser-specific code is handled outside of Asm.js-land. What about creating an Emscripten-to-Asm.js-compiled version of the DOM (akin to DOM.js)?

This is a neat idea. It may be a little tricky with the preliminary version of asm.js, which doesn’t have any support for objects at all. As we grow asm.js to include support for ES6 typed objects, something like this could become feasible and quite efficient! A cool application of this would be to see how much of the web platform could be self-hosted with good performance. One of the motivations behind DOM.js was to see if a pure JS implementation of the DOM could beat the traditional, expensive marshaling/unmarshaling and cross-heap memory management between the JS heap and the reference-counted C++ DOM objects. With asm.js support, DOM.js might get those performance wins plus the benefits of highly optimized data structures. It’s worth investigating.

Given that it’s fairly difficult to write Asm.js, compared with normal JavaScript, what sorts of tools would you like to have to help both developers and compiler authors?

First and foremost we’ll need languages like LLJS, as you mentioned, to compile to asm.js. And we’ll have some of the usual challenges of compiling to the web, such as mapping generated code back to the original source in the browser developer tools, using technologies like source maps. I’d love to see source maps developed further to be able to incorporate richer debugging information, although there’s probably a cost/benefit balance to be struck between the pretty minimal source location information of source maps and super-complex debugging metadata formats like DWARF. For asm.js, I think we’ll focus on LLJS in the near term, but I always welcome ideas from developers about how we can improve their experience.

I assume that you are open to working with other browser vendors, what has collaboration or discussion been like thus far?

Definitely. We’ve had a few informal discussions and they’ve been encouraging so far, and I’m sure we’ll have more. I’m optimistic that we can work with multiple vendors to get asm.js somewhere that we all feel we can realistically implement without too much effort or architectural changes. As I say, the fact that Luke was able to implement OdinMonkey in a matter of just a few months is very encouraging. And I’m happy to see a bug on file for asm.js support in V8. More importantly, I hope that developers will check out asm.js and see what they think, and provide their feedback both to us and other browser vendors.

Viewing all articles
Browse latest Browse all 9433

Trending Articles