The following post is cross-posted from AltDevBlogADay.
For the past few months, I’ve been working on a tool to compile .NET applications into JavaScript. While there’s a lot of work involved, a large chunk of that is caused by static language idioms that cannot easily be expressed in JavaScript. Even when coding by hand, some of these patterns are difficult to turn into JS that performs well, works correctly, and is easy to understand. In this post, I’ll begin describing some of the challenges involved.
Value Type Semantics
Value type semantics are a common feature in static languages. Most static languages allow you to define custom types that inherit the semantics of the language’s built in primitive value types, so that your struct (or whatever your language happens to call it) behaves much like a single int or float would when being manipulated. This feature makes it possible for custom numeric types (like a fixed-point value, for example) or even composite types like vectors and matrices to follow the same rules.
At first glance, the lack of this feature might not seem significant – especially if you’re used to object oriented programming, it would seem like translating code dependent on these semantics is not an intractable problem. While it’s certainly not impossible, it has a number of downsides that surprised me, especially once I started profiling real-world code in modern JavaScript implementations:
Algorithms designed for value types are slower without them
Those of you who’ve worked with Java are probably familiar with this particular problem already: If you take an algorithm designed around the presence of value types and port it to a language without them, while you can easily preserve behavior by inserting copies in the correct places, you may find that your performance has gone through the floor. The reason for this is simple: In many runtime environments, the cost of an instance of a reference type is significantly larger than the cost of an instance of a value type.
The reasons for this cost tend to vary, but one common trait is that these algorithms produce tremendous amounts of garbage when run in a garbage-collected environment. Some of the more sophisticated garbage collectors provide mechanisms to improve performance for this kind of code, like generational collection and escape analysis, but the sad fact is that even the most sophisticated collectors available still suffer from visible pauses when the collector runs. These pauses not only impair the overall performance of your application, but in games, they cause painful framerate hitching and can even impair the playability of your title.
Another common source of higher costs for instances of reference types is inheritance. In particular, virtual inheritance. Virtual inheritance as implemented in most languages requires every object instance to have a few bytes devoted to identifying its type, via vtable pointers or the like. In some cases compilers can optimize this overhead out, but the overhead is usually there regardless of whether you use it, and in some languages, the default is to make everything virtual – so suddenly even your Vector3 object has a vtable pointer.
A third source of overhead is indirection – reference types almost always live in the application heap, which means they are allocated in a chunk of space dedicated to that particular instance. Even if you allocate an array of your Vector3s, reference type semantics mean that it is probably necessary for the heap to contain an allocation for each one, so that if half of them are destroyed, those resources can be freed and reused by some other data types. Not only does this mean that the simple act of creating that array has become more expensive, but it means that an array of values is no longer guaranteed to be contiguous in memory. Sure, your Vector3[] is contiguous, but all it contains is object references. Hm, what’s that cache locality thing people were talking about again?
(Somewhat obvious pre-emptive nitpick: This last bit here doesn’t apply to C++. But you’re using C++, so you know this already, and you should probably go make sure MSVC hasn’t run out of memory while compiling that boost template instantiation. You know the one.)
Algorithms based on value types are fragile using reference types
Despite the above problems, Java developers are able to produce quality software (depending on who you ask, anyway). The same is true for JavaScript: These algorithms can, with effort, be made to work, and even work well. Unfortunately, there is a significant cost involved. You have to rework your code in order to reduce the amount of garbage it creates, and avoid patterns that naturally require the creation of garbage. For example, in a language like C++, your game probably relies on a math library, featuring structured value types with names like Vector3 and Matrix. If you’re one of those people who likes operator overloading, you’ve probably overloaded various operators for those structures, so that you can write things like this:
The above code is simple and self-explanatory to someone familiar with the basics of algebra (and those of you who abstain from operator overloading have probably written similar code using functions instead of operators). As mentioned above, the performance of such algorithms will suffer badly in a garbage collected environment, but that’s not the worst part. First of all, to correctly translate these algorithms, you must carefully locate all the implied copies contained within the code. As written, the above algorithm is correct even without value types. Let’s convert it to valid JavaScript:
Seems fine, right? But, the keen-eyed reader realizes that a copy is implied in the Add and Multiply operations, so for them to be correct, they must create and return new Vector3 instances. Let’s fix it so it doesn’t produce garbage.
Hm, we need a copy here. That sucks. We’ll just store an extra vector in the object so we don’t need to create garbage. All good, right?
A little ugly, perhaps, but not bad, right? The clarity suffers a little, but we could address that by naming variables more carefully, etc. Not horrible. But wait a second. We pulled the values position and velocity out of an object. If we just modified position and velocity, does that mean we’ve modified the state of the object it came from? Yes, yes it does. Well, we just need to emulate the semantics of value types, right?
Okay, there. It’s correct again now. But wait, garbage collection! Uh, crap. Okay, let’s remove the copies we don’t actually need… we’ll assume that the value of NextPosition isn’t being used, so we don’t need to copy it. And we’re not modifying Position so we don’t need that copy either…
Okay, life is good again. But a week later, you’re reading over commits (sneaky interns keep adding features behind your back, the nerve) and you see something that makes you spit up your coffee:
Luckily, as noted in the comment, the game isn’t using ZoomFactor yet, so it’s always 1. Nothing horrible has happened yet. But now you’re worried, and you’re starting to sweat a little: What other bugs has JDoe introduced that are just waiting to break your code once it hits production? Didn’t this guy develop payment processing systems at a bank? Are your savings in danger? Oh man, what if a bug in code he wrote allows Albanian hackers to steal that money you’ve been saving up for your little startup venture? You might be stuck working here for another 10 years!
Quality algorithms designed without value types are hard to find
So, let’s say you’ve calmed down a little bit after that panic attack earlier. JDoe’s not a bad guy, he’s just new at this. You’ll go through and review his code with him, make sure he understands what he did wrong, and then you’ll add another 5 pages to the company’s ever-growing coding standards document, right after you find a new bank. Life moves on, right?
While you’re doing the code review, though, you run across something that makes you pause. You notice that JDoe modified some code that was written by Chris Beardy, a wizardly programmer type who had already been working here for 8 years when you started out. Nobody really knows what happened to Beardy; one day he just stopped showing up. Until now, his code hadn’t been a problem, since most of it worked great, and the bits that stopped working you just threw out and rewrote (with variable names longer than two characters).
JDoe has done his best to modify the code, and while it seems to work, you realize you don’t know whether it’s correct. You could assume that JDoe messed up, but what if you’re wrong? How can you be sure the code is correct? You ask around the office to see if anyone has Beardy’s phone number, but nobody is even certain if he owned a phone. You try to contact his next of kin but all of their phone numbers bear international dialing codes and it’s 3am in Singapore. This is a problem.
After consulting with a respected colleague, you hit upon a solution: It may be true that neither of you know whether the code is correct, but you know what it was supposed to do. It turns out it’s based on some old academic paper, and the paper’s been cited a bunch of times, and other games use this same algorithm. You figure you can just read a couple of those papers and implement the algorithm yourself – hell, maybe you can find a good implementation out there on the internet that’s fit to use!
Soon after you begin reading the papers, panic begins to set in once again. You’re starting to wonder if that literature major was such a good idea, and trying to remember the stuff you learned in the few math classes you did take. You’re not entirely convinced that these symbols used for variable names exist in any alphabet, and not even your coworkers know what it means when the author writes them upside down. You abandon your paper-reading expedition, reminding yourself that code reuse is better anyway, and as a good engineer you should avoid Not Invented Here syndrome by adopting someone else’s battle-tested implementation…
Now the horror begins. You narrow down the few available implementations for your programming language, rule out the ones that depend on libraries you can’t possibly bundle in, and check with Legal to figure out which specific open-source licenses don’t cause them to flee in terror. You scan over some documentation and things don’t seem too bad. This one is enterprise ready, and you can get a support contract! This one is written using Inversion of Control and Dependency Injection, whatever those are – you think you might have heard someone say something good about them once. You’re feeling pretty okay about things until you pop open the source code. Why are half of the comments in Japanese? This class name is 50 characters long and you’re pretty sure all it’s responsible for is adding numbers. The constructor for one of the key objects creates no less than 8 abstract factories, and two of those only exist to create other abstract factories…
While you start updating your resume, you start to wonder whether you should consider another line of work…
(Now, let’s be completely fair: This one applies to pretty much any programming language, value types or not. But I’ve never been the type of person to let reality interrupt a good rant, so the disclaimer sits here at the end.)
In Conclusion
Now, if you’ve read all this you might be thinking, ‘hey, I’m really offended by that comment about Albanians. What’s wrong with this guy?’. And you’re right, that was kind of offensive. I’m sorry.
You also might be thinking: ‘Wow, this is a really negative perspective on things. Can life really be that bad writing JavaScript? I wrote a webpage once, and it was pretty okay. Google writes whole applications in it and they seem to be making tons of money’. You’re also right! Maybe not as right as that other guy, but hey. I said I’m sorry.
These problems are not insurmountable. In my particular case, since I’m writing a compiler, instead of porting by hand or writing JS from scratch, I can solve these problems using verifiable code, instead of trying to get it right in my head. Despite this, fixing some of these problems involves writing some pretty hairy code – and if you’re like me and your Computer Science background is a little – let’s be generous here and call it ‘weak’ – you may find things like performing escape analysis on functions containing goto statements a tiny bit daunting.
When I finally got my translator working on some real-world game demos after a month or two of hacking, I was really excited; the framerate was bad, but I hadn’t done any optimization, so that was to be expected. When I finally ran a demo through a profiler, though, my heart sank a little: Wait, I’m spending 35% of my CPU time in the garbage collector? And another 20% running the constructor for Vector3? I realized that while I had anticipated the challenges in translating working code into another language, I had not anticipated the challenges involved in translating performant code from one language into performant code in another language. So, I hope that these few examples I’ve provided might give you a little insight.



#1 by Mike Shaver on June 18, 2011 - 5:02 am
Quote
Great post, Kevin. Happily we don’t have to worry about goto in JS (except maybe in an Opera extension?), but the GC and value-type issues are definitely key to some of these performance issues.
The bump allocator and generation collector that are under development should help with the former, and you might test with Chrome to see how that aspect varies, since they have something similar to what we’re building.
For value types, it looks like this ES.next (ES6 or whatever we end up calling it) proposal will help especially in these cases: http://wiki.ecmascript.org/doku.php?id=harmony:binary_data
I think optimizers should be doing a lot of the escape analysis and so forth for you, and they’ll be able to see really well into various binary data/value type users. You might send a profile to bhackett, or just talk him through your findings on Monday, if you can stand the walk to his desk.
#2 by Kael on June 18, 2011 - 6:06 am
Quote
Thanks for the comment, Mike.
I didn’t know about the bump collector, so I’ll have to dig into that. The ’35% in garbage collector’ statistic is actually from Chrome’s generational collector, but I believe there’s a lot of improvement to be made there from simply generating better JavaScript (by eliminating unnecessary copies, etc.)
Binary data kicks ass. Can’t wait to be able to use it. In particular, its support for contiguous arrays of binary data means that the ‘data oriented design’ approaches often used to improve locality in C++ will finally be applicable in JavaScript, which should help a lot for performance sensitive code.
On the topic of escape analysis I’m actually using this compiler as a testing ground to figure out how best to apply it to this problem. I want to try and apply those lessons to optimize reference type semantics in SpiderMonkey once I have a handle on the problem.
Pingback: Kevin Gadd: Performance Challenges in Compiling Code to JavaScript | Firefox Latest News
#3 by azakai on June 18, 2011 - 9:31 am
Quote
Most of these issues can be solved if you compile something lower-level into JavaScript. If you compile something like LLVM assembly, you end up with no GC at all in your generated code. There is also no suffering from constructors, non-contiguous allocation, inheritance, etc. And you’re not passing around references all the time, you’re passing around plain integers (an early version of Emscripten passed around references – performance was pretty bad).
The downside of course is that the generated code is complete gibberish. And connecting it to normal handwritten code is not trivial.
On the benchmarks I’ve tested so far, LLVM compiled to JS using Emscripten is in the range of 3-10x slower than gcc -O3 (on JM TI and V8). (That’s without typed arrays, which in theory JS engines should be able to optimize for even better performance.)
What performance are you seeing with your compiler? I’d be interested to see some numbers comparing it to .NET and Mono. Would also be interesting to do a comparison of it to a lower-level approach (compiling .NET to JS using Mono’s LLVM backend+Emscripten).
#4 by Kael on June 18, 2011 - 9:45 am
Quote
I haven’t done extensive comparative benchmarking, but a few of my test cases are pulled from that old alioth benchmark shootout, and on the ones that aren’t heavily dependent on value type semantics, Spidermonkey (with -j and -m) performs on par with debug-mode C#. In some cases, it actually performs slightly better. I’m not sure how much of the difference is due to startup overhead for the C# jit or startup overhead for SM’s tracer/jit, though.
Compiling .NET using the Mono LLVM backend + Emscripten would be an interesting experiment. I think struct based code would probably perform a lot better, but on the other hand, I think you’d end up with the Mono Boehm GC running inside the Spidermonkey GC, and that seems like it would be suboptimal to say the least…
My translation approach could probably also be used to generate Emscripten style code as well, if you got serious about going with a similar ‘array representing memory, no reference types’ model. It would be interesting to see if that would produce better performance, but the need to preserve C# class garbage collection semantics seems like it would prevent it from being truly useful.
My personal perspective is that if you’re producing javascript, readability and debugger friendliness need to be a primary priority – at least until tools like Firebug and the Chrome debugger understand the concept of JavaScript as an intermediate language, and can map compiled JavaScript back to the original source code. It’s already so hard to debug cross-browser javascript that the idea of asking people to debug complete gibberish seems likely to send them screaming into the arms of Flash or Silverlight.
#5 by azakai on June 18, 2011 - 2:49 pm
Quote
> My personal perspective is that if you’re producing javascript, readability and debugger friendliness need to be a primary priority – at least until tools like Firebug and the Chrome debugger understand the concept of JavaScript as an intermediate language, and can map compiled JavaScript back to the original source code.
SpiderMonkey should be able to do this already, with syntax like
//@line 7 “srcFile.c”
I haven’t tried it out myself though.