The following post is cross-posted from AltDevBlogADay.
Value Type Semantics
Value type semantics are a common feature in static languages. Most static languages allow you to define custom types that inherit the semantics of the language’s built in primitive value types, so that your struct (or whatever your language happens to call it) behaves much like a single int or float would when being manipulated. This feature makes it possible for custom numeric types (like a fixed-point value, for example) or even composite types like vectors and matrices to follow the same rules.
Algorithms designed for value types are slower without them
Those of you who’ve worked with Java are probably familiar with this particular problem already: If you take an algorithm designed around the presence of value types and port it to a language without them, while you can easily preserve behavior by inserting copies in the correct places, you may find that your performance has gone through the floor. The reason for this is simple: In many runtime environments, the cost of an instance of a reference type is significantly larger than the cost of an instance of a value type.
The reasons for this cost tend to vary, but one common trait is that these algorithms produce tremendous amounts of garbage when run in a garbage-collected environment. Some of the more sophisticated garbage collectors provide mechanisms to improve performance for this kind of code, like generational collection and escape analysis, but the sad fact is that even the most sophisticated collectors available still suffer from visible pauses when the collector runs. These pauses not only impair the overall performance of your application, but in games, they cause painful framerate hitching and can even impair the playability of your title.
Another common source of higher costs for instances of reference types is inheritance. In particular, virtual inheritance. Virtual inheritance as implemented in most languages requires every object instance to have a few bytes devoted to identifying its type, via vtable pointers or the like. In some cases compilers can optimize this overhead out, but the overhead is usually there regardless of whether you use it, and in some languages, the default is to make everything virtual – so suddenly even your Vector3 object has a vtable pointer.
A third source of overhead is indirection – reference types almost always live in the application heap, which means they are allocated in a chunk of space dedicated to that particular instance. Even if you allocate an array of your Vector3s, reference type semantics mean that it is probably necessary for the heap to contain an allocation for each one, so that if half of them are destroyed, those resources can be freed and reused by some other data types. Not only does this mean that the simple act of creating that array has become more expensive, but it means that an array of values is no longer guaranteed to be contiguous in memory. Sure, your Vector3 is contiguous, but all it contains is object references. Hm, what’s that cache locality thing people were talking about again?
(Somewhat obvious pre-emptive nitpick: This last bit here doesn’t apply to C++. But you’re using C++, so you know this already, and you should probably go make sure MSVC hasn’t run out of memory while compiling that boost template instantiation. You know the one.)
Algorithms based on value types are fragile using reference types
Seems fine, right? But, the keen-eyed reader realizes that a copy is implied in the Add and Multiply operations, so for them to be correct, they must create and return new Vector3 instances. Let’s fix it so it doesn’t produce garbage.
Hm, we need a copy here. That sucks. We’ll just store an extra vector in the object so we don’t need to create garbage. All good, right?
A little ugly, perhaps, but not bad, right? The clarity suffers a little, but we could address that by naming variables more carefully, etc. Not horrible. But wait a second. We pulled the values position and velocity out of an object. If we just modified position and velocity, does that mean we’ve modified the state of the object it came from? Yes, yes it does. Well, we just need to emulate the semantics of value types, right?
Okay, there. It’s correct again now. But wait, garbage collection! Uh, crap. Okay, let’s remove the copies we don’t actually need… we’ll assume that the value of NextPosition isn’t being used, so we don’t need to copy it. And we’re not modifying Position so we don’t need that copy either…
Okay, life is good again. But a week later, you’re reading over commits (sneaky interns keep adding features behind your back, the nerve) and you see something that makes you spit up your coffee:
Luckily, as noted in the comment, the game isn’t using ZoomFactor yet, so it’s always 1. Nothing horrible has happened yet. But now you’re worried, and you’re starting to sweat a little: What other bugs has JDoe introduced that are just waiting to break your code once it hits production? Didn’t this guy develop payment processing systems at a bank? Are your savings in danger? Oh man, what if a bug in code he wrote allows Albanian hackers to steal that money you’ve been saving up for your little startup venture? You might be stuck working here for another 10 years!
Quality algorithms designed without value types are hard to find
So, let’s say you’ve calmed down a little bit after that panic attack earlier. JDoe’s not a bad guy, he’s just new at this. You’ll go through and review his code with him, make sure he understands what he did wrong, and then you’ll add another 5 pages to the company’s ever-growing coding standards document, right after you find a new bank. Life moves on, right?
While you’re doing the code review, though, you run across something that makes you pause. You notice that JDoe modified some code that was written by Chris Beardy, a wizardly programmer type who had already been working here for 8 years when you started out. Nobody really knows what happened to Beardy; one day he just stopped showing up. Until now, his code hadn’t been a problem, since most of it worked great, and the bits that stopped working you just threw out and rewrote (with variable names longer than two characters).
JDoe has done his best to modify the code, and while it seems to work, you realize you don’t know whether it’s correct. You could assume that JDoe messed up, but what if you’re wrong? How can you be sure the code is correct? You ask around the office to see if anyone has Beardy’s phone number, but nobody is even certain if he owned a phone. You try to contact his next of kin but all of their phone numbers bear international dialing codes and it’s 3am in Singapore. This is a problem.
After consulting with a respected colleague, you hit upon a solution: It may be true that neither of you know whether the code is correct, but you know what it was supposed to do. It turns out it’s based on some old academic paper, and the paper’s been cited a bunch of times, and other games use this same algorithm. You figure you can just read a couple of those papers and implement the algorithm yourself – hell, maybe you can find a good implementation out there on the internet that’s fit to use!
Soon after you begin reading the papers, panic begins to set in once again. You’re starting to wonder if that literature major was such a good idea, and trying to remember the stuff you learned in the few math classes you did take. You’re not entirely convinced that these symbols used for variable names exist in any alphabet, and not even your coworkers know what it means when the author writes them upside down. You abandon your paper-reading expedition, reminding yourself that code reuse is better anyway, and as a good engineer you should avoid Not Invented Here syndrome by adopting someone else’s battle-tested implementation…
Now the horror begins. You narrow down the few available implementations for your programming language, rule out the ones that depend on libraries you can’t possibly bundle in, and check with Legal to figure out which specific open-source licenses don’t cause them to flee in terror. You scan over some documentation and things don’t seem too bad. This one is enterprise ready, and you can get a support contract! This one is written using Inversion of Control and Dependency Injection, whatever those are – you think you might have heard someone say something good about them once. You’re feeling pretty okay about things until you pop open the source code. Why are half of the comments in Japanese? This class name is 50 characters long and you’re pretty sure all it’s responsible for is adding numbers. The constructor for one of the key objects creates no less than 8 abstract factories, and two of those only exist to create other abstract factories…
While you start updating your resume, you start to wonder whether you should consider another line of work…
(Now, let’s be completely fair: This one applies to pretty much any programming language, value types or not. But I’ve never been the type of person to let reality interrupt a good rant, so the disclaimer sits here at the end.)
Now, if you’ve read all this you might be thinking, ‘hey, I’m really offended by that comment about Albanians. What’s wrong with this guy?’. And you’re right, that was kind of offensive. I’m sorry.
These problems are not insurmountable. In my particular case, since I’m writing a compiler, instead of porting by hand or writing JS from scratch, I can solve these problems using verifiable code, instead of trying to get it right in my head. Despite this, fixing some of these problems involves writing some pretty hairy code – and if you’re like me and your Computer Science background is a little – let’s be generous here and call it ‘weak’ – you may find things like performing escape analysis on functions containing goto statements a tiny bit daunting.
When I finally got my translator working on some real-world game demos after a month or two of hacking, I was really excited; the framerate was bad, but I hadn’t done any optimization, so that was to be expected. When I finally ran a demo through a profiler, though, my heart sank a little: Wait, I’m spending 35% of my CPU time in the garbage collector? And another 20% running the constructor for Vector3? I realized that while I had anticipated the challenges in translating working code into another language, I had not anticipated the challenges involved in translating performant code from one language into performant code in another language. So, I hope that these few examples I’ve provided might give you a little insight.