Archive for June, 2009

Water

Among the various things I worked on this week, I spent a good portion of time implementing a basic fluid system in the game engine. The fluid system lets me create volumes of moving fluid that alter the player’s physics and can flow down off ledges and interact with objects, so that levels like the aqueducts look and feel more convincing.

The two major pieces of the fluid system are physics and rendering. The physics system is arguably more important, but ended up mostly being a matter of fine-tuning. Here’s how it works:
Previously, every frame I did a sweep of the area below onscreen entities to locate the nearest surface to stand on, and used that to perform some motion and collision detection calculations so that the player can properly move up/down slopes, stand on moving objects, and drop when standing over a gap.

To implement the water system, I extended the standing sweep code to also locate the nearest volume of fluid that intersects the entity, and store it. What this allows me to do is detect whether or not the player character is currently inside a fluid, regardless of whether or not he’s standing on a surface at the time (so it even works while jumping/falling).

Once I had this functioning, I added the ability to attach ‘Material’ information to any piece of geometry in the game, including fluids. These two pieces of functionality combined allow me to control the physical attributes of a surface, and then further alter those attributes when the player is standing inside water or other fluids. This not only allows me to implement slippery surfaces like ice, but cause the player to move more slowly and have different control characteristics when in the water.
In the future, this should also allow me to alter the characteristics of surfaces in response to gameplay events – enemies that spray sticky goo on the ground, spider webs that slow you down, or a potion that freezes water, turning it into slippery ice you can stand on.

After a bit of fine-tuning, I came up with some physics parameters for the basic material types that I’m happy with, that make the differences between various materials obvious without breaking gameplay mechanics. The jumping mechanics were a particularly tough detail – even minor changes to the acceleration and speed characteristics of the player render some of my jumping puzzle prototypes completely impossible to solve, so I had to carefully test against those puzzle prototypes each time I adjusted the physics values for a given material. In the final game, I’m going to have to address this by only using a specific set of materials in each level, so that it’s not necessary to retest the entire game after physics changes.

Of course, water isn’t particularly interesting if you can’t see it, so the other thing I did was build a fairly simple bit of rendering that allows fluids to flow across surfaces and down off ledges.

Read the rest of this entry »

Tags: , , , ,

Content Pipeline integration and deployment

During this past week, one of the problems I tackled was finding a way to deploy builds of my game. When dealing with deployment, an important principle is that the process should be as close to a single click as possible – a complex deployment process discourages you from getting customer feedback, and increases the likelihood that you’ll make a mistake and end up with a failed deployment, which wastes valuable time.

For most of my previous projects, I’ve tended to take either a hands-on approach to deployment, or a completely hands-off one, either building an installer by hand using tools like NSIS, or deploying the project as a ZIP file full of binaries and expecting the end user to install the necessary dependencies and figure out how to get the program running. Neither extreme is ideal, really (though hands-off deployment can be amazing if you manage to set things up such that all the end user has to do is run your game – no install necessary).

Read the rest of this entry »

Tags: , , , ,

Tiled map loader for XNA

Update: Stephen Belanger of Nerd Culture has made some great improvements to the following code, so I encourage you to check out the post about it on his blog.

Early in the development of my game, I used the free and open-source Tiled Map Editor to create levels. It was a big time-saver since it let me worry about more important things instead of investing effort into being able to place tiles down on a map. Later on I decided that the traditional approach to map construction wasn’t right for my project – but I was still glad I’d used Tiled.

Recently I realized that there aren’t very many easy ways for newbie XNA developers to get maps into their games, so I decided it was worth packaging up my Tiled map loader and sharing it with the world. So, I’ve created a simple example that shows how to load Tiled maps in your XNA game on Windows PCs and the XBox 360, and included the loader with it. It’s open-source and free for your use, no strings attached. I hope you find it helpful.

screenshot

Download source code and binaries

Note that it doesn’t have support for isometric tiles or embedded tilesets, because I had no use for either feature. Tiled’s file format is relatively simple, so if you need those features, it should be simple to add them.

And of course, this wouldn’t be possible without the generous contributions of the developers of Tiled, Adam Turk and Bjørn Lindeijer. If you’d like to try it out, you can download it from their website (note: requires Java).

Tags: , , , , , , , , ,

Asynchronous programming for XNA games with Squared.Task

A recent post on Shawn Hargreaves’ blog reminded me that I never got around to sharing my technique for interacting with the XNA Guide API and doing other asynchronous operations during my game’s execution. It’s a simple application of some of the cooperative-threading techniques I’ve blogged about previously, and makes what would otherwise be a somewhat painful exercise relatively simple in practice. So, for those who are working on XNA games or just curious, I’ll go into detail on the technique here and share the source code with you!

The basic goal here is to be able to write game code in an imperative, sequential manner, without having to deal with locks, callbacks, polling, or race conditions. You rarely achieve perfection when dealing with concurrency, but even getting within sight of perfection can be worth the effort if it means you don’t have to spend time pulling your hair out trying to reproduce a rare threading bug.

In my game, any operation that can be performed off the main thread is done without blocking the main thread. Loading content, loading files, saving games, loading games, interacting with the guide, etc. This means that I can seamlessly load things in the background while the game is running without any noticeable stuttering. Normally, you’d need to do lots of juggling to handle this correctly – locking, threads, explicit synchronization – but thanks to cooperative threading, I don’t have to deal with any of that. The techniques I describe in this post are the same basic techniques I’m using for my game. Hopefully, that means they should work for yours too (though unfortunately that means if there are bugs in them, you’ve probably just found a bug in my game… whoops.)

Unless you’re working on an XNA game that uses networking (which sadly, I am not, so I can’t speak to the difficulties there), the biggest concurrency related issues you’re likely to deal with are twofold:

  • Interacting with the XNA Guide APIs
  • Performing I/O

By the end of this article I’ll have shown you how you can tackle these problems in your XNA games without having to deal directly with threads or synchronization, in a simple, imperative manner that works equally well on the PC and XBox 360.

ss_02

The Guide APIs are almost all designed in a manner that requires use of either threads, callbacks/continuations, or polling, because they come as Begin/End pairs that require management of ordering and state beyond a single frame. This is for the most part a necessity, as detailed in the blog post I mentioned above. So instead of trying to find a workaround, the best course of action is to simplify the process of working with those APIs – if possible, without introducing new issues or restricting functionality.

I/O is a more interesting problem; even when writing desktop applications, you often perform large I/O operations that have the potential to block, and in those cases, a well-written application uses concurrency techniques – threading, callbacks, etc – to avoid ‘hanging’ and frustrating the user. However, on almost any modern desktop machine, you can typically get away with tiny I/O operations if done correctly – a good example is Firefox 3, which still regularly performs short disk operations on its UI thread, which can cause it to hang if your machine is under extreme I/O load, but works fine in almost every other situation.

The XBox 360 makes I/O particularly challenging because you can’t rely on latency and throughput characteristics that you might be used to on a desktop machine: Not only do you have to deal with the downsides inherent in hard disks, but you also have to deal with the possibility of a player using a memory card, and even worse, the possiblity of the available storage devices changing while your game is running (due to memory card insertion/removal, etc). This means that doing I/O in the main thread is pretty much a non-starter. You’re stuck: It’s concurrency time.

Luckily, cooperative threading provides a great solution for tackling both of these issues: It lets you reason about your problems in a manner you’re familiar with, and solve them by writing imperative code that still behaves correctly in tough situations, at the expense of some slight overhead and minor changes to your game code. You don’t have to build a tremendously complex state machine (since the C# compiler can be convinced to do the heavy lifting for us), and you don’t have to worry about locking and synchronization since all your code is guaranteed to run sequentially in the same thread, at the appropriate time.

Read the rest of this entry »

Tags: , , , ,

Cutting and tuning

As far as gameplay goes, the only major addition since last week’s post was a relatively complete implementation of player death, along with the ‘reunion’ teleport that goes with it. Fairly simple at present, with some bugs to work out (including one related to the teleport location that you’ll see in the video below). Definitely helps get a better feeling for whether a given puzzle is too hard or too easy.

Other than that, the only thing of note code-wise is the time I spent doing some performance tuning. My framerate had crept down over the past month or so, to the point that I wasn’t able to maintain a stable 60fps on the 360 anymore and my CPU utilization was approaching 30% on my desktop PC. Some of the optimizations were relatively obvious – for example, I was calling SpriteBatch.Begin/SpriteBatch.End for every on-screen object, for the sake of simplicity, which resulted in a lot of unnecessary draw calls.

Some simple changes to automatically begin/end batches when changing rendering settings reduced the number of draw calls per frame in most cases to around 30, which is perfectly acceptable, and reduced my CPU utilization by around 50%. After that, the rest of the optimization was pretty trivial – finding other hotspots in my profiler data and reducing the cost.

Well, it was trivial until I ran the game on the 360 again and noticed that the framerate hadn’t improved very much. Huh? I doubled my framerate on the PC, but on the 360, it barely moved an inch. What’s the deal?

Turns out, the 360’s pitiful floating-point performance was kneecapping me. Believe it or not, the primary culprit was the geometric shapes in the game’s HUD, for the circular health displays you may have seen in previous screenshots/videos. I knew the 360’s FPU was weak, but not THAT bad. Unfortunately, the only way to detect this is by manually measuring the performance cost of your code on the 360, by commenting out/toggling individual sections of your game code.

Tedious, at best. For now, I ended up just reducing the complexity of the geometry for the HUD elements and reducing the number of shapes I was drawing, which brought the 360 framerate much closer to what it used to be. As it happens, some design changes later in the week helped here too…


The majority of my work ended up focusing on the game’s design: My schedule for this project is extremely aggressive (insanely so, really) and as such, I need to have a relatively complete playable demo within mere months for submission to a couple major game competitions. Being able to hit that deadline in my free time requires me to aggressively control the scope of the project, avoid feature creep, and do as little work as possible to get game mechanics implemented and content built. 
 
Towards this end, after getting one of my main mechanics prototyped and testing it out in content I’d built, I made the hard decision to cut the mechanic. The second controllable character you’ve seen in some of my previous videos is effectively gone, though I’m going to attempt to make use of the design and code effort for the revised design. 
 
Making a choice like this is always painful, especially when you don’t have an unchangeable deadline or overbearing boss pushing you towards it. But ultimately, I think I’ll benefit from making these cuts sooner rather than later. I wish I had started thinking hard about it a few weeks earlier, when my first prototypes were working, instead of waiting until the issues were obvious, but I’m still relatively happy with the turnaround. I was able to prototype a relatively unusual game mechanic in a matter of a couple dozen hours of programming time, and decide that it wasn’t worth pursuing, and cut it. Definitely an improvement over traditional Waterfall with long cycles, but not quite true Agile yet. :)  
 
For me, this underscores the importance of aggressive, early prototyping of almost every possible game mechanic and design, instead of focusing on a single section of game content or gameplay until it’s done. Previously I was used to having a rigid focus on a section of a game or application, working on it day in and day out until it was done and ready to hand off – but in many cases, this meant that I could sink days or weeks of my time into something that ultimately had to be thrown out. 
 
Just like a lot of Agile proponents will tell you, it turns out that failing fast means wasting less time. The chaotic feeling and loss of productivity to context switches can be painful, and I think being successful requires setting things up to avoid having to pay those costs too many times a week, but ultimately, it’s a great decision. 
 

Tags: , , , , , ,

In-Depth Python/CPython crash debugging with Visual C++

If you do a lot of work with Python’s standard C implementation, CPython, there’s a chance that you’ll run into issues that will actually crash python.exe. If you’re writing extensions for python in C/C++ (.pyd files), it’s likely that some of those crashes are the result of your code. Naturally, if your C++ caused python.exe to crash, you’re going to attach a debugger and take a look.

If the problem is in your code, often it’s simple to see it – just look at your argument values and data structures. But what if the problem exists outside your code, and somewhere in python? All you have in the debugger are a bunch of PyObject * values; Visual C++ or WinDBG won’t do much to help you work with them. Faced with this, your best options are usually to add debugging code to your python – print statements, assertions, etc. Eventually, you might be able to puzzle out the problem and a solution.

But as it turns out, PyObject doesn’t have to be a black box. If you’re using Visual C++, you have all the tools you need to dig around inside of CPython – even if you just caught an unhandled exception.

For this to work, you’ll need a recent version of Visual C++, along with debug symbols for python25.dll. You also need debug symbols for your C/C++ extension (the .pyd). Once you’ve got all this, attach the visual C++ debugger to your application, and either set a breakpoint in your code, or wait for an unhandled exception.

Once you’re paused in the debugger, inside your code, you’re ready to get to work. Let’s use a hypothetical example. We’ve got a simple object, written in C, with a method exposed to python called ‘print’. The function we wrote in C to handle the method is called MyPrint:

void MyPrint(PyObject * self, PyObject * args) {
    const char *text;

    if (!PyArg_ParseTuple(args, "s", &text))
        return NULL;

    int result = printf("%s", text);

    return Py_BuildValue("i", result);
}

The function is simple – we take a single argument from the python dict (args), convert it to a string, and call printf to print it to standard output. Then we return printf’s return value (the number of characters written) as a python int.

So, let’s suppose that MyPrint crashes. We’re now sitting in the debugger, and staring at a couple opaque PyObject * pointers. If we’re lucky, we got past the ParseTuple call and the ‘text’ pointer is valid too. Unfortunately, the debugger isn’t going to be very helpful. If you look at ’self’ and ‘args’ in the Locals or Watch windows, you’re going to see something like this:

args     0x1179a650 {ob_refcnt=3 ob_type=0x023706a0}

We can see a reference count, and a pointer to the object’s type structure, but that’s about it. There are a few useful things in the type structure, like the name of the type, but that’s it. The same goes for ‘args’ – we can’t see the values inside it. What we really want is to be able to look at MyPrint’s arguments, to see what was passed in.

If you look closely, you’ll notice that the type structure has a field named ‘tp_repr’. If you dig into the CPython documentation, you’ll realize that tp_repr is a pointer to the repr() function for that object. In python, repr() converts any given object into a string, usable for debugging purposes.

Go ahead and open up Visual C++’s Immediate window. You’ll find it under the Debug menu’s Windows submenu. After it’s open, type this in:

args->ob_type->tp_repr(args)

If everything’s in good shape, you’ll see output like this appear in the Immediate window:

0x1196fd88 { ob_refcnt=2 ob_type=0x02370600 }

You just called repr() on your argument list, and it returned a Python string object. Now we’re in a bit of a tough situation – we still don’t have any way to look at our results. All we have is a Python object. We do, however, know that it’s a string, and as it turns out, in CPython, this means that the text of the string is directly adjacent to the PyObject structure. Knowing that, we can read the contents of memory to see the string.

Let’s take the PyObject * we got from repr(), and have visual studio show us the contents of the memory at that address. Type this into the immediate window:

0x1196fd88,ma

The ‘,ma‘ is a hint to the Visual Studio Expression Evaluator (used by the Watch and Immediate windows) that it should treat the specified value as a Memory address pointing to ASCII text. It’ll spit out something like this:

0x1196fd88  ..... ..6...........(<MyClass object at 0xbaadf00d>)............

The debugger wrote out the first 64 bytes of memory at that location. The first few bytes are the contents of the PyObject structure, and immediately following them, the contents of the string. From the result of repr(), we can now see that an instance of MyClass was passed to MyPrint, instead of a string. And from the address shown, it looks like it’s a bad pointer.

This is already useful debugging information we might not have had access to otherwise. But we can go further!

Let’s suppose that the MyClass object that was passed to us is actually valid. In that case, you’ll see a repr() that looks innocuous, and have an idea of what the contents of your argument list are. But if your arguments are all objects, repr() won’t tell you that much about them.

If MyClass looks like this:

class MyClass:
  def __init__(self, number, name):
    self.__number = number
    self.name = name

Calling repr() on it won’t tell us the things we actually care about – its name and number. We could add a repr() function to MyClass to find this out, if we wanted. But if you’re trying to debug a rare crash that’s hard to reproduce, restarting your app to add a repr() function is a dangerous step to take – you might not see the crash again for hours, days, or weeks.

So, let’s find out the name and number of the MyClass instance that was passed in.

The first step is to pull the instance out of our argument list. If we wanted, we could just grab the address straight out of the repr() – but let’s be SLIGHTLY less evil and do it using the CPython API. Type this into the immediate window:

{,,python25.dll}PySequence_GetItem(args, 0)

If everything went well, you’ll see output like this:

0xbaadf00d { ob_refcnt=4 ob_type=0x023791010 }

The expression we just evaluated bears a little explanation: The ‘{,,python25.dll}‘ part of the expression tells the debugger that you want to resolve the following symbol using the module ‘python25.dll‘. Otherwise, the debugger might not know the identity of PySequence_GetItem, depending on the quality of your loaded symbols. Don’t ask me why the two commas are there, I have no idea.

So, now that we’ve successfully called GetItem on our argument list, we have the PyObject * for our MyClass instance. If things worked correctly, it will match what you saw in the result of repr().

Now that we have a MyClass instance, we want to get the name and number. Let’s get the value of the name attribute:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("name") )

There’s a lot going on here, so let’s take it apart. First, we tell the debugger to treat the pointer we were given as a PyObject *:

((PyObject *)0xbaadf00d)

Then we invoke its ‘tp_getattro’ handler, which is the generic getattr() handler for the object. Note that we have to pass in the pointer again, since it takes self as a parameter:

->ob_type->tp_getattro( (PyObject *)0xbaadf00d,

Then finally, we pass in the name of the attribute we wish to retrieve, as a PyObject. To do this, we create a PyString, using the CPython API:

{,,python25.dll}PyString_FromString("name") )

As before, we had to help the debugger find the name.

The getattr call should succeed, and you should get a result:

0xabcd0123 { ob_refcnt=7 ob_type=0x023792222 }

Using the tricks we’ve already learned, we can peek at the name:

0xabcd0123, ma
0xabcd0123  ..... ..6...........Steve..............................

Great. Given this, we may already have enough information to hunt through our source code, looking for the code that constructs Steve. But let’s pretend we need the number:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("number") )
0x00000000

Hm. That didn’t work. If we remember carefully, MyClass stores the number in an attribute named ‘__number’, not ‘number’. That’s easy to fix, right?

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("__number") )
0x00000000

No dice. What’s the problem? If you read the Python documentation, you’ll find that a double underscore at the beginning of an attribute’s name gets treated specially by the language, and ‘mangled’ in order to protect it from external access. As a result, when we wrote:

    self.__number = number

What actually got run was:

    self._MyClass__number = number

This results in __number being accessible from inside MyClass, but not outside. Knowing this, we can fix our getattr call:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("_MyClass__number") )
0xdcba3210 { ob_refcnt=5 ob_type=0x023794444 }

Since number isn’t likely to be a string, let’s call repr and see what it is, and for convenience, display the result as a string directly:

((PyObject *)0xdcba3210)->ob_type->tp_repr( (PyObject *)0xdcba3210),ma
0xddddaaaa  ..... ..8...........42..............................

Now we know that MyPrint was called with an instance of MyClass, that was constructed with the name ‘Steve’ and the number ‘42′, and we never had to leave the debugger. Our app is still running, so we can keep putting the debugger to use to investigate. If MyClass held references to other objects, we could follow those references until we found the information we were looking for.

If you find yourself working with large python strings in the debugger (for example, the repr() of a complex object), you can use this snippet to see the entire contents of an object’s repr(), as a C string:

{,,python25.dll}PyString_AsString(obj->ob_type->tp_repr(obj))

Hopefully the techniques I’ve shown you will prove useful the next time you’re trying to figure out a nasty python crash!

Tags: , , ,

Prototyping week

The main thrust of this week’s work was gameplay prototyping. I spent the majority of my time prototyping gameplay – either by building new mechanics, tuning existing ones, or constructing small prototypes in the editor.

Some of the more interesting things I built:

I added support for attaching pieces of geometry to each other, so that I could make composite objects for use in puzzle designs. In the following video, I’ve attached a Crank (a new variation on my existing Switch object, which I’ll describe a bit later) to a moving platform, and tied the crank to the platform’s velocity so that you can use it to move the platform:

I also spent some time on the Assist mechanic, which allows the player to have abstract control over the inactive character by hitting a special button on the controller at appropriate times. Right now, it merely allows interaction with objects, but in the future, it will allow high-level orders (like ‘come here’ or ’stay there’) that will help guide the (currently unfinished) AI that controls the inactive character in order to keep it from getting left behind during platforming segments. In the following video, I demonstrate using the assist button to operate a crank with the inactive character while the active character makes his way past some spikes:

The culmination of my work this week was constructing a larger puzzle prototype that combines many of the smaller elements I’ve built previously. It’s a puzzle that utilizes the Assist mechanic along with many of the mechanics and technologies I’ve built so far and stresses many of the limits of my current engine. In the process of building it I had to fix numerous bugs and performance issues in my game code and ended up discovering some serious flaws in my collision detection code, which I’ll no doubt have to address in the coming weeks. Nonetheless, it was quite satisfying to see it take shape:

Finally trying it out made me realize that it was far more difficult than I anticipated, and that my UX needed significant improvement. I spent some time refining the puzzle and improving the game’s UI and was finally able to beat it myself:

Naturally, a lot of work still remains, both in the art, design, and technology departments. But it’s nice to finally be able to build a piece of content this large without running into any showstopper issues – all said, it took me about 45 minutes to construct and tune this puzzle in my editing tools. Not as fast as I’d like, but more than good enough to get started on enough content to fill a playable demo.

One of the more surprising technical hurdles this week was the discovery that SpriteBatch.Begin was accounting for over 10% of the CPU usage in my game. This turned out to be due to the fact that each game object was responsible for setting up its own render state, so I was beginning and ending an individual batch for every object in the level – over a hundred spikes, dozens of moving platforms and doors, and hundreds of tiles. I invested some time into writing code to automatically manage beginning and ending batches, so that objects would no longer need to explicitly call End in their Draw methods, only Begin.

Once I had that working it dropped off my profiles entirely, and my framerate climbed back up to where it was a few weeks ago. On one hand, it’s nice to be able to see such large performance gains from little effort, but on the other hand, I’m a little embarassed that I didn’t think about it earlier. :) It’s likely that in the future I’ll end up writing a RenderManager class of some sort to automatically handle sorting objects based on their material and other parameters so that I don’t end up having to send thousands of batches to the video card every frame.

I ran into numerous minor issues with my collision detection code this week, which has me considering ways to rework and improve my current approach. The worst was a bug that I discovered where if I placed an object at negative coordinates (on either the X or Y axes), it behaved differently in collision detection than if it was placed at positive coordinates. Instead of diving into the math to try and figure it out, I ended up just relocating all the objects in my test level so that they had positive coordinates, but I can’t put solving this one off for too long.

Later in the week a friend stopped by and I had him test the game out on my machine. Not only did he have some good feedback, but he found two bugs within a matter of minutes! Always nice to get fresh eyes on your design and see someone else try out what you’ve built. I’m looking forward to having things at the point where I can start handing out test versions for people to bang on.

Finally, a question for anybody who’s reading on a regular basis: The Gruedorf weekly progress format is becoming rather difficult to work with lately, so I’m considering changing my approach for this development log. I’d greatly appreciate it if you let me know via the comments section which topics interest you the most, and whether you’d be interested in seeing multiple shorter posts during the week focusing on single topics.

Tags: , , , ,

luminance is Digg proof thanks to caching by WP Super Cache