Using PerfHUD with XNA games

July 4th, 2009

If you want to use PerfHUD with an XNA game, you’ll need to make some changes to your source code. Since the PerfHUD documentation only explains how to do this when using the native DirectX interfaces from C++, it can be a bit confusing to figure out how to make it work with XNA, especially since the XNA Framework goes to some effort to prevent your games from starting on graphics devices that might not support its feature set (for example, reference rasterizers). Since PerfHUD presents itself as a reference rasterizer, you have to jump through some hoops.

The solution is to write a custom GraphicsDeviceManager that replaces the default XNA one. Luckily, you don’t have to write very much code to make this work - you just have to do things in a particular way so the framework remains happy. The source code can be found here. All you need to do is add this class to your project (or use the library it comes in), and modify your game’s initialization code to look something like this:

#if !XBOX
            Graphics = new PerfHUDDeviceManager(this);
#else
            Graphics = new GraphicsDeviceManager(this);
#endif

After this, you should be able to start your game under PerfHUD on Windows.

Please note that since the PerfHUD device presents itself as a reference rasterizer, I believe this disables use of some types of hardware acceleration within the framework, so your framerate may suffer as a result when running under PerfHUD, and you might get slightly misleading results when profiling. I consider this to be the unfortunate intersection of two bad design decisions, but not really a bug. If you felt so inclined you could probably work around this issue by hacking up the XNA Framework classes a bit so that they believe the PerfHUD device is a real hardware device.

Code , , ,

VirtualBox is pretty awesome

July 2nd, 2009

If you haven’t gotten around to trying out VirtualBox 3 yet, you should. I finally made some time to install the Windows 7 RC inside of it, and I’m stunned. It’s a much better experience than Virtual PC or VMWare and it’s totally free. Whod’ve thought that Sun would release something so useful for free? I didn’t even have to install Java. :D

After a little fiddling, I even managed to get hardware Direct3D and OpenGL working inside of Windows 7. Here’s a short clip of me fiddling with a hardware-accelerated Windows 7 game, and Paint, on top of my Windows XP desktop (thanks to VirtualBox’s neat ‘Seamless’ mode, which appears to be similar to VMWare’s Fusion feature.)

A quick how-to guide if you’re interested in trying out the setup in the video:

  • Install VirtualBox 3
  • Create a virtual machine configured for Windows 7. Enable all the shiny goodies, like hardware 3D.
  • Install Windows 7, preferably using the downloadable Windows 7 RC ISO. You can grab it from Microsoft’s site and set the ISO as your CD/DVD-ROM drive in the VirtualBox settings.
  • Once Windows 7 is installed, install the VirtualBox Guest Additions. This enables pointer integration and makes graphics work better, along with enabling Seamless mode.
  • Download the WineD3D installer onto your Windows 7 virtual machine, preferably the desktop. You should be able to just use IE to do it inside win7, if you turned on NAT when installing VirtualBox and configuring your virtual machine.
  • Boot Windows 7 into safe mode (you may need to spam the F5/F8 keys when the virtual machine is booting to get the safe mode menu).
  • Open Windows Explorer as Administrator (Start menu, accessories, Windows Explorer, right click it and pick Run as Administrator)
  • Navigate to C:\Windows\System32
  • Right click the downloaded WineD3D installer, and select Run as Administrator.
  • Continue with the WineD3D install process. Each time you get an ‘unable to write to file foo.dll’ error message:
    • Go to that Explorer window and find the respective file (like d3d8.dll)
    • Right click the file and select Properties
    • Click over to the Security tab and click the Edit button
    • Grant ‘Full Control’ to the Administrators group.
  • After following those steps, you can click Retry on the error message and the install will continue to the next file.
  • After changing permissions to a few DirectX files, the install will complete. Reboot your virtual machine into normal mode and you should now have hardware 3D in both Direct3D and OpenGL apps inside your Windows 7 virtual machine.

Uncategorized , , ,

‘This application is not configured to use PerfHUD’ - how to fix it

July 1st, 2009

For anyone trying to use NVPerfHud, if you get stuck trying to figure out this error message:

This application is not configured to use PerfHUD.

Consult the User’s Guide for more information.

You may have noticed that there is no reference to this error message, or ‘configuring your application’ in the User’s Guide. The solution is actually documented in the NVPerfHud Quick Tutorial‘, and involves changing the parameters you pass to CreateDevice.

Hopefully posting this here will help other people find this information, since googling the error message turned up zero results for me. Good work, NVidia!

P.S. Preventing ‘unauthorized profiling by third parties’? What the hell is this, the cold war? This is computer graphics, not espionage. Can you imagine if other software worked this way?

To prevent unauthorized pixel modification by third parties, any images you wish to view with Adobe Photoshop must contain a hand-written sample of dialogue from the second act of A Midsummer Night’s Dream, by William Shakespeare. Please note that dialogue from modern adaptations like the 1999 version featuring Kevin Kline is not sufficient. Also please note that handwriting in sufficiently obscure cursive scripts is unlikely to be recognized.

Code , , , ,

Threaded EndDraw in XNA (Crouching WaitOne, Hidden Lock)

July 1st, 2009

After finishing up the materials for my Level Up 2009 entry, today I spent a little while trying out an idea I had recently:

One of the problems with using vertical sync in a video game is that it eats into the available CPU time for performing game updates. The way vsync is implemented in most graphics APIs, it causes your Present/EndDraw/SwapBuffers call to block until the card enters vertical blank and the frame is shown to the user. While this is ideal from a correctness perspective, it’s a tremendous waste since it means you can end up sitting there for up to 16 milliseconds, waiting for vertical blank. If your game spends lots of time doing both updating and drawing, all that time could be spent performing updates instead. Ouch.

Currently, my game spends about as much time drawing as it does updating. A significant portion of the time spent drawing (20-30%) is within the EndDraw function. Turning off vertical sync drops the amount of time spent in EndDraw considerably, but introduces tearing. So, as a potential solution, why not call EndDraw on a background thread? While the thread waits for vertical blank, I can begin performing the next frame’s Update, and in the event that I finish updating before the previous frame is visible, I simply wait for that previous EndDraw call before beginning to paint the next frame. In the optimal case, this means I can come much closer to the best possible framerate without introducing tearing, and in the worst case, the cost of rendering an individual frame is only *slightly* increased by the use of thread synchronization. The fact that I’m only doing EndDraw on another thread means that I don’t have to worry about protecting my game data with locks and other synchronization techniques, since the GraphicsDevice doesn’t use any of my game data when performing the EndDraw operation.

So, to test this out, I overrode my Game class’s BeginDraw and EndDraw methods. This turns out to be all we have to do to change the way drawing is performed, because the XNA Framework developers were kind enough to make both of these methods virtual.

        protected override bool BeginDraw () {
            _DrawCompleteEvent.WaitOne();
            _DrawCompleteEvent.Reset();
            return base.BeginDraw();
        }

        protected override void EndDraw () {
            _DrawRequiredEvent.Set();
        }

Of course, at this point, the two events used here are never set, so this code won’t work. Thus, we add a background thread to perform our painting:

        AutoResetEvent _DrawRequiredEvent = new AutoResetEvent(false);
        ManualResetEvent _DrawCompleteEvent = new ManualResetEvent(true);

        public Game () {
            ...

            _DrawThread = new Thread(DrawThreadFunc);
            _DrawThread.IsBackground = true;
            _DrawThread.Start();
        }

        protected void DrawThreadFunc () {
            while (true) {
                _DrawRequiredEvent.WaitOne();
                base.EndDraw();
                _DrawCompleteEvent.Set();
            }
        }

Fairly simple thread programming here: We create a thread, and set IsBackground to true so that it will stop as soon as the main thread exits. The thread spends all of its time waiting for a ‘required draw’ signal, and then performs an EndDraw call. Once the call is complete, it sets another signal to inform the Game class that the previous draw has finished and it’s safe to perform a BeginDraw call (this lets us make sure that we never use the GraphicsDevice on the main thread while the background thread is performing an EndDraw).

Once this is all done, I start up my game, and… the framerate isn’t any different. Huh? What’s more, my frame profiler indicates that Update is now taking ten times as long as it used to, while Draw isn’t any faster. Huh???

In situations like this, it’s always good to consult a profiler to see if you’re missing something important:

profile_01

So, we can see that the DrawThread is definitely doing its job - it calls EndDraw, then waits for a signal asking it to draw again. Both are taking about as much time as we’d expect. But why is Update taking so long…?

profile_02

… oh. Oops.

So it turns out that I was using GraphicsDevice.Viewport.Width and GraphicsDevice.Viewport.Height in my camera code. Accessing the Viewport property caused the XNA framework to call into Direct3D to retrieve the viewport, which acquired the exact same lock being used by EndDraw, causing my main thread to stall until the draw completed. WHOOPS.

This is especially embarassing since the viewport size never changes anyway, so I could have just stored the width/height into constants. After doing just that and starting the game again, the profile looks more like you’d expect:

profile_03

What’s more, this is actually an improvement: With vertical sync enabled, this results in a significant reduction in the amount of time spent inside the BeginDraw/Draw/EndDraw functions on the main thread, which means there’s more time left to perform Updates. This means that I can maintain a solid, smooth 60fps easier on dual-core/hyperthreaded machines.

Even with vertical sync disabled, this is still an improvement, though not as significant - apparently other things are happening inside EndDraw (not a big surprise), so by shifting that work off onto a second thread, I’m still gaining some time to spend performing the next update. When I disable the built in framerate balancer, this brings my framerate from ~350fps up to ~380fps. Not bad for a couple dozen lines of code!

Of course, it’s worth pointing out that the XNA Framework documentation doesn’t make any promises here, so it’s possible that this technique is unsafe. When it comes to concurrency, it’s very easy to do the wrong thing and get away with it - as you might have noticed here, I was doing something utterly stupid and unsafe in my Update function, and I got away with it because the DirectX developers had the foresight to put a lock in the right place. If they hadn’t, my game might have corrupted state from accessing the GraphicsDevice on two threads, and crashed intermittently.

Regardless, this is a handy technique - once I’ve had the chance to do lots of testing on various PC configurations (and the XBox 360), I’ll probably be using it in my game when I ship.

Gruedorf , , , , , ,

Water

June 26th, 2009

Among the various things I worked on this week, I spent a good portion of time implementing a basic fluid system in the game engine. The fluid system lets me create volumes of moving fluid that alter the player’s physics and can flow down off ledges and interact with objects, so that levels like the aqueducts look and feel more convincing.

The two major pieces of the fluid system are physics and rendering. The physics system is arguably more important, but ended up mostly being a matter of fine-tuning. Here’s how it works:
Previously, every frame I did a sweep of the area below onscreen entities to locate the nearest surface to stand on, and used that to perform some motion and collision detection calculations so that the player can properly move up/down slopes, stand on moving objects, and drop when standing over a gap.

To implement the water system, I extended the standing sweep code to also locate the nearest volume of fluid that intersects the entity, and store it. What this allows me to do is detect whether or not the player character is currently inside a fluid, regardless of whether or not he’s standing on a surface at the time (so it even works while jumping/falling).

Once I had this functioning, I added the ability to attach ‘Material’ information to any piece of geometry in the game, including fluids. These two pieces of functionality combined allow me to control the physical attributes of a surface, and then further alter those attributes when the player is standing inside water or other fluids. This not only allows me to implement slippery surfaces like ice, but cause the player to move more slowly and have different control characteristics when in the water.
In the future, this should also allow me to alter the characteristics of surfaces in response to gameplay events - enemies that spray sticky goo on the ground, spider webs that slow you down, or a potion that freezes water, turning it into slippery ice you can stand on.

After a bit of fine-tuning, I came up with some physics parameters for the basic material types that I’m happy with, that make the differences between various materials obvious without breaking gameplay mechanics. The jumping mechanics were a particularly tough detail - even minor changes to the acceleration and speed characteristics of the player render some of my jumping puzzle prototypes completely impossible to solve, so I had to carefully test against those puzzle prototypes each time I adjusted the physics values for a given material. In the final game, I’m going to have to address this by only using a specific set of materials in each level, so that it’s not necessary to retest the entire game after physics changes.

Of course, water isn’t particularly interesting if you can’t see it, so the other thing I did was build a fairly simple bit of rendering that allows fluids to flow across surfaces and down off ledges.

Read more…

Gruedorf , , , ,

Content Pipeline integration and deployment

June 19th, 2009

During this past week, one of the problems I tackled was finding a way to deploy builds of my game. When dealing with deployment, an important principle is that the process should be as close to a single click as possible - a complex deployment process discourages you from getting customer feedback, and increases the likelihood that you’ll make a mistake and end up with a failed deployment, which wastes valuable time.

For most of my previous projects, I’ve tended to take either a hands-on approach to deployment, or a completely hands-off one, either building an installer by hand using tools like NSIS, or deploying the project as a ZIP file full of binaries and expecting the end user to install the necessary dependencies and figure out how to get the program running. Neither extreme is ideal, really (though hands-off deployment can be amazing if you manage to set things up such that all the end user has to do is run your game - no install necessary).

Read more…

Gruedorf , , , ,

Tiled map loader for XNA

June 17th, 2009

Early in the development of my game, I used the free and open-source Tiled Map Editor to create levels. It was a big time-saver since it let me worry about more important things instead of investing effort into being able to place tiles down on a map. Later on I decided that the traditional approach to map construction wasn’t right for my project - but I was still glad I’d used Tiled.

Recently I realized that there aren’t very many easy ways for newbie XNA developers to get maps into their games, so I decided it was worth packaging up my Tiled map loader and sharing it with the world. So, I’ve created a simple example that shows how to load Tiled maps in your XNA game on Windows PCs and the XBox 360, and included the loader with it. It’s open-source and free for your use, no strings attached. I hope you find it helpful.

screenshot

Download source code and binaries

Note that it doesn’t have support for isometric tiles or embedded tilesets, because I had no use for either feature. Tiled’s file format is relatively simple, so if you need those features, it should be simple to add them.

And of course, this wouldn’t be possible without the generous contributions of the developers of Tiled, Adam Turk and Bjørn Lindeijer. If you’d like to try it out, you can download it from their website (note: requires Java).

Code , , , , , , , , ,

Asynchronous programming for XNA games with Squared.Task

June 13th, 2009

A recent post on Shawn Hargreaves’ blog reminded me that I never got around to sharing my technique for interacting with the XNA Guide API and doing other asynchronous operations during my game’s execution. It’s a simple application of some of the cooperative-threading techniques I’ve blogged about previously, and makes what would otherwise be a somewhat painful exercise relatively simple in practice. So, for those who are working on XNA games or just curious, I’ll go into detail on the technique here and share the source code with you!

The basic goal here is to be able to write game code in an imperative, sequential manner, without having to deal with locks, callbacks, polling, or race conditions. You rarely achieve perfection when dealing with concurrency, but even getting within sight of perfection can be worth the effort if it means you don’t have to spend time pulling your hair out trying to reproduce a rare threading bug.

In my game, any operation that can be performed off the main thread is done without blocking the main thread. Loading content, loading files, saving games, loading games, interacting with the guide, etc. This means that I can seamlessly load things in the background while the game is running without any noticeable stuttering. Normally, you’d need to do lots of juggling to handle this correctly - locking, threads, explicit synchronization - but thanks to cooperative threading, I don’t have to deal with any of that. The techniques I describe in this post are the same basic techniques I’m using for my game. Hopefully, that means they should work for yours too (though unfortunately that means if there are bugs in them, you’ve probably just found a bug in my game… whoops.)

Unless you’re working on an XNA game that uses networking (which sadly, I am not, so I can’t speak to the difficulties there), the biggest concurrency related issues you’re likely to deal with are twofold:

  • Interacting with the XNA Guide APIs
  • Performing I/O

By the end of this article I’ll have shown you how you can tackle these problems in your XNA games without having to deal directly with threads or synchronization, in a simple, imperative manner that works equally well on the PC and XBox 360.

ss_02

The Guide APIs are almost all designed in a manner that requires use of either threads, callbacks/continuations, or polling, because they come as Begin/End pairs that require management of ordering and state beyond a single frame. This is for the most part a necessity, as detailed in the blog post I mentioned above. So instead of trying to find a workaround, the best course of action is to simplify the process of working with those APIs - if possible, without introducing new issues or restricting functionality.

I/O is a more interesting problem; even when writing desktop applications, you often perform large I/O operations that have the potential to block, and in those cases, a well-written application uses concurrency techniques - threading, callbacks, etc - to avoid ‘hanging’ and frustrating the user. However, on almost any modern desktop machine, you can typically get away with tiny I/O operations if done correctly - a good example is Firefox 3, which still regularly performs short disk operations on its UI thread, which can cause it to hang if your machine is under extreme I/O load, but works fine in almost every other situation.

The XBox 360 makes I/O particularly challenging because you can’t rely on latency and throughput characteristics that you might be used to on a desktop machine: Not only do you have to deal with the downsides inherent in hard disks, but you also have to deal with the possibility of a player using a memory card, and even worse, the possiblity of the available storage devices changing while your game is running (due to memory card insertion/removal, etc). This means that doing I/O in the main thread is pretty much a non-starter. You’re stuck: It’s concurrency time.

Luckily, cooperative threading provides a great solution for tackling both of these issues: It lets you reason about your problems in a manner you’re familiar with, and solve them by writing imperative code that still behaves correctly in tough situations, at the expense of some slight overhead and minor changes to your game code. You don’t have to build a tremendously complex state machine (since the C# compiler can be convinced to do the heavy lifting for us), and you don’t have to worry about locking and synchronization since all your code is guaranteed to run sequentially in the same thread, at the appropriate time.

Read more…

Code , , , ,

Cutting and tuning

June 12th, 2009

As far as gameplay goes, the only major addition since last week’s post was a relatively complete implementation of player death, along with the ‘reunion’ teleport that goes with it. Fairly simple at present, with some bugs to work out (including one related to the teleport location that you’ll see in the video below). Definitely helps get a better feeling for whether a given puzzle is too hard or too easy.

Other than that, the only thing of note code-wise is the time I spent doing some performance tuning. My framerate had crept down over the past month or so, to the point that I wasn’t able to maintain a stable 60fps on the 360 anymore and my CPU utilization was approaching 30% on my desktop PC. Some of the optimizations were relatively obvious - for example, I was calling SpriteBatch.Begin/SpriteBatch.End for every on-screen object, for the sake of simplicity, which resulted in a lot of unnecessary draw calls.

Some simple changes to automatically begin/end batches when changing rendering settings reduced the number of draw calls per frame in most cases to around 30, which is perfectly acceptable, and reduced my CPU utilization by around 50%. After that, the rest of the optimization was pretty trivial - finding other hotspots in my profiler data and reducing the cost.

Well, it was trivial until I ran the game on the 360 again and noticed that the framerate hadn’t improved very much. Huh? I doubled my framerate on the PC, but on the 360, it barely moved an inch. What’s the deal?

Turns out, the 360’s pitiful floating-point performance was kneecapping me. Believe it or not, the primary culprit was the geometric shapes in the game’s HUD, for the circular health displays you may have seen in previous screenshots/videos. I knew the 360’s FPU was weak, but not THAT bad. Unfortunately, the only way to detect this is by manually measuring the performance cost of your code on the 360, by commenting out/toggling individual sections of your game code.

Tedious, at best. For now, I ended up just reducing the complexity of the geometry for the HUD elements and reducing the number of shapes I was drawing, which brought the 360 framerate much closer to what it used to be. As it happens, some design changes later in the week helped here too…


The majority of my work ended up focusing on the game’s design: My schedule for this project is extremely aggressive (insanely so, really) and as such, I need to have a relatively complete playable demo within mere months for submission to a couple major game competitions. Being able to hit that deadline in my free time requires me to aggressively control the scope of the project, avoid feature creep, and do as little work as possible to get game mechanics implemented and content built. 
 
Towards this end, after getting one of my main mechanics prototyped and testing it out in content I’d built, I made the hard decision to cut the mechanic. The second controllable character you’ve seen in some of my previous videos is effectively gone, though I’m going to attempt to make use of the design and code effort for the revised design. 
 
Making a choice like this is always painful, especially when you don’t have an unchangeable deadline or overbearing boss pushing you towards it. But ultimately, I think I’ll benefit from making these cuts sooner rather than later. I wish I had started thinking hard about it a few weeks earlier, when my first prototypes were working, instead of waiting until the issues were obvious, but I’m still relatively happy with the turnaround. I was able to prototype a relatively unusual game mechanic in a matter of a couple dozen hours of programming time, and decide that it wasn’t worth pursuing, and cut it. Definitely an improvement over traditional Waterfall with long cycles, but not quite true Agile yet. :) 
 
For me, this underscores the importance of aggressive, early prototyping of almost every possible game mechanic and design, instead of focusing on a single section of game content or gameplay until it’s done. Previously I was used to having a rigid focus on a section of a game or application, working on it day in and day out until it was done and ready to hand off - but in many cases, this meant that I could sink days or weeks of my time into something that ultimately had to be thrown out. 
 
Just like a lot of Agile proponents will tell you, it turns out that failing fast means wasting less time. The chaotic feeling and loss of productivity to context switches can be painful, and I think being successful requires setting things up to avoid having to pay those costs too many times a week, but ultimately, it’s a great decision. 
 

Gruedorf , , , , , ,

In-Depth Python/CPython crash debugging with Visual C++

June 7th, 2009

If you do a lot of work with Python’s standard C implementation, CPython, there’s a chance that you’ll run into issues that will actually crash python.exe. If you’re writing extensions for python in C/C++ (.pyd files), it’s likely that some of those crashes are the result of your code. Naturally, if your C++ caused python.exe to crash, you’re going to attach a debugger and take a look.

If the problem is in your code, often it’s simple to see it - just look at your argument values and data structures. But what if the problem exists outside your code, and somewhere in python? All you have in the debugger are a bunch of PyObject * values; Visual C++ or WinDBG won’t do much to help you work with them. Faced with this, your best options are usually to add debugging code to your python - print statements, assertions, etc. Eventually, you might be able to puzzle out the problem and a solution.

But as it turns out, PyObject doesn’t have to be a black box. If you’re using Visual C++, you have all the tools you need to dig around inside of CPython - even if you just caught an unhandled exception.

For this to work, you’ll need a recent version of Visual C++, along with debug symbols for python25.dll. You also need debug symbols for your C/C++ extension (the .pyd). Once you’ve got all this, attach the visual C++ debugger to your application, and either set a breakpoint in your code, or wait for an unhandled exception.

Once you’re paused in the debugger, inside your code, you’re ready to get to work. Let’s use a hypothetical example. We’ve got a simple object, written in C, with a method exposed to python called ‘print’. The function we wrote in C to handle the method is called MyPrint:

void MyPrint(PyObject * self, PyObject * args) {
    const char *text;

    if (!PyArg_ParseTuple(args, "s", &text))
        return NULL;

    int result = printf("%s", text);

    return Py_BuildValue("i", result);
}

The function is simple - we take a single argument from the python dict (args), convert it to a string, and call printf to print it to standard output. Then we return printf’s return value (the number of characters written) as a python int.

So, let’s suppose that MyPrint crashes. We’re now sitting in the debugger, and staring at a couple opaque PyObject * pointers. If we’re lucky, we got past the ParseTuple call and the ‘text’ pointer is valid too. Unfortunately, the debugger isn’t going to be very helpful. If you look at ’self’ and ‘args’ in the Locals or Watch windows, you’re going to see something like this:

args     0x1179a650 {ob_refcnt=3 ob_type=0x023706a0}

We can see a reference count, and a pointer to the object’s type structure, but that’s about it. There are a few useful things in the type structure, like the name of the type, but that’s it. The same goes for ‘args’ - we can’t see the values inside it. What we really want is to be able to look at MyPrint’s arguments, to see what was passed in.

If you look closely, you’ll notice that the type structure has a field named ‘tp_repr’. If you dig into the CPython documentation, you’ll realize that tp_repr is a pointer to the repr() function for that object. In python, repr() converts any given object into a string, usable for debugging purposes.

Go ahead and open up Visual C++’s Immediate window. You’ll find it under the Debug menu’s Windows submenu. After it’s open, type this in:

args->ob_type->tp_repr(args)

If everything’s in good shape, you’ll see output like this appear in the Immediate window:

0x1196fd88 { ob_refcnt=2 ob_type=0x02370600 }

You just called repr() on your argument list, and it returned a Python string object. Now we’re in a bit of a tough situation - we still don’t have any way to look at our results. All we have is a Python object. We do, however, know that it’s a string, and as it turns out, in CPython, this means that the text of the string is directly adjacent to the PyObject structure. Knowing that, we can read the contents of memory to see the string.

Let’s take the PyObject * we got from repr(), and have visual studio show us the contents of the memory at that address. Type this into the immediate window:

0x1196fd88,ma

The ‘,ma‘ is a hint to the Visual Studio Expression Evaluator (used by the Watch and Immediate windows) that it should treat the specified value as a Memory address pointing to ASCII text. It’ll spit out something like this:

0x1196fd88  ..... ..6...........(<MyClass object at 0xbaadf00d>)............

The debugger wrote out the first 64 bytes of memory at that location. The first few bytes are the contents of the PyObject structure, and immediately following them, the contents of the string. From the result of repr(), we can now see that an instance of MyClass was passed to MyPrint, instead of a string. And from the address shown, it looks like it’s a bad pointer.

This is already useful debugging information we might not have had access to otherwise. But we can go further!

Let’s suppose that the MyClass object that was passed to us is actually valid. In that case, you’ll see a repr() that looks innocuous, and have an idea of what the contents of your argument list are. But if your arguments are all objects, repr() won’t tell you that much about them.

If MyClass looks like this:

class MyClass:
  def __init__(self, number, name):
    self.__number = number
    self.name = name

Calling repr() on it won’t tell us the things we actually care about - its name and number. We could add a repr() function to MyClass to find this out, if we wanted. But if you’re trying to debug a rare crash that’s hard to reproduce, restarting your app to add a repr() function is a dangerous step to take - you might not see the crash again for hours, days, or weeks.

So, let’s find out the name and number of the MyClass instance that was passed in.

The first step is to pull the instance out of our argument list. If we wanted, we could just grab the address straight out of the repr() - but let’s be SLIGHTLY less evil and do it using the CPython API. Type this into the immediate window:

{,,python25.dll}PySequence_GetItem(args, 0)

If everything went well, you’ll see output like this:

0xbaadf00d { ob_refcnt=4 ob_type=0x023791010 }

The expression we just evaluated bears a little explanation: The ‘{,,python25.dll}‘ part of the expression tells the debugger that you want to resolve the following symbol using the module ‘python25.dll‘. Otherwise, the debugger might not know the identity of PySequence_GetItem, depending on the quality of your loaded symbols. Don’t ask me why the two commas are there, I have no idea.

So, now that we’ve successfully called GetItem on our argument list, we have the PyObject * for our MyClass instance. If things worked correctly, it will match what you saw in the result of repr().

Now that we have a MyClass instance, we want to get the name and number. Let’s get the value of the name attribute:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("name") )

There’s a lot going on here, so let’s take it apart. First, we tell the debugger to treat the pointer we were given as a PyObject *:

((PyObject *)0xbaadf00d)

Then we invoke its ‘tp_getattro’ handler, which is the generic getattr() handler for the object. Note that we have to pass in the pointer again, since it takes self as a parameter:

->ob_type->tp_getattro( (PyObject *)0xbaadf00d,

Then finally, we pass in the name of the attribute we wish to retrieve, as a PyObject. To do this, we create a PyString, using the CPython API:

{,,python25.dll}PyString_FromString("name") )

As before, we had to help the debugger find the name.

The getattr call should succeed, and you should get a result:

0xabcd0123 { ob_refcnt=7 ob_type=0x023792222 }

Using the tricks we’ve already learned, we can peek at the name:

0xabcd0123, ma
0xabcd0123  ..... ..6...........Steve..............................

Great. Given this, we may already have enough information to hunt through our source code, looking for the code that constructs Steve. But let’s pretend we need the number:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("number") )
0x00000000

Hm. That didn’t work. If we remember carefully, MyClass stores the number in an attribute named ‘__number’, not ‘number’. That’s easy to fix, right?

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("__number") )
0x00000000

No dice. What’s the problem? If you read the Python documentation, you’ll find that a double underscore at the beginning of an attribute’s name gets treated specially by the language, and ‘mangled’ in order to protect it from external access. As a result, when we wrote:

    self.__number = number

What actually got run was:

    self._MyClass__number = number

This results in __number being accessible from inside MyClass, but not outside. Knowing this, we can fix our getattr call:

((PyObject *)0xbaadf00d)->ob_type->tp_getattro( (PyObject *)0xbaadf00d, {,,python25.dll}PyString_FromString("_MyClass__number") )
0xdcba3210 { ob_refcnt=5 ob_type=0x023794444 }

Since number isn’t likely to be a string, let’s call repr and see what it is, and for convenience, display the result as a string directly:

((PyObject *)0xdcba3210)->ob_type->tp_repr( (PyObject *)0xdcba3210),ma
0xddddaaaa  ..... ..8...........42..............................

Now we know that MyPrint was called with an instance of MyClass, that was constructed with the name ‘Steve’ and the number ‘42′, and we never had to leave the debugger. Our app is still running, so we can keep putting the debugger to use to investigate. If MyClass held references to other objects, we could follow those references until we found the information we were looking for.

If you find yourself working with large python strings in the debugger (for example, the repr() of a complex object), you can use this snippet to see the entire contents of an object’s repr(), as a C string:

{,,python25.dll}PyString_AsString(obj->ob_type->tp_repr(obj))

Hopefully the techniques I’ve shown you will prove useful the next time you’re trying to figure out a nasty python crash!

Code , , ,