After finishing up the materials for my Level Up 2009 entry, today I spent a little while trying out an idea I had recently:
One of the problems with using vertical sync in a video game is that it eats into the available CPU time for performing game updates. The way vsync is implemented in most graphics APIs, it causes your Present/EndDraw/SwapBuffers call to block until the card enters vertical blank and the frame is shown to the user. While this is ideal from a correctness perspective, it’s a tremendous waste since it means you can end up sitting there for up to 16 milliseconds, waiting for vertical blank. If your game spends lots of time doing both updating and drawing, all that time could be spent performing updates instead. Ouch.
Currently, my game spends about as much time drawing as it does updating. A significant portion of the time spent drawing (20-30%) is within the EndDraw function. Turning off vertical sync drops the amount of time spent in EndDraw considerably, but introduces tearing. So, as a potential solution, why not call EndDraw on a background thread? While the thread waits for vertical blank, I can begin performing the next frame’s Update, and in the event that I finish updating before the previous frame is visible, I simply wait for that previous EndDraw call before beginning to paint the next frame. In the optimal case, this means I can come much closer to the best possible framerate without introducing tearing, and in the worst case, the cost of rendering an individual frame is only *slightly* increased by the use of thread synchronization. The fact that I’m only doing EndDraw on another thread means that I don’t have to worry about protecting my game data with locks and other synchronization techniques, since the GraphicsDevice doesn’t use any of my game data when performing the EndDraw operation.
So, to test this out, I overrode my Game class’s BeginDraw and EndDraw methods. This turns out to be all we have to do to change the way drawing is performed, because the XNA Framework developers were kind enough to make both of these methods virtual.
protected override bool BeginDraw () {
_DrawCompleteEvent.WaitOne();
_DrawCompleteEvent.Reset();
return base.BeginDraw();
}
protected override void EndDraw () {
_DrawRequiredEvent.Set();
}
Of course, at this point, the two events used here are never set, so this code won’t work. Thus, we add a background thread to perform our painting:
AutoResetEvent _DrawRequiredEvent = new AutoResetEvent(false);
ManualResetEvent _DrawCompleteEvent = new ManualResetEvent(true);
public Game () {
...
_DrawThread = new Thread(DrawThreadFunc);
_DrawThread.IsBackground = true;
_DrawThread.Start();
}
protected void DrawThreadFunc () {
while (true) {
_DrawRequiredEvent.WaitOne();
base.EndDraw();
_DrawCompleteEvent.Set();
}
}
Fairly simple thread programming here: We create a thread, and set IsBackground to true so that it will stop as soon as the main thread exits. The thread spends all of its time waiting for a ‘required draw’ signal, and then performs an EndDraw call. Once the call is complete, it sets another signal to inform the Game class that the previous draw has finished and it’s safe to perform a BeginDraw call (this lets us make sure that we never use the GraphicsDevice on the main thread while the background thread is performing an EndDraw).
Once this is all done, I start up my game, and… the framerate isn’t any different. Huh? What’s more, my frame profiler indicates that Update is now taking ten times as long as it used to, while Draw isn’t any faster. Huh???
In situations like this, it’s always good to consult a profiler to see if you’re missing something important:
So, we can see that the DrawThread is definitely doing its job – it calls EndDraw, then waits for a signal asking it to draw again. Both are taking about as much time as we’d expect. But why is Update taking so long…?
… oh. Oops.
So it turns out that I was using GraphicsDevice.Viewport.Width and GraphicsDevice.Viewport.Height in my camera code. Accessing the Viewport property caused the XNA framework to call into Direct3D to retrieve the viewport, which acquired the exact same lock being used by EndDraw, causing my main thread to stall until the draw completed. WHOOPS.
This is especially embarassing since the viewport size never changes anyway, so I could have just stored the width/height into constants. After doing just that and starting the game again, the profile looks more like you’d expect:
What’s more, this is actually an improvement: With vertical sync enabled, this results in a significant reduction in the amount of time spent inside the BeginDraw/Draw/EndDraw functions on the main thread, which means there’s more time left to perform Updates. This means that I can maintain a solid, smooth 60fps easier on dual-core/hyperthreaded machines.
Even with vertical sync disabled, this is still an improvement, though not as significant – apparently other things are happening inside EndDraw (not a big surprise), so by shifting that work off onto a second thread, I’m still gaining some time to spend performing the next update. When I disable the built in framerate balancer, this brings my framerate from ~350fps up to ~380fps. Not bad for a couple dozen lines of code!
Of course, it’s worth pointing out that the XNA Framework documentation doesn’t make any promises here, so it’s possible that this technique is unsafe. When it comes to concurrency, it’s very easy to do the wrong thing and get away with it – as you might have noticed here, I was doing something utterly stupid and unsafe in my Update function, and I got away with it because the DirectX developers had the foresight to put a lock in the right place. If they hadn’t, my game might have corrupted state from accessing the GraphicsDevice on two threads, and crashed intermittently.
Regardless, this is a handy technique – once I’ve had the chance to do lots of testing on various PC configurations (and the XBox 360), I’ll probably be using it in my game when I ship.





#1 by zaril on July 4, 2009 - 5:51 am
Quote
Awesome, I’ll give this a try as well. It’s nice that they’ve done things like that virtual, one would expect not having control over such elements just because of an “oh, you would like to use these?”