Oxide Games developer Dan Baker helped answer some questions we had about AMD’s new Mantle API. Oxide’s upcoming game, Star Swarm, will support Mantle out of the gate and the company has been very vocal about Mantle which it believes can help all gamers and also start a dialogue about the future of APIs on the PC.
Maximum PC: Most believe Mantle is a low-level API that is very close to the metal. Can you explain why this concept is wrong?
Baker: Relative to Microsoft's Direct3D (D3D), Mantle is indeed more low-level. But it's not low-level in the sense that we are exposed to individual architectural decisions. For example, Mantle still abstracts the details of the shader cores themselves, so that we don't even know if we are running on a vector machine or a scalar machine. What isn't abstracted is the basic way a GPU operates. The GPU is another processor, just like any other, that reads and writes memory. One thing that has happened is that GPUs are now pretty general in terms of functionality. They can read memory anywhere. They can write memory anywhere. A lot of the things an API has traditionally managed aren't really necessary any more. Mantle puts the responsibility onto the developer. Some feel that is too much, but this really isn't any different than managing multiple CPUs on a system, which we have gotten pretty good at. We don't program multiple CPUs with an API, we just handle it ourselves. Mantle gives us a similar capability for the GPU.
Maximum PC: You’ve said Mantle really addresses the inefficiencies of DirectX which was architected in the 1990s. Can you give us some examples of the inefficiencies of DX? We know you’ve mentioned DrawCalls as an example.
Baker: DirectX was architected in a time when two things were true. First, the hardware itself was very fixed-function. That is, there was a lot of secret sauce as to what exactly it did. As shading models became almost completely general, the need to abstract this level become less useful. The best way to think of a GPU is just a processor that runs programs. All we really want our API to do is give us a means of executing little programs on the GPU. This programs are what's in a batch, or a draw call. We don't want it to manage memory, we don't want to 'make things easier for us.'
Oxide's Star Swarm RTS promises a huge boost in performance when running AMD's Mantle API
The second problem is that APIs are still designed in this functional threading model where you have a series of processes that pass work back and forth to each other. The idea is that you have say, one thread for rendering, one thread for audio, one thread for gameplay, etc. This is a really not a scalable way to build things. In situations where you have a shared L3 cache, you also create contention from all the different processes running, since they all access completely different memory. The industry continues to move to a job-based setup, where we have lots of tiny jobs that run asynchronously. This can now scale to a large number of CPUs, and we can fill up most of the previously unused time where one of the processors isn't doing something.
Maximum PC: Does OpenGL face the same limitations? It’s also a pretty old API at this point.
Baker: OpenGL has essentially all the core problems of D3D, except that one can add extensions to it. The main difference between OpenGL and D3D is that D3D made an attempt to be threaded, and failed, whereas OpenGL has not yet attempted it. One question is whether it is worth building a new API or making a bunch of extensions to an existing API. You can get some mileage out of making extensions, but extensions can't bridge things like being able to use multiple CPUs. Also, at some point it's cleaner and easier just to hit the reset button, rather than throw yet another feature in a fairly big API. Believe it or not, Mantle is actually easier to support than OpenGL. OpenGL has many unobvious pitfalls and traps, whereas Mantle really doesn't.
Oxide Games and Mantle Presentation Demo
Maximum PC: There have been some pretty wild claims of performance increases by going to Mantle. How much have you seen in your game?
Baker: This depends on how exploitative you are, and the specifics of your engine. For us, we have been completely limited in what we could do by driver overhead problems. We were actually making decisions where we traded GPU performance for CPU–that is, we’d end up doing things that are slower on the GPU, because we could get away with less driver overhead. When you talk about building an FPS, you probably spend much of your time optimizing for the GPU; when you try to build an RTS, you end up optimizing for the driver overhead. Nitrous is a new breed of rendering system. Oxide’s specialty is high throughput.
When you look at Star Swarm, it's really a testament to brute force. For us, we can see cases where Mantle is many times faster, with especially big differences as we add more cores and slow down their clock speeds. We wouldn't expect most games to necessarily see this, as it will happen in cases where you have a really efficient, high throughput engine, but it will certainly make an impact everywhere. We aren't set up to do very precise testing, so we'd rather others do the analysis on this. However, we'd like to point out that our Direct3D performance is absolutely outstanding, relative to what is expected. We have spent a huge amount of time optimizing around D3D, and we feel we are actually pretty biased in D3D’s favor. Mantle, on the other hand, we've spent far less time with and currently have only pretty basic optimizations. But Mantle is such an elegant API that it still dwarfs our D3D performance.
Maximum PC: It seems that the main challenge for Mantle to succeed is getting support from Nvidia and Intel. Do you see that as actually happening from a developer point of view?
Baker: Yes and no. Mantle does detract from other platforms, and we are already seeing a big dialogue in terms of what future APIs should look like. Mantle is kind of the disruptive technology that gets everyone rethinking things. Whether this means a new version of OpenGL, or a new version of D3D, we can't say. But it is clear that they will have to adapt if they want to stay relevant. Some of us have been screaming for change for years. The arguments we got in the past were: 1) it couldn't be faster 2) it would be too hard to use, and 3) we have enough performance, so more isn't useful. We wanted to show that all three of these things are provably false.
Maximum PC: Do you see a world where developers will have to write for DX and Mantle? How much of a challenge is it to write for both APIs?
Baker: APIs come and go. Once you support more than one, it's pretty easy to support a dozen–assuming there is parity in the hardware features, and assuming you don't have to rewrite your shaders in an entirely different language. If you release a title right now, you would end up with likely six paths. An Xbox360, a PS3, a PS4, a Xbox One, a DX9, and a DX11. For us, the graphics system is just a module that talks to the API. All we did for Mantle was replace the D3D module with a Mantle one. It's about 3,000 to 4,000 lines of code for the Mantle version, which took me personally about two months to write. In terms of support, at least for us, it wasn't terribly difficult.