Written by 6:06 pm Gaming Views: [tptn_views]

This latest GPU trick just made games 1.64x faster

One of the largest drawbacks with the best way games currently render 3D scenes is that there’s still a surprising amount of forwards and backwards communication required between the CPU and GPU. This overhead can decelerate graphics card processing in a mess of the way. However, a latest technique demonstrated by AMD has managed to massively reduce this, boosting performance by 1.64x with with none extra processing power required.

The technique was demonstrated on an AMD Radeon RX 7900 XTX, the most effective graphics card you possibly can currently buy for workloads without ray tracing, but this system doesn’t require such a high-end GPU. As such, it could see performance increases in lots of games for a mess of GPUs.

This breakthrough concerns the indisputable fact that in many workloads, you could have an initial calculation done on the GPU that then determines that some subsequent work also needs doing on the GPU. However, in the present GPU workload setup, this subsequent work must be triggered by the CPU, so a bit of round trip is required from the GPU to the CPU and back again (often using the ExecuteIndirect command in DirectX’s D3D12). This is each inefficient and slow, relative to the GPU simply having the ability to handle the entire process itself.

An initial workaround for this was proposed a number of years ago, with a setup called work graphs. Work graphs allow a developer to define an entire interrelated framework of possible functions and next steps such that the GPU knows which function to perform next without having to go to the CPU.

Today’s demo, then, is an extension of labor graphs called mesh nodes. As AMD’s, Matthäus Chajdas, puts it within the AMD OpenGPU blog, “Mesh nodes … allow a piece graph to feed directly right into a mesh shader, turning the work graph itself into an amplification shader on steroids.”

Didn’t understand all that? Well, in essence it allows for those clever work graph frameworks to directly trigger mesh shaders, that are the programs used to generate in-game terrain on the fly. It’s quite a particular use case of the work graph setup but AMD demonstrates its power with a demo that procedurally generates a whole lot of elements (comparable to the ivy shown above – left is with less generated, right is with more), all using a single initial dispatch call to the GPU. As a result, on this demo AMD could measure that the traditionally ExecuteIndirect method was 1.64x slower than the mesh nodes system. You can see the video demo on AMD’s blog linked above.

amd mesh node work graph GDC demo 02

What does all this mean for current and future games? Well, it’s just yet one more technique developers can call upon to attempt to eke out more performance from our games. It’s not likely clear just how much a way like this might affect outright frame rate but by freeing up system resources usually – and CPU resources specifically – there’s potential for performance to enhance due to other system bottlenecks being released.

[mailpoet_form id="1"]
Close