![]() Note that each Nvidia GPU has two results, one using the default computational model (slower and in black) and a second using the faster "xformers" library from Facebook (faster and in green).Īs expected, Nvidia's GPUs deliver superior performance - sometimes by massive margins - compared to anything from AMD or Intel. Here are the results from our testing of the AMD RX 7000/6000-series, Nvidia RTX 40/30-series, and Intel Arc A-series GPUs. Automatic 1111 provides the most options, while the Intel OpenVINO build doesn't give you any choice. The sampling algorithm doesn't appear to majorly affect performance, though it can affect the output. Some Euler variant (Ancestral on Automatic 1111, Shark Euler Discrete on AMD) Postapocalyptic steampunk city, exploration, cinematic, realistic, hyper detailed, photorealistic maximum detail, volumetric light, (((focus))), wide-angle, (((brightly lit))), (((vegetation))), lightning, vines, destruction, devastation, wartorn, ruins Note that the settings we chose were selected to work on all three SD projects some options that can improve throughput are only available on Automatic 1111's build, but more on that later. It's the same prompts but targeting 2048x1152 instead of the 512x512 we used for our benchmarks. The above gallery was generated using Automatic 1111's webui on Nvidia GPUs, with higher resolution outputs (that take much, much longer to complete). Our testing parameters are the same for all GPUs, though there's no option for a negative prompt option on the Intel version (at least, not that we could find). Again, if you have some inside knowledge of Stable Diffusion and want to recommend different open source projects that may run better than what we used, let us know in the comments (or just email Jarred). Nod.ai's Shark version uses SD2.1, while Automatic 1111 and OpenVINO use SD1.4 (though it's possible to enable SD2.1 on Automatic 1111). ![]() We're also using different Stable Diffusion models, due to the choice of software projects. Finally, on Intel GPUs, even though the ultimate performance seems to line up decently with the AMD options, in practice the time to render is substantially longer - it takes 5–10 seconds before the actual generation task kicks off, and probably a lot of extra background stuff is happening that slows it down. Nod.ai let us know they're still working on 'tuned' models for RDNA 2, which should boost performance quite a bit (potentially double) once they're available. The AMD results are also a bit of a mixed bag: RDNA 3 GPUs perform very well while the RDNA 2 GPUs seem rather mediocre. RTX 40-series results meanwhile were lower initially, but George SV8ARJ provided this fix, where replacing the PyTorch CUDA DLLs gave a healthy boost to performance. We're relatively confident that the Nvidia 30-series tests do a good job of extracting close to optimal performance - particularly when xformers is enabled, which provides an additional ~20% boost in performance (though at reduced precision that may affect quality). We didn't code any of these tools, but we did look for stuff that was easy to get running (under Windows) that also seemed to be reasonably optimized. Getting Intel's Arc GPUs running was a bit more difficult, due to lack of support, but Stable Diffusion OpenVINO gave us some very basic functionality.ĭisclaimers are in order. AMD GPUs were tested using Nod.ai's Shark version - we checked performance on Nvidia GPUs (in both Vulkan and CUDA modes) and found it was. For Nvidia, we opted for Automatic 1111's webui version it performed best, had more options, and was easy to get running. We ended up using three different Stable Diffusion projects for our testing, mostly because no single package worked on every GPU. ![]() But that doesn't mean you can't get Stable Diffusion running on the other GPUs. The short summary is that Nvidia's GPUs rule the roost, with most software designed using CUDA and other Nvidia toolsets. If you've by chance tried to get Stable Diffusion up and running on your own PC, you may have some inkling of how complex - or simple! - that can be. We've benchmarked Stable Diffusion, a popular AI image creator, on the latest Nvidia, AMD, and even Intel GPUs to see how they stack up. But how fast are consumer GPUs for doing AI inference? ![]() Most of these tools rely on complex servers with lots of hardware for training, but using the trained network via inference can be done on your PC, using its graphics card. Artificial Intelligence and deep learning are constantly in the headlines these days, whether it be ChatGPT generating poor advice, self-driving cars, artists being accused of using AI, medical advice from AI, and more. ![]()
0 Comments
Leave a Reply. |