Almost every year, we get a report telling us that something in the PC industry is dying, or fading away, or the days of some aspect of computer technology are numbered.
So, when I saw an article about Micron not selling enough memory chips for AI PCs and smartphones, which meant the company downgraded its revenue forecasts for the coming quarters, and so some folks are panicking that ‘AI is dying’ – well, it did not surprise me in the slightest.
This industry does love a bit of doom and gloom at times, but much of this errant noise is purely down to public understanding of modern-day AI as a whole – certainly in the enthusiast sector.
Let me be clear here: AI isn’t dying – we know that. Hell, all you have to do is look at how well Nvidia is doing to get a good grasp of just how wrong that assertion is. The thing is, out of all the numerous AI laptops and phones, or other gadgets, out there – everything that’s currently being marketed with the AI tagline (I go on another long rant about that here) – the fact is that the vast bulk of AI processing doesn’t come from your tiny laptop. It just doesn’t.
Even the best custom-built gaming PC right now barely has the capability of running ChatGPT at 10% of its total capacity. And that is even if you could do so, as it’s not an open source program that anyone can just go and download.
Sadly, it requires far too much data and processing power to fully simulate that kind of program locally on the desktop. There are workarounds and alternative apps, but they generally pale in comparison to the likes of Gemini or GPT in both depth of knowledge and response times. Not exactly surprising given you’re trying to compete with multiple server blades operating in real-time. I’m sorry, your RTX 4090 just ain’t going to cut it, my friend.
And that’s another important point here – even looking at your custom PC, anyone that tells you that a CPU with a built-in NPU can outgun something like an aging RTX 3080 in AI workloads is pulling the wool over your eyes. Use something like UL’s Procyon benchmark suite with its AI Computer Vision test, and you’ll see that the results for a desktop RTX 4080 versus an Intel Core Ultra 9 185H-powered laptop are around 700% to 800% higher. That’s not a small margin, and that’s giving the Intel chip the benefit of the doubt and not using the Nvidia TensorRT API too, where the results are even better for Team Green.
The thing is, the companies, tools, and techniques that are doing well in the AI ecosystem are already well-established. If you have an RTX graphics card, the likelihood is you’ve already got plenty of performance to run rings around most modern-day ‘AI’ CPUs with an NPU built in. Secondly, pretty much every AI program worth running utilizes server blades to deliver that performance – there’s very little that runs locally or doesn’t have some form of hookup with the cloud.
Google has now pretty much rolled out Gemini to the bulk of its Android OS devices, and it’ll be landing on its Nest speakers as well in the coming months (with a beta version technically already being available, thanks to some fun Google Home Public Preview skullduggery). And to be clear, that’s a four-year-old speaker at this point, not exactly cutting-edge tech.
This is just the beginning
Many years back, I had a discussion with Roy Taylor, who at the time was at AMD as Corporate Vice President of Media & Entertainment, specializing in VR and the advancements in that field.
My memory is a little hazy, but the long and short of the conversation was that as far as graphics card performance was concerned, to get a true-to-life experience in VR, with a high enough pixel density and sufficient frame rate to ensure a human couldn’t tell the difference, we’d need GPUs capable of driving petaflops of performance. I think the exact figure was around the 90 PFLOPs mark (for reference, an RTX 4090 is still well over 100x less potent than that).
In my mind, local AI feels like it falls very much in the same camp as that. It’s a realm of apps, utilities and tools that won’t likely ever inhabit your local gaming PC, but will instead reside solely on server blades and supercomputers. There’s just no way an isolated computer system can compete – even if we were to halt all AI development at its current state, it would take us years to catch up in terms of overall performance. That’s not necessarily a bad thing or the end of the world either.
There is a silver lining for us off-the-grid folk, and it all hinges on GPU manufacturers. Naturally, AI programming, particularly machine learning, predominantly operates through parallel computing. This is something that GPUs are wildly good at doing, far better than CPUs, and particularly Nvidia GPUs utilizing Tensor cores. It’s the tech behind all those DLSS and FSR models we know and love, driving up frame rates without sacrificing in-game graphical fidelity.
However, developing a GPU from the ground up takes time – a long time. For a brand-new architecture, we’re talking several years. That means the RTX 40 series was likely in development in 2020/2021, at a guess, and similarly, the RTX 50 series (when the next-gen arrives, supposedly imminently) probably began life in 2022/2023, with different teams shuffling about from task to task as and when they became available. All of that prior to the thawing of the most recent AI winter and the arrival of ChatGPT.
What that tells us is that unless Nvidia can radically pivot its designs on the fly, it’s likely that the RTX 50 series will still continue on from Lovelace’s (RTX 40 series) success, giving us even better AI performance, for sure. But it won’t be until the RTX 60 series that we really see AI capacity and performance supercharged in a way that we’ve not seen before with these GPUs. That may be the generation of graphics cards that could make localized LLMs a reality rather than a pipe dream.
You might also like
Services Marketplace – Listings, Bookings & Reviews