Hardly a week goes by without a breakthrough in LLMs (large language models). They’re cheaper to train and getting smarter, so why bother with their little sibling SLMs (small language models)?

For any developer team serious about delivering practical AI, it all comes down to focus and fit. LLMs are great for general, non-domain-specific tasks, but when AI needs to be truly useful in a business context, an SLM, or even a federation of SLMs supporting an LLM, is often the smarter choice.

Why? Because, as top reasoning engines show, using a general-purpose AI for a focused or numeric task is often overkill, and introduces risk. For example, DeepSeek R1 uses a “mixture of experts” setup with 671 billion parameters, but only up to 37 billion activate per query.

Because it knows it needs to operate with only a subset of those billions of parameters active at any given time, and it’s much more efficient to instead break problems down and use selective use of smaller components when it spots, say, a user question that needs some mathematical sub-routines, and not all of its encoded “brain power.”

Dominik Tomicevic

CEO, Memgraph.

It’s easy to see how much more useful the results become when you stop assuming one LLM can do it all. A smarter approach is to deploy different SLMs to analyze specific areas of your business, such as finance, operations, logistics, and then feed their focused outputs into a more general model that synthesizes the findings into a single, coherent response.

When you think about it, this model of coordination is deeply human. Our brains don’t fire up every region at once; we activate specific areas for language, memory, motor function, and more. It’s a modular, connected form of reasoning that mirrors how we solve problems in the real world. A physicist might stumble outside their domain, while a generalist can offer broader but less precise insights. Likewise, AI systems that recognize boundaries of expertise and delegate tasks accordingly can tackle far more complex problems than even the smartest standalone LLM.

SLMs v LLMs

To test the case for SLMs, just try asking ChatGPT, or any general-purpose LLM, about your AWS infrastructure. Since LLMs are notoriously imprecise with numbers, even a basic question like “how many servers do we have?” will likely produce a guess or hallucination, not a reliable answer.

A better approach would chain together an SLM trained to generate accurate database queries, retrieve the exact data, and then pass that to an LLM to explain the result in natural language. For predictive tasks, classical statistical models often still outperform neural networks — and in those cases, an SLM could be used to optimize the model’s parameters, with an LLM summarizing and contextualizing the results.

SLMs aren’t just cheaper — they’re often more capable in niche domains. Take Microsoft’s Phi-2, a small model trained on high-quality math and coding data. Because of its focused, domain-specific training, it famously outperformed much larger models in its area of expertise.

Until (or unless) we reach true AGI, no single model will be great at everything. But an SLM trained for a specific task often outperforms a generalist. Give it the right context, and it delivers peak performance — simple as that.

Granularity matters. You don’t need an AI that knows who won the World Cup in 1930; you need one that understands how your company mixes paint, builds networks, or schedules deliveries. Domain focus is what makes AI useful in the real world.

And for mid-sized operations, SLMs are much more cost-effective. They require fewer GPUs, consume less power, and offer better ROI. That also makes them more accessible to smaller teams, who can afford to train and run models tailored to their needs.

Best to be flexible

So, is the case closed? Just pick an SLM and enjoy guaranteed ROI from enterprise AI? Not quite.

The real challenge is getting your supply chain, or any domain-specific data, into the model in a usable, reliable way. Both LLMs and SLMs rely on transformer architectures that are typically trained in large batches. They’re not naturally suited to continuous updates.

To keep an SLM relevant and accurate, you still need to feed it fresh, contextual data. That’s where graph technology comes in. A well-structured knowledge graph can act as a live tutor, constantly grounding the model in updated, trustworthy information.

This combination, SLM plus knowledge graph, is proving particularly powerful in high-stakes domains. It delivers faster, more precise, and more cost-efficient outputs than a standalone LLM.

Add to this the growing adoption of Retrieval-Augmented Generation (RAG), especially in graph-enabled setups (GraphRAG), and you have a game-changer. By bridging structured and unstructured data and injecting just-in-time context, this architecture makes AI genuinely enterprise-ready.

GraphRAG also boosts reasoning by continuously retrieving relevant, real-world information, instead of relying on static or outdated data. The result? Sharper, more contextual responses that elevate tasks like query-focused summarization (QFS) and enable SLMs to operate with greater precision and adaptability.

In short, if we want AI systems that truly address real business challenges, rather than echo back what they think we want to hear, the future isn’t about building ever-bigger LLMs. For many enterprise scenarios, a hybrid SLM/GraphRAG model may be the real path forward for GenAI.

We’ve listed the best data visualization tools.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Services MarketplaceListings, Bookings & Reviews

Entertainment blogs & Forums

Leave a Reply

dtf kj 602 neon inkt printer.