Join our daily and weekly newsletters to receive the latest updates and exclusive content on our industry-leading AI coverage. Find out more
Two years after the public release of ChatGPT, conversations around artificial intelligence are inevitable as companies in every industry look to leverage gigantic language models (LLM) to transform their business processes. However, although LLMs are powerful and promising, many business and IT leaders have come to over-rely on them and overlook their limitations. That’s why I foresee a future in which specialized language models, or SLMs, will play a larger, complementary role in enterprise IT.
SLMs are more commonly referred to as “small language models” because they require less data and training time and are “more streamlined versions of LLMs”. However, I prefer the word “specialized” because it better describes the ability of these purpose-built solutions to perform highly specialized work with greater accuracy, consistency and clarity than an LLM. By complementing LLM with SLM, organizations can create solutions that leverage the strengths of each model.
Trust and the LLM “black box” problem
LLMs are extremely powerful, but they are also known to sometimes “lose the plot” or offer results that veer off course due to their general training and massive data sets. This trend becomes more problematic due to the fact that ChatGPT and other OpenAI LLMs are essentially “black boxes” that do not reveal how they get the answer.
The black box problem will become a bigger problem in the future, especially for enterprises and business-critical applications where accuracy, consistency and compliance are paramount. Think of healthcare, financial services, and law as prime examples of professions where incorrect answers can have huge financial and even life-or-death consequences. Regulators are already taking notice and are likely to start demanding explainable AI solutions, especially in industries that rely on data privacy and accuracy.
While companies often implement human-led approaches to alleviate these problems, over-reliance on LLM can lead to a false sense of security. Over time, complacency can set in and mistakes can slip by unnoticed.
SLM = greater explainability
Fortunately, SLMs are better at dealing with many of the limitations of LLMs. Rather than being designed for general-purpose tasks, SLMs are developed with a narrower focus and trained on domain-specific data. This specificity allows them to meet diverse linguistic requirements in areas where precision is most critical. Instead of relying on massive, heterogeneous datasets, SLMs are trained on targeted information, giving them contextual intelligence to deliver more consistent, predictable, and true responses.
This offers several benefits. First, they are more understandable, making it easier to understand the source and rationale behind their results. This is crucial in regulated industries where decisions must be traced back to the source.
Secondly, their smaller size means they can often run faster than LLMs, which can be a key factor in real-time applications. Third, SLMs offer companies greater control over data privacy and security, especially if implemented internally or built specifically for the enterprise.
Furthermore, while SLMs may initially require specialized training, they reduce the risks associated with using external LLMs controlled by external providers. This control is invaluable in applications requiring tough data handling and compliance.
Focus on developing expertise (and beware of vendors who overpromise)
I want to make it clear that LLM and SLM are not mutually exclusive. In practice, SLMs can augment LLMs by creating hybrid solutions where LLMs provide broader context and SLMs provide precise execution. It’s still early days even for LLMs, which is why I always advise technology leaders to continue to explore the many opportunities and benefits of LLMs.
Additionally, while LLM models can scale well for a variety of problems, SLM models may not perform well for certain operate cases. Therefore, it is critical to have a clear understanding from the beginning of what operate cases need to be tackled.
It’s also critical that business and IT leaders devote more time and attention to building the distinct skills required to train, tune, and test SLM. Fortunately, there is a lot of free information and training available from popular sources such as Coursera, YouTube, and Huggingface.co. Leaders should ensure their developers have adequate time to learn and experiment with SLM as the battle for AI expertise intensifies.
I also advise leaders to vet partners carefully. I recently spoke with a company that asked for my opinion on the claims of a certain technology vendor. In my opinion, they either exaggerated their claims or simply did not understand the capabilities of this technology.
The company wisely took a step back and implemented a controlled proof of concept to test the vendor’s claims. As I suspected, the solution simply wasn’t ready for primetime, and the company managed to move away from it with relatively little time and money invested.
Whether a company is starting with a proof of concept or a live implementation, my advice is to start petite, test often, and build on early successes. Personally, I have experience working with a petite set of instructions and information, but when I then feed the model more information, the results differ from what I expected. Therefore, the cautious approach is to take a ponderous and steady approach.
In summary, while LLMs will continue to provide increasingly valuable capabilities, their limitations become increasingly apparent as enterprises augment their reliance on AI. The SLM complement offers a way forward, especially in high-stakes fields that require accuracy and explainability. By investing in SLM, companies can future-proof their AI strategies by ensuring that their tools not only drive innovation, but also meet the requirements of trust, reliability and control.
AJ Sunder is co-founder, CIO and CPO at Responsive.
People who decide about data
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including data scientists, can share data-related insights and innovations.
If you want to read about pioneering ideas and current information, best practices and the future of data and data technologies, join us at DataDecisionMakers.
You might even consider writing your own article!
Read more from DataDecisionMakers