Model choices: business use cases for SLMs and LLMs

Published: August 15th, 2024

For many years, businesses and their data science teams valued accuracy above all else when it came to a model’s performance. Increasingly, however, other factors and trade-offs have come into play depending on the business context of models.

From biases buried in training data to the costs associated with runtime and resourcing services underpinned by large language models, businesses looking to harness the power of AI can come up against many challenges. One way foundation large language model (LLM) providers have sought to bridge the gap with many of these concerns is through offering smaller versions of their flagship products, such as small language models (SLMs).

Increasingly, SLMs are proving a viable and cost-effective alternative to LLMs, but as with the adoption of any advanced technology, use cases must be well defined to deliver true value.

Cost management

A good place to start when determining whether an SLM or an LLM will be the right fit is to understand how much power the solution you’re implementing needs. Rudimentary tasks may not even require AI at all to be automated, and some tasks will not require the full force of an LLM and its associated costs, which can ramp up quickly due to the resources needed to support them, such as GPU usage. In most business contexts, LLMs will also typically be hosted by cloud service providers like Azure and AWS, meaning a high volume of calls to the model will see costs increase further.

This is where the viability of SLMs over LLMs can become apparent. Most foundation model providers today release different versions of their products, such as Meta’s Llama 3, which comes in two sizes—8B and 70B parameters. Different versions are charged at different rates, with SLM offerings being much cheaper. This is a key consideration for businesses when testing any use case, particularly when it comes to financial constraints, which is a key factor for small and mid-sized organizations.

More recent versions of a foundation model may yield better performance, but if an earlier and cheaper version, or a smaller variation performs well enough, upgrading is likely an unnecessary expense. For example, some SLMs are small enough that they can be run locally on a device such as a laptop and even a mobile phone. GPT-2, for example, is now fairly outdated, but can be downloaded and run locally on a standard modern laptop. However, these use cases would be mostly restricted to prediction and responses. Training the model locally would be extremely time intensive, to the point that it would likely not be worth doing.

The best use cases for SLMs and LLMs

SLMs that underpin sentiment analysis use cases, such as chatbots, are quickly emerging as the most readily adoptable AI-powered applications. Chatbots can be used internally for knowledge discovery and Q&A tasks, or externally for customer service use cases. These are often domain-specific so will require some level of fine-tuning, but once they are up and running, they will quickly transform workflows and increase efficiency.

A use case like clustering documents, such as grouping customer support tickets by topic and assigning a priority level to each one is well served by an SLM. However, for more intricate tasks, such as parsing HR documents for niche information or a more advanced classification engine for documents and files across systems, an LLM is the more appropriate choice.

This is because the context window — the amount of information surfaced by the model by a user’s prompt — provided by SLMs is generally much smaller. SLMs are also much more likely to hallucinate as they are trained on much less data. With a smaller knowledge base, they are far more likely to produce inaccurate guesses at the answers needed. This immediately rules them out of more sensitive applications, such as medical diagnostics, engineering and financial services use cases—which are also still at fairly early stages of development.

A particularly useful application powered by SLMs is mixture-of-experts (MoE) models, such as Mistral’s Mixtral 8x7b. As the name suggests, this model comprises eight models, each being made up of 7 billion parameters. In essence, this collection of models works together and can often outperform larger models. And although it may seem like running eight SLMs will incur similar running costs to an LLM, Mixtral 8x7b’s will usually only require two of the SLMs to be in use at one time.

Of course, there will always be limitations for SLMs, as they are trained on much less data. Perhaps the most exciting field of AI today, with many emerging use cases, is that of multimodal models in which diverse types of data, such as text, images, and video can be processed simultaneously, moving us closer to mimicking the capabilities of a human brain. Currently, SLMs are not powerful enough for the more advanced multimodal use cases, such as video generation, as their “brains” are not big enough to handle such complex tasks. LLMs will therefore be at the forefront of AI-led innovation, but SLMs will likely deliver the most immediate business value.

Don’t forget model principles

Whether choosing an LLM or an SLM, the fundamentals of selecting models must take precedence. An LLM trained on poor quality data, for example, will perform worse than an SLM trained on high quality data, which is why it’s crucial to experiment with different offerings before committing to one.

Most models today are open source and allow users to experiment and test use cases. If performance requirements are met with an SLM, there’s no need to pay for an LLM. Starting small is always a good idea, as this leaves the potential for upgrading to an LLM, but it’s important to note that performance may be significantly affected if the decision to downgrade to an SLM is taken.

A key consideration for selecting either option is whether the solution will need to scale up, or if the use case is specific enough to be contained. For example, a customer service chatbot will likely not need to produce wildly different responses from one month to the next.

To ensure the right choice is made, a robust discovery phase in which the evolution of the use case is fully considered should be prioritized. This will determine the technical and financial constraints that will ultimately determine whether an SLM or an LLM is the right choice.

You may also like…

Who’s going to pay for AI?

Pros and cons of 5 AI/ML workflow tools for data scientists today

Article Tags

LLM, SLM, training data

About Terrence Alsup

Terrence Alsup is senior data scientist at Finastra.

View all posts by Terrence Alsup

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_WTGVKVXEZJ	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_107693958_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
_heatmaps_g2g_101137905	10 minutes	No description
cf_7167_id	20 years	No description
cf_7167_person_last_update	session	No description
GoogleAdServingTest	session	No description
prism_252377639	1 month	No description
querylyvid	3 months	No description
xtc	1 year 1 month	No description

Model choices: business use cases for SLMs and LLMs

Cost management

The best use cases for SLMs and LLMs

Don’t forget model principles

Article Tags

Subscribe to SDTimes

About Terrence Alsup

Related Articles

The crucial role of observability data lakes in LLM observability

Surviving AI hype takes a human touch

Nokia Bell Labs’ AI solution uses text prompts to manage networks