Resources

Large language models

ChatGPT has pushed large language models and also artificial intelligence in general to the forefront in the media and daily conversation. A lot of us are turning to ChatGPT already from time to time in our daily lives, whenever it suits. Introducing the use of LLM’s in a systematic and controlled way within companies is another ball game. This page provides an introduction into the impressive potential of LLM and in how to set them up so that they do your bidding.

Understanding LLMs:
How do large language models work

Nowadays, we don’t have to introduce ChatGPT anymore. Large Language Models (LLMs), - the power behind tools like ChatGPT - are transforming the way businesses operate by leveraging AI to understand both imagery and natural language and to generate suitable output. These models have been pre-trained on vast and varied data sets, giving them a varied diverse content knowledge. In addition they have been designed to do a simple task - predicting the most likely next word. The combination of these two characteristics provide them with the capability to handle a broad range of tasks, hence making them into the successful example of Generative Foundation models.

What are large language models

Large language models are advanced artificial intelligence (AI) systems that have revolutionized natural language processing. These models, such as OpenAI's GPT-4 (Generative Pre-trained Transformer 3) and its predecessors, are designed to understand and generate human-like text, making them powerful tools for a wide range of applications.

Most used LLMs

Despite the recent attention and excitement surrounding Large Language Models, their development is not new in the field of AI. OpenAI introduced the first GPT-model in 2018, followed by GPT-2 in 2019. Even back in 2019, ML6 was already working on leveraging LLMs for business use. For example, we fine tuned the pre-trained English GPT-2 model to Dutch. Thanks to the exponential growth of LLM capabilities, LLMs have recently gained significant momentum, leading to the emergence of numerous new models, with sometimes weekly releases.

Some of the most widely used LLMs that are currently making waves in the field of AI, include:

GPT-models

A family of LLMs introduced by OpenAI. With the recent releases of ChatGPT and GPT-4, GPT-models have drawn a lot of interest from the AI community. Those models have a proprietary licence, which is a non-open source licence that requires users to pay a fee and impose some usage restrictions.

LLaMA

An open-source collection of LLMs developed by Meta. LLaMA is designed to help researchers in advancing their work in the subfield of LLMs. It is available in multiple sizes, ranging from 7 to 65 billion parameters, and aims to democratise the access to LLMs by requiring less computing power and resources. LLaMA can only be used for research.

PaLM-2

Next-generation Large Language Model developed by Google. PaLM-2 is built upon Google's previous research in AI and has been announced during their annual I/O keynote in May 2023. Google also released a paper introducing Med-PaLM-2, a fine-tuned version on medical data. 

It is important to note that the field of LLMs is rapidly evolving, and there may be newer and even more advanced models available by the time of reading this article. 

Read our blogpost: ChatGPT & GPT-4: Gimmicks or game-changers?

Align LLMs and their architecture to your desired outcome

While LLMs are capable of performing a broad range of tasks, assuring their output is aligned with the desired output is key.

There are three main approaches to align the model’s output: 
1. Prompting, 
2. Retrieval-Augmented Generation (RAG) 
3. and the more advanced finetuning.

Combining your LLMs with the right knowledge (e.g. documents specific to your business) and templates that define how it should act in certain cases (e.g. using a prompt management system) allow your solution to reach its full potential.
‍

Prompting

A prompt is the text you provide to an LLM as input. Prompts can be short and concise, or can be extensive, including additional context and requirements you have regarding the output.

We describe two prompting techniques:

- Zero-shot Prompting:equivalent with “describing a task to a student”
- Few-shot Prompting: equivalent with “describing a task to a student and supplying some examples of similar tasks and how they were carried out

Retrieval-Augmented Generation

A more advanced variant of prompting is called RAG.  In short, a RAG architecture introduces a component that fetches documentation (relevant for the question that was asked) from your knowledge base. By placing this “Smart Retriever” component in front of your conversational LLM, you impose that LLM to base its response on the information present within your documentation.

Above, we present a schematic representation of a RAG (Retrieval-Augmented Generation) architecture.

The benefits from such an architecture are that (1) your LLM can explicitly refer to the sources upon which it based its answer, (2) your LLM is unlikely to hallucinate, because it receives the context within which it should stay and (3) your complete solution remains maintainable because the “Smart Retriever” component can be updated as your knowledge base grows.
‍
Note that the RAG architecture is an example of few-shot prompting: the LLM is presented with exemplary behaviour of how it should respond to questions and additionally, the prompt is enriched with information that it should base its answer on. More information can be found in our blogpost on Leveraging LLMs on your domain-specific knowledge base.

For more information on that RAG-architecture, we refer to our ML6 blogpost on Leveraging LLMs on your domain-specific knowledge base.

Read our blogpost: Leveraging LLMs on your domain-specific knowledge base

How Large Language Models are trained for your case

Large Language Models are capable of a wide assortment of tasks, right out of the box. To get them to behave exactly as you would want them to, however, you should consider the techniques described above.  However, when prompting or a RAG approach are not fulfilling, (additional) fine-tuning techniques can be considered. Finetuning alters the LLM itself by affecting the weights that make up its neural network).

For the purpose of explaining things at an understandable level, we represent the LLM as a high school student.

Without going into technical details here, we want to underline that in the finetuning case, you are changing the actual capacities of the LLM/student. You go further than just providing him “examples ad-hoc” like you do for the few-shot prompting case. Instead, you present the student with input tasks and correct his output based on the “correct” outputs that your fine-tuning dataset describes. One could see this as “taking the student to the next level”: allowing the student to specialise within the domain that your dataset implicitly describes. The analogy between a high school graduate and a university graduate is then straightforward.

In conclusion, we note that for many business cases, the “few-shot prompting” approach may suffice to get your model to behave appropriately (e.g. when you are looking for a conversational LLM that replies based on information extracted from your knowledge base). In other cases, however, some specific behaviour of your model may require fine-tuning of your model in order to make it perform better.


If you want a practical example on how a RAG structure is build up and incorporated in use cases, you can rewatch our webinar on demand on Generative AI for corporate use: How to get started with LLM.

‍

Watch the webinar: Generative AI for corporate use: How to get started with LLM

Large language model applications

Chatbots and virtual assistants

Chatbots and virtual assistants powered by LLMs are able to understand and respond to a wide range of customer or employee questions in an accessible, conversational, multilingual and efficient manner, improving your customer satisfaction or enhancing the productivity of your employees. Importantly, these virtual assistants can be enhanced by providing them with a specific source of data (e.g. your company knowledge base) to further raise the accuracy of the provided responses. Check out our blogpost for more information.

Content creation engines

Content creation engines can generate high-quality human-like written content for a variety of use cases, including marketing content, sales materials, contract generation, media content or translations, which can speed up the content creation process, and also allow for content variant creation in different languages, and writing styles tuning to client segments or employee groups with different preference and cultures .
Different approaches to content creation include: guided generation, summarization, elongation, re-styling, translation and so on. 

Natural Language Interface to Complex Tooling

LLMs can be taught to interact with existing tooling and perform tasks based purely on natural language input. Wielding the vast semantic capacity of LLMs, one can teach them to thoroughly understand how to talk to a certain tool (e.g. the API of a mechanical robot) and use the LLM to steer that tool as they describe in natural language. 

Natural Language interrogation of data sets

Use the power of LLM to analyse and understand extensive data sets and to share with you the key insights using natural language messages. Steer the LLM through focused questions to look for specific insights or ask the LLM to come up with the most important observations it reads in the data set. LLMs are surprisingly strong in this kind of analytical AI tasks, using it's generative capacity to facilitate the dialogue with the user

And many more!

The field of Large Language Models is rapidly expanding, with new solutions and use cases emerging every two days. As a leading provider of ML services, ML6 is committed to staying at the forefront of these developments, constantly exploring new ways to apply LLMs to solve real-word business problems.

Typical challenges LLM: Functional & Technical

While building an LLM solution, it is important to consider the following typical challenges:

Speed of research

The research and development in the field of LLMs is happening at a rapid pace, and new models and functionalities are constantly being released. Keeping up with the latest developments and choosing the most suitable model for a specific use case might be overwhelming and thus requires necessary expertise to make the right choices.

Trustworthy AI

LLMs face ethical challenges, including the potential for biases and perpetuating harmful stereotypes due to biassed training data, the risk of AI hallucinations generating incorrect or nonsensical content that could be maliciously exploited to spread misinformation, and the lack of knowledge of recent events or facts. For that reason they are also receiving extra scrutiny in the draft European AI Act

Read our GPT Guide

Interpretability and explainability

LLMs tend to be considered black boxes because it can be difficult to interpret and understand how they generate their content. This can limit their trustworthiness and accountability, especially in sensitive domains such as healthcare, HR or finance. This limitation results in the necessity of having a person in the loop of your solution. An ideal solution adheres to the laws of robust AI and thus behaves appropriately even in unexpected or changing environments.

High level outline of solutions with LLMs

Large Language Model solutions are available in various forms:

For some use cases, incorporating an existing LLM is all that is needed.

For other use cases, fine-tuning an existing LLM is required.

Yet in other scenarios, a custom language model trained from scratch may be the way to go.

Crucial steps in building an LLM solutions include:

Large Language Model Choice

When choosing the right LLM for your use case, you need to take the following tradeoffs into account: open-source models (eg. LLaMA) vs. commercial models (eg. GPT), self-hosting vs. API-access, allowed commercial usage vs. only for research, as well as model performance and latency, pricing and data ownership and protection.

Analysing Cost and TCO

To prevent unpleasant surprises, it’s crucial to take initial capital expenditure and operating expenses into account when designing your solution. Given how you intend your solution to be used, some options (eg. token-based API use) may or may not be feasible options.

Prompt Engineering

Prompt Engineering involves defining, refining and optimising a prompt template to get the most accurate and relevant results from the language model. By providing additional information, you can steer the model answers into the desired direction in terms of content, company style/tone and structure.

MLOps/LLMOps/FMOps

A crucial part of working with LLM solutions in production is related to version control, fine-tuning pipelines, model swappability, performance monitoring, … To bring (and keep!) your model into production, you have to consider the fine-art of MLOps for LLMs.
Implementing a user feedback loop is invaluable here. Imagine a generative solution that suggests content that users can correct however they see fit, we would want to know exactly what was changed and feed that to the loop to re-train it appropriately at the right time.

Read our in-depth expert blogpost on Foundation Model Operations ( FMOps) to get a better understanding of FMOps, Foundation Models and the model.

- Developing AI Systems in the Foundation Model Age : From MLOps to FMOps [Pt.1]
- Developing AI Systems in the Foundation Model Age : From MLOps to FMOps [Pt.2]
‍

Data Processing

While some believe that AI is a model-centric field, it is our view that, especially with the dawn of Foundation Models, the field is becoming increasingly data-centric. Getting the right processes in place to process and store the data relevant to your use case, such as third party or internal data, still play a crucial role.

Prompt Analytics

When working with non-deterministic systems it’s crucial to capture exactly how your LLM solution is being used (input as well as output). For example, you can monitor the usage of your LLM solution or even explore the content provided to the solution, such as the top 10 most asked questions in the context of a chatbot.

Questions?

Get in touch with our experts

At ML6, we have extensive experience in delivering custom solutions that leverage LLMs to solve complex business problems. Our team of experts can work with you to build an LLM solution tailored to your business needs, whether it's automating customer service or improving content creation. LLMs are a game-changer for companies looking to innovate and maintain their competitive position within their industry.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Talk to
Jens Bontinck

Office of the CTO

Contact Jens