August 10, 2023

How you can think about the future of LLMs

6 months ago, on an evening in December, I was sitting in my room in Brussels staring at a computer screen. Nothing out of the ordinary for a friendly neighbourhood engineer like me.

But because of what I was witnessing on my screen, I felt like I had entered Narnia (yes I know, I really couldn’t find a better analogy).

So, letters were pouring onto my screen, forming words forming sentences, that were explaining to me who had started the conflict between Israel and Palestine.

ChatGPT banged open the closet doors to a new world.

Using ChatGPT for the first time feels like a magical world opens up.

3 Troves of information

I tried to imagine what the future would look like for the family of GPT models (transformers). My quick first thoughts were extrapolations of its current form: better, faster and cheaper versions of ChatGPT.

It had used all the text on the internet after all. So, without entirely new model architectures, not much could change.

But the fallacy is in thinking that all the text on the internet equals all the text in the world. In reality, it’s just a fraction of all the text in the world. And even a smaller fraction of all information in the world.

I counted myself, so the numbers are correct.

Note: I believe information is the correct term here btw. GPT-4 already takes in images next to language, and the goal is to train those models on all modalities of human-interpretable information transfer: audio, video, text, and images (smell, touch ?).

So where do we find other troves of information?

At ML6, we started thinking about what LLMs would mean for organizations.

Well, every organisation generates documents, powerpoints, emails, meeting notes, product schedules, sales pipelines … that are not part of the public domain. (Unless you’re Samsung of course)

Samsung doing an oopsie.

And feeding those documents to an LLM will allow organizations to train a custom model.

So, if you counted correctly, we now have 2 different troves of information:

  1. The globally available information
  2. The information only available to a specific organisation

The last trove of information might make you want to crawl back into that closet that was the entrance to Narnia.

What if we train personal LLMs?

What if we train personal LLMs?

(Yes, I’m so confident that I just quoted my own sentence so you had to read it twice.)

So, we feed the LLM all the information that you personally generate but don’t want to share with anyone else. Chat messages, notes, emails… Maybe even transcriptions of all your personal conversations.

What could possibly go wrong with collecting your most intimate information? “The bread always falls on the buttered side.”

If this sounds like a nightmare to you, I tend to agree with you. But what if no person or company in the world has access to it except you yourself?

Then it’s as if the LLM was a part of your own brain. Your own brain also registers and learns from lots of information, and it’s also private to only yourself. Picture it as a little sidecar brain.

In this scenario, it’s worth it to imagine the possible upsides as well. This is where it gets interesting, but also a bit philosophical.

On the other hand, though, personal LLMs might have upsides. “The bread rises from its buttery ashes again and spreads its doughy wings like a phoenix!”

We can’t really predict what will happen if we have a “second brain”. Will it allow us to have thoughts that we can’t even think today?

Before people could write and do maths, reasoning was much harder. E.g. you need to remember all the steps in your arguments all the time. That’s annoying. There were entire categories of thoughts that we literally couldn’t have!

Just like how the invention of writing and maths enabled new kinds of thoughts, a second brain could unlock a new superior kind of reasoning.

3 Levels of automation

When you think of AI, you think of jobs getting automated away. That’s correct. But here’s some relief: there are 3 different levels of automation.

Level 1: automating language tasks

A large language model is trained to .. model language. So, that makes it very performant in automating language tasks like: “Summarize a text”, “Extract named entities from the text”, “change the tone of voice to a pirate” and “Write a grammatically correct piece about the conflict between Palestine and Israel”.

But language is deeply human and describes and captures a lot of the interfaces and interactions of the world around us.

So, mastering language also means mastering our world.

A bit similar to why the Tesla bots are humanoid: they get interoperability with our world for free because our world is built for humans. So that’s why a second level of automation emerges.

Tesla bots are humanoid so they fit into our world. Likewise, LLMs know how to operate our world because they speak our language.

Level 2: using tools

Just through modelling language, a model can also “think” and write its own manual for how to execute a task in the world. E.g. When you tell it to calculate the root of 2. It will have “know” from modelling language that you need to take a calculator to do so.

If you now give it a calculator when it asks for it, the model is effectively using tools. This is what the ChatGPT plugin store is obviously.

In other words, the model can use all software ever written on this planet to enhance its capabilities. And if we recall that effective tool usage made us homo sapiens stand out from other primates, then that’s a pretty neat feature.

#TBT remember when FlintStone was the only plugin we had?

Level 3: writing code

The final level emerges because LLMs do not only learn to model human-to-human language but also human-to-machine language: code.

A big part of modern-day automation is giving instructions to machines in the form of code. Those machines then go execute those instructions for us, without our intervention. Aka automation.

But because language models can also talk to those machines through their code, they’re a first step towards automating automation.

Automating the automation is when it gets interesting.

What you could see happening is one language model writing code to build a certain software package that another model can start using.

Get the palm leaves to welcome our LLM messiahs?

Obviously, all three levels of automation have flaws. If not, we would be in some sort of post-human society. (Unless the LLMs are so good they’re fooling us while hiding somewhere and living off the fat of the land).

Are LLMs chilling somewhere while fooling us that they’re dumb? Unlikely.

Recognizing those flaws helps with deciding how to leverage LLMs for automation.

The flaws are all testimonies of a fundamental problem in statistical modelling: correlation does not equal causation. The model can frighteningly well model the correlation between letters and sentences, but does not understand causation. So it can’t really reason. And it does make factual and logical mistakes.

But apparently, you can get really far with correlations. And who knows how much further we can get with it?

Two years ago my colleagues at ML6 finetuned an ancestor of GPT-4 and created a Dutch GPT-2 model. Back then, it was hard to imagine we would get this far with essentially the same technique.

Conclusion

We broke down LLMs along two axes:

  1. 3 troves of information
  2. 3 levels of automation

If you combine items of both axes, you can arrive at plausible predictions for the future. E.g. what we (will) see is

personal LLMs using level 2 automation to be your assistant.

It will use software tools to do what you ask and more.

What I didn’t touch on, is a massive asterisk about how immensely vulnerable a lot of those future applications are. It’s literally a playground for hackers.

A massive asterisk about security vulnerabilities in LLM applications.

In the next blog post, Jacob Cassiman will unwrap some of those vulnerabilities. That’s when the real fun begins. (This blogpost is also available on Medium platform)