October 10, 2023
LLAMA 2: So good they named it twice
A quick overview of LLAMA 2 and a short recap of what led up to it
So Meta did a thing. LLAMA 2 is now out!
But what the hell is a LLAMA, how did we get here and what do you (yes, you) need to know if you want to use it.
What the hell is a LLAMA?
LLAMA is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the Pre-Columbian era.
LLAMA is also Meta’s Large Language Model that got an update recently — which is what we’re talking about here.
(llama is apparently also a bowling term for four strikes in a row — which may win you a quiz one day)
How did we get here?
How LLAMA 2 came to be is quite the story. So buckle up because we’re in for a ride.
Let’s cast our minds back a few months.
- Google taunted the demo gods and did a Tesla during Bard’s release (link).
- Elon Musk pulled the old switcheroo — pushing for a halt on AI before starting his own AI company. Classic Elon. Oldest trick in the book.
- And Samsung decided to open-source their internal documents (link).
Good times.
BUT ALSO: Meta entered the chat.
Mark Zuckerberg laid out his carefully constructed vision that open-source LLMs are the future and boldly released LLAMA as a completely open model — no strings attached!
Which would have been a bold (or should I say re-Mark-able) move if not for one major flaw: that is not at all what actually happened.
So let’s recap what did happen:
February: Meta releases LLAMA
Mark and his funky bunch announce that the LLM hype train will not leave the station without them and release their very own model: LLAMA.
Or should I say LLAMA-RK. Sorry.
Unlike the usual suspects (aka Google, AWS, Microsoft and OpenAI), they decide to not keep the model under lock and key 🔐.
Instead opting to share the model with a restricted group of researchers under a non-commercial license.
In this way they want to “maintain integrity and prevent misuse”. Which they will — unless someone leaks LLAMA but that would never happen.
March: Meta does an oopsie
Pretty much immediately, someone leaks LLAMA. As in: the entire thing. So now, anyone can get their hands on the model and basically do whatever they want with it — completely bypassing the restrictions.
Meta underlines that — despite the leak — LLAMA’s license still applies.
You still can’t use it commercially and unauthorized people still can’t use it at all.
March: Unauthorized people use LLAMA anyway
The AI community reacted to Meta’s warning with a collective “damn that’s crazy” and proceeded to ignore it completely.
At this point, the model is in the hands of the masses. And those hands are notoriously sticky.
Meta realizes that they can’t sue literally everyone, basically accepts the L and lets it slide.
Side note: they are much less lenient on their ban on commercial use and it’s definitely feasible for them to sue businesses so don’t get any ideas.
March — July: People go ham on LLAMA
It didn’t take long for the AI community to do what it does best: optimize the hell out of something.
And boy, did they optimize LLAMA.
In a matter of weeks, people managed to get LLAMA running on a phone. People were training LLAMA variations such as Vicuna that rival Google’s Bard, spending just a few hundred bucks.
July: Things have gone full-circle
The leak that originally bit Meta firmly in the derrière (pardon my French) went full circle and ironically became their biggest advantage.
They had initially planned to give LLAMA to a select group of researchers who could tweak & optimize it.
Instead, the model got leaked to the general public. Which is obviously a bad look for a multi-billion dollar company but it also made it so that everyone was optimizing LLAMA (not just a few researchers).
This accelerated progress beyond their most optimistic estimates.
Was this Meta’s master plan all along?
We’ll leave that thought to our colleagues in the conspiracy theory business.
July: Meta doubles down and releases LLAMA 2
Which brings us to today. Meta just released a big update to their LLAMA model. And it is open-source. On purpose this time!
There are a few strings attached though which we’ll circle back to (*cliffhanger*).
This more or less confirms that they are doubling down on the open-source trend they inadvertently kickstarted.
Gotta love how things play out sometimes.
Let’s take a closer look at what exactly they released, what those infamous few strings are and what this all means for you.
What you need to know about LLAMA 2
Small disclaimer: all technical details are nicely & succinctly put in Meta’s announcement so regurgitating the same information doesn’t add much.
Instead, we’ll skip over a lot of the fine print and focus solely on the highlights that you (yes, you) should know.
It’s good but we’re just getting started
LLAMA 2 comes in four sizes (7B, 13B, 65B and 70B).
- The smaller varieties achieve about the same performance as the current open-source state-of-the-art (albeit with a much smaller model).
- The bigger varieties are significantly better than the current open-source state-of-the-art.
Obligatory side note: evaluating LLMs is notoriously difficult. Currently, the evaluations are made based on GPT-4 evaluation, (subjective) human evaluation and scores on standardized tests (i.e., think SATs and the likes).
So any statement about accuracies we make here are by no means very accurate (ironically) but enough to get a general impression.
That’s all very nice but this is not the end of the road. Not even close.
Just like when LLAMA got leaked, LLAMA 2 is now also in the hands of the many. Especially because it’s not even a legal grey area now, those hands will be firmly glued to their collective keyboards — working their magic and optimizing the hell out of it.
In fact, we are already seeing the first signs of the open-source magic, e.g.:
- Baby Llama: LLAMA 2 implementation fully in C
- Stable Beluga 2: Instruction fine-tuned LLAMA 2 by stability.ai
- …
Expect lots more open-source progress based on LLAMA 2.
It’s open-source but also not really?
This is where the infamous few strings attached come in.
Here’s the deal:
Is it open-source? Yes.
Can you use it commercially? Yes.
BUT there are some caveats to be aware of:
- If you have more than 700mln monthly active users, you need a license from Meta. Which probably isn’t an issue for you but it is for the cloud providers.- You can’t use LLAMA 2 (or its output) to improve other LLMs. Again, this shouldn’t be much of an issue for you but it is for Meta’s competitors.
- You can’t use LLAMA 2 in an illegal, unethical or irresponsible way. Meta’s acceptable use definition of unethical / irresponsible that is. This is very reminiscent of the OpenRAIL initiative that e.g., StableDiffusion adopted in their license.
We highly recommend taking a look at Meta’s acceptable use policy. Although overall, it is largely what you would expect.
If you want to read the full LLAMA 2 license, by all means.
Side note: the license refers to their acceptable use policy via URL.
So if they made another oopsie and forgot something important, they can still go back and add it without needing to touch the actual license itself.
If you want to use the model and play it extra safe, we’d suggest making a dated PDF copy of the policy.
Solid business by Meta
All in all, it’s pretty smart business from Meta.
With their LLAMA 2 license, they basically combine the advantages of open-source with the advantages of closed-source:
- On one side, the general public can ethically use LLAMA 2 however they please. So Meta will benefit from all the open-source progress that will be made.
- On the flip side, other model providers essentially can’t use LLAMA 2 at all which protects Meta’s competitive edge.
And the major cloud providers can’t resell LLAMA 2 without a commercial agreement with Meta which is exactly how they want to monetize it.
Don’t believe me? Mark Zuckerberg said this during Meta’s quarterly earnings call:
“If you’re someone like Microsoft, Amazon or Google, and you’re going to basically be reselling the services, that’s something that we think we should get some portion of the revenue for”
~ Mark Zuckerberg, July 27 2023
So basically everyone can ethically use it except for their competitors and the parties they can make the big bucks from 💸. Pretty clever.
So clever in fact that we could see these halfopen-source/pseudopen-source licenses becoming a wider industry standard (*foreshadowing*).
If this turns out to be spot on, you heard it here first.
If his MMA career doesn’t work out, this Mark Zuckerberg guy should look into starting a company.
Meta Microsoft
Microsoft is officially Meta’s preferred partner for LLAMA 2.
However, the exact scope of being a LLAMA 2 preferred partner is not crystal clear. As, for example, the model is also available on AWS and Google Cloud so we know that it’s not an exclusivity agreement.
Either way, it is interesting to see Microsoft doubling down on their partnership with Meta after their partnership with OpenAI.
- Does Microsoft see Meta’s open-source LLM strategy as the long-term play and OpenAI’s closed-source GPTs as more of a short-term bet to gain a first-mover advantage in the LLM space?
- Or are they simply opportunistically betting on all the most promising horses in the race to generative AI leadership? And one of these horses just happens to be a llama.
- Maybe they want to position OpenAI’s and Meta’s offerings differently and believe that there are synergies to banking on both.
Shameless speculation aside, the only real answer is that we don’t know and neither do you. Time will tell though.
Conclusion
Meta stumbled its way onto the open-source LLM podium and decided to stick around.
They released LLAMA 2 which is looking very decent but the real value will come from the general public tinkering around with it.
It’s pretty much open-source. As in: you can do whatever you want with it (both commercially and non-commercially)— except:
- if your name is Google, Amazon or Microsoft
- if you’re planning to do something illegal, unethical or irresponsible. Meta’s definition of unethical and irresponsible that is.