As Meta announces its largest and most capable large language model yet with the Llama 3.1 405B, Mark Zuckerberg says it’s important technology isn’t controlled by a handful of companies.
Opting for a transparent approach, the CEO wrote a public letter on Tuesday (July 23) explaining why the LLM is open source and why this is ‘good for the world.’
“I believe that open source is necessary for a positive AI future,” writes the billionaire.
Mark is an incredible CEO.
What particularly stood out to me from our conversation was how deeply he thinks about open source and how it’ll benefit Meta (and the world) in the long term.
A few of my favorite moments/quotes:
On the future of AI agents:
“I think we’re going to… https://t.co/rhuOVcapbW
— Rowan Cheung (@rowancheung) July 24, 2024
“Open source will ensure that more people around the world have access to the benefits and opportunities of AI, that power isn’t concentrated in the hands of a small number of companies, and that the technology can be deployed more evenly and safely across society.”
The new Llama 3.1 405B is the company’s first frontier-level open source AI model. Unlike in previous releases, a greater focus on building a broader ecosystem has now been taken.
The team has been working with a range of companies to grow this, citing “Amazon, Databricks, and NVIDIA” as all having full suites of services to support developers fine-tuning and distilling their own models.
Zuckerberg explains how this move was taken to “ensure that we always have access to the best technology, and that we’re not locking into a competitor’s closed ecosystem where they can restrict what we build.
“One of my formative experiences has been building our services constrained by what Apple will let us build on their platforms.
“Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build.”
What is the Llama 3.1 405B model by Mark Zuckerberg’s Meta?
Meta officially announced Llama 3.1 on Tuesday (July 23) describing it as having “state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation.”
The LLM was trained on over 15 trillion tokens and model training was pushed to over 16 thousand H100 GPUs, making it the first Llama model to be done at such scale.
Meta says they’ve spoken with the community and realize there’s so much more to generative AI development than just prompting models.
As such, they believe developers can do the following with the ecosystem: “real-time and batch interference, supervised fine-tuning, evaluation of your model for a specific application, continual pre-training, retrieval-augmented generation, function calling, and synthetic data generation.”
Featured Image: Via Anthony Quintano on Flickr