Forget ChatGPT, why Llama and open source AI win 2023

[ad_1]

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Could a furry camelid take the 2023 crown for the biggest AI story of the year? If we’re talking about Llama, Meta’s large language model that took the AI research world by storm in February — followed by the commercial Llama 2 in July and Code Llama in August — I would argue that the answer is… (writer takes a moment to duck) yes.

I can almost see readers getting ready to pounce. “What? Come on — of course ChatGPT was the biggest AI story of 2023!” I can hear the crowds yelling. “OpenAI’s ChatGPT, which launched on November 30, 2022 and reached 100 million users by February? ChatGPT, which brought generative AI into popular culture? It’s the bigger story by far!”

Hang on — hear me out. In the humble opinion of this AI reporter, ChatGPT was and is, naturally, a generative AI game-changer. It was, as Forrester analyst Rowan Curran told me, “the spark that set off the fire around generative AI.”

But starting in February of this year, when Meta released Llama, the first major free ‘open source’ LLM (Llama and Llama 2 are not fully open by traditional license definitions), open source AI began to have a moment — and a red-hot debate — that has not ebbed all year long. That is even as other Big Tech firms, LLM companies and policy makers have questioned the safety and security of AI models with open access to source code and model weights, and the high costs of compute have led to struggles across the ecosystem.

Event

AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.

Learn More

According to Meta, the open source AI community has fine-tuned and released over 7,000 Llama derivatives on the Hugging Face platform since the model’s release, including a veritable animal farm of popular offspring including Koala, Vicuna, Alpaca, Dolly and RedPajama. There are many other open source models, including from Mistral, Hugging Face, and Falcon, but Llama was the first that had the data and resources of a Big Tech company like Meta supporting it.

You could consider ChatGPT the equivalent of Barbie, 2023’s biggest blockbuster movie. But Llama and its open source AI cohort are more like the Marvel Universe, with its endless spinoffs and offshoots that have the cumulative power to offer the biggest long-term impact on the AI landscape.

This will lead to “more real-world, impactful GenAI applications and cementing the open-source foundations of GenAI applications going forward,” Kjell Carlsson, head of data science strategy and evangelism at Domino Data Lab, told me.

Open source AI will have the biggest long-term impact

The era of closed, proprietary models began, in a sense, with ChatGPT. OpenAI launched in 2015 as a more open-sourced, open-research company. But in 2023, OpenAI co-founder and chief scientist Ilya Sutskever told The Verge it was a mistake to share their research, citing competitive and safety concerns.

Meta’s chief AI scientist Yann LeCun, on the other hand, pushed for Llama 2 to be released with a commercial license along with the model weights. “I advocated for this internally,” he said at the AI Native conference in September. “I thought it was inevitable, because large language models are going to become a basic infrastructure that everybody is going to use, it has to be open.”

Carlsson, to be fair, considers my ChatGPT vs. Llama argument to be an apples-to-oranges comparison. Llama 2 is the game-changing model, he explained, because it is open-source, licensed for commercial use, can be fine-tuned, can be run on premises, and is small enough to be operationalized at scale.

But ChatGPT, he said, is “the game-changing experience that brought the power of LLMs to the public consciousness and, most importantly, business leadership.” Yet as a model, he maintained, GPT 3.5 and 4 that power ChatGPT suffer “because they should not, except in exceptional circumstances, be used for anything beyond a PoC [proof of concept].”

Matt Shumer, CEO of Otherside AI, which developed Hyperwrite, pointed out that Llama likely would not have had the reception or influence it did if ChatGPT didn’t happen in the first place. But he agreed that Llama’s effects will be felt for years: “There are likely hundreds of companies that have gotten started over the last year or so that would not have been possible without Llama and everything that came after,” he said.

And Sridhar Ramaswamy, the former Neeva CEO who became SVP of data cloud company Snowflake after the company acquired his company, said “Llama 2 is 100% a game-changer — it is the first truly capable open source AI model.” ChatGPT had appeared to signal an LLM repeat of what happened with cloud, he said: “There would be three companies with capable models, and if you want to do anything you would have to pay them.”

Instead, Meta released Llama.

Early Llama leak led to a flurry of open source LLMs

Launched in February, the first Llama model stood out because it came in several sizes, from 7 billion parameters to 65 billion parameters — Llama’s developers reported that the 13B parameter model’s performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175B parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla. Meta made Llama’s model weights available for academics and researchers on a case-by-case basis — including Stanford for its Alpaca project.

But the Llama weights were subsequently leaked on 4chan. This allowed developers around the world to fully access a GPT-level LLM for the first time — leading to a flurry of new derivatives. Then in July, Meta released Llama 2 free to companies for commercial use, and Microsoft made Llama 2 available on its Azure cloud-computing service.

Those efforts came at a key moment when Congress began to talk about regulating artificial intelligence — in June, two U.S. Senators sent a letter to Meta CEO Mark Zuckerberg that questioned the Llama leak, saying they were concerned about the “potential for its misuse in spam, fraud, malware, privacy violations, harassment, and other wrongdoing and harms.”

But Meta consistently doubled-down on its commitment to open-source AI: In an internal all-hands meeting in June, for example, Zuckerberg said Meta was building generative AI into all of its products and reaffirmed the company’s commitment to an “open science-based approach” to AI research.

More than any other Big Tech company, Meta has long been a champion of open research — including, notably, creating an open source ecosystem around the PyTorch framework. And as 2023 draws to a close, Meta will celebrate the 10th anniversary of FAIR (Fundamental AI Research), which was created “to advance the state of the art of AI through open research for the benefit of all.” Ten years ago, on December 9, 2013, Facebook announced that NYU Professor Yann LeCun would lead FAIR.

In an in-person interview with VentureBeat at Meta’s New York office, Joelle Pineau, VP of AI research at Meta, recalled that she joined Meta in 2017 because of FAIR’s commitment to open research and transparency.

“The reason I came there without interviewing anywhere else is because of the commitment to open science,” she said. “It’s the reason why many of our researchers are here. It’s part of the DNA of the organization.”

But the reason to do open research has changed, she added. “I would say in 2017, the main motivation was about the quality of the research and setting the bar higher,” she said. “What is completely new in the last year is how much this is a motor for the productivity of the whole ecosystem, the number of startups who come up and are just so glad that they have an alternative model.”

But, she added, every Meta release is a one-off. “We’re not committing to releasing everything [open] all the time, under any condition,” she said. “Every release is analyzed in terms of the advantages and the risks.”

Reflecting on Llama: ‘a bunch of small things done really well’

Angela Fan, a Meta FAIR research scientist who worked on the original Llama, said she also worked on Llama 2 and the efforts to convert these models into the user-facing product capabilities that Meta showed off at its Connect developer conference last month (some of which have caused controversy, like its newly-launched stickers and characters).

“I think the biggest reflection I have is even though the technology is still kind of nascent and almost squishy across the industry, it’s at a point where we can build some really interesting stuff and we’re able to do this kind of integration across all our apps in a really consistent way,” she told VentureBeat in an interview at Connect.

She added that the company looks for feedback from its developer community, as well as the ecosystem of startups using Llama for a variety of different applications. “We want to know, what do people think about Llama 2? What should we put into Llama 3?” she said.

But Llama’s secret sauce all along, she said, has been “a bunch of small things done really well and right over a longer period of time.” There were so many different components, she recalled — like getting the original data set right, figuring out the number of parameters and pre-training it on the right learning rate schedule.

“There were many small experiments that we learned from,” she said, adding that for someone who doesn’t understand AI research, it can seem “like a mad scientist sitting somewhere. But it’s truly just a lot of hard work.”

The push to protect open source AI

A big open source ecosystem with a broadly useful technology has been “our thesis all along,” said Vipul Ved Prakash, co-founder of Together, a startup known for creating the RedPajama dataset in April, which replicated the Llama dataset, and releasing a full-stack platform and cloud service for developers at startups and enterprises to build open-source AI — including by building on Llama 2.

Prakash, not surprisingly, agreed that he considers Llama and open source AI to be the game-changer of 2023 — it is a story, he explained, of developing viable, high quality models, with a network of companies and organizations building on them.

“The cost is distributed across this network and then when you’re providing fine tuning or inference, you don’t have to amortize the cost of the model builds,” he said.

But at the moment, open source AI proponents feel the need to push to protect access to these LLMs as regulators circle. At the UK Safety Summit this week, the overarching theme of the event was to mitigate the risk of advanced AI systems wiping out humanity if it falls into the hands of bad actors — presumably with access to open source AI.

But a vocal group from the open source AI community, led by LeCun and Google Brain co-founder Andrew Ng, signed a statement published by Mozilla saying that open AI is “an antidote, not a poison.”

Sriram Krishnan, a general partner at Andreessen Horowitz, tweeted in support of Llama and open source AI:

“Realizing how important it was for @ylecun and team to get llama2 out of the door. A) they may have never had a chance to later legally B) we would have never seen what is possible with open source ( see all the work downstream of llama2) and thought of LLMs as the birthright of 2-4 companies.”

The Llama vs. ChatGPT debate continues

The debate over Llama vs. ChatGPT — as well as the debate over open source vs. closed source generally — will surely continue. When I reached out to a variety of experts to get their thoughts, it was ChatGPT for the win.

“Hands down, ChatGPT,” wrote Nikolaos Vasiloglou, VP of ML research at RelationalAI. “The reason it is a game-changer is not just its AI capabilities, but also the engineering that is behind it and its unbeatable operational costs to run it.”

And John Lyotier, CEO of TravelAI, wrote: “Without a doubt the clear winner would be ChatGPT. It has become AI in the minds of the public. People who would never have considered themselves technologists are suddenly using it and they are introducing their friends and families to AI via ChatGPT. It has become the ‘every-day person’s AI.’”

Then there was Ben James, CEO of Atlas, a 3D generative AI platform, who pointed out that Llama has reignited research in a way ChatGPT did not, and this will bring about stronger, longer-term impact.

“ChatGPT was the clear game changer of 2023, but Llama will be the game-changer of the future,” he said.

Ultimately, perhaps what I’m trying to say — that Llama and open source AI win 2023 because of how it will impact 2024 and beyond — is similar to the way Forrester’s Curran puts it: “The zeitgeist generative AI created in 2023 would not have happened without something like ChatGPT, and the sheer number of humans who have now had the chance to interact with and experience these advanced tools, compared to other cutting edge technologies in history, is staggering,” he said.

But, he added, open source models – and particularly those like Llama 2 which have seen a significant uptake from enterprise developers — are providing a lot of the ongoing fuel for the on-the-ground development and advancement of the space.

In the long term, Curran said, there will be a place for both proprietary and open source models, but without the open source community the generative AI space would be a much less advanced, very niche market, rather than a technology which has the potential for massive impacts across many aspects of work and life.

“The open source community has been and will be where many of the significant long term impacts come from, and the open source community is essential for GenAI’s success,” he said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

[ad_2]

Source link