Snowflake releases a flagship generative AI model of its own

Kyle Wiggers

Updated April 24, 2024 at 2:20 PM·6 min read

All-around, highly generalizable generative AI models were the name of the game once, and they arguably still are. But increasingly, as cloud vendors large and small join the generative AI fray, we're seeing a new crop of models focused on the deepest-pocketed potential customers: the enterprise.

Case in point: Snowflake, the cloud computing company, today unveiled Arctic LLM, a generative AI model that's described as "enterprise-grade." Available under an Apache 2.0 license, Arctic LLM is optimized for "enterprise workloads," including generating database code, Snowflake says, and is free for research and commercial use.

"I think this is going to be the foundation that's going to let us -- Snowflake -- and our customers build enterprise-grade products and actually begin to realize the promise and value of AI," CEO Sridhar Ramaswamy said in press briefing. "You should think of this very much as our first, but big, step in the world of generative AI, with lots more to come."

An enterprise model

My colleague Devin Coldewey recently wrote about how there's no end in sight to the onslaught of generative AI models. I recommend you read his piece, but the gist is: Models are an easy way for vendors to drum up excitement for their R&D and they also serve as a funnel to their product ecosystems (e.g. model hosting, fine-tuning and so on).

Arctic LLM is no different. Snowflake's flagship model in a family of generative AI models called Arctic, Arctic LLM — which took around three months, 1,000 GPUs and $2 million to train — arrives on the heels of Databricks' DBRX, a generative AI model also marketed as optimized for the enterprise space.

Snowflake draws a direct comparison between Arctic LLM and DBRX in its press materials, saying Arctic LLM outperforms DBRX on the two tasks of coding (Snowflake didn't specify which programming languages) and SQL generation. The company said Arctic LLM is also better at those tasks than Meta's Llama 2 70B (but not the more recent Llama 3 70B) and Mistral's Mixtral-8x7B.

Snowflake also claims that Arctic LLM achieves "leading performance" on a popular general language understanding benchmark, MMLU. I'll note, though, that while MMLU purports to evaluate generative models' ability to reason through logic problems, it includes tests that can be solved through rote memorization, so take that bullet point with a grain of salt.

"Arctic LLM addresses specific needs within the enterprise sector," Baris Gultekin, head of AI at Snowflake, told TechCrunch in an interview, "diverging from generic AI applications like composing poetry to focus on enterprise-oriented challenges, such as developing SQL co-pilots and high-quality chatbots."

Arctic LLM, like DBRX and Google's top-performing generative model of the moment, Gemini 1.5 Pro, is a mixture of experts (MoE) architecture. MoE architectures basically break down data processing tasks into subtasks and then delegate them to smaller, specialized "expert" models. So, while Arctic LLM contains 480 billion parameters, it only activates 17 billion at a time — enough to drive the 128 separate expert models. (Parameters essentially define the skill of an AI model on a problem, like analyzing and generating text.)

Snowflake claims that this efficient design enabled it to train Arctic LLM on open public web data sets (including RefinedWeb, C4, RedPajama and StarCoder) at "roughly one-eighth the cost of similar models."

Running everywhere

Snowflake is providing resources like coding templates and a list of training sources alongside Arctic LLM to guide users through the process of getting the model up and running and fine-tuning it for particular use cases. But, recognizing that those are likely to be costly and complex undertakings for most developers (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflake's also pledging to make Arctic LLM available across a range of hosts, including Hugging Face, Microsoft Azure, Together AI's model-hosting service and enterprise generative AI platform Lamini.

Here's the rub, though: Arctic LLM will be available first on Cortex, Snowflake's platform for building AI- and machine learning-powered apps and services. The company's unsurprisingly pitching it as the preferred way to run Arctic LLM with "security," "governance" and scalability.

"Our dream here is, within a year, to have an API that our customers can use so that business users can directly talk to data," Ramaswamy said. "It would've been easy for us to say, 'Oh, we'll just wait for some open source model and we'll use it. Instead, we're making a foundational investment because we think [it's] going to unlock more value for our customers."

So I'm left wondering: Who's Arctic LLM really for besides Snowflake customers?

In a landscape full of "open" generative models that can be fine-tuned for practically any purpose, Arctic LLM doesn't stand out in any obvious way. Its architecture might bring efficiency gains over some of the other options out there. But I'm not convinced that they'll be dramatic enough to sway enterprises away from the countless other well-known and -supported, business-friendly generative models (e.g. GPT-4).

There's also a point in Arctic LLM's disfavor to consider: its relatively small context.

In generative AI, context window refers to input data (e.g. text) that a model considers before generating output (e.g. more text). Models with small context windows are prone to forgetting the content of even very recent conversations, while models with larger contexts typically avoid this pitfall.

Arctic LLM's context is between ~8,000 and ~24,000 words, dependent on the fine-tuning method -- far below that of models like Anthropic's Claude 3 Opus and Google's Gemini 1.5 Pro.

Snowflake doesn't mention it in the marketing, but Arctic LLM almost certainly suffers from the same limitations and shortcomings as other generative AI models — namely, hallucinations (i.e. confidently answering requests incorrectly). That's because Arctic LLM, along with every other generative AI model in existence, is a statistical probability machine -- one that, again, has a small context window. It guesses based on vast amounts of examples which data makes the most "sense" to place where (e.g. the word "go" before "the market" in the sentence "I go to the market"). It'll inevitably guess wrong — and that's a "hallucination."

As Devin writes in his piece, until the next major technical breakthrough, incremental improvements are all we have to look forward to in the generative AI domain. That won't stop vendors like Snowflake from championing them as great achievements, though, and marketing them for all they're worth.

TechnologyTechCrunch
Why RAG won't solve generative AI's hallucination problem
Hallucinations -- the lies generative AI models tell, basically -- are a big problem for businesses looking to integrate the technology into their operations. In a recent piece in The Wall Street Journal, a source recounts an instance where Microsoft's generative AI invented meeting attendees and implied that conference calls were about subjects that weren't actually discussed on the call. As I wrote a while ago, hallucinations may be an unsolvable problem with today's transformer-based model architectures.
BusinessTechCrunch
Anthropic launches new iPhone app and premium plan for businesses
Anthropic, one of the world's best-funded generative AI startups with $7.6 billion in the bank, is launching a new paid plan aimed at enterprises, including those in highly regulated industries like healthcare, finance and legal, as well as a new iOS app. Team, the enterprise plan, gives customers higher-priority access to Anthropic's Claude 3 family of generative AI models plus additional admin and user management controls. "Anthropic introduced the Team plan now in response to growing demand from enterprise customers who want to deploy Claude's advanced AI capabilities across their organizations," Scott White, product lead at Anthropic, told TechCrunch.
BusinessTechCrunch
Nvidia acquires AI workload management startup Run:ai for $700M, sources say
Nvidia is acquiring Run:ai, a Tel Aviv-based company that makes it easier for developers and operations teams to manage and optimize their AI hardware infrastructure. CTech reported earlier this morning the companies were in "advanced negotiations" that could see Nvidia pay upwards of $1 billion for Run:ai. Nvidia says that it'll continue to offer Run:ai’s products "under the same business model" and invest in Run:ai's product roadmap as part of Nvidia's DGX Cloud AI platform, which gives enterprise customers access to compute infrastructure and software that they can use to train models for generative and other forms of AI.
HealthTechCrunch
Hugging Face releases a benchmark for testing generative AI on health tasks
Generative AI models are increasingly being brought to healthcare settings — in some cases prematurely, perhaps. Hugging Face, the AI startup, proposes a solution in a newly released benchmark test called Open Medical-LLM. Created in partnership with researchers at the nonprofit Open Life Science AI and the University of Edinburgh's Natural Language Processing Group, Open Medical-LLM aims to standardize evaluating the performance of generative AI models on a range of medical-related tasks.
BusinessEngadget
Meta rolls out an updated AI assistant, built with the long-awaited Llama 3
Meta has begun rolling out its new AI assistant, which was built using the long-awaited Llama 3 LLM. You can use Meta AI on Facebook, Instagram, WhatsApp and Messenger, with support for the Quest platform coming soon.
BusinessTechCrunch
Adobe's working on generative video, too
Adobe says it's building an AI model to generate video. Offered as an answer of sorts to OpenAI's Sora, Google's Imagen 2 and models from the growing number of startups in the nascent generative AI video space, Adobe's model -- a part of the company's expanding Firefly family of generative AI products -- will make its way into Premiere Pro, Adobe's flagship video editing suite, sometime later this year, Adobe says. Like many generative AI video tools today, Adobe's model creates footage from scratch (either a prompt or reference images) -- and it powers three new features in Premiere Pro: object addition, object removal and generative extend.
BusinessTechCrunch
Meta confirms that its Llama 3 open source LLM is coming in the next month
At an event in London on Tuesday, Meta confirmed that it plans an initial release of Llama 3 — the next generation of its large language model used to power generative AI assistants — within the next month. This confirms a report published on Monday by The Information that Meta was getting close to launch. "Within the next month, actually less, hopefully in a very short period of time, we hope to start rolling out our new suite of next-generation foundation models, Llama 3," said Nick Clegg, Meta's president of global affairs.
SportsYahoo Sports
WWE Backlash France 2024: Full card, live updates, results, grades and analysis
Backlash, WWE's first premium live event since WrestleMania 40, takes place on Saturday at the LDLC Arena in Lyon, France.
SportsYahoo Sports
Los Angeles Clippers 2024 NBA offseason preview: It's gonna be a tough summer
The Clippers will likely have some key free agents but are not expected to have any cap space.
BusinessYahoo Finance
Berkshire Hathaway annual shareholders meeting: Warren Buffett takes stage without Charlie Munger for first time
Warren Buffett will take hours of questions from shareholders on Saturday without his right-hand man, Charlie Munger, for the first time in decades.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Snowflake releases a flagship generative AI model of its own

An enterprise model

Running everywhere

Recommended Stories

Why RAG won't solve generative AI's hallucination problem

Anthropic launches new iPhone app and premium plan for businesses

Nvidia acquires AI workload management startup Run:ai for $700M, sources say

Hugging Face releases a benchmark for testing generative AI on health tasks

Meta rolls out an updated AI assistant, built with the long-awaited Llama 3

Adobe's working on generative video, too

Meta confirms that its Llama 3 open source LLM is coming in the next month

WWE Backlash France 2024: Full card, live updates, results, grades and analysis

Los Angeles Clippers 2024 NBA offseason preview: It's gonna be a tough summer

Berkshire Hathaway annual shareholders meeting: Warren Buffett takes stage without Charlie Munger for first time