Why code-testing startup Nova AI uses open source LLMs more than OpenAI

Julie Bort

Updated April 24, 2024 at 6:57 PM·4 min read

It is a universal truth of human nature that the developers who build the code should not be the ones to test it. First of all, most of them pretty much detest that task. Second, like any good auditing protocol, those who do the work should not be the ones who verify it.

Not surprisingly, then, code testing in all its forms -- usability, language- or task-specific tests, end-to-end testing -- has been a focus of a growing cadre of generative AI startups. Every week, TechCrunch covers another one like Antithesis (raised $47 million), CodiumAI (raised $11 million) and QA Wolf (raised $20 million). And new ones are emerging all the time, like new Y Combinator graduate Momentic.

Another is year-old startup Nova AI, an Unusual Academy accelerator grad that’s raised a $1 million pre-seed round. It is attempting to best its competitors with its end-to-end testing tools by breaking many of the Silicon Valley rules of how startups should operate, founder/CEO Zach Smith tells TechCrunch.

Whereas the standard Y Combinator approach is to start small, Nova AI is aiming at mid-size to large enterprises with complex code-bases and a burning need now. Smith declined to name any customers using or testing its product except to describe them as mostly late-stage (Series C or beyond) venture-backed startups in e-commerce, fintech or consumer products, and “heavy user experiences. Downtime for these features is costly.”

Nova AI's tech sifts through its customers' code to build tests automatically using GenAI. It is particularly geared toward continuous integration and continuous delivery/deployment (CI/CD) environments where engineers are constantly shipping bits and pieces into their production code.

The idea for Nova AI came from the experiences Smith and his co-founder Jeffrey Shih had when they were engineers working for big tech companies. Smith is a former Googler who worked on cloud-related teams that helped customers use a lot of automation technology. Shih previously worked at Meta (also at Unity and Microsoft before that) with a rare AI specialty involving synthetic data. They've since added a third co-founder, AI data scientist Henry Li.

Another rule Nova AI is not following: While boatloads of AI startups are building on top of OpenAI’s industry-leading GPT, Nova AI is using OpenAI's Chat GPT-4 as little as possible. No customer data is being fed to OpenAI.

While OpenAI promises that the data of those on a paid business plan is not being used to train its models, enterprises still do not trust OpenAI, Smith tells us. “When we're talking to large enterprises, they're like, ‘We don't want our data going into OpenAI,” Smith said.

The engineering teams of large companies are not the only ones that feel this way. OpenAI is fending off a number of lawsuits from those who don’t want it to use their work for model training, or believe their work wound up, unauthorized and unpaid for, in its outputs.

Nova AI is instead heavily relying on open source models like Llama developed by Meta and StarCoder (from the BigCoder community, which was developed by ServiceNow and Hugging Face), as well as building its own models. They aren’t yet using Google’s Gemma with customers, but have tested it and “seen good results,” Smith says.

For instance, he explains that OpenAI offers models for vector embeddings. Vector embeddings translate chunks of text into numbers so the LLM can perform various operations, such as clustering them with other chunks of similar text. Nova AI doesn’t use OpenAI's embeddings and instead uses open source for this on the customer's source code. It uses OpenAI tools only to help it generate some code and to do some labeling tasks, and is going through lengths not to send any customer data into OpenAI.

"In this case, instead of using OpenAI's embedding models, we deploy our own open source embedding models so that when we need to run through every file, we aren't just sending it to OpenAI," Smith explained.

While not sending customer data to OpenAI appeases nervous enterprises, open source AI models are also cheaper and more than sufficient for doing targeted specific tasks, Smith has found. In this case, they work well for writing tests.

“The open LLM industry is really proving that they can beat GPT 4 and these big domain providers, when you go really narrow,” he said. “We don’t have to provide some massive model that can tell you what your grandma wants for her birthday. Right? We need to write a test. And that's it. So our models are fine-tuned specifically for that.”

Open source models are also progressing quickly. For instance, Meta recently introduced a new version of Llama that's earning accolades in technology circles and that may convince more AI startups to look at OpenAI alternatives.

TechCrunch
Yelp is launching a new AI assistant to help you connect with businesses
Yelp announced a new AI-powered chatbot today for consumers that helps them connect with relevant businesses for their tasks. The company joins a long list of organizations leaning into AI chatbots as an assistive medium. Yelp said that the chatbot uses OpenAI's large language models (LLMs) along with its own data to ask users about their problems and connect them with relevant professionals for the job.
TechCrunch
OpenAI Startup Fund quietly raises $15M
The OpenAI Startup Fund, a venture fund related to -- but technically separate from -- OpenAI that invests in early-stage, typically AI-related companies across education, law and the sciences, has quietly closed a $15 million tranche. According to a filing with the U.S. Securities and Exchange Commission, two unnamed investors contributed the $15 million in new cash on or around April 19. The paperwork was submitted on April 25, and mentions Ian Hathaway, the OpenAI Startup Fund's manager and sole partner.
TechCrunch
Hugging Face releases a benchmark for testing generative AI on health tasks
Generative AI models are increasingly being brought to healthcare settings — in some cases prematurely, perhaps. Hugging Face, the AI startup, proposes a solution in a newly released benchmark test called Open Medical-LLM. Created in partnership with researchers at the nonprofit Open Life Science AI and the University of Edinburgh's Natural Language Processing Group, Open Medical-LLM aims to standardize evaluating the performance of generative AI models on a range of medical-related tasks.
TechCrunch
Startups Weekly: Is the wind going out of the AI sails?
Welcome to Startups WeeklySign up here to get it in your inbox every Friday. The report highlights a notable decrease in both private and corporate investments in the AI sector for the second consecutive year, with overall investments dropping by 20% in 2023 compared to the previous year, Kyle reports. Despite this general downturn, certain segments like generative AI continue to attract significant funding, indicating a selective yet substantial interest in specific AI applications.
TechCrunch
Adobe's working on generative video, too
Adobe says it's building an AI model to generate video. Offered as an answer of sorts to OpenAI's Sora, Google's Imagen 2 and models from the growing number of startups in the nascent generative AI video space, Adobe's model -- a part of the company's expanding Firefly family of generative AI products -- will make its way into Premiere Pro, Adobe's flagship video editing suite, sometime later this year, Adobe says. Like many generative AI video tools today, Adobe's model creates footage from scratch (either a prompt or reference images) -- and it powers three new features in Premiere Pro: object addition, object removal and generative extend.
TechCrunch
Generative AI is coming for healthcare, and not everyone's thrilled
Generative AI, which can create and analyze images, text, audio, videos and more, is increasingly making its way into healthcare, pushed by both Big Tech firms and startups alike. Google Cloud, Google's cloud services and products division, is collaborating with Highmark Health, a Pittsburgh-based nonprofit healthcare company, on generative AI tools designed to personalize the patient intake experience. Amazon's AWS division says it's working with unnamed customers on a way to use generative AI to analyze medical databases for "social determinants of health."
TechCrunch
Vana plans to let users rent out their Reddit data to train AI
In the generative AI boom, data is the new oil. From Big Tech firms to startups, AI makers are licensing e-books, images, videos, audio and more from data brokers, all in the pursuit of training up more capable (and more legally defensible) AI-powered products. Shutterstock has deals with Meta, Google, Amazon and Apple to supply millions of images for model training, while OpenAI has signed agreements with several news organizations to train its models on news archives.
TechCrunch
How Y Combinator’s founder-matching service helped medical records AI startup Hona land $3M
Y Combinator is renowned in Silicon Valley for a lot of reasons, but there’s one service that has quietly become one of its most powerful: an online founder-matching tool. Recent Y Combinator grad Hona is an example, although its founders' meet-cute story is a bit more exciting than just using that tool.
TechCrunch
Inside TC's Techstars investigation and how AI is accelerating disability tech
The downturn in venture capital funding has impacted startups, VC firms, and accelerators alike. One company in the final category, Techstars, has been shaking up its operations for some time now, leading to a number of departures.
Yahoo Sports
NBA playoffs: Kyrie Irving takes over to lead Mavericks past Clippers into 2nd-round matchup vs. Thunder
A tepid Clippers offense had no answer for Irving and Luka Dončić in an elimination game.
TechCrunch
Cloud revenue accelerates 21% to $76 billion for the latest earnings cycle
If you were concerned about slowing cloud infrastructure growth for a time in 2023, you can finally relax: The cloud was back with a vengeance this quarter. The market as a whole was up a healthy $13.5 billion to $76 billion, up 21% over the first quarter in 2023, per Synergy Research. If you’re wondering what’s driving the growth, you probably guessed that it's related to generative AI and the copious amount of data required to build the underlying models.
Yahoo Life Shopping
The best Wayfair Way Day deals, according to a home product tester — up to 80% off mattresses, vacuums and more
I'm eyeing a cooling queen mattress for $150, a KitchenAid stand mixer for $240 and a Hoover vac marked down by $100, to name a few.
TechCrunch
X launches Stories, delivering news summarized by Grok AI
X, formerly Twitter, is now using Elon Musk's AI chatbot Grok to power a feature that summarizes the personalized trending stories in the app's Explore section. See what the world is talking about with Stories on X, curated by @grok.
Yahoo Sports
NBA playoffs: Magic force Game 7 with 103–96 win over Cavaliers, despite Donovan Mitchell's 50 points
The Orlando Magic forced a Game 7 in their first-round NBA playoff series with a 103–96 win over the Cleveland Cavaliers. Donovan Mitchell scored 50 points for the Cavs.
Yahoo Sports
Caitlin Clark catches fire from 3 in WNBA preseason debut; Arike Ogunbowale's late heroics send Wings past Fever
Caitlin Clark’s WNBA preseason debut went much like her senior year at Iowa. She hit a bunch of 3s and did so in front of a sold-out crowd.
TechCrunch
Allozymes puts its accelerated enzymatics to work on a data and AI play, raising $15M
Allozymes' ingenious method of quickly testing millions of bio-based chemical reactions is proving to be not just a useful service, but the basis of a unique and valuable dataset. The company just raised a $15 million Series A to grow its business from a helpful service to a world-class resource. The company has grown to 32 people in the U.S., Europe and Singapore, and has 15 times the lab space, which it has used to accelerate its already exponentially faster enzyme-screening technique.
Engadget
Rabbit R1 review: A $199 AI toy that fails at almost everything
The Rabbit R1 is a cute AI gadget, but at launch it’s riddled with issues and terrible battery life. When phones can handle similar AI tasks, the R1 doesn’t do enough to justify its existence.
TechCrunch
Fisker stiffed the engineering firm developing its low-cost EV and pickup truck, lawsuit claims
Henrik Fisker stood on a stage last August and proudly debuted two prototypes designed to catapult his eponymous EV startup Fisker into the mainstream. There was the Pear, a low-cost EV meant for the masses, and the Alaska, Fisker’s entry into the red-hot pickup truck market. In the weeks that followed, Fisker stopped paying the engineering firm that helped develop those vehicles, according to a previously unreported lawsuit filed in federal court this week.
Yahoo Life
What everyone still gets wrong about Botox, according to experts
What can Botox do — and what are the risks? Here’s what to know.
TechCrunch
Microsoft bans US police departments from using enterprise AI tool for facial recognition
Microsoft has reaffirmed its ban on U.S. police departments from using generative AI for facial recognition through Azure OpenAI Service, the company's fully managed, enterprise-focused wrapper around OpenAI tech. Language added Wednesday to the terms of service for Azure OpenAI Service more clearly prohibits integrations with Azure OpenAI Service from being used "by or for" police departments for facial recognition in the U.S., including integrations with OpenAI's current -- and possibly future -- image-analyzing models. A separate new bullet point covers "any law enforcement globally," and explicitly bars the use of "real-time facial recognition technology" on mobile cameras, like body cameras and dashcams, to attempt to identify a person in "uncontrolled, in-the-wild" environments.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Why code-testing startup Nova AI uses open source LLMs more than OpenAI

Recommended Stories

Yelp is launching a new AI assistant to help you connect with businesses

OpenAI Startup Fund quietly raises $15M

Hugging Face releases a benchmark for testing generative AI on health tasks

Startups Weekly: Is the wind going out of the AI sails?

Adobe's working on generative video, too

Generative AI is coming for healthcare, and not everyone's thrilled

Vana plans to let users rent out their Reddit data to train AI

How Y Combinator’s founder-matching service helped medical records AI startup Hona land $3M

Inside TC's Techstars investigation and how AI is accelerating disability tech

NBA playoffs: Kyrie Irving takes over to lead Mavericks past Clippers into 2nd-round matchup vs. Thunder

Cloud revenue accelerates 21% to $76 billion for the latest earnings cycle

The best Wayfair Way Day deals, according to a home product tester — up to 80% off mattresses, vacuums and more

X launches Stories, delivering news summarized by Grok AI

NBA playoffs: Magic force Game 7 with 103–96 win over Cavaliers, despite Donovan Mitchell's 50 points

Caitlin Clark catches fire from 3 in WNBA preseason debut; Arike Ogunbowale's late heroics send Wings past Fever

Allozymes puts its accelerated enzymatics to work on a data and AI play, raising $15M

Rabbit R1 review: A $199 AI toy that fails at almost everything

Fisker stiffed the engineering firm developing its low-cost EV and pickup truck, lawsuit claims

What everyone still gets wrong about Botox, according to experts

Microsoft bans US police departments from using enterprise AI tool for facial recognition