Why vector databases are having a moment as the AI hype cycle peaks

Paul Sawers

Updated April 20, 2024 at 2:39 PM·5 min read

Vector databases are all the rage, judging by the number of startups entering the space and the investors ponying up for a piece of the pie. The proliferation of large language models (LLMs) and the generative AI (GenAI) movement have created fertile ground for vector database technologies to flourish.

While traditional relational databases such as Postgres or MySQL are well-suited to structured data -- predefined data types that can be filed neatly in rows and columns -- this doesn't work so well for unstructured data such as images, videos, emails, social media posts, and any data that doesn't adhere to a predefined data model.

Vector databases, on the other hand, store and process data in the form of vector embeddings, which convert text, documents, images, and other data into numerical representations that capture the meaning and relationships between the different data points. This is perfect for machine learning, as the database stores data spatially by how relevant each item is to the other, making it easier to retrieve semantically similar data.

This is particularly useful for LLMs, such as OpenAI's GPT-4, as it allows the AI chatbot to better understand the context of a conversation by analyzing previous similar conversations. Vector search is also useful for all manner of real-time applications, such as content recommendations in social networks or e-commerce apps, as it can look at what a user has searched for and retrieve similar items in a heartbeat.

Vector search can also help reduce "hallucinations" in LLM applications, through providing additional information that might not have been available in the original training dataset.

"Without using vector similarity search, you can still develop AI/ML applications, but you would need to do more retraining and fine-tuning," Andre Zayarni, CEO and co-founder of vector search startup Qdrant, explained to TechCrunch. "Vector databases come into play when there’s a large dataset, and you need a tool to work with vector embeddings in an efficient and convenient way."

In January, Qdrant secured $28 million in funding to capitalize on growth that has led it to become one of the top 10 fastest growing commercial open source startups last year. And it's far from the only vector database startup to raise cash of late -- Vespa, Weaviate, Pinecone, and Chroma collectively raised $200 million last year for various vector offerings.

Qdrant founding team. Image Credits: Qdrant

Since the turn of the year, we've also seen Index Ventures lead a $9.5 million seed round into Superlinked, a platform that transforms complex data into vector embeddings. And a few weeks back, Y Combinator (YC) unveiled its Winter '24 cohort, which included Lantern, a startup that sells a hosted vector search engine for Postgres.

Elsewhere, Marqo raised a $4.4 million seed round late last year, swiftly followed by a $12.5 million Series A round in February. The Marqo platform provides a full gamut of vector tools out of the box, spanning vector generation, storage, and retrieval, allowing users to circumvent third-party tools from the likes of OpenAI or Hugging Face, and it offers everything via a single API.

Marqo co-founders Tom Hamer and Jesse N. Clark previously worked in engineering roles at Amazon, where they realized the "huge unmet need" for semantic, flexible searching across different modalities such as text and images. And that is when they jumped ship to form Marqo in 2021.

"Working with visual search and robotics at Amazon was when I really looked at vector search -- I was thinking about new ways to do product discovery, and that very quickly converged on vector search," Clark told TechCrunch. "In robotics, I was using multi-modal search to search through a lot of our images to identify if there were errant things like hoses and packages. This was otherwise going to be very challenging to solve."

Marqo co-founders Jesse Clark and Tom Hamer. Image Credits: Marqo

Enter the enterprise

While vector databases are having a moment amid the hullabaloo of ChatGPT and the GenAI movement, they're not the panacea for every enterprise search scenario.

"Dedicated databases tend to be fully focused on specific use cases and hence can design their architecture for performance on the tasks needed, as well as user experience, compared to general-purpose databases, which need to fit it in the current design," Peter Zaitsev, founder of database support and services company Percona, explained to TechCrunch.

While specialized databases might excel at one thing to the exclusion of others, this is why we're starting to see database incumbents such as Elastic, Redis, OpenSearch, Cassandra, Oracle, and MongoDB adding vector database search smarts to the mix, as are cloud service providers like Microsoft’s Azure, Amazon’s AWS, and Cloudflare.

Zaitsev compares this latest trend to what happened with JSON more than a decade ago, when web apps became more prevalent and developers needed a language-independent data format that was easy for humans to read and write. In that case, a new database class emerged in the form of document databases such as MongoDB, while existing relational databases also introduced JSON support.

"I think the same is likely to happen with vector databases," Zaitsev told TechCrunch. "Users who are building very complicated and large-scale AI applications will use dedicated vector search databases, while folks who need to build a bit of AI functionality for their existing application are more likely to use vector search functionality in the databases they use already."

But Zayarni and his Qdrant colleagues are betting that native solutions built entirely around vectors will provide the "speed, memory safety, and scale" needed as vector data explodes, compared to the companies bolting vector search on as an afterthought.

"Their pitch is, 'we can also do vector search, if needed,'" Zayarni said. "Our pitch is, 'we do advanced vector search in the best way possible.' It is all about specialization. We actually recommend starting with whatever database you already have in your tech stack. At some point, users will face limitations if vector search is a critical component of your solution."

TechCrunch
Three things we learned about Apple's AI plans from its earnings
Apple CEO Tim Cook didn't give much away about the company's AI plans on Thursday's Q2 earnings call with investors, but he did confirm a few tidbits about how the tech giant plans to move forward with artificial intelligence. Notably, his comments suggested that despite spending more than $100 billion on R&D over the last five years, Apple isn't planning to spin up too many new data centers to run or train AI models. While we've known this for some time — after all, Apple has been calling its M3 MacBook Airs the "best consumer laptop for AI" — the company shouted out on its earnings call how AI is being used across its products.
Autoblog
Study: These are the most expensive vehicles to drive per mile
iSeeCars' data showed that electric vehicles dominated the list of the most expensive vehicles to drive. Here's why.
Engadget
X is using Grok to publish AI-generated news summaries
X is using Grok to publish AI-generated summaries of news and other topics that trend on the platform.
TechCrunch
X launches Stories, delivering news summarized by Grok AI
X, formerly Twitter, is now using Elon Musk's AI chatbot Grok to power a feature that summarizes the personalized trending stories in the app's Explore section. See what the world is talking about with Stories on X, curated by @grok.
Yahoo Sports
Clippers reportedly pursuing contract extension with coach Tyronn Lue
The Los Angeles Clippers are reportedly pursuing a contract extension with head coach Tyronn Lue, who is expected to be targeted by other teams, including the Los Angeles Lakers.
Yahoo Life Shopping
This 'beautiful' 11-piece Carote nonstick cookware set is a steal at $60 — that's a clean 50% off
The nonstick, non-toxic pots and pans produce even results and clean up easily, according to nearly 10,000 five-star fans.
Yahoo Sports
The Lakers firing Darvin Ham was a predictable move. So ... now what?
Perhaps it was the right time to let go of Ham. But who looks at this Lakers roster and sees a championship team? Or a championship contender? Not in this NBA.
Yahoo Sports
What are Dolphins getting in Odell Beckham Jr.? Reported contract hints at his value
Is Beckham an obvious fit to slide in with speedy threats such as Tyreek Hill and Jaylen Waddle?
TechCrunch
Fisker stiffed the engineering firm developing its low-cost EV and pickup truck, lawsuit claims
Henrik Fisker stood on a stage last August and proudly debuted two prototypes designed to catapult his eponymous EV startup Fisker into the mainstream. There was the Pear, a low-cost EV meant for the masses, and the Alaska, Fisker’s entry into the red-hot pickup truck market. In the weeks that followed, Fisker stopped paying the engineering firm that helped develop those vehicles, according to a previously unreported lawsuit filed in federal court this week.
TechCrunch
Allozymes puts its accelerated enzymatics to work on a data and AI play, raising $15M
Allozymes' ingenious method of quickly testing millions of bio-based chemical reactions is proving to be not just a useful service, but the basis of a unique and valuable dataset. The company just raised a $15 million Series A to grow its business from a helpful service to a world-class resource. The company has grown to 32 people in the U.S., Europe and Singapore, and has 15 times the lab space, which it has used to accelerate its already exponentially faster enzyme-screening technique.
Yahoo Finance
This week in Bidenomics: No stag, no 'flation, just consternation
Investors cheered weaker-than-expected job numbers, which suggest inflation is abating. Voters will be harder to please.
TechCrunch
Inside TC's Techstars investigation and how AI is accelerating disability tech
The downturn in venture capital funding has impacted startups, VC firms, and accelerators alike. One company in the final category, Techstars, has been shaking up its operations for some time now, leading to a number of departures.
Yahoo Sports
Bees delay game in Arizona, Twins are on fire, Rays city connect & standings draft
Jake Mintz & Jordan Shusterman talk about the bee-delay game in Arizona, the Twins winning 10 games in a row, the incredible city connect uniforms released by the Rays, conduct a standings draft and give their good, bad and Uggla for this week.
TechCrunch
Instagram now lets you post a secret Story that viewers can uncover with a DM
Instagram is adding a handful of new features for Stories to give users more creative ways to share content and engage with each other, the company announced on Friday. Most notably, the social network is introducing a new "Reveal" feature that lets you post a hidden Story for your followers to uncover by sending you a DM. Instagram is also launching other features that let you share your favorite songs and highlight memories via Stories.
Yahoo Personal Finance
How to remove FHA mortgage insurance and lower your payments
FHA mortgage insurance removal is possible if you refinance or qualify for cancellation. Find out if you’re eligible to remove FHA mortgage insurance.
Autoblog
Salsa Notch 160 Deore 12 eBike: Electric mountain bike built for adventure riding
In the realm of electric mountain bikes (eMTBs), the newly released Salsa Notch 160 Deore 12 is another exciting option for trail enthusiasts. Engineered for those who love challenging terrains and extending their riding sessions, the Notch combines robust technology with high-end performance features, making it an excellent choice for anyone looking to up their mountain biking game with a little battery-powered help. At the heart of the Salsa Notch 160 Deore 12 is the Bosch Performance Line CX drive system.
TechCrunch
Climate tech investment roars back with an $8.1B start to 2024
Climate tech startups raised $8.1 billion in the first quarter, near record amounts of money that suggest 2023’s quiet close might have been more of a blip than the sign of a protracted downturn. The figure, contained in a new report from PitchBook, shows that climate tech hasn’t succumbed to the same slowdown that has dragged on the rest of the venture community. A deeper look into the $8.1 billion raised in the first quarter shows that investors focused their attention on materials, including green steel and battery materials and minerals.
TechCrunch
UnitedHealth data breach should be a wake-up call for the UK and NHS
The ransomware attack that has engulfed U.S. health insurance giant UnitedHealth Group and its tech subsidiary Change Healthcare is a data privacy nightmare for millions of U.S. patients, with CEO Andrew Witty confirming this week that it may impact as much as one-third of the country. As one of the largest healthcare companies in the U.S., UnitedHealth is well known domestically, intersecting with every facet of the healthcare industry from insurance and billing and winding all the way through the physician and pharmacy networks -- it's a $500 billion juggernaut, and the 11th largest company globally by revenue.
TechCrunch
10 years in the making, retro game emulator Delta is now No. 1 on the iOS charts
Video game emulator Delta’s decade-long struggle against the iOS App Store began with a school-issued TI-84 calculator. When Riley Testut was a sophomore in high school, he showed his friends how to load illicit software onto their bulky graphing calculators. “The teachers didn’t think we were playing Pokémon,” Testut told TechCrunch.
Yahoo Sports
Defense rankings for fantasy football 2024
The Yahoo Fantasy football analysts reveal their first defense rankings for the 2024 NFL season.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Why vector databases are having a moment as the AI hype cycle peaks

Enter the enterprise

Recommended Stories

Three things we learned about Apple's AI plans from its earnings

Study: These are the most expensive vehicles to drive per mile

X is using Grok to publish AI-generated news summaries

X launches Stories, delivering news summarized by Grok AI

Clippers reportedly pursuing contract extension with coach Tyronn Lue

This 'beautiful' 11-piece Carote nonstick cookware set is a steal at $60 — that's a clean 50% off

The Lakers firing Darvin Ham was a predictable move. So ... now what?

What are Dolphins getting in Odell Beckham Jr.? Reported contract hints at his value

Fisker stiffed the engineering firm developing its low-cost EV and pickup truck, lawsuit claims

Allozymes puts its accelerated enzymatics to work on a data and AI play, raising $15M

This week in Bidenomics: No stag, no 'flation, just consternation

Inside TC's Techstars investigation and how AI is accelerating disability tech

Bees delay game in Arizona, Twins are on fire, Rays city connect & standings draft

Instagram now lets you post a secret Story that viewers can uncover with a DM

How to remove FHA mortgage insurance and lower your payments

Salsa Notch 160 Deore 12 eBike: Electric mountain bike built for adventure riding

Climate tech investment roars back with an $8.1B start to 2024

UnitedHealth data breach should be a wake-up call for the UK and NHS

10 years in the making, retro game emulator Delta is now No. 1 on the iOS charts

Defense rankings for fantasy football 2024