TechCrunch

Data is at the heart of today's advanced AI systems, but it's costing more and more -- making it out of reach for all but the wealthiest tech companies. Last year, James Betker, a researcher at OpenAI, penned a post on his personal blog about the nature of generative AI models and the datasets on which they're trained. In it, Betker claimed that training data -- not a model's design, architecture or any other characteristic -- was the key to increasingly sophisticated, capable AI systems.