Anthropic's Claude AI is guided by 10 secret foundational pillars of fairness

What? No, of course the company isn't saying what those core tenets actually are.

Carol Yepes via Getty Images

Despite their ability to crank out incredibly lifelike prose, generative AIs like Google's Bard or OpenAI's ChatGPT (powered by GPT-4), have already shown the current limitations of gen-AI technology as well as their own tenuous grasp of the facts — arguing that the JWST was the first telescope to image an exoplanet, and that Elvis' dad was an actor. But with this much market share at stake, what are a few misquoted facts against getting their product into the hands of consumers as quickly as possible?

The team over at Anthropic, conversely, is made up largely of ex-OpenAI folks and they've taken a more pragmatic approach to the development of their own chatbot, Claude. The result is an AI that is "more steerable" and “much less likely to produce harmful outputs,” than ChatGPT, per a report from TechCrunch.

Claude has been in closed beta development since late 2022, but has recently begun testing the AI's conversational capabilities with launch partners including Robin AI, Quora and privacy-centered search engine, Duck Duck Go. The company has not released pricing yet but has confirmed to TC that two versions will be available at launch: the standard API and a faster, lightweight iteration they've dubbed Claude Instant.

“We use Claude to evaluate particular parts of a contract, and to suggest new, alternative language that’s more friendly to our customers,” Robin CEO Richard Robinson told TechCrunch. “We’ve found Claude is really good at understanding language — including in technical domains like legal language. It’s also very confident at drafting, summarizing, translations and explaining complex concepts in simple terms.”

Anthropic believes that Claude will be less likely to go rogue and start spitting racist obscenities like Tay did, in part, due to the AI's specialized training regimen that eh company is calling "constitutional AI." The company asserts that this provides a “principle-based” approach towards getting humans and robots on the same ethical page. Anthropic started with 10 foundational principles — though the company won't disclose what they are, specifically, which is 11-secret-herbs-and-spices of weird marketing stunt — suffice to say that, "they’re grounded in the concepts of beneficence, nonmaleficence and autonomy," per TC.

The company then trained a separate AI to reliably generate text in accordance to those semi-secret principles by responding to myriad writing prompts like “compose a poem in the style of John Keats.” That model then trained Claude. But just because it is trained to be fundamentally less problematic than its competition doesn't mean Claude doesn't hallucinate facts like a startup CEO on an ayahuasca retreat. The AI has already invented a whole new chemical and taken artistic license to the uranium enrichment process; it has reportedly scored lower than ChatGPT on standardized tests for both math and grammar as well.

“The challenge is making models that both never hallucinate but are still useful — you can get into a tough situation where the model figures a good way to never lie is to never say anything at all, so there’s a tradeoff there that we’re working on,” the Anthropic spokesperson told TC. “We’ve also made progress on reducing hallucinations, but there is more to do."