This Week in AI: Let us not forget the humble data annotator

Kyle Wiggers and Devin Coldewey

Updated May 3, 2024 at 12:29 PM·7 min read

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

This week in AI, I'd like to turn the spotlight on labeling and annotation startups -- startups like Scale AI, which is reportedly in talks to raise new funds at a $13 billion valuation. Labeling and annotation platforms might not get the attention flashy new generative AI models like OpenAI's Sora do. But they're essential. Without them, modern AI models arguably wouldn't exist.

The data on which many models train has to be labeled. Why? Labels, or tags, help the models understand and interpret data during the training process. For example, labels to train an image recognition model might take the form of markings around objects, "bounding boxes" or captions referring to each person, place or object depicted in an image.

The accuracy and quality of labels significantly impact the performance -- and reliability -- of the trained models. And annotation is a vast undertaking, requiring thousands to millions of labels for the larger and more sophisticated datasets in use.

So you'd think data annotators would be treated well, paid living wages and given the same benefits that the engineers building the models themselves enjoy. But often, the opposite is true -- a product of the brutal working conditions that many annotation and labeling startups foster.

Companies with billions in the bank, like OpenAI, have relied on annotators in third-world countries paid only a few dollars per hour. Some of these annotators are exposed to highly disturbing content, like graphic imagery, yet aren't given time off (as they're usually contractors) or access to mental health resources.

Workers that made ChatGPT less harmful ask lawmakers to stem alleged exploitation by Big Tech

An excellent piece in NY Mag peels back the curtain on Scale AI in particular, which recruits annotators in countries as far-flung as Nairobi and Kenya. Some of the tasks required by Scale AI take labelers multiple eight-hour workdays -- no breaks -- and pay as little as $10. And these workers are beholden to the whims of the platform. Annotators sometimes go long stretches without receiving work, or they're unceremoniously booted off Scale AI -- as happened to contractors in Thailand, Vietnam, Poland and Pakistan recently.

Some annotation and labeling platforms claim to provide "fair-trade" work. They've made it a central part of their branding in fact. But as MIT Tech Review's Kate Kaye notes, there are no regulations, only weak industry standards for what ethical labeling work means -- and companies’ own definitions vary widely.

So, what to do? Barring a massive technological breakthrough, the need to annotate and label data for AI training isn't going away. We can hope that the platforms self-regulate, but the more realistic solution seems to be policymaking. That itself is a tricky prospect -- but it's the best shot we have, I'd argue, at changing things for the better. Or at least starting to.

Here are some other AI stories of note from the past few days:

OpenAI builds a voice cloner: OpenAI is previewing a new AI-powered tool it developed, Voice Engine, that enables users to clone a voice from a 15-second recording of someone speaking. But the company is choosing not to release it widely (yet), citing risks of misuse and abuse.
Amazon doubles down on Anthropic: Amazon has invested an additional $2.75 billion in the growing AI startup Anthropic, following through on the option it left open last September.
Google.org launches an accelerator: Google.org, Google’s charitable wing, is launching a new $20 million, six-month program to help fund nonprofits developing tech that leverages generative AI.
A new model architecture: AI startup AI21 Labs has released a generative AI model, Jamba, that employs a novel, new(ish) model architecture -- state space models, or SSMs -- to improve efficiency.
Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model akin to OpenAI’s GPT series and Google’s Gemini. The company claims it achieves state-of-the-art results on a number of popular AI benchmarks, including several measuring reasoning.
Uber Eats and UK AI regulation: Natasha writes about how an Uber Eats courier’s fight against AI bias shows that justice under the U.K.'s AI regulations is hard won.
EU election security guidance: The European Union published draft election security guidelines Tuesday aimed at the around two dozen platforms regulated under the Digital Services Act, including guidelines pertaining to preventing content recommendation algorithms from spreading generative AI–based disinformation (aka political deepfakes).
Grok gets upgraded: X’s Grok chatbot will soon get an upgraded underlying model, Grok-1.5 -- at the same time all Premium subscribers on X will gain access to Grok. (Grok was previously exclusive to X Premium+ customers.)
Adobe expands Firefly: This week, Adobe unveiled Firefly Services, a set of more than 20 new generative and creative APIs, tools and services. It also launched Custom Models, which allows businesses to fine-tune Firefly models based on their assets -- a part of Adobe's new GenStudio suite.

More machine learnings

How's the weather? AI is increasingly able to tell you this. I noted a few efforts in hourly, weekly, and century-scale forecasting a few months ago, but like all things AI, the field is moving fast. The teams behind MetNet-3 and GraphCast have published a paper describing a new system called SEEDS ( Scalable Ensemble Envelope Diffusion Sampler).

Animation showing how more predictions create a more even distribution of weather predictions. Image Credits: Google

SEEDS uses diffusion to generate "ensembles" of plausible weather outcomes for an area based on the input (radar readings or orbital imagery perhaps) much faster than physics-based models. With bigger ensemble counts, they can cover more edge cases (like an event that only occurs in 1 out of 100 possible scenarios) and can be more confident about more likely situations.

Fujitsu is also hoping to better understand the natural world by applying AI image handling techniques to underwater imagery and lidar data collected by underwater autonomous vehicles. Improving the quality of the imagery will let other, less sophisticated processes (like 3D conversion) work better on the target data.

Image Credits: Fujitsu

The idea is to build a "digital twin" of waters that can help simulate and predict new developments. We're a long way off from that, but you gotta start somewhere.

Over among the large language models (LLMs), researchers have found that they mimic intelligence by an even simpler-than-expected method: linear functions. Frankly, the math is beyond me (vector stuff in many dimensions) but this writeup at MIT makes it pretty clear that the recall mechanism of these models is pretty … basic.

Even though these models are really complicated, nonlinear functions that are trained on lots of data and are very hard to understand, there are sometimes really simple mechanisms working inside them. "This is one instance of that," said co-lead author Evan Hernandez. If you're more technically minded, check out the researchers' paper here.

One way these models can fail is not understanding context or feedback. Even a really capable LLM might not "get it" if you tell it your name is pronounced a certain way, since they don't actually know or understand anything. In cases where that might be important, like human-robot interactions, it could put people off if the robot acts that way.

Disney Research has been looking into automated character interactions for a long time, and this name pronunciation and reuse paper just showed up a little while back. It seems obvious, but extracting the phonemes when someone introduces themselves and encoding that rather than just the written name is a smart approach.

Image Credits: Disney Research

Lastly, as AI and search overlap more and more, it's worth reassessing how these tools are used and whether there are any new risks presented by this unholy union. Safiya Umoja Noble has been an important voice in AI and search ethics for years, and her opinion is always enlightening. She did a nice interview with the UCLA news team about how her work has evolved and why we need to stay frosty when it comes to bias and bad habits in search.

[youtube https://www.youtube.com/watch?v=thTwpAJ9jQM?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en-US&autohide=2&wmode=transparent&w=640&h=360]

Why it’s impossible to review AIs, and why TechCrunch is doing it anyway

TechCrunch
OpenAI says it's building a tool to let content creators 'opt out' of AI training
OpenAI says that it's developing a tool to let creators better control how their content's used in training generative AI. The tool, called Media Manager, will allow creators and content owners to identify their works to OpenAI and specify how they want those works to be included or excluded from AI research and training. The goal is to have the tool in place by 2025, OpenAI says, as the company works with "creators, content owners and regulators" toward a standard -- perhaps through the industry steering committee it recently joined.
TechCrunch
Stack Overflow signs deal with OpenAI to supply data to its models
OpenAI is collaborating with Stack Overflow, the Q&A forum for software developers, to improve its generative AI models' performance on programming-related tasks. As a result of the partnership, announced Monday, OpenAI's models, including models served through its ChatGPT chatbot platform, should get better over time at answering programming-related questions, the two companies say. At the same time, Stack Overflow will benefit from OpenAI's expertise in developing new generative AI integrations on the Stack Overflow platform.
Engadget
OpenAI will train its AI models on the Financial Times' journalism
Generative AI is only as good as the training data used to train the models that power it, so AI companies have increasingly been striking deals with news publishers.
TechCrunch
Creators of Sora-powered short explain AI-generated video's strengths and limitations
OpenAI's video generation tool Sora took the AI community by surprise in February with fluid, realistic video that seems miles ahead of competitors. Shy Kids is a digital production team based in Toronto that was picked by OpenAI as one of a few to produce short films essentially for OpenAI promotional purposes, though they were given considerable creative freedom in creating "air head." In an interview with visual effects news outlet fxguide, post-production artist Patrick Cederberg described "actually using Sora" as part of his work. Perhaps the most important takeaway for most is simply this: While OpenAI's post highlighting the shorts lets the reader assume they more or less emerged fully formed from Sora, the reality is that these were professional productions, complete with robust storyboarding, editing, color correction, and post work like rotoscoping and VFX.
TechCrunch
Adobe's working on generative video, too
Adobe says it's building an AI model to generate video. Offered as an answer of sorts to OpenAI's Sora, Google's Imagen 2 and models from the growing number of startups in the nascent generative AI video space, Adobe's model -- a part of the company's expanding Firefly family of generative AI products -- will make its way into Premiere Pro, Adobe's flagship video editing suite, sometime later this year, Adobe says. Like many generative AI video tools today, Adobe's model creates footage from scratch (either a prompt or reference images) -- and it powers three new features in Premiere Pro: object addition, object removal and generative extend.
TechCrunch
Vana plans to let users rent out their Reddit data to train AI
In the generative AI boom, data is the new oil. From Big Tech firms to startups, AI makers are licensing e-books, images, videos, audio and more from data brokers, all in the pursuit of training up more capable (and more legally defensible) AI-powered products. Shutterstock has deals with Meta, Google, Amazon and Apple to supply millions of images for model training, while OpenAI has signed agreements with several news organizations to train its models on news archives.
TechCrunch
Bye-bye bots: Altera's game-playing AI agents get backing from Eric Schmidt
Autonomous, AI-based players are coming to a gaming experience near you, and a new startup, Altera, is joining the fray to build this new guard of AI agents. The company announced Wednesday that it raised $9 million in an oversubscribed seed round, co-led by First Spark Ventures (Eric Schmidt’s deep-tech fund) and Patron (the seed stage fund co-founded by Riot Games alums). The funding follows Altera's previous raising a pre-seed $2 million from Andreessen Horowitz and others in January of this year.
TechCrunch
Checkfirst raises $1.5M pre-seed to apply AI to remote inspections and audits
Founder Ben Lambert realized just that, when after moving to Portugal, his wife’s property inspection business needed to be run remotely. Seeing an opportunity, Lambert founded an AI-powered workflow tools startup, Checkfirst, that, in addition to allowing for remote inspections, enables businesses to schedule inspectors based on geographical location and qualifications.
Yahoo Sports
End of the Astros dynasty, Blue Jays panic, Tigers city connect, Eitan Levine interview
Jake Mintz & Jordan Shusterman talk about the possible end of the Astros dynasty following their loss to the Yankees, if it’s time to panic about the Blue Jays, yay or nay on the Tigers City Connect uniforms and comedian Eitan Levine joins to talk first pitches.
Engadget
Oh no, I think I want an iPad Pro now
Something about this year's iPad Pros compels me.
Yahoo Life Shopping
Ready, jet, go: These are the 23 travel essentials Yahoo shopping editors always pack for a trip
These are the luggage, clothing, comfort and tech gadgets our globe-trotting team can’t live without.
Yahoo Sports
Rory McIlroy won’t rejoin PGA Tour’s policy board after pushback from other members
“I think it got pretty complicated and pretty messy.”
Yahoo TV
‘Let It Be’ director says lost Beatles documentary became ‘collateral damage’ during band's breakup
"Let It Be" director Michael Lindsay-Hogg talks to Yahoo Entertainment about making the 1970 documentary, which was first released at a very unfortunate time — as the band broke up.
Engadget
Ugh, Max subscription prices might be going up again
Max subscription prices are set to go up again, according to Bloomberg.
Yahoo Finance
Warner Bros. Discovery earnings preview: Investors eye NBA updates amid linear TV turmoil
Warner Bros. Discovery will report first quarter earnings before the bell on Thursday. Here's what to expect
Yahoo Tech
HoverAir X1 drone review: My favorite flying selfie camera
It takes off from your palm, follows you around, captures HDR video and folds down to pocket-size. What's not to love?
Yahoo Finance
Toyota issues muted profit forecast following blowout 2024 results
Toyota, the world’s largest automaker, reported blowout results for its fiscal year 2024, though investor response was muted as the company issued a conservative outlook, reflecting heavy investments that need to be made as its business transforms.
Yahoo Sports
College athletics revenue sharing, Brian Kelly cries about portal recruiting & The Snoop Dogg Arizona Bowl
On today's show, Dan Wetzel, Ross Dellenger and SI's Pat Forde talk college athletics revenue sharing, Brian Kelly complaining about not wanting to pay players, which teams are struggling to recruit, the Snoop Dogg Arizona Bowl, and a psychedelic commencement speaker.
Engadget
Marvel’s making an ‘interactive story’ based on the What If...? show for Apple Vision Pro
Marvel and Industrial Light & Magic have teamed up to make a mixed-reality interactive story for the Apple Vision Pro. This content will be based on the show What If…?
Yahoo Personal Finance
Does closing a bank account hurt your credit score?
There are certain cases in which closing a bank account could impact your score indirectly. Here’s what you should know.

News

Life

Entertainment

Finance

Sports

New on Yahoo

This Week in AI: Let us not forget the humble data annotator

More machine learnings

Recommended Stories

OpenAI says it's building a tool to let content creators 'opt out' of AI training

Stack Overflow signs deal with OpenAI to supply data to its models

OpenAI will train its AI models on the Financial Times' journalism

Creators of Sora-powered short explain AI-generated video's strengths and limitations

Adobe's working on generative video, too

Vana plans to let users rent out their Reddit data to train AI

Bye-bye bots: Altera's game-playing AI agents get backing from Eric Schmidt

Checkfirst raises $1.5M pre-seed to apply AI to remote inspections and audits

End of the Astros dynasty, Blue Jays panic, Tigers city connect, Eitan Levine interview

Oh no, I think I want an iPad Pro now

Ready, jet, go: These are the 23 travel essentials Yahoo shopping editors always pack for a trip

Rory McIlroy won’t rejoin PGA Tour’s policy board after pushback from other members

‘Let It Be’ director says lost Beatles documentary became ‘collateral damage’ during band's breakup

Ugh, Max subscription prices might be going up again

Warner Bros. Discovery earnings preview: Investors eye NBA updates amid linear TV turmoil

HoverAir X1 drone review: My favorite flying selfie camera

Toyota issues muted profit forecast following blowout 2024 results

College athletics revenue sharing, Brian Kelly cries about portal recruiting & The Snoop Dogg Arizona Bowl

Marvel’s making an ‘interactive story’ based on the What If...? show for Apple Vision Pro

Does closing a bank account hurt your credit score?