Neon and its "artificial human" avatars were the first viral hit of CES. They had everything to get the internet excited: a corporate giant (the company is from Samsung's STAR Labs), buzzwords (Avatars! Realistic AI assistants!) and confusion. Redditors combed the internet for details, and YouTube channel Good Content pulled together a surprisingly comprehensive dossier on a company that's barely half a year old. Neon then officially announced itself to CES in a press release rich in hyperbole, complicated machine learning jargon and a pretty opaque mission statement. There was also the promise of Neons "reacting and responding in real-time." I had to see it for myself.
According to Neon creator and STAR Labs boss Pranav Mistry, "Neons will integrate with our world and serve as new links to a better future, a world where 'humans are humans' and 'machines are humane.'" That makes them sound like digital conversation partners, where interactions would approximate real humans. The mission seems to be a softer, more empathetic connection with future virtual assistants, which sounds... nice. I guess.
Figuring out how that's all meant to come together (and work) is complicated, so let's start with some clarification. Neon is both the company name and what it's calling these avatars, and the technology behind the concept, at least, is split into two: Core R3 and Spectra.
Core R3 is short for "Reality, Realtime, Responsive." This is the process of generating how these avatars look and move, with the aim of creating a "reality that is beyond normal perception." It combines proprietary technology with neural networks to create these artificial humans, though the starting point was real humans.
The (slightly) interactive Neons demoed at CES were based on real people, but the gestures and expressions were, according to the company, generated independently.
The early Neon videos that wowed everyone last week were fluid and realistic. But that was because they were really just videos of humans. At the booth, Neon caveated its life-size avatars with little disclaimers at the bottom of each screen. These were just visions of how Neons could look and behave in the future. Sure, the actual Core R3 results looked promising, but they were far from what most people hoped to see.
Core R3 was "extensively trained" on how humans look, behave and interact -- this is where the neural networks would come into play. Onstage during his debut presentation, Mistry showed the incredible pace of improvements between early models and today, roughly four months later. He also offered the best insight into how the technology actually works.
After establishing facial models of one engineer and generating a copycat avatar of him, the team used this with different people. They could then "talk" through this avatar, which sounds very similar to how deepfakes work. The next step was what differentiates it though. The team then established a system that generates facial expressions and mouth movements on its own. It's not a combination of people but something entirely new.
A Neon model will be able to generate facial animations from a multitude of options (the word "millions" was used during the press conference). There are countless ways to smile, and a Neon avatar apparently has countless ways to follow a command to smile. We saw the Neon smile in two different ways. Maybe there were countless other smiles available? I didn't get to find out, but Neon suggests that's the case in what it's said in releases and to other journalists. The team could even raise the avatar's eyebrows during these different expressions. It was impressive to see it all happen in real time, but it was all at a rudimentary level.
What else is a Neon capable of, then, at this point? In part of my demo, one avatar reeled off a few lines in Chinese, Korean and Hindi, all in response to a Neon employee's voice commands. But the "artificial person" was relatively static, and, barring some initial expressions in response to the handler's requests, the Neons were largely dead behind their eyes. There were uncanny valley vibes. The mouth tracking was especially rough, with snaggleteeth undulating as the avatar talked.
But perhaps the biggest disappointment for CES attendees was the low level of interaction, limited to a few light questions from the audience, in addition to repetitive answers we'd heard during Neon's launch event. The avatar answered when it understood the question, but it was nothing beyond the half-decent responses that you get from online chatbots.
Then there's Spectra: the platform that represents what would be truly new here. This is the exciting sci-fi part, intended to handle the learning and emotional responses of Neons. Unsurprisingly, it's the hardest part to understand and get a straight answer about, and it's all but missing from this launch. (It will form the focus for the company this year.)
Despite that, Neon has been quick to define what Neons can and can't do, keen to distance the small company's ambitions from incumbent tech giants' similar efforts. Neons are apparently not smart assistants. They won't spout random facts or sing a ditty on command. They will, if the company's Spectra platform is realized, be able to learn from experiences and converse and sympathize with humans.
Mistry mentioned to me that he imagined an old person who "doesn't want facts read out, they want to have a conversation, a 'Dear Diary.'" Alas, these early demos seemed like exactly what Neon is trying to avoid: graphically impressive Alexa-esque virtual assistant showcases.
Mistry is keen to see what people think of Neon and what is possible in these early stages. The amount of interest the team has garnered at launch will probably make STAR Labs backer Samsung very happy, but Neon's avatars need purpose. The company believes its creations could one day be used as banking assistants, actors or hotel concierges, and the press conference was packed with representatives from banks, resorts and retailers, squeezed in alongside rows of skeptical reporters.
By the end of the presentation and the few demos, the words "ambitious" and "naive" came to mind. You can see where this is going if Neon keeps iterating, fixing and polishing. But it's a victim of everyone's overexcitement. Neon's booth is a fair distance away from Samsung's own imposing area at CES, and while the association between them isn't a hugely strong one, the connection alone was enough to stoke the hype flames. Neon could have kept itself under the radar until it had a better idea of how to explain it to people or had better demos that reflected the initial sales pitch.
The tech press can be unforgiving, but Neon hasn't done anything particularly wrong. Yes, it had an overenthusiastic press release (everyone's guilty of that), teasers that didn't represent the actual results on the ground and an excited CEO who truly believes in the tech his team is developing.
Let's see what Neon has to show at CES 2021.