TV Music Streamer Roxi’s CEO on U.S. Plans, Celeb Backers, Replacing Smart Speakers With “MTV 2.0”

  • Oops!
    Something went wrong.
    Please try again later.
  • Oops!
    Something went wrong.
    Please try again later.

U.K.-headquartered TV-based music streaming service Roxi, backed by such investors as Simon Cowell, Kylie Minogue and Sheryl Crow, is ready to move into the U.S. during the first quarter of 2024. The company recently unveiled partnerships with TV and set-top box makers, as well as TV station giant Sinclair.

Consumers can use the Roxi app free with ads under the Roxi Standard plan or for $8.99 per month in the U.S. (after a free 30-day trial) or £6.99 ($8.85) in the U.K. The firm also offers interactive music channels to the likes of Sinclair.

More from The Hollywood Reporter

CEO Rob Lewis leads the company, founded in 2017, which has deals with all major labels and touts that it offers its partners “a unique full-catalog music video service – all the original music videos plus over 100 million Roxi virtual music videos.” The serial entrepreneur formerly founded Omnifone, a provider of cloud-based unlimited music services, among others.

In a conversation with The Hollywood Reporter, Lewis explained why his team sees a big market opportunity for offering consumers music videos on TV sets, how Roxi makes music more social, why the TV can replace the role of audio speakers in many homes and how he sees it as this generation’s answer to MTV.

Why did you decide to bring Roxi to the U.S., why now and how do you assess consumers’ appetite for the service?

Our ambition is to change the way consumers all over America enjoy music in the home. We basically believe that the TV is this wonderful connected device that we have in the living room and clearly has revolutionized the way we access TV and film. But these days it’s also connected increasingly to great sound bars and sound systems. It’s a great big TV screen. And ultimately, it’s a better way of doing music than with a smart speaker. It’s just that it’s never really been done very well before. It’s quite shocking that it hasn’t really been done very well before because it’s something that’s reasonably obvious. Of the smart TV vendors there are only three that actually own a music service. Two of them, Apple Music and Fire TV with Amazon Music, are delivering an audio-only experience on their TVs. And frankly, an audio-only music experience on the TV is totally 1980s. I remember when I grew up, my dad used to listen to BBC Radio on the TV. I always thought it was ridiculous. There was the BBC logo in the middle of the TV, but you’re listening to audio. That’s not what TV is for.

There is obviously YouTube Music, which does a decent job on TVs. But it’s still relatively audio-centric as there are a million music videos and over 100 million of tracks that are actually still audio-only. Now there is ViewTube, as in the video sharing software, but we say that’s really not a family-friendly music experience, is very much about user-generated content and there’s quite a lot of stuff I wouldn’t like six-year-olds to consume by accident when they’re supposed to be listening to music. So what we set out to do with Roxi is “let’s turn the TV into something that’s going to be way better than a smart speaker. Lots of things have collided in the last little while so that I think we’re going to achieve that in 2024.

How key is it that Roxi has licensing deals with all major labels and a broad offering of music and music videos?

We used to run something called Sony Music Unlimited for Sony Corp on Sony PlayStation. The number one issue with the service was that users would search for a track they loved and it wasn’t there. Now we also say audio-only on a TV is rubbish. Actually resolving that conundrum is quite an interesting one, because there are less than a million music videos in the world. But if you go to a music streaming service, there are well over 100 million songs that we need to cover. The really big recent pop-ish, like Taylor Swift, songs are going to have great music videos. But if you go to the ’60s, ’70s, Elvis Presley, EDM, classical, jazz or you smaller artists, or even quite big artists, there’s quite a lot of music out there that isn’t ever going to be available in an original music video.

The way we crack that problem, and it was a long negotiation with all the recorded rights owners, publishers and also Getty Images, is a system to build virtual music videos for all the music that doesn’t have an original music video. That’s quite a substantial exercise. We essentially take all the metadata associated with an individual track or artists – when was it recorded, the genre, the decade, the mood –and then take all of the photography that Getty Images has to create virtual music videos whilst that music is playing. That means that for the very first time on TV you have an audio-visual experience for every single track and a full catalog.

Does that mean Elvis photos morphing into each other or is there video or animation or so involved?

To a degree. You will know about sync rights. They give an individual songwriter or even somebody who owns a little bit of a song the ability to veto the synchronization of an audio and a video asset. So we have to do all of the creation of a music video on the fly, and it’s always marginally different for each individual audio track each time you play it. That avoids the requirement for a sync right, because the whole point is to be comprehensive.

If I listened to an Elvis Presley track on YouTube Music, I’d have a bit of artwork in the middle of the screen and otherwise an audio-only experience and feel like I’m back in the 1980s. If you do it on Roxi, you got fabulous photography that could be on the front page of Time magazine and it’s all animated. Even if you go to a genre that doesn’t normally have a music video, let’s say classical or jazz, you will have fabulous photography of either the artists or the genre or the decade. For example, for classical, if you were listening to opera, you’d have amazing architectural pictures of opera houses on the screen, along with people up on stage. So it creates an experience that the TV deserves for music. And in addition to that, we also say the TV is at the center of a home, and it’s a shared experience.

How do you lean into that in the age of mobile phones?

There are lots of things we can do with music on the TV that we’d never think about doing on a mobile phone. One example is what we call Party Play Queue. There could be a party or even a restaurant or bar. Chances are that most people there don’t have access to the remote. Play Queue enables anyone in the room, and it could literally be a customer at a bar, to pick up their mobile phone, use a QR code and instantly choose and add music to the video play queue. On the screen it will say “John just added Taylor Swift’s latest track to the video play queue.” That is a key feature, particularly given that our competition is actually an Amazon Echo.

What we think is really significant is having a living room with our friends and family. So it’s not just about consuming content, it’s also about interacting with music. So we have music video karaoke where the music is playing in a music video format and you’re actually seeing Taylor Swift on the screen with the lyrics superimposed on top and, yes, you can sing with Taylor Swift. We also have interactive music games where you’ve got to quickly guess who’s the artist and what is the music video. It is a kind of Who Wants to Be a Millionaire-style environment and user experience. Those experiences are great for getting people in the family to get off the mobile phone and have fun together in front of a big screen.

How big is the market opportunity and who are key competitors for Roxi?

We will be available in the U.S. on Samsung, LG, Vizio, Roku, Comcast, Sky, Sony Bravia, Amazon FireTV all during the first quarter. Essentially, that’s all the platforms except Apple TV. We truly want to disintermediate or create an alternative experience to the smart speaker. In the U.S., 54 percent of households are using a smart speaker right now. There are 200 million smart speakers in the U.S., and 80 percent of the usage is actually music. What we’re trying to do is to say “there’s a much better way to do music in the home than that smart speaker.” The fact that it’s music videos on a great big screen is a reason, the fact that you’re likely to have a much better audio experience today with sound bars and sound systems.

If you can’t remember the name of a Taylor Swift track, you can’t really say to Amazon Echo, or Alexa, “tell me the name of all of the tracks Taylor Swift ever did.” It’s an extremely cumbersome way of trying to select music. We want to make it instantly accessible with your voice as easily as it would be with a smart speaker.

For Roxi, is it all about getting consumers to interact with music and each other?

We also believe a lot in the lean-back experience for curation. We know that consumers will often have had a long day at work, come home and want to put something on. They don’t necessarily want to build a playlist, they just want to put something on and if they don’t like a track, they want to be able to skip it. Part of that lean-back experience is not only the personalization we have and the interaction features we have but also making sure we have a huge number of interactive channels that are available to customers. And then some of those are actually curated by some of our shareholders, particularly people like Simon Cowell, Kylie Minogue and Sheryl Crow.

So if I listen to the Kylie Minogue-curated channel, for example, is this full of songs from her plus songs that she likes or what do I find?

It is basically what she listens to herself. It really is what our curators listen to – we insist on that. Simon Cowell creates absolutely everything that is Simon Cowell-branded on our service. His curated selection of festive favorites was very popular. He actually has access to our online content management suite on his own laptop.

One thing that we are moving to is to have live party events. Let’s say it’s eight o’clock on Saturday night. Simon Cowell is having a party and using that play queue functionality to add tracks to the list all the time. For this particular party, no one else except him can add music. But you can see Simon added this track to the play queue etc. And you can tune in to the party music videos. So consumers all over the country can tune in to someone else’s party and everyone is having a party in different locations at the same time. That is one of those kinds of things where we think the TV can offer very different things from what is typically behind headphones on a mobile device.

What happens in between music videos on Roxi? How do you handle the transitions?

There’s a graphical treatment of the Roxi logo that kind of slides in and fades out as you transition from one track to another. On the free version, you also might get an ad break sometimes. You have to pay the subscription to avoid that completely.

How does the company make revenue?

We are a direct-to-consumer music streaming service. We exist even on platforms where we know we’re never going to be their greatest friend. For example, Google owns YouTube Music. But we are available on Google TV and Android TV because we want to be available to every single consumer, whatever TV they have. But we’re never going to be the preferred music provider on Google TVs, because they’re always going to prefer YouTube Music. But we work with many partners.

We’re giving them a revenue share on the ad inventory and the premium revenues that are generated. That’s part of distribution arrangements. That effectively means that we’re almost like an in-house music service, except that we’re obviously interoperable across all the different device types and can be delivered very quickly into their ecosystem. Anyone building a new music service from scratch is going to need two or three years and have a cost of $200 million. There is a really high degree of risk involved in building new stuff and there is no upside to doing that, because they want a music service, and they want it now. So fundamentally, we make a margin on every subscription. And we make a margin on our advertising on the free tier.

We have a certain number of users we need to get to to be a profitable business. And it is a relatively small number compared with major streaming services because we are a connected TV app, rather than a mobile app, and ad revenues, in particular, for connected TV are significantly higher than they would be for mobile radio advertising.

Have you said or can you say how many subscribers Roxi has in the U.K. so far and when you’ll turn profitable?

We are a private company, so obviously we have to deliver a lot less information. But I can tell you that we have millions of users in the U.K., but that’s about as far as I can go. The real key in terms of the next step of the business is moving into the U.S. market, which is obviously the world’s biggest music market and the world’s biggest TV and streaming market. That suddenly enables a lot of these global deals with major smart TV vendors because many of them are reticent to do that until you are present in the world’s number one market.

Do you have plans for future launches in further markets?

We’re globally licensed, and the repertoire we have is basically the international, Anglo-American and Spanish repertoire. Effectively, it is everything that you need for any Western or any Spanish-speaking markets. We would be weak if we were to launch in some of the Asia-Pacific markets that have a very, very high level of localization. I think the most extreme market is Thailand, which is 90 percent local repertoire. But even in places like Singapore and Hong Kong local music is dominant. There is the Gold (Typhoon) label in Hong Kong, which doesn’t distribute its content internationally. But if you were in Hong Kong, you’d need to have it content from local artists.

So we’re effectively able to roll out instantaneously anywhere, but we do a really good job in Western market or Spanish-speaking markets. As you can probably imagine, a lot of the partners who’ve been encouraging us to launch in the U.S. would like a solution for all of their key markets. So, you can imagine that the goal is to roll out the service and integrate it into all of the key markets as quickly as possible.

How do music labels feel about licensing to Roxi? Are they happy to have a company focused on music videos or are there any reservations or challenges?

The music industry is highly fragmented, and for a service like this, you have to get consensus from all of the key parties, many of whom are competitors and not allowed to talk to each other. So licensing for a novel proposition is certainly not a task for the faint-hearted. There is a belief, I think, almost consistently within the music industry that there are now some fabulous audio-centric mobile streaming services out there that are fabulously popular in the home. But there is also a belief that no one ever effectively created MTV 2.0.

MTV was fabulously popular. They spent enormous amounts of money creating music videos, which looked great on a big screen and are really spoiled on a tiny mobile screen. But MTV and the music channels have been decaying at speed because the younger generation expects to be able to interact with and control the experience. The idea of turning on the TV, and you can’t switch and skip to the next track is totally absurd for the average young consumer. In a focus group that we did not long ago, a young chap said “well, what’s the point of broadcast TV? You turn on the TV, the program has already started, and you can’t choose what you’re going to watch. It’s ridiculous.” Fundamentally, we’re creating what MTV probably should have created a long time ago, which is the ultimate experience for music in the living room on a bigger screen, leveraging the fact that every artist spends a lot of time and energy and love and creative talent, creating fantastic audiovisual experiences, not just audio experiences. And a lot of younger customers are not necessarily satisfied with audio-only. They want an audio-visual experience. They spend a lot of time watching TikTok or YouTube, so audio-only does seem a bit weird to them.

Beyond Roxi’s focus on the big TV screen and interacting around music, is there a possible use case on mobile phones?

Fundamentally, we’re a bit like Netflix – we’re designed for the big screen. Netflix is best enjoyed on a big TV in your living room because a $100 million picture looks best on a great, big TV in the living room and all watching together as a family. But if you want to watch a Netflix program on your mobile phone, you can. You’ll probably do that because you’re in a place where you don’t have access to a TV. We’re not doing that at launch in the U.S., but that option is coming very soon. So consumers will have the freedom to access it on a very wide range of devices. But we will never be a Spotify competitor. We’re not trying to be a mobile streaming service.

Anything else we haven’t discussed yet that you feel is important to highlight?

One thing I very rarely get asked about is the relative complexity of music versus TV and film. Netflix or Disney+ may have 10,000, 20,000, 30,000 long-form bits. In some cases, they’ll probably all come from the same source. The thing about music, particularly when you’re doing it as an audiovisual experience, it’s well over 100 million and the catalog is coming from myriad different sources, where an individual music video might actually be owned by a different label in one country compared to another. There are multiple rights challenges where you may have things being sold by one party to another, different release dates and sometimes expiry dates. Then there are vast amounts of new content coming out every single day, some of which is available only in some markets, and some which has different release dates. And then you’ve got all of the photography and lyrics data and everything else like karaoke and AI engines for the lyrics data.

Building a global music video streaming service is way harder than building a Netflix or Disney+, just because you’re dealing with so much more data from so many different types of people. I guess that is probably one of the reasons why parties like MTV didn’t undertake it. Consumers are completely unforgiving. It’s way harder than business-to-business. If you haven’t got a product that consumers immediately love, they go away and they don’t ever come back.

Interview edited for length and clarity.

Best of The Hollywood Reporter