I know we've talked before about real people vs computers for narrations, but recent discourse over AI-generated images have brought up some other very valid points that also apply to this field. So let's talk about some of the other issues surrounding the growing attempts to replace real people with AI voices, and what are some of the implications and impacts it could have for the industry.
AI-generated images aren't fabricating things out of thin air—they work from a database of images. These images are what the AI supposedly “learns” from, recognizing patterns and styles and subjects. It also then uses these images in its generated results, like sort of a melding or composite of concepts to attempt to fulfill the prompts. Unfortunately, since it is all just algorithm and not based on any actual skill or artistic knowledge, the results often end up...rather off. Even results which may at first glance look passable, become more uncomfortably wrong the more you look at them. Maybe it's because the lighting is wonky, or some element of perspective is askew. Things which would have easily been caught and fixed if done by a skilled artist.
Similarly, AI voices are based off of real voices. Like Rosko pointed out previously, the AI voice companies try to find actual voice actors to contribute their voices to these projects. These banked recordings are then used by the AI to generate speech for whatever new script you feed it. But much like with the AI images, AI voices don't sound entirely right. There's a strange tonal quality which is basically the audio version of the uncanny valley. For those unfamiliar, the uncanny valley refers to things which closely resemble humans but are just off enough to make us uncomfortable. I have yet to encounter an AI voice that sounds entirely human, and the ones that try the hardest are often the most distracting because they are almost there while missing the mark enough to be unsettling.
One of the biggest controversies surrounding AI images are the unauthorized use of copyrighted content within the databases, many artists having not been given a chance beforehand to opt out. While AI narration is different in that it is generated from the recorded voices of actors who very specifically opted in, there's still an exploitative feel to it. Think about it...a company will be able to use that actor's voice for countless projects in the future, but the actor will never be paid for any of it. That goes well beyond something like a company paying to use the same recorded commercial for perpetuity.
Also, there's this argument, which is often brought up in regards to the AI images: If more and more potential clients turn to using AI instead of real people, what does that do to the industry itself? This applies to the voiceover industry, as well. If opportunities evaporate, it makes it more difficult for new talent to get a foothold, and discourages talent away from pursuing that field. It's rather ironic that AIs which rely on the creativity and talent of real humans, could work to limit the pool of creativity and talent of the future.
I personally think that unless someone manages to train an AI so that it has a human's ability to understand acting and narration, it will just remain a novelty. It's something fun to play around with, but using it in a professional capacity is little better than someone trying to save a few bucks by doing it themselves. Which, as we've discussed before, isn't the best option when you want to come across as polished and professional.
With the plethora of ways to find quality voice talent nowadays, there's no fathomable reason to take a turn into the unsettling realm of the uncanny valley instead.