Inappropriate Content Hallucination

Inappropriate Content Hallucination, as defined by a recent study conducted by researchers at the Rochester Institute of Technology, is when artificial intelligence systems insert dirty words into the subtitles of videos meant for kids. From their article:

Over the last few years, YouTube Kids has emerged as one of the highly competitive alternatives to television for children's entertainment. Consequently, YouTube Kids' content should receive an additional level of scrutiny to ensure children's safety. While research on detecting offensive or inappropriate content for kids is gaining momentum, little or no current work exists that investigates to what extent AI applications can (accidentally) introduce content that is inappropriate for kids.

In this paper, we present a novel (and troubling) finding that well-known automatic speech recognition (ASR) systems may produce text content highly inappropriate for kids while transcribing YouTube Kids' videos. We dub this phenomenon as inappropriate content hallucination. Our analyses suggest that such hallucinations are far from occasional, and the ASR systems often produce them with high confidence.

More info: Indian Express
     Posted By: Alex - Tue Apr 19, 2022
     Category: AI, Robots and Other Automatons | Mistranslations | Swears

Well, we all knew this was going to happen when the censoring was put on the parents so anything could be sold instead of only wholesome content being spread everywhere. And here we are in 2022 and the censoring online is they show us what they are told too, or gets the most ad offers.
Posted by John Church on 04/19/22 at 09:46 AM
My profession is in the computer industry and I remember when the first human language translation programs would (and as far as I know, still do) place the wrong word into a phrase from one language to another where there is no exact concept between the two. I was not impressed at the time and see that the problem hasn't been overcome.
Posted by KDP on 04/19/22 at 11:06 AM
The same thing occurs with videos for adults, only it's not seen as as much of a problem because, well, not children. It's still irritating when you put on the subtitles to see what someone said in a Brummie accent only to find that YouTube didn't understand it, either, and put in something obviously nonsensical.

It seems to me, though, that this is easy to solve - the inappropriate bit at any rate, not the not understanding, that's a much harder problem. YouTube already knows which videos are for children - they mark them as such and kill the comment section and so on. (In fact, IMO they're being rather over-zealous in this, but that's another matter.) They also know, or can at least put together, a list of words they don't want in translations for children. Then all they have to do is instruct their ASR that words on that list are always wrong in a children's video. That cannot possibly be difficult. (It might be a pain in the [inappropriate for children] having to redo all the translations they already have, but that's not the same thing.)
Posted by Richard Bos on 04/23/22 at 09:40 AM
Commenting is not available in this channel entry.