Child sex abuse images found in data used to train AI

(NewsNation) — Researchers at Stanford University discovered more than 1,000 images of child sexual abuse material (CSAM) in a database used to train artificial intelligence tools.

“We are now faced with the task of trying to mitigate the material present in extant training sets and tainted models; a problem gaining urgency with the surge in the use of these models to produce not just generated CSAM, but also CSAM and NCII (non-consensual intimate imagery) of real children, often for commercial purposes,” the study said.

Machine learning models for AI that can generate visuals, including explicit content, are trained using datasets of images. More recent machine learning models like Stable Diffusion were trained on billions of scraped images in the LAION-5B dataset.

“The dataset included known CSAM scraped from a wide array of sources, including mainstream social media websites and popular adult video sites,” Stanford’s Cyber Policy Center said.

Leah Plunkett, an author and research lecturer at Harvard Law School, said “countless” CSAM images are available on the open web, not just buried deep in the dark web.

“Basically, if you go crawling along the web to look for many images on which to train image generators, you do wind up, as LAION-5B apparently did, according to Stanford, getting some illegal, awful, horrific links out to things available on the web,” Plunkett told “NewsNation Now.”

According to Stanford, there are ways to “minimize CSAM” in datasets used to train AI models, but it is challenging to clean or stop the distribution of open datasets. University researchers said deleting or cleaning the material from datasets is “the most obvious solution.”

“They’ve (Stanford) made recommendations that any company or institution using LAION-5B to train its image generators or anything else, pause what they’re doing and bring in a multistakeholder expert team to take a close look and make sure they don’t have anything in there that they shouldn’t,” Plunkett said.

Plunkett said companies should report any inappropriate images to law enforcement, and parents should be careful about posting images of their children on the internet.

“At this moment in time, in this age, as AI ramps up with limited to no guardrails — yes, I would advise parents to refrain from posting any images of their children on the open web, including social media sites, especially those where your privacy settings are set to public,” Plunkett said. “I would still have some pause about posting even when the social media privacy settings are set to private.”

She continued: “Right now, the risks of having any image of your child, even harmless, innocent, wonderful ones, out there really are starting at this current moment to outweigh the benefits.”

Stanford said the removal of the inappropriate content discovered on the database is currently in progress. Researchers have reported the image links to the National Center for Missing and Exploited Children in the U.S. and the Canadian Centre for Child Protection.

Source link

Denial of responsibility! NewsConcerns is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a Comment