Dataset used to train AI models contains child sex abuse images, new Canadian analysis finds

OTTAWA — A Canadian organization focused on combatting the spread of child sex abuse images says a new analysis has found the presence of such images in a dataset used to train AI models.
The Canadian Centre for Child Protection said the findings include images of more than 120 victims across Canada and the United States and raise serious ethical concerns when it comes to the development of AI technologies.
“Many of the AI models used to support features in applications and research initiatives have been trained on data that has been collected indiscriminately or in ethically questionable ways,” Lloyd Richardson, the Winnipeg-based centre’s director of technology, said in a news release issued on Wednesday.
“This lack of due diligence has led to the appearance of known child sexual abuse and exploitation material in these types of datasets, something that is largely preventable.”
The centre for child protection said its analysis focused on an image collection known as Nudenet, which features tens of thousands of images used by researchers to develop AI tools for detecting nudity and whose images are collected from sources like social media and adult porn sites
It says its analysis found around 680 images known to the centre as being suspected or verified as child sex abuse and exploitation material.
Of those images, the centre reported that more than 120 images show victims in Canada and the U.S. Other images included minors in sexually explicit acts.
As a result of its findings, the centre said it issued a removal notification to Academic Torrents, a website used by researchers and universities to download datasets, adding that the flagged images were no longer available.
It says additional steps ought to be taken by those distributing datasets, used by researchers and academics, to ensure they do not include child sex abuse images and calls for regulation when it comes to AI technologies.
The centre’s analysis comes after a 2023 investigation by Stanford University’s Cyber Policy Centre, which found the presence of child sex abuse images in a dataset used in developing text-to-images AI models.
It warned that models being developed on that dataset were then being used to generate realistic-looking nudes, including those of minors.
Prime Minister Mark Carney has made developing Canada’s AI capacity a priority of his government’s approach to digital policy.
Artificial Intelligence Minister Evan Solomon, who became the first federal minister to hold such a title, has been tasked with steering that work, and so far signalled the government is not keen on pushing a regulation-focused approach.
He he said the government’s focus on an upcoming bill would be around privacy and data.
At the same time, Carney’s government has promised to criminalize the creation of non-consensual sexualized images known as “deepfakes,” which are generated by tools, including AI.
National Post
Our website is the place for the latest breaking news, exclusive scoops, longreads and provocative commentary. Please bookmark nationalpost.com and sign up for our newsletters here.
Comments
Be the first to comment