Illustration shows Artificial Intelligence words
IBTimes US

An investigation by Human Rights Watch (HRW) has revealed that the artificial intelligence (AI) sector is widely violating the privacy of Australian children by using their names, locations, ages, and personal photos to train some of the world's most advanced AI models.

Researchers found personal photos of Australian children, including infants, toddlers, and females in swimwear at school functions, were included in a well-known dataset called LAION-5B to train AI models that produce hyper-realistic photographs. These photos -- many of which contained personal information -- were taken from the internet without permission, ABC reported.

"Ordinary moments of childhood were captured and scraped and put into this dataset," said Hye Jung Han, a children's rights and technology researcher at HRW. "It's really quite scary and astonishing."

One such picture showed two three- and four-year-old boys, grinning while holding paint brushes in front of a colorful mural. The caption provides information on the ages, complete names, and preschool names of the two kids in Perth, Western Australia.

It's concerning that this information on the kids doesn't seem to be available anywhere else online, which raises major questions about consent and privacy.

Out of the 5.85 billion images and captions in LAION-5B, HRW only looked at less than 0.0001%, meaning that even though they found 190 photos of children from every Australian state and territory, this is probably a significant undercount of the amount of personal data about children.

Every stage of childhood is represented in the images reviewed by HRW, from babies born into the gloved hands of doctors, still attached to their mothers by the umbilical cord, to young children blowing bubbles or playing instruments in preschools, kids dressed up as their favorite Book Week characters, and girls in swimsuits at their school swimming carnival.

The dataset, which was acquired by web crawlers, contained private information that might be misused, such as to create convincing deepfakes. To safeguard the privacy of children's data, experts and HRW demanded the need for immediate legislative changes.

"There are very, very few instances where a breach of privacy leads to regulatory action," Edward Santow, a former Human Rights commissioner and current director at the Human Technology Institute, told ABC.

"That's one of the many reasons why we need to modernize Australia's Privacy Act," he said.

The LAION-5B dataset was created by a German not-for-profit organisation called LAION.

A LAION spokesperson told ABC that its datasets "are just a collection of links to images available on [the] public internet," and added that "the most effective way to increase safety is to remove private children's information from [the] public internet.".