A new tool for copyright holders can show if their work is in AI training data

These AI copyright traps tap into one of the biggest fights in AI. A number of publishers and writers are in the middle of litigation against tech companies, claiming their intellectual property has been scraped into AI training data sets without their permission. The New York Times’ ongoing case against OpenAI is probably the most high-profile of these.

The code to generate and detect traps is currently available on GitHub, but the team also intends to build a tool that allows people to generate and insert copyright traps themselves.

“There is a complete lack of transparency in terms of which content is used to train models, and we think this is preventing finding the right balance [between AI companies and content creators],” says Yves-Alexandre de Montjoye, an associate professor of applied mathematics and computer science at Imperial College London, who led the research. It was presented at the International Conference on Machine Learning, a top AI conference being held in Vienna this week.

To create the traps, the team used a word generator to create thousands of synthetic sentences. These sentences are long and full of gibberish, and could look something like this: ”When in comes times of turmoil … whats on sale and more important when, is best, this list tells your who is opening on Thrs. at night with their regular sale times and other opening time from your neighbors. You still.”

The team generated 100 trap sentences and then randomly chose one to inject into a text many times, de Montjoy explains. The trap could be injected into text in multiple ways—for example, as white text on a white background, or embedded in the article’s source code. This sentence had to be repeated in the text 100 to 1,000 times.

To detect the traps, they fed a large language model the 100 synthetic sentences they had generated, and looked at whether it flagged them as new or not. If the model had seen a trap sentence in its training data, it would indicate a lower “surprise” (also known as “perplexity”) score. But if the model was “surprised” about sentences, it meant that it was encountering them for the first time, and therefore they weren’t traps.

In the past, researchers have suggested exploiting the fact that language models memorize their training data to determine whether something has appeared in that data. The technique, called a “membership inference attack,” works effectively in large state-of-the art models, which tend to memorize a lot of their data during training.

In contrast, smaller models, which are gaining popularity and can be run on mobile devices, memorize less and are thus less susceptible to membership inference attacks, which makes it harder to determine whether or not they were trained on a particular copyrighted document, says Gautam Kamath, an assistant computer science professor at the University of Waterloo, who was not part of the research.

A new tool for copyright holders can show if their work is in AI training data

Revisiting a year of Roundtables, MIT Technology Review’s subscriber-only events

These stunning images trace ships’ routes as they move

The humans behind the robots

You may have missed

Manhattan nail salon files for bankruptcy after attracting celeb investors including Olivia Culpo

Dual-functional Qx-D scaffolds hold immense promise for treating infected bone defects

In-N-Out heiress details dangerous conditions in Calif. neighborhood that forced her to close outpost

Trump’s FCC pick sends stern letter to Bob Iger ripping Disney-owned ABC News for role in ‘erosion in public trust’

Elon Musk urges supporters not to donate to Wikipedia after it spent $50M on DEI: 'Wokepedia'

Categories

Useful Links

More Stories

You may have missed

Categories

Useful Links