China has a new plan for judging the safety of generative AI—and it’s packed with details
Last week we got some clarity about what all this may look like in practice.
On October 11, a Chinese government organization called the National Information Security Standardization Technical Committee released a draft document that proposed detailed rules for how to determine whether a generative AI model is problematic. Often abbreviated as TC260, the committee consults corporate representatives, academics, and regulators to set up tech industry rules on issues ranging from cybersecurity to privacy to IT infrastructure.
Unlike many manifestos you may have seen about how to regulate AI, this standards document is very detailed: it sets clear criteria for when a data source should be banned from training generative AI, and it gives metrics on the exact number of keywords and sample questions that should be prepared to test out a model.
Matt Sheehan, a global technology fellow at the Carnegie Endowment for International Peace who flagged the document for me, said that when he first read it, he “felt like it was the most grounded and specific document related to the generative AI regulation.” He added, “This essentially gives companies a rubric or a playbook for how to comply with the generative AI regulations that have a lot of vague requirements.”
It also clarifies what companies should consider a “safety risk” in AI models—since Beijing is trying to get rid of both universal concerns, like algorithmic biases, and content that’s only sensitive in the Chinese context. “It’s an adaptation to the already very sophisticated censorship infrastructure,” he says.
So what do these specific rules look like?
On training: All AI foundation models are currently trained on many corpora (text and image databases), some of which have biases and unmoderated content. The TC260 standards demand that companies not only diversify the corpora (mixing languages and formats) but also assess the quality of all their training materials.
How? Companies should randomly sample 4,000 “pieces of data” from one source. If over 5% of the data is considered “illegal and negative information,” this corpus should be blacklisted for future training.