Cross-posted on the EA Forum. This article is the fourth in a series of ~10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance (e.g. incident reporting, safety evals, model registries, etc.). We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.
This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.
In the case of AI, these legally mandated disclosures can cover several topics, such as:
Clearly labeling AI-generated content: This allows people to immediately recognize that the image (or text or audio etc) they’re looking at was AI-generated. For example, the proposed AI Disclosure Act would require all generative AI content to include the text “Disclaimer: this output has been generated by artificial intelligence.”
Watermarking content generated by AI: This involves adding some detectable but not necessarily obvious mark. Watermarking has several purposes, for example letting us identify the provenance or source of AI-generated content.
Disclosure of training data: Since models are trained on huge amounts of data, but this data isn’t identifiable or reconstructable from the final model, some regulators require AI developers to disclose information about the data used to train models. For example, the EU AI Act requires AI developers to publicly disclose any copyrighted material used in their training data.
Notifying people that they’re being processed by an AI: For example, if video footage is analyzed by an AI to identify people’s age, the EU AI Act requires those people to be informed.
Labels and watermarks
Labels and watermarks vary in design; some are subtle, some conspicuous; some easy to remove, some difficult. For example, Dall-E 2 images have 5 coloured squares in their bottom right corner, a conspicuous label that’s easy to remove:
However, Dall-E 3 will add invisible watermarks to generated images, which are much harder to remove. Watermarking techniques are less visible than labels, and are evaluated on criteria such as perceptibility and robustness. A technique is considered robust if the resulting watermark resists both benign and malicious modifications; semi-robust if it resists benign modifications; and fragile if the watermark isn’t detectable after any minor transformation. Note that fragile and semi-robust techniques are still useful, for example in detecting tampering.
Imperceptible watermarking methods might embed a signal in the “noise” of the image such that it isn’t detectable to the human eye, and is difficult to fully remove, while still being clearly identifiable to a machine. This is part of steganography, the field of “representing information within another message or physical object”.
For example, the Least Significant Bit (LSB) technique adjusts unimportant bits in images or sound files to carry messages. For example, 73 represented in binary is 1001001. The leftmost “1” is the most significant bit, representing 26, while the rightmost “1” just represents 1, meaning it can be adjusted to carry part of a message without much significant change. LSB is relatively fragile, while other techniques like Discrete Cosine Transform (DCT) uses Fourier transforms to subtly adjust images at a more fundamental level, and thus is robust against attack techniques such as adding noise, compressing the image, or adding filters. Other popular techniques include DWT and SVD, and there are open-source technical standards such as C2PA that have been adopted by organizations like OpenAI.
Text is harder to watermark subtly, as the information in text is far less noisy than in an image, for example. Watermarking can still be applied to metadata, and there are techniques derived from steganography that add hidden messages to text, though these can be disrupted and aren’t under major consideration by legislators or AI labs.
Importantly, all these labeling and watermarking techniques can be embedded in the weights of generative AI models, for example in a final layer of a neural network, meaning it is possible to have robust but invisible signals in AI-generated content that, if interpreted correctly, could be used to identify what particular model generated a piece of work.
Watermarking also involves tradeoffs between robustness and detectability; robust watermarking techniques alter the content more fundamentally, which is easier to detect. This means robustness can also trade-off against security, as more obscure and undetectable watermarking are be harder to extract information from, and thus more secure. For example, brain scans feature incredibly sensitive information, and so researchers have developed fragile but secure watermarking techniques for fMRI. In summary, to quote a thorough review of watermarking and steganography:
It is tough to achieve a watermarking system that is simultaneously robust and secure.
The Executive Order on AI states that Biden’s administration will “develop effective labeling and content provenance mechanisms, so that Americans are able to determine when content is generated using AI and when it is not.” In particular:
Section 4.5(a): Requires the Secretary of Commerce to submit a report identifying existing and developable standards and tools for authenticating content, tracking its provenance, and detecting and labeling AI-generated content.
Section 10.1(b)(viii)(C): Requires the Director of OMB to issue guidance to government agencies that includes the specification of reasonable steps to watermark or otherwise label generative AI output.
Section 8(a): Encourages independent regulatory agencies to emphasize requirements related to the transparency of AI models.
The AI Disclosure Act was proposed in 2023, though it has not passed the house or senate yet, instead being referred to the Subcommittee on Innovation, Data, and Commerce. If passed, the act would require any output generated by AI to include the text: ‘‘Disclaimer: this output has been generated by artificial intelligence.’’
China
China’s 2022 rules for deep synthesis, which addresses the online provision and use of deep fakes and similar technology, requires providers to watermark and conspicuously label deep fakes. The regulation also requires the notification and consent of any individual whose biometric information is edited (e.g. whose voice or face is edited or added to audio or visual media).
The 2023 Interim Measures for the Management of Generative AI Services, which addresses public-facing generative AI in mainland China, requires content created by generative AI to be conspicuously labeled as such and digitally watermarked. Developers must also label the data they use in training AI clearly, and disclose the users and user groups of their services.
The EU
Article 52 of the draft EU AI Act lists the transparency obligations for AI developers. These largely relate to AI systems “intended to directly interact with natural persons”, where natural persons are individual people (excluding legal persons, which can include businesses). For concision, I will just call these “public-facing” AIs. Notably, the following requirements have exemptions for AI used to detect, prevent, investigate, or prosecute crimes (assuming other laws and rights are observed).
Article 52.1: Requires developers to ensure users of public-facing AI are informed or obviously aware that they are interacting with an AI.
Article 52.1a: Requires AI-generated content to be watermarked (with an exemption for AI assisting in standard editing or which doesn’t substantially alter input data).
Article 52.2: Requires developers of AI that recognizes emotions or categorizes biometric data (e.g. distinguishing children from adults in video footage) to inform the people being processed.
Article 52.3: Requires deep fakes to be labeled as AI-generated (with a partial exemption for use in art, satire, etc, in which case developers can disclose the existence of the deep fake less intrusively). AI-generated text designed to inform on matters of public interest must disclose that it’s AI-generated, unless the text undergoes human review, and someone takes editorial responsibility.
Article 52b: Requires developers of general purpose AI with systemic risk to notify the EU Commission within 2 weeks of meeting any of the following requirements defined in article 52a.1:
Possessing “high impact capabilities”, as evaluated by appropriate technical tools.
By decision of the Commission, if they believe a general purpose AI has capabilities or impact equivalent to “high impact capabilities”.
Article 52c: Requires providers of GPAI to make publish a summary of the content used for training the model, and 60f and 60k require developers to disclose any copyrighted material in their training data in their summary.
Convergence’s Analysis
Mandatory labeling of AI-generated content is a lightweight but imperfect method to keep users informed and reduce the spread of misinformation and similar risks from generative AI.
Labeling AI-generated text, images, video, and so on is a simple way to make users clearly understand that content is AI-generated. Further, it’s not expensive or complex to add labeling mechanisms to generative AI.
Labeling has extensive precedents in most legislations, such as food and medication labels.
While compliance can be high for such mandatory labeling, there’s variance in efficacy. For example, the World Health Organization found that inadequate labeling of medication plays a role in non-adherence to medication prescriptions, and some studies have found that improving labeling improves health outcomes.
Further, compliance can be low, especially when violations by smaller organizations or individuals aren’t actively addressed. For example, though many major websites are GDPR-compliant, a 2020 survey found that only 11.8% of (a scrapable subset of) the top 10,000 websites in the UK were compliant.
Mandatory watermarking is a lightweight way to improve traceability and accountability for AI developers.
Like labeling, watermarking is easy for developers to do, and invisible watermarks have the advantage of not interfering with the users’ experience.
If AI developers include watermarking in their generative AI models, these can be used to precisely identify which model was used to generate a piece of content. This is especially important when generative AI is used to generate harmful content, such as misinformation, deep fake porn, or other provocative material, as models should be trained not to produce such content. Watermarking allows us to find and address the root of the problem and hold the developers legally accountable.
Labels and watermarks can be disrupted or removed by motivated users, especially in text generation.
This means that it’s unlikely any content platform could guarantee that AI-generated content is always clearly distinguishable to people.
Despite the potential fragility of labeling and watermarking, they can still be important aspects of a larger, layered strategy, making it more difficult to produce misinformation, or for AI developers to avoid accountability.
In particular, societal education about AI will be a critical aspect of such a layered strategy.
Research orgs such as Meta and DeepMind are researching more advanced methods of watermarking during AI development.
Unclear definitions of what constitutes an application of AI will lead to inconsistent disclosure requirements and enforcement.
AI is becoming embedded in many creative tools, such as image-editing tools like Photoshop and GIMP. Among other functions, these can be used to “uncrop” images, generating additional content. AI is also important in procedurally generated video games and VR spaces.
These uses of AI lead to gray areas and edge cases that aren’t clearly covered by legislation, and individuals using these tools may not be able to tell whether they’re using compliant or illegal tools.
Current legal definitions are far from comprehensive enough to fully distinguish and legislate these overlapping use cases.
Cross-posted on the EA Forum. This article is the fourth in a series of ~10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance (e.g. incident reporting, safety evals, model registries, etc.). We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.
This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.
What are disclosures and why do they matter?
The public and regulators have legal rights to understand goods and services. For example, food products must have clear nutritional labels; medications must disclose their side effects and contraindications; and machinery must come with safety instructions.
In the case of AI, these legally mandated disclosures can cover several topics, such as:
Labels and watermarks
Labels and watermarks vary in design; some are subtle, some conspicuous; some easy to remove, some difficult. For example, Dall-E 2 images have 5 coloured squares in their bottom right corner, a conspicuous label that’s easy to remove:
However, Dall-E 3 will add invisible watermarks to generated images, which are much harder to remove. Watermarking techniques are less visible than labels, and are evaluated on criteria such as perceptibility and robustness. A technique is considered robust if the resulting watermark resists both benign and malicious modifications; semi-robust if it resists benign modifications; and fragile if the watermark isn’t detectable after any minor transformation. Note that fragile and semi-robust techniques are still useful, for example in detecting tampering.
Imperceptible watermarking methods might embed a signal in the “noise” of the image such that it isn’t detectable to the human eye, and is difficult to fully remove, while still being clearly identifiable to a machine. This is part of steganography, the field of “representing information within another message or physical object”.
For example, the Least Significant Bit (LSB) technique adjusts unimportant bits in images or sound files to carry messages. For example, 73 represented in binary is 1001001. The leftmost “1” is the most significant bit, representing 26, while the rightmost “1” just represents 1, meaning it can be adjusted to carry part of a message without much significant change. LSB is relatively fragile, while other techniques like Discrete Cosine Transform (DCT) uses Fourier transforms to subtly adjust images at a more fundamental level, and thus is robust against attack techniques such as adding noise, compressing the image, or adding filters. Other popular techniques include DWT and SVD, and there are open-source technical standards such as C2PA that have been adopted by organizations like OpenAI.
Text is harder to watermark subtly, as the information in text is far less noisy than in an image, for example. Watermarking can still be applied to metadata, and there are techniques derived from steganography that add hidden messages to text, though these can be disrupted and aren’t under major consideration by legislators or AI labs.
Importantly, all these labeling and watermarking techniques can be embedded in the weights of generative AI models, for example in a final layer of a neural network, meaning it is possible to have robust but invisible signals in AI-generated content that, if interpreted correctly, could be used to identify what particular model generated a piece of work.
Watermarking also involves tradeoffs between robustness and detectability; robust watermarking techniques alter the content more fundamentally, which is easier to detect. This means robustness can also trade-off against security, as more obscure and undetectable watermarking are be harder to extract information from, and thus more secure. For example, brain scans feature incredibly sensitive information, and so researchers have developed fragile but secure watermarking techniques for fMRI. In summary, to quote a thorough review of watermarking and steganography:
Overall, modern digital watermarking techniques are robust and difficult (but not impossible) to remove.
Current Regulatory Policies
The US
The Executive Order on AI states that Biden’s administration will “develop effective labeling and content provenance mechanisms, so that Americans are able to determine when content is generated using AI and when it is not.” In particular:
The AI Disclosure Act was proposed in 2023, though it has not passed the house or senate yet, instead being referred to the Subcommittee on Innovation, Data, and Commerce. If passed, the act would require any output generated by AI to include the text: ‘‘Disclaimer: this output has been generated by artificial intelligence.’’
China
China’s 2022 rules for deep synthesis, which addresses the online provision and use of deep fakes and similar technology, requires providers to watermark and conspicuously label deep fakes. The regulation also requires the notification and consent of any individual whose biometric information is edited (e.g. whose voice or face is edited or added to audio or visual media).
The 2023 Interim Measures for the Management of Generative AI Services, which addresses public-facing generative AI in mainland China, requires content created by generative AI to be conspicuously labeled as such and digitally watermarked. Developers must also label the data they use in training AI clearly, and disclose the users and user groups of their services.
The EU
Article 52 of the draft EU AI Act lists the transparency obligations for AI developers. These largely relate to AI systems “intended to directly interact with natural persons”, where natural persons are individual people (excluding legal persons, which can include businesses). For concision, I will just call these “public-facing” AIs. Notably, the following requirements have exemptions for AI used to detect, prevent, investigate, or prosecute crimes (assuming other laws and rights are observed).
Convergence’s Analysis
Mandatory labeling of AI-generated content is a lightweight but imperfect method to keep users informed and reduce the spread of misinformation and similar risks from generative AI.
Mandatory watermarking is a lightweight way to improve traceability and accountability for AI developers.
Labels and watermarks can be disrupted or removed by motivated users, especially in text generation.
Unclear definitions of what constitutes an application of AI will lead to inconsistent disclosure requirements and enforcement.