[论文翻译]SEAL:语义感知图像水印


原文地址:https://arxiv.org/pdf/2503.12172


SEAL: Semantic Aware Image Watermarking

SEAL:语义感知图像水印

Kasra Arabi, R. Teal Witter, Chinmay Hegde, Niv Cohen New York University

Kasra Arabi, R. Teal Witter, Chinmay Hegde, Niv Cohen 纽约大学

Abstract

摘要

Generative models have rapidly evolved to generate realistic outputs. However, their synthetic outputs increasingly challenge the clear distinction between natural and AI-generated content, necessitating robust watermarking techniques. Watermarks are typically expected to preserve the integrity of the target image, withstand removal attempts, and prevent unauthorized replication onto unrelated images. To address this need, recent methods embed persistent watermarks into images produced by diffusion models using the initial noise. Yet, to do so, they either distort the distribution of generated images or rely on searching through a long dictionary of used keys for detection.

生成式模型已迅速发展,能够生成逼真的输出。然而,它们的合成输出日益挑战自然内容与AI生成内容之间的明确区分,因此需要强大的水印技术。水印通常需要保持目标图像的完整性,抵御移除尝试,并防止未经授权的复制到无关图像上。为了满足这一需求,最近的方法通过使用初始噪声将持久水印嵌入到扩散模型生成的图像中。然而,这些方法要么会扭曲生成图像的分布,要么依赖于在长字典中搜索使用的密钥进行检测。

In this paper, we propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark, enabling a distortion-free watermark that can be verified without requiring a database of key patterns. Instead, the key pattern can be inferred from the semantic embedding of the image using localitysensitive hashing. Furthermore, conditioning the watermark detection on the original image content improves robustness against forgery attacks. To demonstrate that, we consider two largely overlooked attack strategies: (i) an attacker extracting the initial noise and generating a novel image with the same pattern; (ii) an attacker inserting an unrelated (potentially harmful) object into a watermarked image, possibly while preserving the watermark. We empirically validate our method’s increased robustness to these attacks. Taken together, our results suggest that content-aware watermarks can mitigate risks arising from image-generative models. Our code is available at https://github.com/Kasraarabi/SEAL.

在本文中,我们提出了一种新颖的水印方法,该方法将生成图像的语义信息直接嵌入到水印中,从而实现了一种无需密钥模式数据库即可验证的无失真水印。相反,密钥模式可以通过使用局部敏感哈希从图像的语义嵌入中推断出来。此外,将水印检测条件设定为原始图像内容可以提高对伪造攻击的鲁棒性。为了证明这一点,我们考虑了两种被广泛忽视的攻击策略:(i) 攻击者提取初始噪声并生成具有相同模式的新图像;(ii) 攻击者在可能保留水印的情况下,将无关(可能有害)的对象插入到带水印的图像中。我们通过实验验证了我们的方法对这些攻击的增强鲁棒性。总的来说,我们的结果表明,内容感知水印可以减轻图像生成模型带来的风险。我们的代码可在 https://github.com/Kasraarabi/SEAL 获取。

1. Introduction

1. 引言

The growing capabilities of generative models pose risks to society, including misleading public opinion, violating privacy or intellectual property, and fabricating legal evidence [5, 14, 22]. Watermarking methods aim to mitigate such risks by allowing the detection of generated contents.

生成模型日益增长的能力对社会构成了风险,包括误导公众舆论、侵犯隐私或知识产权,以及伪造法律证据 [5, 14, 22]。水印方法旨在通过检测生成内容来减轻此类风险。

Yet, many conventional watermarking techniques lack robustness against adversaries who attempt to remove them using regeneration attacks powered by recent generative models [9, 18, 24]. To address this, new watermarking techniques leveraging advances in generative models offer increased robustness against such attacks [4, 25, 27]. Namely, these methods embed a watermarking pattern in the initial noise used by a diffusion model. These patterns have been shown to be more robust against existing removal attacks.

然而,许多传统的水印技术在面对使用最新生成模型驱动的再生攻击的对手时,缺乏鲁棒性 [9, 18, 24]。为了解决这一问题,利用生成模型进展的新水印技术提供了更强的抗攻击能力 [4, 25, 27]。具体来说,这些方法在扩散模型使用的初始噪声中嵌入水印模式。这些模式已被证明对现有的去除攻击更具鲁棒性。


Figure 1. Illustration of different watermarking frameworks using the initial noise of diffusion models. No Watermark: A diffusion model maps pure Gaussian noise to an image. Tree-Ring: A pattern is added to the initial noise, modifying the distribution of generated images in a detectable way. Key-Based Watermarking: A key is sampled to generate distortion-free images linked to the key. Ours (SEAL): The initial noise is conditioned on multiple keys derived from the image’s semantic embedding, with each key influencing a different patch.

图 1: 使用扩散模型初始噪声的不同水印框架示意图。无水印:扩散模型将纯高斯噪声映射到图像。Tree-Ring:在初始噪声中添加模式,以可检测的方式修改生成图像的分布。基于密钥的水印:采样密钥以生成与密钥相关的无失真图像。我们的方法 (SEAL):初始噪声以从图像语义嵌入派生的多个密钥为条件,每个密钥影响不同的图像块。

However, existing watermarks that utilize the diffusion model initial noise tend to be vulnerable to other attacks aiming to “steal” the watermark and apply it to images unrelated to the watermark owners. Some of these watermark forgery attacks can be evaded by using a distortion-free watermark - generating watermarked images from a similar distribution to the distribution of non-watermarked images; therefore exposing less information about the watermark identity. Even so, using an extensively large number of watermark identities requires maintaining a database of used noises, and might still be forgeable by other attacks.

然而,现有的利用扩散模型初始噪声的水印往往容易受到其他攻击,这些攻击旨在“窃取”水印并将其应用于与水印所有者无关的图像。其中一些水印伪造攻击可以通过使用无失真水印来避免——从与非水印图像相似的分布中生成水印图像;因此暴露的水印身份信息较少。即便如此,使用大量水印身份需要维护一个已使用噪声的数据库,并且可能仍然会被其他攻击伪造。

To address these challenges, we introduce SEAL - Semantic Embedding for AI Lineage, a method that embeds watermark patterns directly tied to image semantics. Our approach enables direct watermark detection from image samples and offers the following key properties: (i) Distortionfree: As in previous works, we utilize pseudo-random hash functions to generate an initial noise that is similar to the noise used by non-watermarked models, ensuring a similar distribution of generated images. (ii) Robust to regeneration attacks: Similar to prior watermarking methods based on DDIM inversion, our approach demonstrates resilience against regeneration-based removal attempts [28]. (iii) Correlated with image semantics: The applied watermark encodes semantic information from the image. (iv) Independent of a historical database: Unlike previous methods that rely on a database of past generations, our approach embeds watermarks without requiring access to such a database, making detection possible without prior stored data.

为了解决这些挑战,我们引入了 SEAL - 语义嵌入的 AI 溯源方法,该方法将水印模式直接嵌入到图像语义中。我们的方法能够直接从图像样本中检测水印,并提供以下关键特性:(i) 无失真:与之前的工作一样,我们利用伪随机哈希函数生成初始噪声,该噪声与非水印模型使用的噪声相似,确保生成的图像具有相似的分布。(ii) 对再生攻击具有鲁棒性:与之前基于 DDIM 反演的水印方法类似,我们的方法展示了对抗基于再生的移除尝试的韧性 [28]。(iii) 与图像语义相关:应用的水印编码了图像的语义信息。(iv) 独立于历史数据库:与之前依赖过去生成数据库的方法不同,我们的方法在嵌入水印时不需要访问此类数据库,使得在没有预先存储数据的情况下也能进行检测。

Our key insight is that we can encode semantic information about the image content in a distortion-free watermark by embedding the semantic encoding directly into the initial noise. We use projections of the user prompt embedding to seed different pseudo-random patches that compose the initial noise. We ensure the encoded embedding correlates strongly with the resulting image content, not just with the prompt, which is important since the prompt is not available during detection. At detection time, our approach identifies an image as watermarked only when the watermark pattern is both present and properly correlated with the image semantics. We describe in detail our watermarking technique in Section 3.

我们的关键洞察是,通过将语义编码直接嵌入到初始噪声中,可以在无失真的水印中编码图像内容的语义信息。我们使用用户提示嵌入的投影来生成构成初始噪声的不同伪随机补丁。我们确保编码的嵌入与生成的图像内容强相关,而不仅仅与提示相关,这一点很重要,因为在检测时提示不可用。在检测时,我们的方法仅在存在水印模式且与图像语义正确相关时,才将图像识别为带有水印。我们将在第3节详细描述我们的水印技术。

Correlating our watermarking algorithm to the image semantics also allows us to resist forgery attacks that are challenging for many existing approaches. An attacker attempting to forge such a watermark onto unauthorized content would alter the image’s semantic embedding, breaking its correlation with the embedded pattern and rendering the watermark invalid.

将我们的水印算法与图像语义相关联,还能帮助我们抵御许多现有方法难以应对的伪造攻击。攻击者试图在未经授权的内容上伪造此类水印时,会改变图像的语义嵌入,破坏其与嵌入模式的相关性,从而使水印失效。

One mostly overlooked attack involves an adversary altering only small portions of a watermarked image while preserving the rest of its content. In such cases, the attacker can manipulate the image to be offensive, illegal, or damaging to the watermark owner’s reputation, all while the original watermark remains detectable. We term this attack the CAT ATTACK, as the attacker may add an object to the image (e.g., a cat) and expect the watermark to persist. We evaluate the potential damage of such attacks and demonstrate that our method uniquely provides robustness against both the CAT ATTACK and forgery attempts by adversaries who obtain accurate copies of our initial noise. Our experiments confirm our method’s effectiveness against these novel threats as well as previously studied attack vectors.

一种大多被忽视的攻击方式涉及对手仅改变水印图像的一小部分,同时保留其余内容。在这种情况下,攻击者可以操纵图像使其具有冒犯性、非法性或损害水印所有者的声誉,而原始水印仍然可被检测到。我们将这种攻击称为 CAT ATTACK,因为攻击者可能会在图像中添加一个对象(例如,一只猫),并期望水印仍然存在。我们评估了此类攻击的潜在损害,并证明我们的方法独特地提供了对 CAT ATTACK 和对手获取我们初始噪声的准确副本的伪造尝试的鲁棒性。我们的实验证实了我们的方法对这些新威胁以及先前研究的攻击向量的有效性。

Our contributions are as follows:

我们的贡献如下:

2. Related works

2. 相关工作

Recent research on image watermarking can be broadly categorized into post-processing and in-processing approaches, each offering distinct trade-offs between quality, robustness, and deployment practicality [2]. We cover here InProcessing Methods, and for Post-Processing Methods refer to Appendix 8.

最近的图像水印研究大致可以分为后处理 (post-processing) 和处理中 (in-processing) 方法,每种方法在质量、鲁棒性和部署实用性之间提供了不同的权衡 [2]。我们在这里介绍处理中方法,后处理方法请参见附录 8。

In-Processing Methods. In-processing approaches integrate watermark embedding directly within the image generation process. These methods are often used in diffusion models to achieve minimal perceptual impact. Some methods modify the generative model entirely by fine-tuning specific components, as demonstrated in Stable Signature [10]. An alternative class of techniques manipulates the initial noise of the generation process, thereby embedding the watermark without extensive model retraining. For example, Tree-Ring [25] embeds a Fourier-domain pattern into the initial noise, which can be detected through DDIM inversion [23], while RingID [7] extends this idea to support multiple keys. Other notable methods include Gaussian Shading, which produces a unique key for each watermark owner [27], PRC that leverages pseudo-random error-correcting codes for computational un detect ability [12], and WIND, which employs a two-stage detection process to enables a very large number of keys [4].

处理中方法。处理中方法将水印嵌入直接集成到图像生成过程中。这些方法通常用于扩散模型,以实现最小的感知影响。一些方法通过微调特定组件完全修改生成模型,如 Stable Signature [10] 所示。另一类技术则操纵生成过程的初始噪声,从而在不进行大量模型重新训练的情况下嵌入水印。例如,Tree-Ring [25] 将傅里叶域模式嵌入到初始噪声中,可以通过 DDIM 反演 [23] 检测到,而 RingID [7] 则扩展了这一思路以支持多个密钥。其他值得注意的方法包括 Gaussian Shading,它为每个水印所有者生成唯一的密钥 [27],PRC 利用伪随机纠错码实现计算不可检测性 [12],以及 WIND,它采用两阶段检测过程以支持大量密钥 [4]。

Locally Sensitive Hashing in High-Dimensional Spaces.

高维空间中的局部敏感哈希 (Locally Sensitive Hashing)

Recent advances in approximate nearest neighbor (ANN) search have increasingly relied on the power of Locally Sensitive Hashing (LSH) to address the challenges inherent in high-dimensional data. Originally introduced by Indyk and Motwani [13] and further refined by Gionis et al. [11], LSH employs randomized hash functions that ensure similar data points are mapped to the same bucket with high probability. Formally, for a hash family H , the collision probability is

最近在近似最近邻 (ANN) 搜索方面的进展越来越依赖于局部敏感哈希 (Locally Sensitive Hashing, LSH) 的能力,以应对高维数据中固有的挑战。LSH 最初由 Indyk 和 Motwani [13] 提出,并由 Gionis 等人 [11] 进一步改进,它采用随机哈希函数,确保相似的数据点以高概率映射到同一个桶中。形式上,对于哈希族 H,碰撞概率为


Figure 2. Illustration of the SEAL watermarking framework for diffusion models using semantic-aware noise patterns. Watermark Generation: A textual prompt (e.g., “Beach at sunset.”) is first embedded into a semantic space. The embedding is then processed using SimHash to generate discrete keys, which are used in Encrypted Sampling to choose the initial noise zN(0,I) . The watermarked noise then undergoes standard diffusion to generate the final image. Detection: The image is captioned to obtain an embedding, which is then processed by SimHash to generate a reference noise, similar to watermark generation. This noise remains correlated with the initial noise used during generation as long as the image semantics remain unchanged. The image is also processed through Inverse Diffusion to estimate the actual initial noise used during its generation. If there are insufficient matches between the reference noise and the noise obtained from inversion, the watermarking framework flags the image as non-watermarked. If a key match is found but the image is still deemed suspicious, a detailed inspection of the patches can be performed to identify local edits.

图 2: 使用语义感知噪声模式的扩散模型 SEAL 水印框架示意图。水印生成:首先将文本提示(例如,“日落时的海滩”)嵌入到语义空间中。然后使用 SimHash 处理嵌入以生成离散密钥,这些密钥在加密采样中用于选择初始噪声 zN(0,I)。带水印的噪声随后经过标准扩散生成最终图像。检测:对图像进行描述以获得嵌入,然后通过 SimHash 处理生成参考噪声,类似于水印生成。只要图像语义保持不变,该噪声仍与生成过程中使用的初始噪声相关。图像还通过逆扩散处理以估计生成过程中使用的实际初始噪声。如果参考噪声与逆扩散获得的噪声之间匹配不足,水印框架会将图像标记为未加水印。如果找到密钥匹配但图像仍被视为可疑,则可以对图像块进行详细检查以识别局部编辑。

given by

给定

图片.png

Subsequent improvements by Datar et al. [8] and Andoni and Indyk [3] have enhanced both the efficiency and robustness of LSH methods, making them key for large-scale, highdimensional search tasks.

Datar 等人 [8] 以及 Andoni 和 Indyk [3] 的后续改进提升了 LSH 方法的效率和鲁棒性,使其成为大规模高维搜索任务的关键。

3. SEAL: Semantic Aware Watermarking

3. SEAL: 语义感知水印

3.1. Motivation

3.1. 动机

Watermarking methods suffer from an inherent trade-off: a watermark that is harder to remove is also easier to attach to unrelated generations, compromising the reputation of the watermark owner [5]. One suggested solution to overcome this trade-off, might be maintaining a database of past generations, such that the owner could compare a seemingly watermarked image to the actual past generations. Yet, this solution is not without its problems. First, maintaining and searching a rapidly growing database, which expands with each new generation, can be challenging. Second, safeguarding the database itself may pose security risks. Finally, in various use cases, the watermark owner may not only wish to detect if an image is watermarked but also provide to a third party evidence that it is. We therefore turn to suggest a watermarking scheme that is hard to remove, hard to forge, and does not rely on maintaining a database of past generations.

水印方法面临一个固有的权衡:更难去除的水印也更容易附加到不相关的生成内容上,从而损害水印所有者的声誉 [5]。为了克服这一权衡,一种建议的解决方案可能是维护一个过去生成内容的数据库,使得所有者可以将看似带有水印的图像与实际过去的生成内容进行比较。然而,这一解决方案并非没有问题。首先,维护和搜索一个随着每次新生成而迅速增长的数据库可能具有挑战性。其次,保护数据库本身可能会带来安全风险。最后,在各种使用场景中,水印所有者可能不仅希望检测图像是否带有水印,还希望向第三方提供证据证明其带有水印。因此,我们转而建议一种难以去除、难以伪造且不依赖于维护过去生成内容数据库的水印方案。

Our core idea is to use a distortion-free initial noise pattern not only to indicate the origin of the image but also to encode which semantic information the image may contain. We do so in three stages (see also Figure 2): (i) Semantic Embedding – we predict a vector representing the expected semantic content in each generated image (ii) SimHash Encoding – we encode the semantic vector using a set of multibit hash functions (iii) Encrypted Sampling – The pseudorandom outputs of these functions are combined to produce the initial noise for the denoising process. Taken together, these steps set an initial noise that is both distortion-free with respect to standard random initialization and correlated with the semantics of the input prompt (see Section 3.3). We describe our watermarking method in detail below.

我们的核心思想是使用无失真的初始噪声模式,不仅用于指示图像的来源,还用于编码图像可能包含的语义信息。我们通过三个阶段实现这一目标(参见图 2):(i) 语义嵌入——我们预测一个向量,表示每个生成图像中预期的语义内容;(ii) SimHash 编码——我们使用一组多比特哈希函数对语义向量进行编码;(iii) 加密采样——这些函数的伪随机输出被组合起来,生成去噪过程的初始噪声。综合来看,这些步骤设置的初始噪声既相对于标准随机初始化是无失真的,又与输入提示的语义相关联(参见第 3.3 节)。我们将在下面详细描述我们的水印方法。


Figure 3. Effect of the Cat Attack on SEAL. (Left) A cat image is pasted onto a watermarked image at a random position and scale. (Right) Our method detects this modification by identifying elevated 2 norms in affected patches.

图 3: 猫攻击对 SEAL 的影响。(左) 一张猫的图像被随机位置和缩放粘贴到带有水印的图像上。(右) 我们的方法通过识别受影响区域中升高的 2 范数来检测这种修改。

3.2. Method

3.2. 方法

Formally, our method first creates a semantic vector v and uses it to sample the initial noise z for the watermarked image. During detection, we aim to verify the connection between the used initial noise z and the semantic embedding of the image. When approximating z from the generated image during detection and verifying it, we consider the following error sources:

形式上,我们的方法首先创建一个语义向量 v,并使用它来为水印图像采样初始噪声 z。在检测过程中,我们的目标是验证所使用的初始噪声 z 与图像的语义嵌入之间的联系。在检测过程中从生成的图像中近似 z 并验证它时,我们考虑以下误差来源:

We would ideally like for zinv to align with ˜z but this is not guaranteed because both differ from z due to the error sources mentioned above. Instead, we separate each noise vector into patches and compare them. Our method provides a high likelihood that even if some patches do not match because of the challenges discussed above, many of the patches will match as long as the suspect image is watermarked.

我们理想情况下希望 zinv˜z 对齐,但这并不能保证,因为两者都由于上述错误源而与 z 不同。相反,我们将每个噪声向量分成小块并进行比较。我们的方法提供了很高的可能性,即使由于上述挑战导致某些小块不匹配,只要嫌疑图像被水印,许多小块仍将匹配。

Semantic Patterns with SimHash

使用 SimHash 的语义模式

The core subroutine of our watermarking method is SimHash [6], used to generate initial noise maps correlated to a given vector (Algorithm 1). SimHash takes a vector v and generates an initial noise $\mathbf{z}{i}forpatchi,allowingaverifiertolaterdetermine,withsomeprobability,whether\mathbf{z}{i}isrelatedto\mathbf{v}Namely,thesemanticvector\mathbf{v}ispassedthroughalocalitysensitivehashingmethodthatgeneratesrepresentationsof\mathbf{v}$ in terms of its projections in random directions.

我们水印方法的核心子程序是 SimHash [6],用于生成与给定向量相关的初始噪声图(算法 1)。SimHash 接受一个向量 v,并为补丁 i 生成初始噪声 $\mathbf{z}{i}使\mathbf{z}{i}\mathbf{v}\mathbf{v}$ 通过一种局部敏感哈希方法,生成其在随机方向上的投影表示。

Specifically, SimHash projects v onto a set of random vectors. The input to the hash function is determined by the

具体来说,SimHash 将 v 投影到一组随机向量上。哈希函数的输入由以下决定:

Algorithm 1 SimHash

算法 1 SimHash

1: Input: v: semantic vector, i : patch index, salt: secret salt, b : number of bits, hash: cryptographic hash function 2: Output: Semantic, secure, normally distributed noise 3: bit Λ0 // Initialize hash input 4: for j=1,,b do 5: // Re prod uci bly sample random vector 6: shash(i,j,salt) 7: Sample $\mathbf{r}{j}^{(i)}\overset{s}{\sim}\mathcal{N}(\mathbf{0},\mathbf{I})8:b\mathbf{\nabla}{\mathbf{\eta}}\mathbf{\cdot}\mathbf{c}\mathbf{s}[j]\gets\mathrm{sign}(\langle\mathbf{v},\mathbf{r}{j}^{(i)}\rangle)//Randomprojection9:endfor10:s{i}\gets\mathtt{h a s h}(\mathtt{b i t s},i,\mathtt{s a l t})11:return\mathbf{z}{i}\overset{s{i}}{\sim}N(\mathbf{0},\mathbf{I})$

1: 输入: v: 语义向量, i: 补丁索引, salt: 秘密盐值, b: 位数, hash: 加密哈希函数
2: 输出: 语义安全、正态分布的噪声
3: bit Λ0 // 初始化哈希输入
4: for j=1,,b do
5: // 可重复采样随机向量
6: shash(i,j,salt)
7: 采样 $\mathbf{r}{j}^{(i)}\overset{s}{\sim}\mathcal{N}(\mathbf{0},\mathbf{I})8:b\mathbf{\nabla}{\mathbf{\eta}}\mathbf{\cdot}\mathbf{c}\mathbf{s}[j]\gets\mathrm{sign}(\langle\mathbf{v},\mathbf{r}{j}^{(i)}\rangle)//9:endfor10:s{i}\gets\mathtt{h a s h}(\mathtt{b i t s},i,\mathtt{s a l t})11:return\mathbf{z}{i}\overset{s{i}}{\sim}N(\mathbf{0},\mathbf{I})$

sign of the projection, ensuring that similar vectors yield similar hash values. For i1,,k , the seed and the noise for patch i are:

投影的符号确保相似的向量产生相似的哈希值。对于 i1,,k,补丁 i 的种子和噪声为:

图片.png

Having repetitive patches in the initial noise may distort image generation. Therefore, we include the patch index in the hash function input to ensure that sisj even when the input bits are identical (see Figure 8 for a visualization of what happens to the noise without the patch-dependent input). For cryptographic security, we also hash a secret salt.

初始噪声中的重复补丁可能会扭曲图像生成。因此,我们在哈希函数输入中包含补丁索引,以确保即使输入位相同,sisj(参见图 8,了解在没有补丁相关输入的情况下噪声的变化情况)。为了加密安全,我们还对秘密盐值进行哈希处理。

Watermark Generation

水印生成

Algorithm 2 Watermark Generation

算法 2 水印生成

The first step of the generation process is to find a semantic vector describing the image that will be generated. Ideally, the semantic vector depends only on the prompt and correlates exclusively with images generated from it. Yet, in practice, predicting the image semantics based on the prompt is difficult.

生成过程的第一步是找到描述将要生成图像的语义向量。理想情况下,语义向量仅依赖于提示词,并且仅与由其生成的图像相关。然而,在实践中,基于提示词预测图像语义是困难的。

Our solution begins by generating a proxy image xpre . Then, we use a captioning model to achieve a text description of the generated image. The caption is embedded into a latent semantic space, resulting in a semantic vector v, which captures the high-level semantics of the generated image by the prompt. Next, we generate the watermarked noise z in patches using the semantic vector v and SimHash. Finally, we apply diffusion to the watermarked initial noise.

我们的解决方案首先通过生成一个代理图像 xpre。然后,我们使用一个描述模型来获取生成图像的文本描述。该描述被嵌入到一个潜在语义空间中,生成一个语义向量 v,它捕捉了提示生成图像的高级语义。接下来,我们使用语义向量 v 和 SimHash 以分块的方式生成带水印的噪声 z。最后,我们对带水印的初始噪声进行扩散处理。

Embedding Optimization. During detection, the generated image will be captioned to obtain a semantic vector ˜v . To make sure that v correlates to ˜v and not to unrelated vectors, we fine-tune the embedding model to improve the similarity between the embedding of different images generated from the same prompt.

嵌入优化。在检测过程中,生成的图像将被标注以获得语义向量 ˜v。为了确保 v 与 ˜v 相关,而不是与不相关的向量相关,我们对嵌入模型进行微调,以提高从同一提示生成的不同图像的嵌入之间的相似性。

Algorithm 3 Watermark Detection

算法 3 水印检测

1: Input: x: suspect image, τ : patch distance threshold, n : number of patches, mmatch : match threshold, b number of bits, salt: secret salt 2: Output: Watermark detection (True/False) 3: \Tildev Embed(Caption (˜x) )4: zinv Inverse Diffusion (˜x) 5: m0 6: for i=1,,n do 7: $\widetilde{\mathbf z}{i}\gets\mathrm{SimHash}(\widetilde{\mathbf v},i,\mathrm{salt})8:if|\tilde{\mathbf{z}}{i}-\mathbf{z}{i}^{\mathrm{inv}}|{2}<\tauthen9:m++$ 10: end if 11: end for 12: return m ≥ mmatch

1: 输入: x: 可疑图像, τ: 块距离阈值, n: 块数量, mmatch: 匹配阈值, b: 位数, salt: 秘密盐值
2: 输出: 水印检测 (True/False)
3: \Tildev 嵌入(描述 (˜x) )
4: zinv 逆扩散 (˜x)
5: m0
6: for i=1,,n do
7: $\widetilde{\mathbf z}{i}\gets\mathrm{SimHash}(\widetilde{\mathbf v},i,\mathrm{salt})8:if|\tilde{\mathbf{z}}{i}-\mathbf{z}{i}^{\mathrm{inv}}|{2}<\tauthen9:m++$
10: end if
11: end for
12: return m ≥ mmatch

Watermark Detection

水印检测

For detection, we generate noise based on the semantic content of the image and check how well it corresponds to the reconstructed noise obtained through DDIM inversion (Algorithm 3). We begin by embedding the image to get a semantic vector ˜v that captures the content of the image. SimHash is then applied to this vector as in the watermark generation process, generating an estimated initial noise ˜z . Finally, we use inverse diffusion (e.g., DDIM [23]) to approximately reconstruct the initial noise zinv from the image.

对于检测,我们基于图像的语义内容生成噪声,并检查其与通过DDIM反演(算法3)获得的重构噪声的对应程度。我们首先嵌入图像以获得捕捉图像内容的语义向量 ˜v 。然后像水印生成过程一样,对该向量应用SimHash,生成估计的初始噪声 ˜z 。最后,我们使用逆扩散(例如,DDIM [23])从图像中近似重构初始噪声 zinv

Since v and ˜v may differ, z and ˜z are not necessarily the same. However, by the similarity property of SimHash, z and ˜z will be identical on some patches as long as v and ˜v are close. On the patches i where $\widetilde{\mathbf{z}}{i}=\mathbf{z}{i}$ ,

由于 v˜v 可能不同,z˜z 不一定相同。然而,根据 SimHash 的相似性特性,只要 v˜v 接近,z˜z 在某些局部区域上将是相同的。在 $\widetilde{\mathbf{z}}{i}=\mathbf{z}{i}i$ 上,

$$
|\widetilde{\mathbf z}{i}-\mathbf z{i}^{\mathrm{inv}}|{2}=|{\mathbf z}{i}-{\mathbf z}{i}^{\mathrm{inv}}|{2}.
$$

$$
|\widetilde{\mathbf z}{i}-\mathbf z{i}^{\mathrm{inv}}|{2}=|{\mathbf z}{i}-{\mathbf z}{i}^{\mathrm{inv}}|{2}.
$$

For such i , the only error stems from the diffusion and inverse diffusion processes. Empirically, we find that there is a threshold τ so that two patches have 2 -norm difference at most τ it is likely that they were generated from the same random seed.

对于这样的 i,唯一的误差来源于扩散和逆扩散过程。经验上,我们发现存在一个阈值 τ,使得两个补丁的 2 范数差异最多为 τ,它们很可能是由相同的随机种子生成的。


Figure 4. Watermark Detection vs. Semantic Similarity. We plot the empirical probability of declaring an image as watermarked as a function of the angle between the semantic embedding used for watermark generation (n=1024 and b=7. ) and that of the inspected image. Our use of locally-sensitive hashing allows us to constrain the semantic embedding of the image we deem watermarked according to the initial noise used.

图 4: 水印检测与语义相似性。我们绘制了将图像声明为带水印的经验概率,作为用于生成水印的语义嵌入与检查图像的语义嵌入之间角度的函数 (n=1024b=7.)。我们使用局部敏感哈希的方法,能够根据初始噪声限制我们认为带水印的图像的语义嵌入。

Semantic Similarity Detection. In order to detect whether an image was initially generated with our watermark, we count the number of patches that match (i.e., their 2 -norm distance is at most τ ). If the number of matches is above a set threshold nmatch then we declare the image is watermarked. In Section 3.3, we analyze the probability of correctly identifying a watermarked image.

语义相似性检测。为了检测图像最初是否使用我们的水印生成,我们计算匹配的补丁数量(即它们的 2 范数距离最多为 τ)。如果匹配数量超过设定的阈值 nmatch,则我们声明该图像带有水印。在第 3.3 节中,我们分析了正确识别带水印图像的概率。

Tampering Detection. In addition to the association between the watermark and the semantic embedding, edits such as object addition, removal, or modification are likely to alter the estimated initial noise in the affected image regions. This enables our watermark to provide localized information about edits that might have been made to the image. Consequently, even if the semantic embedding ˜v aligns well with the initial embedding v after edits, such tampering can still be detected by identifying localized patches in the reconstructed initial noise that neither match the expected noise nor any other valid input to the hash function. Inspecting the patches one by one, the model owner may recover the b input bits for each patch with an exhaustive search over the 2b options per patch, and recover a matching initial noise.

篡改检测。除了水印与语义嵌入之间的关联外,对象添加、移除或修改等编辑操作可能会改变受影响图像区域的估计初始噪声。这使得我们的水印能够提供有关图像可能被编辑的局部信息。因此,即使编辑后的语义嵌入 ˜v 与初始嵌入 v 对齐良好,通过识别重建的初始噪声中既不匹配预期噪声也不匹配哈希函数任何其他有效输入的局部区域,仍然可以检测到此类篡改。通过逐一检查这些区域,模型所有者可以通过对每个区域的 2b 种选项进行穷举搜索,恢复每个区域的 b 输入比特,并恢复匹配的初始噪声。

Comparing this reconstructed noise to the inverted noise zinv allows us to detect which patches may have been modified. The total time for this search scales as n2b (which is much faster than naively searching over all 2(bn) possible initial noise). After obtaining a per patch map (see Figure 3b), we may apply a spatial test as the one described

将此重建噪声与反转噪声 zinv 进行比较,可以检测出哪些图像块可能被修改。此搜索的总时间与 n2b 成正比(这比在所有 2(bn) 种可能的初始噪声中进行朴素搜索要快得多)。在获得每个图像块的映射后(见图 3b),我们可以应用空间测试,如所述

in Section 10.

在第10节中。

In any case, the local patch inspection is only required when an image is deemed watermarked by semantic similarity detection; but the watermark owner would like to have a finer understanding of the edits that might have been applied to it. This inspection is especially useful against the CAT ATTACK, described in Section 4.

在任何情况下,只有当图像被语义相似性检测判定为带有水印时,才需要进行局部补丁检查;但水印所有者可能希望更细致地了解可能对其进行的编辑。这种检查对于应对第4节中描述的CAT ATTACK尤其有用。

3.3. Analysis

3.3. 分析

Before formally analyzing our watermarking scheme, we state a simplifying assumption on the distance between the initial and reconstructed noise patches. We assume the noise patches are close if and only if the suspect image was produced from the same noise as the one given by our watermarking scheme. The impact of low-likelihood events, where unrelated patches end up close after noise reconstruction, remains part of our empirical analysis in Section 5.

在正式分析我们的水印方案之前,我们对初始噪声块与重建噪声块之间的距离做一个简化假设。我们假设噪声块接近当且仅当可疑图像是由与水印方案提供的相同噪声生成的。低概率事件(即不相关的噪声块在噪声重建后最终接近)的影响仍然是我们第5节实证分析的一部分。

Assumption 3.1 (Patch Distance Separation). There is a threshold τdist so that, for all generation noises z, inverted noises zinv , and patches i[k] ,

假设 3.1 (Patch Distance Separation)。存在一个阈值 τdist,使得对于所有生成噪声 z、反演噪声 zinv 和 patch i[k]
图片.png

if and only ifzinv=˙ Inverse Diffusion(Diffusion(z)).

当且仅当 zinv=˙ 逆扩散 (Diffusion(z))。

An immediate consequence of the patch distance separation assumption is that we never declare an image as watermarked if its initial noise was not generated using our watermarking scheme. In practice, such unrelated images can match with a few patches; however, it is highly unlikely for them to match with the nmatch needed for our method to declare a detection.

补丁距离分离假设的一个直接后果是,如果图像的初始噪声不是使用我们的水印方案生成的,我们永远不会将其声明为带有水印。在实践中,这些无关的图像可能会与少数补丁匹配;然而,它们与我们方法所需的 nmatch 匹配的可能性极低。

Unrelated prompts. A key property of our watermarking approach is its resistance to forgeries generated from unrelated prompts. Prior watermarking methods declare an image as watermarked as long as the pattern is embedded in the initial noise and the diffusion and inverse diffusion processes remain reasonably accurate. However, this creates vulnerabilities - an adversary could take an existing watermark and apply it to an unrelated, potentially offensive, or misleading prompt.

无关提示。我们水印方法的一个关键特性是它对来自无关提示生成的伪造内容的抵抗力。先前的水印方法只要初始噪声中嵌入了模式,并且扩散和逆扩散过程保持合理准确,就会声明图像为水印图像。然而,这带来了漏洞——攻击者可以获取现有的水印,并将其应用于无关的、可能具有冒犯性或误导性的提示。

In contrast, our approach strengthens watermark integrity by requiring that the new prompt remains semantically close to the original. This ensures that watermarks are not erroneously detected in entirely unrelated images.

相比之下,我们的方法通过要求新提示在语义上接近原始提示来增强水印的完整性。这确保了在完全不相关的图像中不会错误地检测到水印。

Lemma 3.2 (Detection Probability). Consider a suspect image× produced from our watermarking scheme with initial semantic vector v. Let ˜v be the (possibly quite different) semantic embedding of ˜x, , and θ[90,90] be the angle between v and ˜v . Set θmid as the threshold between related and unrelated semantic vectors. The probability that we identify the image as watermarked is

引理 3.2 (检测概率)。考虑由我们的水印方案生成的嫌疑图像×,其初始语义向量为 v。设 ˜v˜x 的(可能非常不同的)语义嵌入,θ[90,90] 为 v 和 ˜v 之间的角度。设 θmid 为相关和不相关语义向量之间的阈值。我们将图像识别为带水印的概率为

Table 1. Watermark Detection Probabilities. Example detection probabilities for suspect images generated by our watermarking scheme with an angle threshold of θmid=70 , n=1024 patches, and b=4 bits. For images with angles deviating by more than 5 from the threshold, our method distinguishes between related and unrelated watermarked images.

表 1: 水印检测概率。我们使用角度阈值为 θmid=70n=1024 个补丁和 b=4 位的水印方案生成的嫌疑图像的检测概率示例。对于角度偏离阈值超过 5 的图像,我们的方法能够区分相关和不相关的水印图像。

语义角度: 0(v,) 检测概率
80 3.06 × 10-6
75 0.0111
70 0.507
65 0.992
60 1.0

图片.png

where ρ(θ)=(1θ180)b

其中 ρ(θ)=(1θ180)b

We illustrate in the example below the sharp detection thresholds Lemma 3.2 implies; specifically how watermark detection probability varies with semantic similarity between the original and a potentially modified image. We delay the proof of Lemma 3.2 to the appendix.

我们在下面的示例中展示了引理 3.2 所隐含的尖锐检测阈值;具体来说,水印检测概率如何随原始图像与可能修改后的图像之间的语义相似性而变化。我们将引理 3.2 的证明推迟到附录中。

Example 3.3 (Sharp Detection Thresholds). Our watermarking scheme embeds a semantic vector v into an image at generation time. When evaluating a suspect image that was generated via our watermark, we extract its current semantic vector ˜v . The probability of a watermark detection depends on the semantic angle θ(v,˜v) between v and ˜v .

例 3.3 (尖锐检测阈值). 我们的水印方案在生成时将语义向量 v 嵌入到图像中。当评估通过我们的水印生成的嫌疑图像时,我们提取其当前的语义向量 ˜v。水印检测的概率取决于 v 和 ˜v 之间的语义角度 θ(v,˜v)

For instance, Figure 5c illustrates a separation between vectors associated with the original image and those that are unrelated, occurring at a threshold of approximately θmid 70 . When our watermarking scheme is run with θmid=70 n=1024 , and b=4 , Table 1 quantifies the probability of a watermark detection. For images with angles beyond 5 of the threshold, we almost always correctly dist