Perceptual Hashing of Deep Convolutional Neural Networks for Model Copy Detection

In recent years, many model intellectual property (IP) proof methods for IP protection have been proposed, such as model watermarking and model fingerprinting. However, with the increasing number of models transmitted and deployed on the Internet, quickly finding the suspect model among thousands of models on model-sharing platforms such as GitHub is in great demand, which concurrently triggers the new security problem of model copy detection for IP protection. As an important part of the model IP protection system, the model copy detection task has not received enough attention. Due to the high computational complexity, both model watermarking and model fingerprinting lack the capability to efficiently find suspected infringing models among tens of millions of models. In this article, inspired by the hash-based image retrieval methods, we introduce a novel model copy detection mechanism: perceptual hashing for convolutional neural networks (CNNs). The proposed perceptual hashing algorithm can convert the weights of CNNs to fixed-length binary hash codes so that the lightly modified version has the similar hash code as the original model. By comparing the similarity of a pair of hash codes between a query model and a test model in the model library, similar versions of a query model can be retrieved efficiently. To the best of our knowledge, this is the first perceptual hashing algorithm for deep neural network models. Specifically, we first select the important model weights based on the model compression theory, then calculate the normal test statistics (NTS) on the segments of important weights, and finally encode the NTS features into hash codes. The experiment performed on a model library containing 3,565 models indicates that our perceptual hashing scheme has a superior copy detection performance.


INTRODUCTION
In recent years, deep learning has achieved great success on a wide variety of tasks, such as image recognition [31,59], image-to-image translation [53], action recognition [42], knowledge graph reasoning [62], cross-view geo-localization [48], and video captioning [33]. But the training of high-quality models is often costly, which can be considered the intellectual property (IP) of the model creator. Thus, it is vital to pay attention to the IP protection of pretrained models. Recently, there have been some effective methods to protect deep neural networks against infringement by validating the authorship information of suspected illegal models, such as model watermarking [2,50,55,56,58] and model fingerprinting [7,30,60].
Model watermarking allows model producers to hide the IP in the models during the training phase for future authorship verification. Current works either add a regularization term to the loss function [34] or regard the predictions of a special set of indicator images as the required watermarks [32,37,51]. However, they all need to modify the models (e.g., retraining or finetuning) to embed the IPs message, which is time-consuming and not applicable to the large-scale model library. Moreover, the watermarking process [60] will often affect the prediction accuracy of the original models.
To avoid the performance degradation caused by model modification, model fingerprinting [7,27,30,60], which is free of modifying model parameters, is proposed to protect the IP by designing a set of special adversarial examples. By comparing the predicted label of the specified adversarial example with the pre-defined target labels or comparing the decision patterns of the models on a set of normal and special adversarial inputs, the model owner can determine whether it is a plagiarism model. However, a huge explosion of convolutional neural networks (CNN) models over the Internet indeed causes a new problem for IP protection. Specifically, there are millions of pretrained models on model-sharing platforms such as GitHub. It is a critical challenge for artificial intelligence companies to efficiently and accurately detect suspect illegal copy versions of the commercial models on those platforms. After obtaining the suspect model, the model owners can further conduct model watermarking or model fingerprinting for further identification and decision-making about whether to take legal action. In other words, the copy model retrieval can be seen as the pre-step of the model ownership verification. We name this new task as model copy detection just like image copy detection [29,61] in traditional multimedia.
Although model fingerprinting can be used for copy detection, the dependence on trigger sets makes fingerprinting not suitable for large-scale retrieval. Current model fingerprinting algorithms are solely designed for protecting classification models, which have limited generalizability across different tasks. Although model fingerprinting can be extended to other tasks by simply modifying adversarial sample generation schemes, such trigger-sets-based fingerprints are not generic and need to be optimized in a model-specific way, and their generation process is still very slow. More importantly, to retrieve illegal copies, it requires us to feed its specific trigger sets into every model in the model library to obtain their prediction, which is also extremely costly.  To solve the efficient copy model detection problem, and inspired by the hash-based image retrieval methods [13,49,63], we propose the first perceptual hashing algorithm for CNNs that converts the weights of CNNs to fixed-length binary hash codes. By comparing the hash code similarity between a query model (model the user enters to obtain the suspect copy model) and test models (models to be checked) from the model library, the copy versions of a query model can be detected efficiently. Table 1 shows the comparison among model watermarking, model fingerprinting, and proposed model perceptual hashing.
Since the characteristics of parameters in CNNs are different from pixels, traditional perceptual hashing methods for images cannot be applied to the neural networks directly: • First, due to the difference between the distribution of neural network weights and images, normalization methods for image perceptual hashing such as color space dimension reduction [52] and illumination normalization [46] are not suitable for neural networks. • Second, the neural networks do not have a unified representation [22], i.e., they could have different topological structures, including different depth, different number of nodes, different activation functions, and so on. • Third, the requirements of robustness and discrimination are different in image and neural network domains (see Section 3.1 for more details). For images, the copy versions are often generated by lossy compression, cropping, and resizing, while for CNNs, the copied models are often generated by fine-tuning and model pruning.
Therefore, it is necessary to construct a preferable hashing algorithm suitable for CNNs. The key point of constructing model hash is how to extract the robust and representation features from CNNs. In this work, we extract features from the important weights of CNNs. The features extracted from weights can be seen as the unique marks for the model, which can be used to distinguish between different models. However, to avoid significant accuracy drop, function-preserving modification such as model fine-tuning and pruning will not change the essential statistical characteristics of important weights too much, which is able to guarantee the robustness of hash. Specifically, we first use the model compression theory [5,9,15,17] to select important weights of the target CNN model and calculate the normal test statistics (NTS) [36,41] for each weight segment to get the fixed-length robust features. Finally, the robust features are encoded and encrypted to obtain the final hash codes.  1. Illustration of the perceptual hashing of CNNs for the illegal copy detection. Given a query model and a test model, the perceptual hashing of CNNs converts the weights of them to the hash codes and calculates the hamming distance of hash codes. Thus, the similarity between the query model and the test model can be measured. If the distance is lower than the threshold value, then the test model is then considered as the illegal copy version of the query model. Figure 1, by comparing the differences of hash codes, the similarity between each test model and the query model can be easily obtained. If the similarity distance is lower than one threshold value, then the test model is potentially one copy version of the query model.

As shown in
In summary, our contributions are threefold: • We introduce a new model copy detection mechanism, i.e., perceptual hashing for CNN models. To the best of our knowledge, this is the first perceptual hashing algorithm for deep neural network models. • We design an efficient and effective perceptual hashing algorithm based on the model compression theory, which is applicable to large-scale model retrieval. • We perform comprehensive experiments to validate the effectiveness and robustness of perceptual hashing on different models of different tasks.
The rest of the article is organized as follows: Section 2 presents related works. In Section 3, we describe our proposed hashing algorithms for CNN model copy detection. Section 4 gives the experiment results on 10 widely used models, a large deep-learning library called TIMM, and a model library we constructed that containing 3,565 models. The article is concluded in Section 5.

RELATED WORK 2.1 Model Watermarking
The concept of model watermarking is first proposed by Uchida et al. [50], and they designed a regularization loss to embed information in the weights while ensuring that the watermark embedding does not cause the decrease of network performance. However, extracting the watermark requires that all parameters can be accessed. To enable watermark extraction from remote models, black-box model watermarking is proposed in many works [2,11,16,[54][55][56]58]. For example, Zhang et al. [58] designed a black-box watermark based on the signature of a specific author in which they proposed three CNN-applicable watermark generation algorithms. These watermarks were labeled as the target category and added into the training dataset. After training, the network behaves normally on clean images and outputs the specified predictions on the specially designed watermark patterns. Further, to resist ambiguity attacks, Zhang et al. [57] proposes a new passport-aware normalization layer that acts as a plug-in module for the target model. Despite their effectiveness, all the above watermarking-based defense methods need to modify the models by retraining or fine-tuning, which are not only time-consuming but possibly affect the original model performance.

Model Fingerprinting
To avoid the risk of modifying models, Zhao et al. [60] proposed adversarial fingerprinting authentication, which aims at extracting the inherent features of the model itself instead of embedding fixed watermarks. The features they selected as model fingerprints were a set of specially crafted adversarial examples called Adversarial-Marks, which have better transferability and can help model owners determine whether the model is a plagiarism model. Similarly, Cao et al. [7] also used the adversarial examples near the classification boundary as the model fingerprints. To make the fingerprinting robust to model distillation, Lukas et al. [30] further propose the conferrable adversarial examples as the model fingerprints. To detect the plagiarism models that have different output space from the source model, Li et al. [27] proposed a model fingerprinting scheme that compares the behavioral patterns on a set of normal and special adversarial inputs. As described above, optimizing such trigger sets is both time-consuming and model specific. The model-specific property also makes fingerprinting not suitable for large-scale retrieval, because each test model needs to be evaluated on the trigger set of the query model on-the-fly.

Image Hashing
Image hashing is the process of mapping images of arbitrary size to fixed-size values, which can identify an input image. Image hashing can be divided into cryptographic hashing and perceptual hashing. Cryptographic hashing such as message-digest algorithm 5 [38] can verify the integrity of the data when all bits of data are transmitted correctly. Therefore, they are extremely sensitive to data bit changes, i.e., changing only 1 bit of the data will significantly change the output. Different from cryptographic hashing, perceptual hashing [1,12,21,28,35,44,46,47] can verify the integrity of the data when the data are modified in a content-preserving way. With perceptual hashing, the hash codes of the original image and the modified image are still similar if images are modified by image operators such as image filtering, JPEG compression, or image resize. In this article, inspired by the hash-based image retrieval methods, we propose a perceptual hashing algorithm for CNNs. Because the characteristics of parameters in CNNs are totally different from pixels in images, the traditional image perceptual hashing methods cannot be applied to the models directly, and it is crucial to construct the hash codes that are suitable for CNNs.

PERCEPTUAL HASHING OF CNNS
In this article, we propose to leverage hashing algorithms for CNN model copy detection. Considering that the illegally copied models may be pruned and fine-tuned while preserving accuracy, perceptual hashing is more suitable than cryptographic hashing. However, as the characteristics of CNNs are totally different from pixels in images, it is unsuitable for applying traditional image perceptual hashing methods to CNNs directly. Inspired by the model compression theory, we design a new perceptual hashing algorithm tailored for CNN models.

Problem Formulation
Given an input CNN model M and a secret key K, the target of perceptual hashing of CNNs is to design a function f , which can generate a t-bit vector h ∈ {0, 1} t . Let M 1 and M 2 be two input models and their corresponding binary hash codes h 1 and h 2 are calculated by the following: For CNN model copy detection, the following three requirements should be considered.

Robustness.
If the test model is some common-model-modified version (the model after common modification such as model fine-tuning and model pruning) of the query model (we called the two models are related), then the distance of the hash codes should be smaller than the threshold th. Formally, ∀M 1 , M 2 , we have the following: where M 2 is the common-model-modified version of the query model M 1 , D is the distance of two models and th is the threshold, and P denotes the probability.
The related models represent the test models that are generated from the query model, which is in accord with the IP infringement in practice. Specifically, we follow the setting in the recent model IP protection works [26,60], where the attacker tries to obtain his pirated model by finetuning the original model or pruning the original model by common model compression techniques.

Discrimination.
The hash codes of unrelated models (i.e., a test model is neither the query model nor its common-model-modified version) should be discriminative. In other words, if two models are unrelated, then the distance of hash codes should be higher than the threshold th.
(3) The unrelated models represent the test models that cannot be obtained by modifying the query model, those models cannot be the copied version of the query model. In this article, we consider two kinds of these models: models with different structures and models with the same structure but trained with different random initialization. Note that, the same as with IPGuard [7], we consider the models trained with the same training setting while different random initializations are unrelated models, i.e., one model cannot be an illegal copy version of the other. Specifically, to get a model with different random initialization from the stolen model, the adversary needs to initialize and retrain the stolen model on the original dataset. However, for the adversary, (1) it is almost impossible to obtain the same dataset, training hyper-parameters as the original model and sufficient computing resources to initialize and retrain the model, and (2) rather than stealing the model, the adversary would rather train his/her own model in this scenario, since those two procedures require similar amount of training data and computing resources.

Security.
For security considerations, it is extremely significant to ensure that the hash code cannot be estimated without knowing the secret key. Otherwise, the adversary would be able to conduct intentional attacks by using an estimated hash. Specifically, assume an attacker has a copied model and has estimated the hash code of the copied model. The attacker can fine-tune the copied model by regularizing the model loss with the negative L2 norm loss of the hash code and the estimated hash. In this way, it is possible for an attacker to generate models whose performance is similar to the copied model but whose hash is significantly different from the copied model. Thus, the attacker can use the copied model illegally and escape the copy detection.
Therefore, the hash value must be highly dependent on the key. If two different keys are used for the same model, then the corresponding hash values should be totally different. Formally, ∀M, we have the following:  The hash code can be robust against common signal processing operations such as resizing, cropping, and so on.
The hash code can be robust against common model modifications such as fine-tuning and model pruning.

Discrimination
If a test image is neither the query image nor its modified version after common signal processing operations, the distance of hash codes can be larger than the threshold.
If a test model is neither the query model nor its modified version after common model modifications, the distance of hash codes can be larger than the threshold.

Security
Without the correct key, the hash code can be extremely difficult to be approximated.
various signal processing operations, the hash code should change little even after these operations. However, the most possible modification operation to a CNN model is model pruning or finetuning [2,50,58]. That is, the hash code of a neural network after fine-tuning or other possible modifications should remain similar to the original one.

Proposed Framework
In this section, we will elaborate on the framework of our perceptual hashing algorithm. As shown in Figure 2, our hashing technique starts with the model pruning operation to select some important weights. The pruned weights are then normalized, reshaped, and divided into several segments. After that, the NTS of each segment is computed as the fixed-length features. Finally, quantization and encryption are used to convert the NTS into hash codes. By comparing the hash similarity between the target query model and all the test models in the model library, all the possible copy versions of a query model can be detected efficiently. Below, we will explain each part in detail.

Weights Selection.
As mentioned above, it is difficult to apply the traditional image perceptual hashing directly onto CNNs, i.e., we must construct the hash codes for CNN models by utilizing their own characteristics. Considering that there is much redundancy in the CNN weights, the perceptual hashing algorithm should ensure its robustness if the target model is modified in a function-preserving way, e.g., pruningand fine-tuning. Inspired by model compression theory [17], we design the hash codes to be relevant to the important network weights and adopt the absolute value of weights to rank the importance of the weights. That is, the hash codes are constructed by the weights with large absolute values. Specifically, let v be the list of the absolute value of all the weights in model M and c be the ratio of weights to be kept (we call c as the weights selection ratio in this work); we then calculate a threshold th v such that the target ratio of weight magnitudes is under that threshold, where quantile (·, q) means the function that get the qth quantile of a list. The qth quantile of a list v = [v i ] can be computed by the following equation: whereṽ q means the qth quantile of the list v. Afterward, the weights whose magnitudes are under th v will not involve in hash code computation. To further enhance the robustness, we normalize the pruned weights of each layer. We will discuss this in more detail in Section 4.6.

Fixed-length Features Generation.
In this section, we calculate the NTS in the segments of pruned weights to get the robust feature. The normal test statistics [36,41] are a set of statistics for measuring the distance of a random process from Gaussianity, which are often used to analyze the statistical characteristics of a random process.
The reason we choose normal test statistics as features for hashing is detailed as follows. As mentioned above, for the robustness of hash, the features should be extracted from the weights that have important effects on the overall deep neural networks (DNN) outcome. And the impact of weights can be measured by the distance between the weight and the Gaussian distribution [9]. This is because it is the common practice in the deep learning community to use the 2 norm for regularization during training in an attempt to avoid overfitting. However, 2 regularization forces the CNN filter parameters to follow a Gaussian distribution [6]. Nevertheless, Gaussianity is not an ideal characteristic for CNNs. Because important weights are known to be non-Gaussian, the distance of a random process from Gaussianity might be used to characterize the important weights, i.e., accuracy-preserving distortion will not change the essential statistical characteristics of weights. Another advantage of using NTS is that the one-way property can be well satisfied, i.e., due to the high complexity, it is difficult to reconstruct meaningful weights from NTS.
Specifically, for a convolution layer i in model M, we flatten the filter of this layer w i ∈ R d ×h×l ×l into a one-dimensional vector as r i , where h and d denote the number of input channels and output channels, respectively, and l is the size of convolution kernel. For the model with L convolution layers, we can obtain a flatten-weights-sequence R = {r 1 , r 2 , . . . , r L }. Then we concatenate R and divide the concatenated vector into non-overlapped N segments equally, R = {r 1 , r 2 , . . . , r N }.
After that, we calculate the NTS of each segment r i to be the features that represent model perceptual content. For N segments, we can obtain a NTS sequence s = {s 1 , s 2 , . . . , s N }. Following Reference [36], we adopt the Shapiro-Wilk test statistics [41] as the NTS because of its increased robustness. Given r i = [x 1 , . . . , x n ] where x i is the weight, the Shapiro-Wilk test statistics can be computed as follows: where x (k ) is the kth order statistic, i.e., kth smallest number in the r i ; x = (x 1 + · · · + x n ) /n is the mean. The coefficients o k are computed by (o 1 , . . . , o n 1/2 and the vector m = (m 1 , . . . ,m n ) T is composed of the expected values of the order statistics of independent and identically distributed random variables that sampled from the standard normal distribution; finally, V is the covariance matrix of those normal order statistics.

Hash Generation.
In this section, the NTS features are converted into the final hash codes by quantization and encryption. Quantization converts features into binary codes and only XOR operation is required to compute the similarity between features [19], which improved the computational efficiency of copy detection. Specifically, the NTS features are finally quantified into the binary hash code a = {a 1 , a 2 , . . . , a N −1 } by the following equation: where i = 1, 2, . . . , N − 1.
To improve the security of the scheme, the finally generated t-bit (t = N − 1) hash sequence h is encrypted using a secret key by the following equation: where k is pseudo-random bits generated by the secret key and ⊕ is the exclusive OR. The complexity of the hash generation is O(n c + n w · log(n w ) + t ), where n c is the number of connections of the input model M, n w is the number of weights of all convolution layers of M, and t is the hashing length. In addition, the complexity is O(t ·n hl ) in the retrieval stage where n hl is the number of models in the model library. To summarize, the computational complexity of our method is linear to n c and t, and linearithmic to n w in the hash generation stage, and the search complexity is linear to n hl and t. Therefore, the proposed hash is scalable for the large-scale model copy detection task.

EXPERIMENTS
In this section, we first introduce the experimental settings and then evaluate our perceptual hashing algorithm from seven aspects: robustness, discrimination, impact of parameters on hashing performance, security, adaptive attacks, copy detection performance on the model library, and computational cost.

Models.
To test the basic robustness and discrimination of our scheme, as shown in Table 3, we consider 10 model structures that are widely used for evaluating the our algorithm. The first 5 models are pretrained on CIFAR-10 [23]. MNIST-Net is pretrained on MNIST [10]. LeNet is pretrained on GSTRB [45]. ResNet18 is pretrained on ImageNet [39]. SRResNet is IR means Image Recognization. TSR means Traffic Sign Recognition. IS means Image Super-resolution. The performance of first nine models is measured by top-1 accuracy (%), and the performance of SRResNet is measured by the peak signal-to-noise ratio (PSNR) between the recovered high resolution image and the ground truth.
pretrained on 291 images with data augmentation including flipping, rotation, and downsizing and evaluated on Set5 [4], which is a widely used benchmark datasets for super-resolution and consists of natural scenes.
To further verify the discrimination of the proposed scheme, a large deep-learning library called TIMM 1 is selected, which includes lots of SOTA computer vision models trained on ImageNet and each model has a unique structure. We select 347 CNNs that only contain convolution layers and linear layers for the discrimination test.
To test the performance of our method in the environment as close to the real application as possible, we constructed a model library containing 3,565 models and tested the copy detection performance in the model library. The model library is formed by mixed 10 basic models in Table 3 and their modified versions and models from TIMM and their fine-tuned versions.

Metric.
To measure the similarity between hashes, we exploited the hamming distance as an evaluation metric: where h 1 and h 2 are two CNN hashes to compare. If D (h 1 , h 2 ) is smaller than a pre-defined threshold, then the CNNs of the input hashes are considered as similar models.

Implementation Details.
We set the threshold of hash code th as 0.32, weights selection ratio c as 16 and the length of hash code t as 1,024 bits. The effect of different th, c, and t on hashing performance will be discussed in Section 4.4.

Robustness
For validating robustness, we compare the hash of a CNN with the common-model-modified version. Two common processions, pruning and fine-tune, are exploited on these CNNs to generate common-model-modified CNNs. The validation starts by extracting hashes of the original CNNs and their similar versions, and the similarity between each pair of hashes is calculated with the normalized hamming distance. In theory, the normalized hamming distances are supposed to be smaller than the threshold of hash code th between the original models and their similar versions.

Model Pruning.
Assume the attacker uses the amplitude-based model pruning to sparsity the weights of models to avoid retrieving. To prune a specific layer, one sets several parameters that possess the weights with small magnitudes to zeros and then sparsely fine-tunes the model using the cross-entropy loss with 10 epochs to recoup the drop in accuracy.
As explained, the results of normalized hamming distance under different compression ratios are presented in the left sub-figure of Figure 3, where the x-axis indicates the specific compression ratio of model pruning and the y-axis represents the corresponding normalized hamming distance. From the left sub-figure of Figure 3, it can be observed that all the distances of 45 cases are below the threshold, which indicates that our hashing can detect the similar models well. In other words, we have achieved quite well robustness against pruning attacks.

Model Finetuning.
Model fine-tuning is an attack that an adversary attempts to avoid retrieval by modifying the weights of models. Specifically, the adversary retrains the model with half of the original test dataset to change the weight of the model so that great changes have taken place to the hash code while the accuracy is preserved. For the image super-resolution model SRResNet, we assume that the adversaries use four images of DIV2k [3] dataset to generate a fine-tuning dataset by data augmentation.
The right sub-figure of Figure 3 shows the normalized hamming distance of the model hashes and the fine-tuned version with various parameters, where the x-axis is the epochs of model finetuning and the y-axis is the corresponding normalized hamming distance. We can see that all the normalized hamming distances are below the threshold. Therefore, we conclude that our proposed hashing scheme is robust to model fine-tuning.

Discrimination
Indeed, a fairly small distance between the hashes extracted from the original models and commonmodel-modified version does not necessarily mean that the hashing system is reliable unless it can efficiently distinguish between unrelated models. To support this argument, we evaluate the discrimination performance on models with different structures, models with the same structure while different random initializations, and 347 models of the TIMM library.

Discrimination on Models with Different Structures.
The hamming distance between hashes is computed for validating discrimination. First, we extract hashes from these CNNs. Then, we calculate the hamming distance between each CNN hash with the corresponding hash of other CNNs. The hash distances for the same (on the diagonal) and unrelated model pairs (others) are shown in Table 4. As illustrated, most of the hash distances for the same models are close to zeros, while for unrelated model pairs, they are close to 0.5 and higher than the threshold of hash code 0.32. Notice that though the structure of ResNet32, ResNet56, and ResNet110 are similar, the hashes of them are very different, which proves the discrimination ability of our scheme.

Discrimination on Models with Different Random Initialization.
As described in Section 3.1, models with different initialization should be distinguished by the hashing system. To test the Normalized hamming distance between hash codes of same (on the diagonal) and different models (others) are deployed. proposed scheme in such scenarios, we train 10 models for each architecture in Table 3 using different initializations. Due to our limited computation resource, we do not test the ResNet18 for Im-ageNet and SRResNet for Set5. Table 5 reports the average and maximum of hash distances for the independent trained model pairs. We can see that all average/maximum are higher than the threshold 0.32, which indicates that our has satisfactory discrimination performance for the independent trained models. The reason why the hash can distinguish the models is that the initialization will have an important impact on the performance and the convergence of the model. Because of the non-convex of modelâĂŹs loss function, random initialization and stochastic gradient descent can cause the loss function to find a new local minimum and lead to different weights distribution.

Discrimination on Models of TIMM Model Library.
To further verify the discrimination of the proposed scheme, the 347 CNNs that only contain convolution layers and linear layers in TIMM are selected for the discrimination test. The hash of all 347 models are extracted, the hamming distance between each pair hash codes are calculated, and we finally obtain C 347 2 = 60, 031 results. Furthermore, the hamming distances are compared with the threshold th and the number of unrelated models judged as related models are computed. When th = 0.32, only 62/60, 031 = 0.10% unrelated model pairs are misjudged as related models, which further proves that our scheme has satisfactory discrimination.
The neural network consists of two parts: weight and topological structure. In this work, although only the weights information are encoded to hashing, the experiments result on TIMM library indicate that our proposed perceptual hashing scheme has a good copy detection performance. This means our scheme can work well in cases where only the weights information is accessible in the model library.

Impact of Parameters on Hashing Performance
4.4.1 Impact of Threshold th. To quantify the robustness and discrimination of the proposed scheme under different thresholds th, two criteria are introduced, true-positive rate (P T P R ) and false-positive rate (P F P R ). P T P R indicates that related models are correctly detected, which is equal to robustness. The larger the P T P R , the better the robustness. P F P R represents that unrelated models are misidentified as related models, which is equivalent to discrimination. The smaller P F P R , the The best th (highest P T P R and lowest P F P R ) is highlighted in bold.
better discrimination, P T P R = n 1 N 1 , where n 1 is the number of related models that are correctly detected, N 1 is the total number of related models, n 2 is the number of unrelated models judged as related models, and N 2 is the total number of unrelated models. The P T P R and P F P R with different threshold values are shown in Table 6. It can be observed that the P T P R = 1 and P F P R = 0.0010 when th = 0.32, which means all the related models are detected and only 0.10% unrelated models are misjudged. Therefore, we can take th = 0.32 as the threshold to achieve satisfactory tradeoff between robustness and discrimination.

Impact of Weight Selection Ratio c.
We discuss the impact of different weight selection ratio c on hashing performance through the Receiver Operating Characteristic (ROC) curve [14]. The ROC curve is formed by a set of points (P F P R , P T P R ) across varying thresholds and is taken to evaluate the balance between robustness and discrimination. In general, the x-axis is defined as P F P R , the y-axis is considered the P T P R in the ROC diagram, and the ROC curve closer to the top-left corner corresponds to a small P F P R and a big P T P R , i.e., better tradeoff between robustness and discrimination than those curves far away from the top-left corner.
Specifically, only the weight selection ratio c is changed on the condition of remaining other parameters invariant, and then the weight selection ratios c are 4, 8, 16, 32, and 64, respectively. On the basis of 95 related model pairs in Table 3 and 60, 031 unrelated model pairs in TIMM, we design different thresholds to get the final ROC curves, as shown in left subfigure of Figure 4. It can be seen that the curve of c = 16 is the one closest to the top-left corner (which has the largest area of ROC curve 0.99953), which means the tradeoff between robustness and discrimination of the weight selection ratio c = 16 is better than others. Therefore, 16 is a moderate weight selection ratio for reaching a better tradeoff between the robustness and discrimination.

Impact of Hashing Length t.
In this section, we will discuss the effect of hashing length on hashing performance in ROC. The right subfigure of Figure 4 is the ROC curve comparisons among different hashing lengths. It can be seen that the ROC curve of t = 1,024 is closer to the top-left corner than other curves (has the largest area of ROC curve 0.99956). Thus, t = 1,024 is a good choice for keeping a desirable tradeoff between robustness and discrimination.

Security
In this section, we analyze the security of the proposed approach. It is extremely significant to ensure that the hash code cannot be estimated without knowing the secret key. Otherwise, the adversary would be able to conduct intentional attacks by using an estimated hash code. Ideally, the hash codes using wrong keys should be uncorrelated with the original hash code. In our experiments, a total of 10,000 hash h were generated using 10,000 wrong keys and compared with all  Table 3 and the false-positive rate is computed on TIMM. the original hash. The entire process is repeatedly applied to all the 10 models in Table 3. Figure 5 shows distance distributions between the hash using the wrong keys and the hash using the correct key on 10 models. From the figure, we can see that the distances of all cases are close to 0.5, which means that the hash using wrong keys and the original hash are uncorrelated. Moreover, the minimum distances in all cases are larger than the threshold 0.32, as well as no hashes with the wrong key are misjudged.

Adaptive Attacks
It is essential to verify the robustness of our method against adaptive attacks where attackers are aware of the hashing scheme used, say, attackers can invalidate the hashing algorithm by exploiting non-identifiability in convolutional networks such as weight scaling and weight permuting attacks. The accuracy (ACC) of the model and the hash distance between the attacked model and the original model when randomly extracting and scrambling the p% of convolution layers are measured. The performance and the hash distance between the attacked and original models when randomly extracting and scrambling the p% of convolution layers are measured. The performance of SRResNet is measured by the peak signal-to-noise ratio (PSNR) between the recovered high-resolution image and the ground truth.

Weight Scaling Attack.
To escape the copy detection, attackers scale the weights to change the distribution of weights while keeping the functionality of models simultaneously. For example, the attacker multiplies the weights of one layer by a factor and divides the weights of the next layer by that factor. However, because of the weights normalization, scaling weights will not change the hash codes. In other words, the proposed algorithm is robust against the weights scaling attack.

Weight Permuting Attack.
Attackers randomly select p% weights of a convolution layer from the model and permute them within the extracted convolution layer. They expect to change the hash code of the scrambled model significantly while maintaining the model performance. Table 7 shows the classification accuracies of models in Table 3 and their hash distances between the tampered models and the original models. Each group of experiments was repeated for 5 times to reduce the impact of different scrambled layers. In the table, distance measures the hash distance between the scrambled model and the original model. We can see that almost all average model accuracy has dropped at least 30% when p ≥ 15 and all the average/maximum of the hash distance are lower than the threshold value 0.32. In other words, the attacker cannot make the hash code of the scrambled model significantly different from the original model without a significant accuracy drop, which shows that our algorithm has the robustness to weight permuting attack.
The results of SRResNet, MNIST-Net, and LeNet verify the robustness of the proposed scheme to weight permuting attack again. Results of weight permuting attack on SRResNet are summarized in Table 8, and that of MNIST-Net and LeNet are summarized in Table 9. In Table 8, all the average/maximum of the hash distances are lower than the threshold 0.32, and when p ≥ 10, the PSNR has dropped at least 10 dB. And in Table 9, under the circumstance of permuting either of the two layers in MNIST-Net or LeNet, all the average/maximum of the hash distances are lower than the threshold 0.32, and in most cases, the average model accuracy has dropped at least 20%.

Adaptive Model Finetune
Attack. The attacker can fine-tune the model to escape the copy detection by regularizing the model loss with the negative L2 norm loss of the hash code and the original hash. However, because the hash is encrypted, the attacker cannot know the true hash value.

Copy Detection Performance on the Model Library
In this section, we analyze the copy detection performance of our method through a model library that contains a lot of models. Specifically, we use 347 pretrained models from TIMM, and each model is fine-tuned nine epochs with 500 images of a test dataset to get 3,470 different models. Then, we mixed the 95 models (the 10 models in Table 3 with their modified versions in Section 4.2) and 3,470 models to form a model library. The 3,470 models were different from the models in Table 3, i.e., not the common-model-modified version. The 10 models in Table 3 were used as query models to test the effect of copy detection by comparing each of them with the models in the model library.
To evaluate the copy detection performance of our scheme, we adopt average true-positive rate (P AT P R ) and average false-positive rate (P AF P R ). P AT P R indicates that similar models are correctly identified, and P AF P R indicates that unrelated models are wrongly identified as similar models, where m is the number of query models, n j 1 is the number of correctly detected copy models of jth query model, n j 2 is the number of the unrelated model that is mistakenly detected as copy models, N j 1 is the true number of copy models, and N j 2 is the true number of unrelated models. By default m = 10, N j 1 = 10, N j 2 = 3, 470 + 95 − 10 = 3, 555. When the threshold for identifying copy models th = 0.32, P AT P R is 100%, and P AF P R is only 0.017%, which indicates that the copy detection has high P AT P R and low P AF P R . Therefore, we can argue that the proposed scheme has a good copy detection performance on the model library.

Computational Cost
We validate the practicability of the proposed scheme by evaluating the computational cost, i.e., the average execution time for hash generation. This is done by calculating the total time of extracting hashes on 357 different models (the 10 models in Table 3 and 347 models in TIMM), and the average time of extracting each hash is then recorded. The average computational time of the proposed method is only 10.23s using PyTorch on the single Intel Xeon Gold 5120 CPU @2.20 GHz, which is fair for practical use. Besides, the average computational time of weight selection stage and feature generation stage are 1.13 s and 9.09 s, respectively. We also test the average computational time of model copy retrieval, which is done by calculating the average time of 34,700 hash matches (the 10 models in Table 3 as the query models and the 3,470 fine-tuned TIMM models in Section 4.7 as the test models). The average computational time of model copy retrieval is only 2.13 × 10 −5 , which shows the efficiency of the proposed hash.

CONCLUSION AND DISCUSSION
In this article, we present a new problem: computing perceptual hashing of DNNs for model IP protection. Inspired by the hash-based image retrieval methods, we design the perceptual hashing and apply it into deep CNNs. Specifically, the important weights of models are selected based on the model compression theory, then the NTS are calculated on the segments of important weights, and, finally, the NTS features are encoded into hash codes. Experiments demonstrate that the proposed method can achieve satisfactory robustness, discrimination, and security. However, some limitations need to be addressed in the future. For example, we need to find more advanced features to achieve the satisfactory robustness to the more challenging attack such as knowledge distillation.