Private and Threshold Set-Intersection

In this paper we consider the problem of privately computing the set-intersection (private matching) of sets, as well as several variations on this problem: cardinality set-intersection, threshold set-intersection, and over-threshold set-intersection. Cardinality set-intersection is the problem of determining the size of the intersection set, without revealing the actual set. In threshold set-intersection , only the elements which appear at least a threshold number t times in the players’ private inputs are revealed. Over-threshold set-intersection is a variation on threshold set-intersection in which not only the threshold set is revealed, but also the popularity of each element in the set. Let there be n ≥ 2 players, c < n dishonestly colluding, each with a private input set of k elements. Our protocols for two parties are comparable to those of Freedman, Nissim, and Pinkas [12], except in the malicious case, in which we achieve security without the use of the expensive cut-and-choose technique. Freedman, Nissim, and Pinkas also address the problem of private set-intersection for multiple parties, giving a protocol secure against honest-but-curious parties with O ( n 2 k ) total communication complexity, but no protocol secure against malicious adversaries. We give a protocol for this scenario secure against honest-but-curious parties with O ( cnk ) total communication complexity, and against malicious parties with communication complexity O ( n 2 k 2 ). There was no previous eﬃcient protocol for the problems of cardinality set-intersection (for n ≥ 3 players or n ≥ 2 malicious players), threshold set-intersection, or over-threshold set-intersection. We give the ﬁrst eﬃcient protocols for these problems without generic secure circuit computation.


Introduction
In this paper we consider problems related to privately computing the intersection of sets: set-intersection, cardinality set-intersection, threshold set-intersection, and over-threshold set-intersection.Let each party hold a private input set.The set-intersection is the intersection of the private sets.Cardinality set-intersection is the problem of determining the size of the intersection set, without revealing the actual set.In threshold set-intersection, only the elements which appear at least a threshold number t times in the players' private inputs are revealed.Over-threshold set-intersection is a variation on threshold set-intersection in which not only the threshold set is revealed, but also the number of times each element in the set appeared in the private inputs.
These problems have many applications in sharing personal or private information, such as medical databases, online dating profiles, and distributed network monitoring.We give several specific examples: 1.To determine the number of prisoners who have not paid their income tax, one would use the cardinality set-intersection protocol; the list of prisoners is private, the list of people with outstanding income tax is private, as well as the intersection set.2. A group of people might wish to determine shared movie preferences without embarrassment, revealing only the names of movies which are liked by all members of the group, using the set-intersection protocol.3. Companies that have had their networks scanned by hackers can determine which hackers are a significant problem to many companies by using the threshold set-intersection protocol.4. Pharmacies can use a variation on the threshold set-intersection protocol in to find people filling the same prescription at multiple pharmacies.Unfair threshold set-intersection lets only those pharmacies who filled a prescription for a cheating patient learn their identity.5. Results of a survey to determine popular musical artists can be distorted by people's desire to avoid embarrassment, as they might not include the names of artists who they do not believe to be sufficiently 'cool' and popular.Allowing people to profess an interest in a musical artist privately, where it will be suppressed if unpopular, and never linked to a specific person's preferences might well increase the honesty of participants.The over-threshold set-intersection protocol will reveal the popular artists, and how popular they are.
Our Contributions Freedman, Nissim, and Pinkas proposed protocols that improved substantially on the best known solution for the set-intersection problem [12].We propose protocols that are more efficient in the malicious case for set-intersection, as well as new protocols for problems which had no previous solution that did not  • Set-intersection protocols for the case of n = 2 malicious parties that do not utilize the cut-and-choose technique, and for the case of n ≥ 3 malicious parties, for which there was no previous efficient solution.
• Cardinality set-intersection protocols secure against n ≥ 2 malicious parties and n ≥ 3 honest-butcurious parties, for which there were no previous efficient solutions • Threshold set-intersection protocols for n ≥ 2 honest-but-curious parties, for which there was no previous efficient solution • Over-threshold set-intersection protocols for the case of n ≥ 2 honest-but-curious parties and the case of n ≥ 2 malicious parties, for which there was no previous efficient solution • Fair protocols for all problems These results are summarized in Table 1.The communication complexity of protocols is shown in terms of ciphertexts.The size of these ciphertexts is max κ, O lg |P | + lg 1  , where κ represents the size of the ciphertext domain used in the protocols, and is dependent on the security parameter for the cryptosystem.
In the protocols secure against malicious parties presented in this paper, the overhead in communication complexity is a result of the size of the zero-knowledge proofs employed.These larger proofs, however, are generally built out of simple, efficient proofs, such as a proof of equality of discrete logarithms [4].
We offer security proofs for our protocols for both the honest-but-curious case (indistinguishability proofs) and the malicous case (simulation proofs).Due to space limitations, the theorems of the correctness and security of our protocols and the proofs are in Appendix D and F.

Problems and Models
In this section we define the problems for which we propose protocols, and discuss the adversary models under which we secure these protocols.

Problem Defintions
Assume we have n parties; each has a private input set S i (1 ≤ i ≤ n) of size k.By engaging in the protocols for the problems defined below, every player learns the specified answer set.
Set-Intersection All players learn the intersection of all private input sets S i ; that is, each player learns No element may appear in any player's private input more than once.
Cardinality Set-Intersection All players learn the size of the intersection set of all private input sets S i ; that is, each player learns No element may appear in any player's private input more than once.
Threshold Set-Intersection All players learn which elements appear in the combined private input of the players at least a threshold number t times.For example, assume that a appears in the combined private input of the players 15 times.If t = 10, then all players learn a.However, if t = 16, then no player learns a.An element may appear in a player's private input more than once.
We offer protocols for several variants on threshold set-intersection: unfair (in which each player learns which of his elements are in the threshold set), perfect (which conforms exactly to the definition, but is less efficient), and semi-perfect (in which the protocol is secure if the coalition does not submit more than t − 1 copies of any element a, and in which the coalition may learn that there exists some other player with at least one copy of a in his private input).We do not consider the difference in security between the semi-perfect and perfect variants to be significant.
Over-Threshold Set-Intersection All players learn which elements appear in the combined private input of the players at least a threshold number t times, and the number of times an element so appears.For example, assume that a appears in the combined private input of the players 15 times.If t = 10, then all players learn a has appeared 15 times.However, if t = 16, then no player learns a or the number of times it has appeared.As in threshold set-intersection, an element may appear in a player's private input more than once.

Adversary Models
In this paper, we present protocols for the aforementioned problems in two standard adversary models.We give only an intuitive notion of each.These notions are formalized in [15].
Honest-But-Curious An honest-but-curious player is assumed to follow the protocol exactly.Security in this model is straightforward: no player or coalition of players (who cheat by sharing their private information) gains information which is not inherent in the output of the calculated function.Formally, consider an ideal scenario in which a trusted third party receives the input of each party, calculates the function output, and broadcasts it to each player.We require that when the parties perform the protocol, each party does not learn information it does not learn in the ideal case, nor can any player or cheating coalition distinguish between possible sets of private input sets held by non-cheating players, where each possible set of input sets results in the same answer set.
Malicious A malicious player (or coalition of such players) will do anything within their power to extract extra information in the course of the protocol.We cannot, however, prevent malicious players from choosing their 'private input' arbitrarily or refusing to participate in the protocol at any point.The standard definition of security in this case does not require enforcement against these actions [15].Instead, the definition of security compares the ideal model (where the trusted third party computes the function, but malicious parties may submit any value as their 'private input') with performing the protocol in the real scenario.If the protocol is secure, we can construct a simulation that translates any strategy of the coalition of malicious players in the real model into the ideal model such that the coalition gains computationally indistinguishable information in the two scenarios.

Preliminaries
In this section, we introduce the mathematical and cryptographic tools that we use to construct our protocols.

Additively Homomorphic Cryptosystem
In this paper we utilize a semantically-secure [16], additively homomorphic public-key cryptosystem.Let E pk (•) denote the encryption function with public key pk.The cryptosystem supports the following two operations that can be performed without knowledge of the private key: (1) Given the encryptions of a and b E pk (a) and E pk (b), we can efficiently compute the encryption of a + b, denoted as E pk (a + b) := E pk (a) + h E pk (b); (2) Given a constant c and the encryption E pk (a), we can efficiently compute the encryption of ca, denoted as Moreover, we also require the homomorphic public-key cryptosystem to support ciphertext re-randomization, i.e., one can transform a ciphertext into a different ciphertext encrypting the same plaintext, in such a way that is it difficult to determine that it is a transformation of the original ciphertext.
We also require the homomorphic public-key cryptosystem to support secure (n, n)-threshold decryption, i.e., the corresponding private key is shared by a group of n players, and the decryption is performed by all players acting together, but cannot be performed by fewer than n players.In our protocols for the malicious case, we require that the decryption protocol be secure against malicious players; typically, this is done by requiring each player to prove in zero-knowledge that he has followed the threshold decryption protocol correctly [14].
In the protocols we give for the malicious case, we also require the homomorphic cryptosystem to allow efficient zero-knowledge proofs of plaintext knowledge and zero-knowledge proofs for the correctness of certain operations, as detailed in Section A.1.
In the rest of this paper, we simply use E pk (•) to denote the encryption function of the homomorhpic cryptosystem which satisfies all the aforementioned properties, and one could use Paillier's cryptosystem for the concrete instantiation.

Polynomials
Let R denote the domain of the plaintext of the homomorphic public key cyrptosystem (in Paillier's cryptosystem, R is Z N ).We can define the polynomial ring R[x] where the coefficients of the polynomials are from R. Let f be a polynomial in R (i + 1)f i+1 x i , and the dth formal derivative f (d) as the result of taking the formal derivative sequentially d times.We also define the encryption of polynomial f as the ordered list of the encryptions of its coefficients E pk (f 0 ), . . ., E pk (f deg(f ) ) under the homomorphic cryptosystem.

Algorithms for Operations on Encrypted Polynomials
Let f , g, and h be polynomials in R Let a and b be elements in R. Using the homomorphic properties of the homomorphic cryptosystem, we can efficiently perform the following operations on encrypted polynomials without the knowledge of the private key: • Evaluation of an encrypted polynomial at an unencrypted point: given the encryption of polynomial f , we can efficiently compute the encryption of b := f (y), by calculating ). • Sum of encrypted polynomials: given the encryption of polynomial f and g, we can efficiently compute the encryption of the polynomial h := f + g, by calculating • Product of an unencrypted polynomial and an encrypted polynomial: given a polynomial g and the encryption of polynomial f , we can efficiently compute the encryption of polynomial h := f * g, (also denoted g * h E pk (f )) by calculating the encryption of each coefficient • Derivative of an encrypted polynomial: given the encryption of polynomial f , we can efficiently compute the encryption of polynomial h := d dx f , by calculating the encryption of each coefficient

Polynomial Factoring
Some protocols in this paper can yield better efficiency if efficient polynomial factoring is possible without knowledge of the private key.If R is a field of known order, then we can have efficient polynomial factoring [28].If R is a ring with s subfields of known order, then we can have efficient polynomial factoring through polynomial factoring in the subfields.However, in this case, we may obtain a larger number of roots than the degree of the polynomial (in particular, there can be k s roots for a polynomial of degree k), which will make polynomial factoring more computationally expensive.However, we are not aware of an additively homomorphic cryptosystem whose domain is a field of publiclyknown order larger than GF 2 .Note that we cannot simply use a single sub-field of the plaintext ring in the otherwise acceptable Paillier cryptosystem; doing so requires revealing that subfield, and thus the factorization of N from which the scheme's security is drawn.The Naccache-Stern cryptosystem gives subfields of publiclyknown orders, but cannot include a large subfield, making factoring inefficient [23].
Thus in this paper, when appropriate, we give protocols for both the cases where we assume efficient polynomial factoring is possible and where we do not use polynomial factoring.For protocols where we assume efficient polynomial factoring is possible, we use a technique called simulated calculations over a field.Instead of drawing coefficients directly from a field Z p , we may instead choose the homomorphic cryptosystem to have a sufficiently large plaintext domain (say Z m ) so as to perform all calculations without ever "wrapping around" (being taken modulo m).Threshold decryption can be performed so as to only reveal the element modulo p [20].However, this is generally quite inefficient.For example, for the over-threshold set-intersection protocol (with polynomial factoring) described in this paper, the cryptosystem must have a domain that is of size approximately n lg |P | + kn 2 where P is the valid set, n the number of players, k the size of each player's private input, and the negligible probability that a random element will represent a member of P .The protocols that require polynomial factoring, however, may have certain advantages over the alternate protocols, such as removing the need for mix-nets and reducing the number of rounds in the protocol.Also, in the future, when new homomorphic cryptosystems are developed that have the required properties of enabling efficient polynomial factoring without the private key, they can be directly plugged into our protocols to achieve better efficiency.

Other Tools
Key-Private Cryptosystem.Given a ciphertext, any player who does not hold the private key cannot distinguish which key was used to create it [2].
Equivocal Commitment.A standard commitment scheme allows parties to give a "sealed envelope" that can be later opened to reveal exactly one value.We use an equivocal commitment scheme in our protocols, where the simulator can open the 'envelope' to an arbitrary value without being detected by the adversary [19,22].
Mix-Net and Shuffling Protocol.Either a standard mix-net [5,17,8,13,25] or the muti-party shuffling computation given in this paper (Figs. 9) can be used to distribute data to all players without revealing the origin.
Hash Function.In this paper, let h(•) denote a hash function from {0, 1} * to {0, 1} ( = lg 1 , where is a probability parameter chosen to be negligable).This hash function maps to each output bitstring with uniform and certain ω-wise independent probability (where ω is polynomial in nk).This can be approximated by a cryptographic hash function.

Notation
• P -the set of elements which can be members of a private input set • k -elements in each private input set • n -players in a protocol • t -threshold number, an element must appear t times in private input sets to be included in the threshold set • E pk (•) -encryption under the additively homomorphic, public key cryptosystem to which all players share a secret key • p (d) is the result of taking the formal derivative of p d times • gcd(p, q) is the greatest common divisor of p, q • S j is the jth element of the set S under some arbitrary ordering • Dom(q) denotes the domain of function q

Overview and Mathematical Intuition
In this section, we give the overview and the mathematical intuition of our protocols.

Polynomial Representation of Sets
In our problem setting, there are n players, each with a private input set S i , where We denote the domain of the elements in these sets as P , ∀ i∈[n] S i ⊆ P .
Let R denote the plaintext domain of the homomorphic cryptosystem we use in our protocols.R, however, must be larger than P , so that a random element drawn from the plaintext domain has only negligible probability of representing an element of P .For example, we could require that only elements of the form a || h(a), a ∈ P could represent elements in P .If |h(•)| = lg 1 , then there is only probability that a random element from the plaintext domain is in P .
We represent a set of elements as a polynomial in R[x] where the elements of the set are the roots of the polynomial.For example, given a set of k elements S i = {(S i ) j } 1≤j≤k , we construct its polynomial representation as

Mathematical Intuition
By representing sets of elements as polynomials, we can use mathematical properties of polynomials to help compute the different variations of the set-intersection and threshold set-intersection problems: Intersection using polynomial addition.When two polynomials f and g are added, the shared roots of f and g will be preserved: (f (a) = 0) ∧ (g(a) = 0) → (f + g)(a) = 0.This approximately represents an intersection operator.
Union using polynomial multiplication.Multiplying two such polynomials approximately represents a union operator that preserves duplicates: (f (a Removing duplicate elements using polynomial derivative.To reduce the number of duplicate elements represented in a polynomial p by d, one may take the dth derivative of the polynomial p, denoted as p (d) , as formalized by this theorem: (proof given in Section B) 1) .
Masking non-common factors of polynomials.To hide factors which are not shared between two polynomials f and g, we select two random polynomials r and s in R[x] with sufficiently high degree, and calculate f * r + g * s.The resulting polynomial completely hides all information except for those factors shared between f and g, as formalized by this theorem: (proof given in Section B) Then ∀ 0≤i≤x+y u i are distributed uniformly and independently over R.

Overview of Protocols
In all our protocols, we have n players, and each player has an input of a set of k elements S i = {(S i ) j } 1≤j≤k .We call the polynomial representation of each player's input set S i its (input) polynomial, f i (x) = 1≤j≤k (x − (S i ) j ).We give a brief overview of our protocols below.
In the set-intersection protocol (Fig. 1), the players add all their input polynomials obtain a polynomial that preserves the roots that appear in each private input set S i .By multiplying the random polynomials r i+j,j by the polynomials f i before adding the products f i * r i+j,j together to obtain p = n i=1 f i c j=0 r i+j,j , they hide all information about the input sets except for the intersection set.
The cardinality set-intersection protocol (Fig. 5) calculates p in the same way as the set-intersection protocol, but instead of decrypting the polynomial, all players evaluate the elements in their input sets, to determine the number of roots of the polynomial.
In the over-threshold set-intersection protocol with polynomial factoring (Fig. 2), the players multiply the polynomials f i = k j=1 (x − (S i ) j ) to obtain p = n i=1 f i , which represents a union of all private input elements, with duplicates preserved.Taking the derivative of a polynomial reduces the number of duplicates, so p (t−1) only retains representations of those input elements which which appeared at least t times in p.By multiplying the random polynomials r i and s i by p (t−1) and p, all information about the input sets except for the threshold set (and how many times each element in it is repeated) is completely hidden.
The over-threshold protocol without polynomial factoring (Fig. 3) and threshold set-intersection protocol (Fig. 4) calculate a polynomial representing the threshold set in the same way as the over-threshold protocol with polynomial factoring, but instead of decrypting the polynomial, all players evaluate the elements in their input sets, to determine which are roots of the polynomial.

Protocols for the Honest-But-Curious Case
We present protocols for the set-intersection problems in this section, which are secure against honest-butcurious players.

Set-intersection
The first protocol we present is for the set-intersection problem, given in Fig. 1.In step 1, each player i (1 ≤ i ≤ n) first calculates his input polynomial f i and sends the encryption of its polynomial f i to c other players (where c is the maximum dishonest coalition size), making c + 1 players in all who have this encrypted polynomial.Each of these players i + j (0 ≤ j ≤ c) chooses a random polynomial r i+j,i , and computes the encryption of f i * r i+j,j .Note that, as no coalition of players can be of size c + 1, not all of the random polynomials multiplied by any polynomial f i can be known to the dishonest coalition.Each player i ∈ [n] adds their polynomials to produce φ i , and these polynomials are then added in steps 2-3 to produce If a is a root of some polynomial f i , it is also a root of f i * r for a random polynomial r, and thus all members of the private input sets are preserved as roots when the random polynomials are multiplied in.If a is a root of the polynomials f and g, then it is also a root of the polynomial f + g.Thus, when all the polynomials f i * c j=0 r i+j,j are added together, if an element was in every private input set, its representation will be a root of the final polynomial.If it was not, the use of the random polynomials ensures that it is both hidden and does not appear as a root with overwhelming probability (see Theorem 2).The players jointly decrypt this result polynomial and each player tests to see if the representation of each member of their private input set is a root of the polynomial.Thus every player learns the set-intersection of all players' private inputs.

Over-Threshold Set-Intersection (With Polynomial Factoring)
The protocol for the over-threshold set-intersection is given in Fig. 2. To recover the answer set, the players must be able to factor polynomials (see Sec. 3.2.2).In this protocol the players calculate the product of the polynomials f i created from their private inputs, creating a polynomial p = n i=1 f i .Duplicate elements Protocol: Set-Intersection-HBC Input: There are n ≥ 2 honest-but-curious players, c < n dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key to a homomorpic cryptosystem (Sec.3.2.2). 5.All players perform a group decryption to obtain the polynomial p. 6.Each player i = 1, . . ., n may determine which of his items j are in the intersection of all private inputs as follows: if p((S i ) j ) = 0, then it is in the intersection; otherwise it is not.

Each player
Figure 1: Set-intersection protocol for honest-but-curious parties.
Protocol: OverThreshold-Factor-HBC Input: There are n ≥ 2 honest-but-curious players, c < n dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key for a homomorphic cryptosystem (Sec.3.2.2).The threshold number of repetitions at which an element appears in the output is t.F is a fixed polynomial of degree t − 1 which has no roots representing elements of P .  . ., c + 1 then calculate the t − 1th derivative of that polynomial, reducing the number of repetitions of each linear factor in p by t − 1 (see Theorem 1).Each player i ∈ [c + 1] chooses random polynomials r i and s i , and calculates p * r i + F * p (t−1) * s i , where F is a polynomial used to pad p (t−1) to the same degree as p.All players then add these polynomials to obtain Φ = p * c+1 i=1 r i + F * p (t−1) * c+1 i=1 s i .Each linear factor (x − a) which appears v ≥ t times in p appears v − t + 1 in p (t−1) and for each element appearing in the threshold set which appeared v ≥ t times in the private inputs, (x − a) v−t+1 | Φ.When Φ is factored, these linear factors can be recovered, so that every player learns the solution to the over-threshold problem.The random polynomials c+1 i=1 r i and c+1 i=1 s i ensure all information except for the answer set is hidden.
To see that, with overwhelming probability, no extra linear factors representing private set elements are included in Φ, we may note that unless a factor appears in both p and p (t−1) , with overwhelming probability it will not appear in Φ (see Theorem 2).F is chosen so as to exclude possible factors of p, and the only factors that appear in both p and p (t−1) are those that represent the threshold set, with v − t + 1 repetitions in p (t−1) if there were v ≥ t repetitions in p (see Theorem 1).
This protocol requires polynomial factoring, but has two advantages over the version which does not: it does not need mix-nets and utilizes fewer rounds.

Over-Threshold Set-Intersection (Without Polynomial Factoring)
The protocol for the over-threshold set-intersection problem is presented in Fig. 3.This protocol does not require polynomial factoring.In this protocol, each player creates a polynomial f i with roots representing the elements of their private input.The players then calculate the encrypted polynomial Φ = p * c+1 i=1 r i + F * p (t−1) * c+1 i=1 s i where p = n i=1 f i and r i and s i are chosen randomly.As described in Section 5.2, the roots of this polynomial are either: (1) exactly those represent elements in the threshold set, or (2) are random, and thus with overwhelming probability are irrelevent, as they do not represent elements from the valid set P .
Each player then computes the encrypted evaluation of the Φ at the points that represent their private input.With overwhelming probability, each such encrypted evaluation is an encryption of 0 if that element is in the intersection set, and non-zero otherwise.The encrypted element (V i ) j calculated from this encrypted evaluation is thus either: (1) an encryption of the private input element (S i ) j (if (S i ) j is in the intersection set) or (2) an encryption of a random element (otherwise).(This technique is related to the conditional disclosures of [1].)These ciphertexts are shuffled and then decrypted, revealing each element a that appears in the intersection set, with as many repetitions c ≥ t as appeared in the initial private inputs.The other decrypted elements are random, and thus with overwhelming probability do not represent members of the valid set P .This both hides unpopular elements from the players' private inputs and ensures that incorrect elements are not inserted into the answer set.

Threshold Set-Intersection
The protocol for the threshold set-intersection problem (semi-perfect variant), given in Fig, 4, proceeds largely as the protocol for over-threshold set-intersection given in Fig. 3.We explain other variants and do not provide figures due to space constraints.Each player constructs encryptions of the elements Φ((S i ) j ) from his private input set in step 6.
Unfair Instead of continuing the protocol as written, the players decrypt the encrypted elements Φ((S i ) j ) immediately.This decryption must take place in such a way that only player i learns the element Φ((S i ) j ).Typically, parties produce decryption shares and reconstruct the element from them; player i simply retains his decryption share, so that only he learns the decryption.Thus each player learns which of his elements appear in the threshold set.
Semi-Perfect The encrypted element (U i ) j calculated from the encrypted evaluation of Φ((S i ) j ) is either: (1) an encryption of the private input element (S i ) j (if (S i ) j is in the intersection set) or (2) an encryption of a random element (otherwise).However, the player also constructs a corresponding encrypted tag for each Protocol: OverThreshold-NoFactor-HBC Input: There are n ≥ 2 honest-but-curious players, c < n dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key to a homomorpic cryptosystem (Sec.3.2.2).The threshold number of repetitions at which an element appears in the output is t ≥ 2. F is a fixed polynomial of degree t − 1 which has no roots representing elements of P .
7. All players perform shuffling on their private input sets V i either through use of a mix-net or by using the shuffling protocol given in Fig. 9, obtaining a joint set V .(U i ) j , T ij .We require that the cryptosystem used to construct these tags be key-private, so that the origin of ciphertext pairs T, U cannot be ascertained by the key used to construct the tags.The players then correctly obtain a decryption of each element in the threshold set exactly once.Any other time a ciphertext U for an element in the threshold set is decrypted, a player sabotages it.In group decryption schemes, players generally produce shares of the decrypted element; if one player sends a random share instead of a valid one, the decrypted element is random and the decryption is sabotaged.To ensure an encryption of an element in the threshold set is not decrypted once the element is known to be in the threshold set, a player sabotages the decryption under the following conditions: (1) he can decrypt the tag to a || h(a) for some a and (2) a has already been determined to be a member of the threshold set.All other ciphertexts should be correctly decrypted; either they are encryptions of elements in the threshold set which have not yet been decrypted, or they are encryptions of random elements.
Note that the protocol is the only protocol proposed in this paper with a non-constant number of rounds.Because of the need to sabotage decryptions based on the results of past decryptions, there are O(nk) rounds in this protocol.
Perfect Due to space constraints, the description of this protocol is given in Section A.5.
Protocol: Threshold-HBC Input: There are n ≥ 2 honest-but-curious players, c < n dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key to a homomorpic cryptosystem (Sec.3.2.2).Each player has their own key to a key-private cryptosystem, where encryption and decryption are denoted (Enc i , Dec i ).The threshold number of repetitions at which an element appears in the output is t ≥ 2. F is a fixed polynomial of degree t − 1 which has no roots representing elements of P .
All players perform shuffling on their private input sets V i either through use of a mix-net or by using the protocol given in Fig. 9. 8.For each shuffled element T || U in sorted order, each player i = 1, . . ., n (a) if D i (T ) = h(a) || a) for some a i. if a has previously been revealed to be in the threshold set, then calculate an incorrect decryption share of U , and send it to all other players (b) else calculate a decryption share of U , and send it to all other players (c) reconstruct the decryption of U .If the element a ∈ P , then a is in the threshold set Figure 4: Threshold set-intersection protocol for the honest-but-curious case (semi-perfect variant).(Does not require polynomial factoring.)

Cardinality Set-Intersection
The cardinality set-intersection protocol, given in Fig. 5, is essentially a combination of the set-intersection protocol in Fig. 1 and the over-threshold protocol in Fig. 3. Jointly, the players calculate the polynomial p whose roots represent the set-intersection of the private inputs, as described in Section 5.1.Instead of decrypting the polynomial, like in the set-intersection protocol, the players evaluate the elements of their private input sets in this encrypted polynomial, like in the over-threshold protocol described in Section 5.3.After shuffling these encrypted results, the players decrypt them.If an element a is in the intersection set then both p(a) = 0 and n players have this element in their private input.If an element a is not in the intersection set, then with overwhelming probability, p(a ) = 0. Thus for each element in the intersection set there will be n elements decrypted to 0, and elements in the intersection set will decrypt to uniformly distributed, non-zero elements.

Security and Correctness
A protocol is correct if each player learns the appropriate answer set at its termination.This is proved for our set-intersection, over-threshold set-intersection (with and without polynomial factoring), and threshold set-intersection in Theorems 4, 9, 11, and 7. Proof for cardinality set-intersection follows from the proofs for set-intersection and over-threshold set-intersection (without polynomial factoring).
Each of these protocols is secure in the honest-but-curious model; no player gains information that it would not gain when using its input in the ideal model.A formal statement of our security property is as follows: In the protocol, any honest-but-curious player learns no more than would be gained by using the same private input in an ideal setting, and cannot distinguish any element of any private input set that does not appear in the answer set.
Application and proof of this theorem to the set-intersection, over-threshold set-intersection (with and without polynomial factoring), and threshold set-intersection protocols is given in Theorems 5, 10, 12, and 8. Proof for cardinality set-intersection follows from the proofs for set-intersection and over-threshold setintersection (without polynomial factoring).

The Malicious Case
Protocols secure against malicious players largely follow those secure against honest-but-curious players, with the addition of zero-knowledge proofs, verified by all players, to ensure the correctness of all computation.We give protocols for set-intersection, cardinality set-intersection, and over-threshold set-intersection (with polynomial factoring) for the malicious case in Section A.
Each of these protocols is secure in the simulation model; an intermediary G translates between the real wold with malicious, colluding players Γ and the ideal world, where a trusted third party computes the answer set.This proof shows that no information other than that in the answer set can be gained by malicious players.A formal statement of our security property is as follows: In the protocol, for any coalition Γ of colluding players (at most n − 1 such colluding parties), there is a player (or group of players) G operating in the ideal model, such that the views of the players in the ideal model is computationally indistinguishable from the views of the honest players and Γ in the real model.
Application and proof of this theorem for set-intersection and over-threshold set-intersection are given in Theorems 6 and 13.The proof for cardinality set-intersection follows from these proofs.

Related Work
The problems of set-intersection and cardinality set-intersection (for n = 2 players) are addressed by Freedman, Nissim, and Pinkas [12].They do not, however, address certain problems, such as n ≥ 3 malicious player setintersection, n ≥ 3 cardinality set-intersection, threshold set-intersection, and over-threshold set-intersection, which are addressed in our paper.Our complexity results are comparable or more efficient than those for the protocols proposed in their paper.A summary comparison of the results of our paper to theirs is given in Table 1.
Private equality testing is the problem set-intersection for the limited case k = 1.Generalized circuit evaluation gives a protocol for privately computing equality with O(lg |P |) overhead, where P is the domain from which elements are chosen.Protocols for this problem are proposed in [9,24,21], and are approximately as expensive.Fairness is added in [3].
Determining whether input sets (subsets of [|P |]) are disjoint (without privacy) has communication overhead of Θ(|P |) [18,27].This implies that determining the cardinality of the set-intersection requires at least Θ(|P |) communication as well, and the communication complexity of cardinality set-intersection is proportional to the size of the input set k.
Protocol: Cardinality-HBC There are n ≥ 2 honest-but-curious players, c < n dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key to a homomorpic cryptosystem(Sec.3.2.2).

Each player
6.All players perform shuffling on their private input sets V i either through use of a mix-net, obtaining a joint set V , in which all ciphertexts have been re-randomized.7.All players (a) decrypt each element of the shuffled set V (b) if na of the decrypted elements are 0, then the size of the set intersection is a [27] Alexander A. Razborov.Application of matrix methods to the theory of lower bounds in computational complexity.

A Protocols for Perfect Threshold Set-Intersection and The Malicious Case
In this section, we first introduce some notations of the zero-knowledge proofs we use to ensure security of our protocols for the malicious case, then give the protocols secure against malicious parties.We also include the protocol for the threshold set-intersection problem (perfect variant).

A.1 Zero-Knowledge Proofs
We utilize several zero-knowledge proofs in our protocols for the malicious case.We introduce the notation for these zero-knowledge proofs below, and for any additively homomorphic cryptosystem of which we are aware, we can efficiently construct these zero-knowledge proof protocols using standard proof constructions [6,4].
• POPK{E pk (x)} is a zero-knowledge proof that given ciphertext E pk (x), the player knows the corresponding plaintext x [7].
Protocol: Cardinality-Mal Input: There are n ≥ 2 players, c < n malicious and dishonestly colluding, each with a private input set S i , such that |S i | = k.The players share the secret key sk, to which pk is the corresponding public key to a homomorpic cryptosystem (Sec.3.2.2).The commitment scheme used in this protocol is a equivocal commitment scheme.
All players verify the correctness of all proofs sent to them, and refuse to participate in the protocol if any are not correct.
Each player i = 1, . . ., n: 1. (a) calculates the polynomial f i such that the k roots of the polynomial are the elements of S i , as ) to all other players, along with proofs of plaintext knowledge (POPK{E pk (y i,j )}, 1 ii. sends a commitment to Λ(r i,j ) to all players, where Λ(r i,j ) = E pk (r i,j ) 2. for 1 ≤ j ≤ n (a) opens the commitment to Λ(r i,j ) (b) verifies proofs of plaintext knowledge for the encrypted coefficients of f j (c) sets the leading encrypted coefficient (for x k ) to a known encryption of 1 (d) calculates τ i,j , the encryption of the polynomial p i,j = f j * r i,j , with proofs of correct multiplication ZKPK{r i,j | (τ i,j = r i,j * h δ j ) ∧ (Λ(r i,j ) = E pk (r i,j )) } and sends it to all other players 3.Each player i = 1, . . ., n: (a) calculates µ, the encryption of the polynomial p = n i=1 n j=1 p i,j , as in Sec.3.2.1, and verifies all attached proofs (b) evaluates the encryption of the polynomial p at each input (S i ) j , obtaining encrypted elements E pk (c ij ) where c ij = p((S i ) j ), using the algorithm given in Sec.3.2.1.(c) for each j ∈ [k] chooses a random element r ij , calculates an encrypted element and sends the encrypted element (V i ) j and the proof of correct construction to all players 4. All players perform shuffling on the sets V i through use of a mix-net, obtaining a joint set V , in which all ciphertexts have been re-randomized.5.All players (a) decrypt each element of the shuffled set V (and send proofs of correct decryption to all other players) (b) if na of the decrypted elements are 0, then the size of the set intersection is a commitments to the data items Λ(r i,j ) are purely for the purposes of a simulation proof.We add zeroknowledge proofs of knowledge to prevent five forms of misbehavior: choosing f i without knowledge of its roots, choosing f i such that it is not the product of linear factors, not performing the polynomial multiplication of f j * r i,j correctly, not calculating encrypted elements (V i ) j correctly (either not from the data items (S i ) j or not evaluating the encrypted polynomial p), and not performing decryption correctly.We can thus detect or prevent misbehavior from malicious players, forcing this protocol to operate like the honest-but-curious protocol in Fig. 5. in step 6 of Figure 4.The players then use a mix-net to shuffle them.Like in the semi-perfect protocol, the players then correctly obtain a decryption of each element in the threshold set exactly once.Instead of sabotaging the decryption process directly, they add the encryption of a random element to every ciphertext U whose decryption would be sabotaged under the semi-perfect protocol.Let the shuffled ciphertexts U have an arbitrary ordering U 1 , . . ., U nk .Eq(C, C ) = 1 if the ciphertexts C encode the same plaintext, and 0 otherwise.(This calculation can be achieved with the techniques in [20].)The players i ∈ [n] then choose random elements q i ← R and decrypt the ciphertexts
Proof.This theorem follows from 9.16 in Shoup's Computational Introduction to Number Theory and Algebra. [28].
Theorem 2: Let f, g be polynomials in R[x] where R is a ring, deg(f ) = deg(g) = x, and gcd(f, g) = 1.Let r = y i=0 r i x i and s = y i=0 s i x i , where ∀ 0≤i≤y r i ← R, ∀ 0≤i≤y s i ← R (independently) and y ≥ x.Let u = f * r + g * s = y i=0 u i x i .Then ∀ 0≤i≤x+y u i are distributed uniformly and independently over R. Proof.Firstly note that the number of possible r, s pairs is |R| 2y+2 , and thus there are that many potentially unique mappings from f , g to u.However, deg(u) = x + y, and so there are only |F | x+y+1 result polynomials.We thus must show that the same number of r, s pairs map to each result polynomial, and each result polynomial can be mapped to by at least one choice of r, s.This implies that f * r + g * s is distributed uniformly over all polynomials of deg(f * r + g * s).
Note that if two pairs of polynomials r, s, r , s (r = r , s = s ) that map f , g to the same output polynomial u: We wish to show that the number of pairs of such polynomials that map to the given result polynomial t is equal.Pick any pair of polynomials r, s such that f * r + g * s = t.Choose p such that deg(p) = y − x.This fixes r−r g and s −s f .As we have already chosen r and s, this fixes r and s .We may then simply count how many polynomials p exist, as each choice counts exactly one pair r , s such that f * r + g * s = u, and this counts every such pair, as shown above (our original pair r, s is counted with the polynomial p = 0).There are |R| y−x+1 such polynomials p, and thus an equal number of pairs of polynomials that map to any given output pair for which there exists at least one mapping.
We now show that every result polynomial must have these same number of polynomial pairs r, s that map to it.There are exactly |R| y−x+1 mappings to any polynomial that has at least one mapping.There are exactly |R| y+x+1 possible result polynomials t, as deg(u) = y + x.The mappings to each result polynomial multiplied by the number of result polynomials must exactly equal the total number of mappings |R| 2y+2 .
Thus every possible result polynomial must have |R| y−x+1 mappings to it, and the mappings are thus distributed uniformly over the entire space of polynomials of deg(f * r + g * s).
all polynomials of the appropriate degree.This random polynomial s is of polynomial size, and thus has a polynomial number of roots.Each of these roots is a representation of an element from P with only negligible probability.Thus, the probability that an erroneous element is included in the answer set is also negligible, and all players learn exactly the intersection set.
Theorem 5.In the set-intersection protocol of Fig. 1, any honest-but-curious player learns no more than would be gained by using the same private input in an ideal setting, and cannot distinguish any element of any private input set (of a player not in the coalition) that does not appear in the answer set.
Proof.We assume that the homomorphic cryptosystem (E, D) used in the protocol is in fact secure as we required.Thus, as the inputs of the other players are all encrypted until the decryption is performed, nothing can be learned by any player before that point.Each player j then learns only the summed polynomial p = n i=1 f i * c j=0 r i+j,j .Note that to every coalition of c players, for every i, c j=0 r i+j,j is completely random, as at least one player in the c + 1 players who chose that random polynomial is not a member of the coalition, and so c j=0 r i+j,j is uniformly distributed and unknown.By Theorem 2, p = n i=1 f i * c j=0 r i+j,j = a∈I (x − a) * s, were I is the intersection set and s is uniformly distributed over the polynomials of appropriate length.Note that for any private inputs held by the honest players such that I remains the intersection set, s is still uniformly distributed.Thus no information about the private inputs of the honest players can be recovered from p, other than that given by revealing the intersection set.

D.2 Malicious Case
Theorem 6.In the set-intersection protocol for the malicious case in Fig. 6, for any coalition Γ of colluding players (at most n − 1 such colluding parties), there is a player (or group of players) G operating in the ideal model, such that the views of the players in the ideal model is computationally indistinguishable from the views of the honest players and Γ in the real model.Proof.In this simulation proof, we give an algorithm for a player G.This player communicates with the malicious players Γ, pretending to be one or more honest players in such a fashion that Γ cannot distinguish that he is not in the real world.We assume that all malicious players can collude.The trusted third party takes the input from G and the honest parties, and gives both G and the honest parties the intersection set.G then communicates with the malicious players Γ, so they also learn the intersection set.
We give a sketch of how the player G operates (note that G can prevaricate when opening commitments, as we use an equivocal commitment scheme): 1.For each simulated honest player i, G: (a) chooses a polynomial f i such that each such polynomial is relatively prime (for randomly generated polynomials, this is true with overwhelming probability) (b) chooses arbitrary polynomials r i,1 , . . ., r i,n and creates data items Λ(r i,j ) from them (in the case of Paillier, specially construct encryptions of those polynomials, and proofs of knowledge of each coefficient, see Section A.1) 2. Performs step 1 of the protocol: (a) sends the encryption of f i to all malicious players Γ, along with proofs of plaintext knowledge (b) sends data items Λ(r i,j ) to all malicious players Γ (c) Receives from each malicious player α ∈ Γ: i. encryption of a polynomial f α and proofs of plaintext knowledge for its coefficients ii.trapdoor commitments to data items Λ(r α,j ) for each random polynomial r α,j , 1 ≤ j ≤ n 3. The player G (who knows the trapdoor commitment information) extracts from the proofs of plaintext knowledge and trapdoor commitments to Λ(r i,j ) (in the case of Paillier, the extraction is from the proof of knowledge of the discrete logarithm) the polynomials f α , and the random polynomials r α,j the malicious players Γ have chosen.
4. G obtains the roots of each polynomial f α (as these exactly determine, for the purposes of the protocol, his set): • If polynomial factoring is possible, G may factor f α .f α (a) = 0 ⇔ (x − a)|f α , so all roots of f α may be determined by examining the linear factors.• If we are working in the random oracle model, then, with overwhelming probability, to correctly represent any element of the valid set P , a player must consult the random oracle.As there can be only a polynomial number of such queries, for each query a, G may check if f α (a || h(a)) = 0. • If neither of these routes are feasible, then a proof that f α was constructed by multiplying k linear factors of the form x − a may be added to the protocol instead of proofs of plaintext knowledge.This proof is of size O(k 3 ), and is constructed by using proofs of plaintext knowledge for some linear factors, and layering proofs of correct multiplication to obtain the complete polynomial f α .From this proof, each linear factor of f α can be obtained, and thus all roots of f α .
5. G submits the sets represented by these roots to the trusted third party and obtains the intersection set I. 6. G prepares to reveal the intersection set to the malicious players Γ: (a) selects a target polynomial p = a∈I (x − a) * s, where s is chosen uniformly from those polynomials of degree k − |I|.(note that, by Theorem 2, this is exactly the polynomial calculated by simply running the protocol) (b) chooses a set of polynomials r i,j (where i is one of the simulated honest players) such that n i=1 f i n j=1 r i,j = p (from the proof of Theorem 2, we know that such polynomials exist, and can be determined through simple polynomial manipulation) 7. G follows the rest of the protocol with the malicious players Γ as written, except that he opens the trapdoor commitment to reveal an appropriate Λ(r i,j ) for the new chosen r i,j .In this way, the players calculate an encryption of the polynomial p chosen by G, and then decrypt it.The coalition players thus learn the intersection set.
Note that the dishonest players cannot distinguish that they are talking to G instead of the honest clients, and the correct answer is learned by all parties, in both the real and ideal models.

E Proofs of Correctness and Security for Threshold Set-Intersection Protocols
For lack of space, all proofs presented in this section are proof sketches.

E.1 Honest-But-Curious Case
Theorem 7. In the threshold set-intersection protocol of Fig. 4 (semi-perfect variant), every player learns each element a which appears at least t times in the n players' private inputs.
Proof.As shown in Theorem 1, if an element a appears at least t times in the players' private inputs, it is a root of the polynomial p (t−1) , as (x − a) t | p.Thus if an element a is a root of both p and p (t−1) , it is in the threshold intersection set.The polynomial evaluated to create conditional disclosures is Φ = p * c+1 i=1 r i + F * p (t−1) c+1 i=1 s i .As c+1 i=1 r i and c+1 i=1 s i are chosen by more players than can be in the coalition, they are both uniformly distributed and unknown to any coalition of players.Thus, by Theorem 2, Φ = a∈I (x − a) s, where I is the threshold intersection set and s is a polynomial uniformly distributed over those of the appropriate size.As s has only a polynomial number of uniformly distributed roots, and each root has negligible probability of representing an element from the valid set P , with overwhelming probability, the only roots representing elements of P are those in the threshold set.Thus, if the element (S i ) j appears at least t times, the conditional disclosure U ij = E pk ((S i ) j ), and if it does not, U ij is the encryption of a uniformly distributed element in R.
At this point in the protocol, all players engage in shuffling of inputs.As explained in Section 5.4, the players obtain a decryption of each element in the threshold set exactly once.All players therefore learn the answer set.Theorem 8.In the threshold set-intersection protocol of Fig. 4 (semi-perfect variant), any honest-but-curious player learns no more than would be gained by using the same private input in an ideal setting, and cannot distinguish any element of any private input set (of a player not in the coalition) that does not appear in the answer set.
Proof.All polynomials representing private input are encrypted.We will assume that the cryptosystem we use is semantically secure, so no information is revealed about the players' private inputs when calculating Φ = p c+1 i=1 r i + F * p (t−1) c+1 i=1 s i .First we observe that, by Theorem 2, Φ = gcd p, p (t−1) * s where s is a random polynomial.(Note that F is chosen to not share factors with p, and so is irrelevant to determining shared factors.)By Theorem 1, the only factors shared between p and p (t−1) are those in the threshold set.
Each player constructs tag/disclosure pairs T, U from all of his inputs.At this point in the protocol, all players engage in shuffling of inputs (either through a separate protocol or trusted third party).Thus all players receive all tag/disclosure pairs T, U , without any indication as to the origin.As the cryptosystem used to encrypt the tags is key-private, no player can gain information about which honest player created any tag.As the disclosures are all encrypted under the same key, no information can be gleaned about their origin.
Let a = (S i ) j for some i, j be the representation of the private input element used to create the encrypted element U .If Φ(a) = 0, then, with overwhelming probability, a is a representation of an element in the threshold intersection set.Then U is an encryption of a.If a is not in the threshold intersection set, then, with overwhelming probability, Φ(a) = 0, and U is thus an encryption of a uniformly distributed element, which does not reveal any information.
When the players consider each T, U pair, they either correctly decrypt it, or they do not.If they do not correctly decrypt it, the decryption shares do not reveal information about the contents of the encryption U .If they do correctly decrypt it, the tag indicates that the element a used to create the ciphertext U has not been determined to be in the threshold intersection set.If the decryption of U is not a member of the valid set P , then the player who created it learns that his input element is not in the intersection set; this is information he can gain directly from the answer set.If the decryption of U is a representation of an element a from P , then all players learn that a is in the answer set.Thus, no player learns any information except the answer set.
Note that if Alice and Bob both have some element a in their private inputs, each may learn that some other player holds the same element.However, as t ≥ 2, if they have fewer than t copies of that element in their private input, they could determine that directly from their private input and the answer set.If they have at least t copies of that element in their private input, they may also learn that there exists some other player(s) holding that element, which they cannot learn directly from their input and the answer set.We do not consider this leak of information to be a problem in most circumstances.

F Proofs of Correctness and Security for Over-Threshold Set-Intersection Protocols
F.1 Honest-But-Curious Case (With Polynomial Factoring) Theorem 9.In the over-threshold set-intersection protocol of Fig. 2 (with polynomial factoring), every honestbut-curious player learns each element a which appears at least t times in the n players' private inputs, as well as the number of times it so appears.
Proof.All players calculate and decrypt Φ = F * p (t−1) * c+1 i=1 r i +p * c+1 i=1 s i .As c+1 i=1 r i and c+1 i=1 s i are distributed uniformly over all polynomials of approximate size nk, Theorem 2 tells us that Φ = gcd p (t−1) , p * r, where r is a random polynomial of the appropriate size.As r has only a polynomial number of roots, each the set of polynomials of degree between 0 and a • [c] for an integer c denotes the set {1, . . ., c} • a := b denotes that the variable a is given the value b • a || b denotes a concatenated with b • a ← S denotes that element a is sampled uniformly from set S • deg(p) is degree of polynomial p 0 , utilizing the algorithms given in Sec.3.2.1.2. Player 1 sends the encryption of the polynomial λ 1 = φ 1 , to player 2 3.Each player i = 2, . . ., n in turn (a) receives the encryption of the polynomial λ i−1 from player i − 1 (b) calculates the encryption of the polynomial λ i = λ i−1 + φ i by utilizing the algorithms given in Sec.3.2.1.(c) sends the encryption of the polynomial λ i to player i + 1 mod n 4. Player 1 distributes the encryption of the polynomial p = λ n = n i=1 f i * c j=0 r i+j,j to all other players.

1 .
Each player i = 1, . . ., n calculates the polynomial f i = (x − (S i ) 1 ) . . .(x − (S i ) k ) 2. Player 1 sends the encryption of the polynomial λ 1 = f 1 to player 2 3.Each player i = 2, . . ., n (a) receives the encryption of the polynomial λ i−1 from player i − 1 (b) calculates the encryption of the polynomial λ i = λ i−1 * f i by utilizing the algorithm given in Sec.3.2.1.(c) sends the encryption of the polynomial λ i to player i + 1 mod n 4. Player 1 distributes the encryption of the polynomial p = λ n = n i=1 f i to players 2, . . ., c + 1 5.Each player i = 1, . . ., c + 1 (a) calculate the encryption of the t − 1th derivative of p, denoted p (t−1) , by repeating the algorithm given in Sec.3.2.1.(b) choose random polynomials r i , s i ← R nk [x] (c) calculate the encryption of the polynomial p * r i + F * p (t−1) * s i and send it to all other players 6.All players perform a group decryption to obtain the polynomial Φ = F * p (t−1) * c+1 i=1 r i + p * c+1 i=1 s i .7. Each player factors Φ.Each factor of the form x − a where a ∈ P (i.e., a is a legitimate element of a private input) indicates that a is in the set intersection.If the factor x − a appears b times, a appeared t + b − 1 times in the players' private inputs.

Figure 2 :
Figure 2: Over-threshold set-intersection protocol for the honest-but-curious case (with polynomial factoring)

1 .
Each player i = 1, . . ., n calculates the polynomial f i = (x − (S i ) 1 ) . . .(x − (S i ) k ) 2. Player 1 sends the encryption of the polynomial λ 1 = f 1 to player 2 3.Each player i = 2, . . ., n (a) receives the encryption of the polynomial λ i−1 from player i − 1 (b) calculates the encryption of the polynomial λ i = λ i−1 * f i by utilizing the algorithm given in Sec.3.2.1.(c) sends the encryption of the polynomial λ i to player i + 1 mod n 4. Player 1 distributes the encryption of the polynomial p = λ n = n i=1 f i to players 2, . . ., c + 1 5.Each player i = 1, . . ., c + 1 (a) calculate the encryption of the t − 1th derivative of p, denoted p (t−1) , by repeating the algorithm given in Sec.3.2.1.(b) choose random polynomials r i , s i ← R nk [x] (c) calculate the encryption of the polynomial p * r i + F * p (t−1) * s i and send it to all other players 6.Each player i = 1, . . ., n (a) evaluates the encryption of the polynomial Φ = p * c+1 i=1 r i + F * p (t−1) * c+1 i=1 s i at each input (S i ) j , obtaining encrypted elements E pk (c ij ), (where c ij = Φ((S i ) j )) using the algorithm given in Sec.3.2.1.(b) for each j = 1, . . ., k chooses a random number r ij ← Dom(E pk ) and calculates an encrypted element (

8 .
All players (a) decrypt each element of the shuffled set V (b) for each element, if the element a is in the valid set P , then a is in the threshold set.If a appears b times, then a appeared b times in the players' private inputs.

Figure 3 :
Figure 3: Over-threshold set-intersection protocol for the honest-but-curious case (Does not require polynomial factoring.)

1 .
Each player i = 1, . . ., n calculates the polynomial f i = (x − (S i ) 1 ) . . .(x − (S i ) k ) 2. Player 1 sends the encryption of the polynomial λ 1 = f 1 to player 2 3.Each player i = 2, . . ., n (a) receives the encryption of the polynomial λ i−1 from player i − 1 (b) calculates the encryption of the polynomial λ i = λ i−1 * f i by utilizing the algorithm given in Sec.3.2.1.(c) sends the encryption of the polynomial λ i to player i + 1 mod n 4. Player 1 distributes the encryption of the polynomial p = λ n = n i=1 f i to players 2, . . ., c + 1 5.Each player i = 1, . . ., c + 1 (a) calculate the encryption of the t − 1th derivative of p, denoted p (t−1) , by repeating the algorithm given in Sec.3.2.1.(b) choose random polynomials r i , s i ← R nk [x] (c) calculate the encryption of the polynomial p * r i + F * p (t−1) * s i and send it to all other players 6.Each player i = 1, . . ., n (a) evaluates the encryption of the polynomial Φ = p * c+1 i=1 r i + F * p (t−1) * c+1 i=1 s i at each input (S i ) j , obtaining encrypted elements E pk (c ij ) where c ij = Φ((S i ) j ), using the algorithm given in Sec.3.2.1.(b) for each j = 1, . . ., k calculates an encrypted tag

2 .
0 , utilizing the algorithms given in Sec.3.2.1.Player 1 sends the encrypted polynomial λ 1 = φ 1 , to player 2 3.Each player i = 2, . . ., n in turn (a) receives the encryption of the polynomial λ i−1 from player i − 1 (b) calculates the encryption of the polynomial λ i = λ i−1 + φ i by utilizing the algorithms given in Sec.3.2.1.(c) sends the encryption of the polynomial λ i to player i + 1 mod n 4. Player 1 distributes the encryption of the polynomial p = λ n = n i=1 f i * c j=0 r i+j,j to all other players.5.Each player i = 1, . . ., n (a) evaluates the encryption of the polynomial p at each input (S i ) j , obtaining encrypted elements E pk (c ij ) where c ij = p((S i ) j ), using the algorithm given in Sec.3.2.1.(b) for each j = 1, . . ., k chooses a random number r ij ← Dom(E pk ) and calculates an encrypted element (

Figure 5 :
Figure 5: Cardinality set-intersection protocol for the honest-but-curious case.(Does not require polynomial factoring.)

Figure 7 :
Figure 7: Cardinality set-intersection protocol for the malicious case.
) and f | (s − s), this transformation is valid.Thus r − r = g * p and s − s = f * p for some polynomial p such that deg(p) = y − x.

Table 1 :
utilize general multi-party computation: Communication complexity comparison for our protocols and previous solutions.