Types of quantum information

Quantum, in contrast to classical, information theory, allows for different incompatible types (or species) of information which cannot be combined with each other. Distinguishing these incompatible types is useful in understanding the role of the two classical bits in teleportation (or one bit in one-bit teleportation), for discussing decoherence in information-theoretic terms, and for giving a proper definition, in quantum terms, of ``classical information.'' Various examples (some updating earlier work) are given of theorems which relate different incompatible kinds of information, and thus have no counterparts in classical information theory.


I Introduction
Despite an enormous number of publications in the field of quantum information (see [1,2] for useful introductions), neither the fundamental principles underlying the subject, nor its connection with classical information theory as developed by Shannon and his successors [3], is altogether clear. On the one hand there has been some dispute [4,5] about whether Shannon's ideas can be applied at all in the quantum domain. On the other hand there have been suggestions that the connection with Shannon's ideas occurs only for macroscopic systems or asymptotically large N (number of transmissions, or whatever) limits, as in what is sometimes called "quantum Shannon" theory [6,7]. The author's position is that, to the contrary, there are perfectly consistent ways of applying the basic ideas of classical information theory to small numbers (even one) of microscopic quantum systems provided attention is paid to the Hilbert space structure of quantum theory, and probabilities are introduced in a consistent fashion. And, further, that this approach has advantages in that simple systems are simpler to think about than complicated systems, so it is useful to develop some intuition as to how they behave. One goal is to understand both classical and quantum information theory in fully quantum terms, since the world is (most physicists believe) fundamentally quantum mechanical.
The basic strategy of this paper is based on the idea that quantum information comes in a variety of incompatible types or species. Each type or species refers to a certain class of (typically microscopic) mutually-compatible properties of a quantum system. As long as the discussion of information is limited to a single type, all the usual formalism and intuition provided by classical information theory apply directly to the quantum domain. On the other hand, incompatible types of quantum information cannot be combined, as this makes no sense in the context of standard Hilbert space quantum mechanics. Since in classical information theory there is only a single type or species of information, or, equivalently, all different types are compatible with each other, the main respect in which quantum information theory needs to go beyond its classical counterpart is in relating incompatible types of information in a useful way. Various examples are given below.
In the real (i.e., quantum) world it must, of course, be the case that so-called "classical" information, as in the acronym LOCC, "local operations and classical communication," is describable in quantum terms. A relatively precise definition can be given as indicated in Sec. IV: classical information is a particular type of quantum information, the only one that survives under circumstances (implicitly assumed in much writing on the subject) where there is strong decoherence.
The remainder of this paper is organized as follows. The concept of quantum information types is introduced in the context of a discussion of quantum incompatibility in Sec. II. The idea is illustrated by the simple examples of one-and (standard) two-bit teleportation in Sec. III, where using quantum information types and what we call the Presence theorem helps understand why one or two bits, respectively, are needed in these protocols, or "dits" in their d-dimensional ("qudit") generalizations. Decoherence and "classical" information are the subjects of Sec. IV, which begins with a simple beam splitter example that illustrates the importance of the Exclusion theorem, derived from a more general Truncation theorem, and this sets the stage for a proper understanding of classical information in quantum terms.
Quantum information theory requires theorems that relate different types of information, and hence go beyond anything in classical information theory. Those used in Secs. III and IV and some others closely related to them are given precise formulations in Sec. V, extending earlier work in [8]. They all have the "smell" of no-cloning [9], but the connection is not altogether straightforward, as shown by an additional Generalized No-Cloning theorem. Proofs and some additional technical details are found in the appendices. A summary and an indication of various ways the present work needs to be extended comprise the concluding Sec. VI.

II Types of Information
Central to the following discussion will be the notion of quantum incompatibility [10], which can be illustrated using the familiar two-dimensional Hilbert space of a spin-half particle. Each one-dimensional subspace or ray, which is to say all complex multiples of a fixed nonzero ket |w , is associated with the property that a particular component of angular momentum is positive, S w = +1/2 (in units of ) for some direction w in space, e.g., w = z or w = x or w = −z, etc. The negation of the property (or proposition) S w = +1/2 is the property S w = −1/2, corresponding to the orthogonal complement of the ray associated with S w = +1/2. In the notation commonly used in quantum information theory, S z = +1/2 and −1/2 correspond to rays passing through (i.e., multiples of) |0 and |1 , respectively, and of course these are orthogonal, 0|1 = 0.
It always makes sense to talk about the conjunction P AND Q of two properties P and Q of a classical system, such as p > 0 and x < 0 for a harmonic oscillator. The result may be a property that is always false, as when P is p > 0 and Q is p < 0, and in this case the negation (NOT P ) OR (NOT Q) of P AND Q is the property p ≤ 0 OR p ≥ 0, which is always true. But in quantum theory it is possible to write down conjunctions, such as which make no sense. Obviously (1) cannot correspond to any ray in the Hilbert space, since each ray is associated with S w = +1/2 for some direction w, and (1) is not of this form. Could it be a proposition that is always false? Then its negation S x = −1/2 OR S z = +1/2 must always be true, which does not seem very plausible. Indeed, assuming that (1) and similar conjunctions are false swiftly leads to a contradiction if one follows the usual rules of logicfor details, see Sec. 4.6 of [11]. This was understood by Birkhoff and von Neumann [12], who proposed altering the rules of propositional logic to get around this difficulty. Their proposal has not been of much use for interpreting quantum mechanics, which may merely mean that we physicists are not smart enough. By contrast, if one restricts the domain of meaningful discourse so as to exclude (1) and similar things-in particular, conjunctions (AND) and disjunctions (OR) of properties corresponding to projectors that do not commute-it is possible to produce a consistent interpretation of quantum mechanics [11,[13][14][15][16][17][18][19][20] that follows the usual rules of logic (as applied to meaningful statements), and resolves all the standard paradoxes [21]. The compatible propositions S z = +1/2 and S z = −1/2, corresponding to mutually orthogonal projection operators, form a quantum sample space of mutually exclusive possibilities: their conjunction is always false, and since each is the negation of the other, one or the other is always true. This makes physical sense in that one can in principle carry out a Stern-Gerlach measurement to determine whether S z = +1/2 or −1/2 [22]. (By contrast, there is no measurement which can determine the truth or falsity of (1), as one would expect for something that is meaningless.) Information that answers the question of whether S z = +1/2 or −1/2 is what we shall call the Z type (or species) of information. Similarly, X information answers the question whether S x = +1/2 or −1/2. It is incompatible with Z information in that there is no way in which the two can be meaningfully combined: (1) makes no sense, and asking whether S z = +1/2 or S x = +1/2 is equally meaningless. For a spin-half particle there is a type of information associated with each pair w and −w of opposite directions in three-dimensional space, and the different species associated with distinct pairs are incompatible.
In larger Hilbert spaces a quantum sample space or type of information always corresponds to a decomposition of the identity, a collection of mutually orthogonal projectors , that sum to the identity I. In the case of an orthonormal basis, V j = |v j v j | and v j |v k = δ jk , we also write V = {|v j }, since the meaning is obvious. Two such collections or types of information V and W are compatible if and only if all projectors in one commute with all projectors in the other; otherwise they are incompatible. The "single framework" rule of quantum reasoning [23] generalizes the example discussed above, and states that incompatible quantum descriptions (decompositions, information types) cannot be meaningfully combined. Let us see how using incompatible types of information assists in understanding why quantum teleportation uses two classical bits of information in the standard protocol [24]. It is simplest to start with a variant known as "one bit" teleportation [25], corresponding to the quantum circuit in Fig. 1(a), where the teleportation process transports the state |ψ from the upper left a to the lower right b ′ . First, a CNOT, shown as a controlled-X (CX) gate, acts between qubits a and b, and then an S x measurement is carried out on qubit a. In the figure this measurement is indicated by the Hadamard gate H that interchanges S x and S z , followed by a measurement in the standard or S z or "computational" basis indicated by the D-like symbol. If the measurement reveals S a x = −1/2 a classical bit (dashed line labeled x) is transmitted to where it actuates a Z gate on qubit b, whereas if S a x = +1/2 nothing is done. It is an easy exercise to show that whatever initial state |ψ = α|0 + β|1 enters at a will later reappear at b ′ .

III Teleportation
In the case of Z information, meaning the input is |ψ = |0 or |1 , corresponding to S a z = +1/2 or −1/2, the CX (CNOT) gate copies it from the a to the b qubit so that S b z = S a z , and the later Z gate has no effect, since even if it acts it only changes the phase of |1 , leaving the ray (or projector) corresponding to S b z = −1/2 the same. Thus failing to do the measurement, or throwing away the classical bit, has no influence so far as transporting the Z information is concerned.
In the case of X information the input |ψ is either |+ = (|0 + |1 )/ √ 2 or |− = (|0 − |1 )/ √ 2, corresponding to S x = +1/2 or −1/2, and the analysis is somewhat more complicated. The CX gate maps |ψ = |+ into the two qubit state | + + + | − − corre- This means the original X information is not present in either qubit by itself, since the corresponding reduced density operator is 1 2 I, but resides in a correlation between the two. Information residing in a correlation is not in itself a quantum effect. One can, for instance, encode a classical bit {0 L , 1 L } in two coding bits by letting 00 or 11, chosen at random, represent 0 L , and 01 or 10, again chosen at random, represent 1 L . From either coding bit alone one can extract no information about 0 L versus 1 L , but it is obviously present in the two together through their correlation. In the case under discussion the measurement in Fig. 1(a) extracts the value of S a x after the CX has acted (note that this is not the original X information), and if this is negative the Z gate applied to qubit b changes the sign of S b x . The net effect is that at the end of the process the value of S b x is exactly the same as that of S a x at the beginning, so the X information has also been successfully transmitted from a to b ′ .
One could continue to check what happens to other types of information, but that is not necessary. The Presence theorem of Sec. V A says that once it is known that two suitably incompatible types of information, Z and X in the case at hand, are present in the output b ′ , all other types of information about the input a are also present, so there is a perfect quantum channel from a to b ′ , the desired result for teleportation. In summary, the transmission of two (suitably incompatible) types of information is needed to ensure that there is a good quantum channel from a to b ′ . The CX gate by itself transmits the Z type, while the later measurement and the single classical bit carrying its outcome are needed to transmit the incompatible X type of information.
The Presence theorem is a statement about quantum information discussed in fully quantum terms, so to apply it to the system in Fig. 1(a) one needs to understand the measurement and the "classical" bit in quantum terms. This can be done in the manner indicated in Sec. IV. But for present purposes it is convenient to avoid having to introduce the Hilbert space of a complicated macroscopic system, by "quantizing" the circuit in Fig. 1(a) so that it takes the form shown in (b), with the measurement of the a qubit following the part of the circuit where it controls (in the usual quantum sense) a Z gate. (See [26] and pp. 186f in [1] for this "trick," based on ideas in [27].) The two circuits in (a) and (b) are equivalent so far as teleportation is concerned, but the second is simpler to analyze in fully quantum terms. Indeed, the later measurement of qubit a in (b) need not be made at all, which is why it is not shown, as its outcome is not used in the protocol. (This discussion continues in the latter part of Sec. IV.) In addition, the Presence theorem is stated in Sec. V A in the language of entangled kets, rather than in terms of the input and output of a quantum channel. One way of connecting the two is indicated in Fig. 1(c), where an auxiliary qubitā has been introduced, and where {|a j } and {|ā j } are orthonormal bases of H a and Hā, is a fully-entangled state. The result is a final state |Ψ ∈ Hā ⊗ H a ⊗ H b , referred to as a "channel ket" in [8] (which see for additional details). That there is a perfect quantum channel from a to b ′ in Fig. 1(a) or (b) is the same as saying that all the information about qubitā is in qubit b ′ if one uses |Ψ , or equivalently the reduced density operator obtained by tracing |Ψ Ψ| over H b , to generate probabilities for correlations between the two in the usual way. An alternative way to associate channels with kets is to use map-state duality [8,28] in which an entangled ket |ψ = j |a j |φ j on the tensor product of H a and another space H f is expanded in an orthonormal basis of H a , with {|φ j } the (unnormalized) expansion coefficients. One can always "transpose" |ψ to an operator mapping H a to H f . In the particular case in which |ψ is maximally entangled, which is to say Tr f (|ψ ψ|) is proportional to the identity I a , the map M is, up to normalization, an isometry (a unitary operator from H a to the subspace MH a of H f ), that is, Conversely, given a map M from H a to H f and an orthonormal basis of H a , it can be expanded in the form (4), and (3) then defines a corresponding entangled state (typically not normalized), which when M is an isometry is maximally entangled. Of course, if |ψ is given, M depends on the choice of orthonormal basis {|a j }, and vice versa, but the Presence theorem is unaffected by the basis choice. While this and the other theorems in Sec. V can be expressed either in "map" or "entanglement" language, the latter has the advantage of being more symmetrical (see remarks in Sec. VI of [8]). The idea of regarding the input and output of a quantum channel as corresponding to the tensor product of two Hilbert spaces, as suggested by the preceding discussion, is a very natural notion when using atemporal diagrams [29], and within the consistent histories approach to probabilistic time development [11], where the idea of such a tensor product goes back to Isham [30]. Conventional "two bit" teleportation, Fig. 2, with √ 2|B 0 = |00 + |11 , can be analyzed in the same way; the details are left as an exercise. The two classical bits in (a) are labeled x and z to indicate that they are essential for correct transmission of the Z and X information, respectively; throwing z away will not affect X information, and x is dispensable if only Z information is of interest. Neither classical bit, nor the two together, actually contain any information in themselves, a consequence of No Splitting, Sec. V B. All the information is in correlations of the classical bits with the b qubit (more details in [31]). One classical bit is needed for each of these two incompatible types of quantum information, but that is enough, for the Presence theorem then guarantees that all other species are correctly transmitted, so one has a perfect quantum channel from a to b ′ . Figure 2(b) is the quantized version of (a), which is convenient for analyzing the situation in quantum terms, and one could once again introduce a channel ket |Ψ (not shown), this time on four qubits, in analogy with Fig. 1(c). The Presence theorem can be applied to the channel ket, or one can regard the channel entrance as constituting its own Hilbert space. Generalization to the teleportation of a qudit (with Hilbert space of dimension d > 2) is straightforward: if two suitably incompatible types of information are correctly transmitted-each type requires a classical "dit"-one has a perfect quantum channel. An application of types of information to a simple case of decoherence is shown in Fig. 3, where a particle (neutron or photon) enters an interferometer on path d at beamsplitter B and, because at an intermediate time it is in a coherent superposition (|e + |f )/ √ 2, leaves the second beamsplitter B ′ in channel h rather than g. But if while inside the interferometer some interaction with the environment leaves a trace indicating that the particle took path e rather than f , or vice versa, the interference effect is lost, and the particle emerges with equal probability in g or h. Let Z be the e vs. f "which way" information, and X be the

IV Decoherence and Classical Information
Decoherence, the disappearance of coherence, in this case X information, when Z information about the path resides in the environment, illustrates the Exclusion theorem of Sec. V: one type of information about S a perfectly present in S b means that a mutually-unbiased type is completely absent from S c . Two types of information X and Z are said to be mutually unbiased if they correspond to mutually unbiased orthonormal bases {|x j } and {|z k }, with | x j |z k | 2 equal to 1 divided by the dimension of the Hilbert space, independent of j and k.
To apply this theorem to the situation in Fig. 3, think of the particle that has just passed through the first beam splitter as system S a , and just before it reaches the second beam splitter B ′ as S c , while S b is the environment at this second time. (See the discussion in Sec. III on why one can regard the particle at two different times as two separate systems, and how to apply the Exclusion theorem, worded in terms of entangled states, to situations with unitary time evolution.) For our purposes it suffices to model S a and S c using a d = 2 dimensional Hilbert space spanned by |e and |f -this is analogous to focusing on the spin of a particle when its other degrees of freedom are not relevant to the analysis. The Exclusion theorem says that when the Z or which-way information about S a is perfectly present in the environment, i.e., at the time the particle reaches the second beam splitter, the (mutually unbiased) X or coherence information must be perfectly absent from S c , i.e., from the particle itself at this later time. And in the absence of coherence all interference effects disappear: the situation after the second beam splitter is, statistically, just the same as if the particle arrived at random on path e or path f . All this is well known, and the connection between decoherence and information in the environment has been previously pointed out by Zurek and his collaborators [32][33][34][35]. The use of types of information, not tied to some notion of measurement [36], is our attempt to add further clarity and precision to these seminal ideas.
The situation to which the Exclusion theorem applies is that of strong, meaning essentially complete, decoherence. Clearly extensions are needed (Sec. VI) to cases of only partial decoherence. Nevertheless, strong decoherence is a useful idealization both because if is often a good approximation to what is realized in the laboratory (to the dismay of those who want to build quantum computers), and because it yields a precise definition of another idealization, classical information. Indeed, it is rather odd to find the term "classical information" floating around in technical books and articles on quantum information theory when most, even if not all, physicists believe that all physical processes in the real world are quantum mechanical, with classical physics a good approximation in appropriate circumstances, but hardly part of our fundamental understanding of nature. A good way to see how "classical" information can arise in quantum mechanics is to note that one consequence of the Truncation theorem as discussed in Sec. V B is the fact that if a particular type of information about S a associated with an orthonormal basis {|v j } is perfectly present in S b it is the only type of information about S a which can be present in a third system S c in the sense that any other species of information is parasitic upon, or controlled by, or compatible with the {|v j } type. Whenever only one type of information needs to be considered all the rules of classical information theory apply to it; conversely, "classical information" in the quantum context refers to the single dominant type of quantum information available in a situation of strong decoherence. Typically it is the presence of this type of information in the environment that means that other types can be ignored in systems which are not isolated from the environment. In particular, the measurements indicated in part (a) of Figs. 1 and 2 when instantiated in physical apparatus amplify a particular type of information, and the environment rapidly copies the "pointer positions," resulting in strong decoherence. To avoid the rather unwieldy task of trying to describe this amplification process and interaction with the environment in correct quantum mechanical terms, which is certainly possible in principle, it is often preferable (as noted earlier) to employ a simple quantum circuit in which the decoherence is "built in": the a qubit in Fig. 1(b), and the a and c in qubits in Fig. 2(b), are at later times good copies of the Z information preceding the final control gates, and since no further use is made of them, they may be regarded as carrying this type of information off into the environment.

V Theorems
In this section we state and prove results used in the preceding sections, plus some additional ones that are closely related. The treatment builds upon ideas and terminology from [8], repeated here to the extent needed to make the exposition self-contained. Note in particular that H a is the Hilbert space of system S a , H ab = H a ⊗ H b that of S ab , the systems S a and S b regarded as a single system, ρ ab is a density operator on H ab , often traced down from that of a larger system, d a the dimension of H a , and so forth. All Hilbert spaces are assumed to be of finite dimension in order to avoid technical complications.

V A Presence
Theorem (Presence). Let S a and S b be two quantum systems with Hilbert spaces H a and H b , and V = {V j } and W = {W k } two strongly incompatible projective decompositions of the identity I a . If both the V and the W information is perfectly present in S b for a density operator ρ on H ab (possibly a pure state), then all types of information about S a are perfectly present in S b .
The terms are to be understood as follows. The density operator ρ on H ab or pure state |Ψ ∈ H ab will be called a pre-probability using the terminology of Ch. 9 of [11], because it can be used to generate probabilities once a quantum sample space-an orthonormal basis of H ab or a decomposition of the identity I ab -has been specified, following the usual rule that the probability associated with a projector P is Pr(P ) = P = Tr(P ρ) = Ψ|P |Ψ , with ρ = |Ψ Ψ| for a pure state. For example, ρ as a density operator is represented by different matrices if different orthonormal bases are chosen. The diagonal elements of one of these matrices form a probability distribution associated with the corresponding basis (or type of information), whereas the single density operator giving rise to the different distributions is the pre-probability. The V type of information is perfectly present in S b for a given pre-probability (Sec. III C of [8]) when the unnormalized conditional density operators on H b are mutually orthogonal, i.e., ρ bj ρ bk = 0 for j = k.
In the language of measurements, if one thinks of carrying out a projective measurement on H a corresponding to {V j }, then there is a corresponding decomposition {T k } of I b such that the measurement outcomes are in one-to-one correspondence. An analogous definition applies to W information. The conclusion of the theorem, that all species of information about S a are perfectly present in S b , conveniently abbreviated to "all information about S a is in S b ," means that for any decomposition of the identity, in particular for any orthonormal basis of H a , that kind of information is perfectly present in S b in the sense just discussed. When the pre-probability is a pure state |Ψ , this implies it is maximally entangled, i.e., ρ a = Tr b (|Ψ Ψ|) is proportional to the identity I a . For a general ρ a similar but more complicated result obtains-see theorem 3(ii) in [8]-and once again ρ a = Tr b (ρ) is proportional to I a .
The decompositions V and W are said to be strongly incompatible (Sec. IV of [8]) when the only projector P that commutes with every V j and every W k is either P = 0 or P = I a . While concise, this definition is not very intuitive. In the case of orthonormal bases V = {|v j } and W = {|w k } one can use a somewhat simpler definition. Construct a graph containing 2d a nodes, one for each |v j and one for each |w k . Whenever the inner product v j |w k is nonzero, draw an edge between the corresponding nodes. Then V and W (i.e., the corresponding collections of projectors) are strongly incompatible if and only if this graph is connected. The proof is at the end of App. A. In the case of two mutually unbiased bases, as in Sec. IV, every {|v j } is connected to every {|w k } node, so connectivity of the graph is obvious. It is equally obvious for two bases in which v j |w k is never zero. However, strong incompatibility can still hold if some of the v j |w k are zero, provided the graph remains connected.
The proof of the Presence theorem, extending a weaker theorem in [8], is in App. A

V B Truncation, Exclusion, No Splitting, Somewhere
A series of useful "all-or-nothing" results about information in three systems S a , S b , and S c begins with: Theorem (Truncation). Let S a , S b and S c be three quantum systems, and suppose that for some decomposition V = {V j } of I a and for some density operator ρ on H abc all the V information about S a is present in S b . Then any other type of information W = {W k } about S a will be "truncated" (or "censored") in the sense that that is, ρ ac , the partial trace of ρ over H b , commutes with all the V j . (Note that V j is here understood as V j ⊗ I c on H ac .) Equivalently, all correlations between S a and the third system S c satisfy AC = Ā C for any operators A and C on H a ad H c , respectively (one could write A ⊗ C in place of the truncated version of the operator A, and the average taken with respect to ρ, as in (5).
This theorem is closely related to, but not the same as, theorem 6(i) in [8], and its proof is in App. B. Since any operator A can be written as A = jk V j AV k , its truncated version A is obtained by throwing away the off-diagonal blocks. To understand the implications of the theorem it helps to consider the case in which V = {|v j } is an orthonormal basis of H a , so thatĀ = is diagonal in this basis, meaning that all correlations between A and C can be computed from the correlations V j C , that is from V information about S a in S c . Equivalently, where the Γ j are operators on H c . All other information about S a in S c , of whatever kind, is then "parasitic upon," "truncated by," or "censored relative to" the V information. When the V j projectors have rank greater than 1 the truncation or censorship is less extreme, but it remains true that the only sort of information about S a allowed in S c is represented bȳ A-type operators which are compatible with V in the sense of commuting with every V j or, equivalently, ρ ac commutes with every V j . The situation is particularly clear if there is another basis W = {|w k } which is mutually unbiased with respect to V = {|v j }, i.e., | w k |v j | 2 = 1/d a for all j and k. In that case the truncated projectorsW k are not only diagonal in the V representation, but are all equal to I a /d a independent of k, proving the next theorem (which is the same as theorem 7(ii) in [8]): Theorem (Exclusion). Let S a , S b and S c be three quantum systems, and V = {|v j } and W = {|w k } two mutually unbiased orthonormal bases of H a . Then if the V information about S a is perfectly present in S b , the W information about S a is perfectly absent from S c .
The perfect absence of some type of information can be defined using reduced density operators, as in (6), but now they are required to be the same up to a multiplicative constant. That is, the W or {W k } information about S a is perfectly absent from S c if and only if for every k Tr a (W k ρ ac ) = p k ρ c , where ρ c = Tr a (ρ ac ), ρ ac is the density operator (pre-probability) on H ac , and the p k are nonnegative numbers. One can think of this in terms of measurements as saying that when a projective measurement corresponding to {W k } is made on S a , the probability of any measurement on S c conditioned on the outcome k will be independent of k. Below we will need the notion of the perfect absence of all types of information about S a from S c , conveniently abbreviated to "no information about S a is in S c ." This is equivalent to ρ ac = ρ a ⊗ρ c , or to |Ψ = |α ⊗|γ for a pure state, theorem 1(iii) of [8]. As the relationship is obviously symmetrical, one can also say that S a and S c are uncorrelated.
An important corollary of the Exclusion theorem is: Theorem (No Splitting). Let S a , S b and S c be three quantum systems. If all types of information about S a are perfectly present in S b , then all types will be perfectly absent from S c . That is, if all information about S a is in S b , no information about S a is in S c . This is theorem 8(i) in [8]. It follows at once from the Exclusion theorem, because to show the absence of some species of information about S a in S c it suffices to consider orthonormal bases, and for each of these we know that there is at least one mutually unbiased basis for which all the corresponding information is, by hypothesis, perfectly present in S b . The No Splitting theorem has lots of applications. For example, in either one or two bit teleportation after the final corrections have been made, there is no information about the input state |ψ remaining in the environment treated as a quantum system, and since copies of the classical bits x and y used to complete the protocol can remain in the environment, it is evident that they, as has often been observed, can contain no information about the input: their probabilities cannot depend upon |ψ . In the case of quantum codes the presence of the encoded information in some subset of the coding bits (which is what makes error correction possible) means its absence from the complementary subset of coding bits, and this can provide additional intuition about the coding process [8].
Is there a converse to the Exclusion theorem which says that if the W information about S a is perfectly absent from S c , then that associated with any mutually unbiased basis V must be present in S b ? No, not even if one knows that all information about S a is present in the combined system of S bc ; see the end of App. B for a counterexample. There is, on the other hand, a partial converse of the No Splitting theorem: Theorem (Somewhere). If for a pure state pre-probability |Ψ on H abc it is the case that all the information about S a is in the combined system S bc , and none of it is in S c , then it is all in S b .
The name "Somewhere" comes from the idea that if we know that an object is in one of two rooms and it is not in the second, it has to be in the first: it must be somewhere. However, information is very different from a lost child, as it can be present in correlations between two systems, while not being available in either system by itself. See, for example, the discussion of X information in one bit teleportation in Sec. III. Consequently, the Somewhere theorem is a decidedly quantum mechanical result. Also, it fails (in general) if the pure state is replaced by a density operator. The theorem itself is proved in [8] as theorem 8(ii), where one will also find an application to quantum codes.

V C Absence
Given the Presence theorem one might anticipate a similar Absence theorem. It comes in two versions: Theorem (Absence). Let S a and S b be two quantum systems. i) Simple version. If the pre-probability is a pure state |Ψ on H a ⊗ H b and the information associated with a single orthonormal basis {|v j } of H a is completely absent from H b , then |Ψ is a product state of the form |a ⊗ |b , so there is no information about S a in S b or vice versa; the two are uncorrelated.
ii) Complicated version. Let the pre-probability be a general density operator on H a ⊗H b , and let {V (m) } be a collection of decompositions of the identity I a of H a , for each m, where the V so there is no information about S a in S b or vice versa; the two are uncorrelated. The proof of both versions is given in App. C. For version (ii) the conditions are definitely more complicated than for the Presence theorem: if H a has dimension d a , one needs to check not two but at least d a + 1 orthonormal bases (see end of App. C) in order to be sure that all information is absent. For instance, if S a is a qubit, d a = 2, it suffices to check that the X , Y, and Z types of information are absent, but two out of the three is not enough, as shown in the example at the end of App. B.

V D No Cloning
One might suspect that the No-Splitting theorem is the same as, or at least closely related to, the well-known no cloning result [9]. However, the two seem to be different, since neither the conditions nor the consequences of no-cloning are expressed in terms of types of information as used here. The following theorem is the closest we have been able to come in finding a connection between the two.
Theorem (Generalized No Cloning). Let M be an isometry from H a to H bc , and {|α j }, with j lying in a finite index set J, a collection of normalized kets on H a with the property that the pairs (j, k) for which the inner product α j |α k is nonzero when treated as edges produce a connected graph on the set J. Assume that for each |α j its image under M is a product state where both |β j and |γ j are normalized, and that whenever the left side is nonzero. Under these conditions M restricted to the subspace G a spanned by the {|α j } is of the form where U is a unitary map of G a onto the subspace G b of H b spanned by the {|β j }, and |γ 1 is a fixed ket in H c . The proof is in App. D. The connection with no-cloning, not obvious given the somewhat abstract statement of the theorem, is the following. Suppose j takes on just two values 1 and 2, the states |α 1 and |α 2 are linearly independent, and α 1 |α 2 = 0. Imagine these are two states to be cloned, and the isometry M (which can be replaced with a unitary acting on the tensor product of H a and an additional space H s initially in a state |s 0 ) is supposed to carry out the cloning process. If |β 1 and |β 2 are good copies up to some unitary transformation of H b , their inner product must equal α 1 |α 2 apart from an unimportant phase. As the conditions of the theorem are fulfilled-the graph consists of two nodes joined with the edge (1,2)-it follows that M is not only unable to produce additional copies in G c , but in fact there is no information at all in G c which would allow distinguishing the states |α 1 and |α 2 . Thus at least for the subspace G a (which could be all of H a if the span of {|α j } is large enough) one arrives at the same conclusion as with the No Splitting theorem, but using somewhat different hypotheses.

VI Conclusion
Identifying types or species of quantum information and noting when they are compatible (i.e., the projectors commute) or incompatible looks like a promising approach to the foundations of quantum information for the following reasons. First, it allows a more intuitive, as well as a fully consistent, approach to quantum probabilities at the microscopic level, in contrast to the usual textbook approach, with its preparations, measurements, and "great smoky dragon" [37], long known to provide an awkward and difficult (and internally inconsistent [38,39]) way of thinking about the quantum world, however effective it may be as a calculational tool for the final outcomes of measurements. Second, the ideas of classical information theory [3] are directly applicable to quantum systems as long as one restricts oneself to a single type of quantum information, or to two or more compatible types (which can then be combined to form a single type), because there is a properly defined sample space on which probabilities of quantum events and processes, and their correlations, satisfy the standard rules of probability theory, which are fundamental to the structure of information theory as developed by Shannon and his successors. Note that it is not necessary to restrict oneself to macroscopic systems or asymptotically large N (number of transmissions, or whatever) limits.
Third, the existence of different incompatible species of quantum information is at the heart of the objections raised in [4] to extending Shannon's theory to the quantum domain. Recognizing the role of quantum incompatibility and using different information types gets around these problems and allows a fully consistent formulation of the microscopic statistical correlations needed to properly begin the "quantization" of classical information theory. Fourth, one of the principal ways quantum information goes beyond its classical counterpart is in its discussion of how incompatible types of information relate to, or so-to-speak constrain, each other for a given setup, or quantum circuit, or entangled state. The Presence, Truncation, and Absence theorems and their various corollaries in Sec. V clearly do not belong to the domain of classical information, since their very formulation requires reference to noncommuting operators, the hallmark of "quantum" effects.
The approach presented here provides, we believe, new intuitive insight into the processes of teleportation and decoherence, and into how "classical" information can be consistently described as a quantum phenomenon. It is obviously incomplete in two respects. First, the theorems of Sec. V are of the "all or nothing" variety: they apply to extreme situations in which information is either completely present or completely absent. Obviously it would be valuable to have quantitative extensions of these theorems, presumably in the form of inequalities, that apply to situations where information of different kinds is partially present or absent. Finding suitable information measures and proving appropriate bounds looks like a challenging problem, but one that needs to be addressed given that one is often interested in situations where there is noise, so different types of quantum information will be degraded in different ways. There are, of course, many inequalities involving quantum information in the published literature, and some of them, such as those of Hall [40,41], look as if they can be reformulated to apply to different species of information as discussed here.
Second, the examples and theorems given in this paper (and their extensions beyond "all or nothing" noted above) need to be generalized to cases in which microscopic quantum properties are considered at a large number of successive times, as in the case of "quantum jumps" [42][43][44]. For this purpose it is likely that the full machinery of quantum histories [11] will be needed in order to provide consistent probabilistic descriptions without having to invoke the awkward concepts of macroscopic "preparation" and "measurement," which are obviously not a fundamental part of microscopic quantum mechanics.

A Presence Theorem
The Presence theorem of Sec. V is an extension of theorem 4 of [8], where it was shown to hold when ρ is a pure state on H ab . Our task here will be to remove that restriction and prove it for a general density operator ρ = ρ ab , where subscripts have been added to avoid any confusion in the following argument. To that end first "purify" ρ ab by introducing an auxiliary system H c and a ket on H abe , with {|c q } an orthonormal basis of H c , and coefficients {|ψ q } chosen so that Lemma Suppose that the V = {V j } information about S a is perfectly present in S b for ρ ab . Then that is also true for every |ψ q in (A.1) for which p q > 0.
To show this, insert (A.2) in place of ρ on the right side of (6), so each ρ bj is a sum over q of positive operators ρ bqj . Then in order that (7) hold it is necessary (and obviously sufficient) that ρ bqj ρ bq ′ k = 0, for all q and q ′ , whenever j = k. Setting q ′ = q gives the desired result. Now apply the lemma to both the V and the strongly incompatible W type of information. Since for one of the pure states |ψ q on H ab both types of information about S a are in S b , theorem 4 of [8] tells us that for this pure state all information about S a is present in S b . This implies (and is implied by, see theorem 3 (i) of [8]) that |ψ q is maximally entangled, or Tr b (|ψ q ψ q |) = I a /d a . : as is well known, changing that basis does not alter the density operator ρ = ρ ab we began with, but simply expresses it in terms of a different ensemble. Consequently, all types of information about S c are absent from S a , which is to say ρ ac = ρ a ⊗ ρ c , theorem 1 (iii) of [8], and thus all types of information about S a are also absent from S c . One more step is needed. The presence of V information about S a in S b means it is also present in S bc . (This is intuitively obvious, but can also be shown formally from the definition in (35) of [8], where one simply replaces B k with B k ⊗ I c . Or one can use the definition in (7) of the present paper, along with the fact that if P and Q are positive operators on H e ⊗ H f with P Q = 0, then P e Q e = 0, where P e and Q e are partial traces over H f -use the spectral representations and take traces.) Of course the same is true of the W information. Hence, applying theorem 4 of [8] to the bipartite system consisting of S a on the one hand and S bc on the other, with |Φ an entangled ket on H a ⊗ H bc , we conclude, from the presence of two strongly incompatible types of information, that all the information about S a is present in the combined system S bc . From this and the absence of all information about S a from S c demonstrated earlier, it follows from theorem 8 (iii) of [8], using |Φ as a pure state preprobability on H abc , that all information about S a is in S b . This completes the proof of the theorem.
Next we will show that two orthonormal bases V = {|v j } and W = {|w k } are strongly incompatible if and only if the graph described in Sec. V A is connected. First, assume V and W are not strongly incompatible, so there is a projector P that commutes with every |v j v j | and every |w k w k |, and is neither 0 nor I, so projects onto a proper subspace P of the Hilbert space. Then some subset of the {|v j } are inside P and the rest are in its orthogonal complement P ⊥ , for otherwise P would not commute with them, and the same is true of the {|w k }. Evidently there can be no nonzero v j |w k for a |v j (or |w k ) in P and a |w k (or |v j ) in P ⊥ . Consequently the graph cannot be connected, as it has at least two components, one for P and one for P ⊥ vertices.
For the converse, assume the graph is not connected, and renumber the vertices so that the {|v j } with 1 ≤ j ≤ m < d a are all the v vertices in one connected component C of the graph, with the others in its complement. The projector obviously commutes with all the {|v j }, and in addition commutes with every |w k w k | when |w k is not in C, as otherwise there would be an edge from some |v j with j ≤ m to this |w k . Now apply the same argument with I − P in place of P to show that I − P commutes with every |w k w k | when |w k is in C. Since P and I − P commute with the same things, we have shown that P commutes with all the |w k w k | as well as the |v j v j |. As P is neither 0 nor I, V and W are not strongly incompatible.

B Truncation Theorem and Related
If the V = {V j } information about S a is perfectly present in S b , there is, as noted following (7), a decomposition {T k } of I b such that The equivalence is shown in [8], where in fact (B.1) is the primary definition. If the pre-probability defining is a pure state |Ψ ∈ H abc , (B.1) implies that As a consequence, and using the fact that I a = j V j , I b = k T k , we have Therefore it follows that which is (9) when the pre-probability is a pure state. In (B.5) we have used the fact that T j commutes with AC, T j T k = δ jk T j , and (B.4).
To extend the argument to a general density operator ρ on H abc , introduce a fictitious system H r , purify ρ to a ket |Ψ ∈ H abcr , and apply (B.5) to the three part system consisting of S a , S b , and in place of S c the combined system S cr . The significance of V j , T j , and A is the same as before, while C can be replaced by any operator on H cr . If in particular we use C ⊗ I r , the result is (9). The equivalence of (8) and (9) is a straightforward exercise when one notes that AC = Tr(ACρ ac ), and that (9) holds for all A and C (operating on H a and H c ). This completes the proof.
The following example shows that the Exclusion theorem does not possess a simple converse of the type mentioned in Sec. V B. The entangled state on H abc , with qubits in the order |abc , has the property that all information about S a is present in the combined system S bc , the X information about S a is perfectly present in both S b and in S c , whereas both the Y (basis {(|0 ± i|1 )/ √ 2}) and the mutually unbiased Z information about S a are perfectly absent from both S b and from S c . Perhaps the easiest way to check this is to expand |Ψ in the X , Y, and Z bases of H a in turn, and look at the coefficients in H bc . That all the information about S a is in S bc follows from the observation that any one of these expansions (and therefore all three) is in Schmidt form with Schmidt coefficients of equal magnitude, so theorem 3(i) of [8] applies. Note that we have an example in which if S a and S b are considered two parts of a bipartite system with pre-probability given by the density operator Tr c (|Ψ Ψ|), one would be mistaken to suppose that the absence of the two mutually unbiased types of information Y and Z about S a from S b implied the complete absence of all information. This confirms the remarks at the end of Sec. V C. where the |β j are expansion coefficients. The fact that the {|v j } or {|v j v j |} information about S a is absent from S b means that the |β j are all proportional to one another, thus multiples of |β 1 , assuming it is nonzero. Inserting |β j = c j |β 1 in (C.1) shows that |Ψ = |a ⊗ |β 1 is a product state, so no information about S a is in S b or vice versa.

C Absence Theorems
To prove (ii) we employ an orthonormal basis {Q r }, 0 ≤ r ≤ d 2 a − 1 of the spaceĤ a of linear operators on H a , in the sense that Q r , Q s := (1/d a )Tr a (Q † r Q s ) = δ rs , (C. 2) with Q 0 = I a , and thus Tr a (Q r ) = 0 for r > 0. Expand ρ as where, see (C.2), the expansion coefficients are given by with suitable coefficients c rjm . (These may not be unique, but that does not matter.) Insert (C.7) in (C.4) and use (C.6) to conclude that every B r is a multiple of B 0 , and therefore ρ = ρ a ⊗ B 0 is a product.
The need for at least d a +1 orthonormal bases of H a in order to check that all information about S a is absent from S b can be seen in the following way. Each basis of H a gives rise to d a orthogonal, and hence linearly independent, operators in the d 2 a -dimensional spaceĤ a . But these d a projectors sum to the identity I for each such basis, and therefore ν such bases will give rise to at most ν(d a −1) + 1 linearly independent operators, which is d 2 a when ν = d a + 1.