A potential long-range RNA-RNA interaction in the HIV-1 RNA

Abstract It is well-established that viral and cellular mRNAs alike harbour functional long-range intra-molecular RNA-RNA interactions. Despite the biological importance of such interactions, their identification and characterization remain challenging. Here we present a computational method for the identification of certain kinds of long-range intra-molecular RNA-RNA interactions involving the loop nucleotides of a hairpin loop. Using the computational method, we analysed 4272 HIV-1 genomic mRNAs. A potential long-range intra-molecular RNA-RNA interaction within the HIV-1 genomic RNA was identified. The long-range interaction is mediated by a kissing loop structure between two stem-loops of the previously reported SHAPE-based secondary structure of the entire HIV-1 genome. Structural modelling studies were carried out to show that the kissing loop structure not only is sterically feasible, but also contains a conserved RNA structural motif often found in compact RNA pseudoknots. The computational method should be generally applicable to the identification of potential long-range intra-molecular RNA-RNA interactions in any viral or cellular mRNA sequence. Communicated by Ramaswamy H. Sarma


Introduction
The genomic RNAs of viruses harbour many secondary and tertiary structures that are involved in the regulation of various viral processes (Liu et al., 2009).While most of the structures are confined to local RNA sequences, many cases of long-range intragenomic RNA-RNA interactions with essential biological functions have also been reported (Chkuaseli and White, 2018;Hu et al., 2007;Kim and Hemenway, 1999;Klovins et al., 1998;Licis et al., 1998;Mateos-Gomez et al., 2013;Miller and White, 2006;Tajima et al., 2011;van Himbergen et al., 1993).A long-range RNA-RNA interaction involves two stretches of nucleotides distantly separated within the same RNA sequence.The actual distance that classifies a RNA-RNA interaction as long-range can vary in different contexts, ranging from tens to tens of thousands of nucleotides.So far, all of the experimentally verified longrange interactions are mediated by complementary base pairing, and the nucleotides participating in the interactions are often located in the loop or internal bulge regions within local RNA structures.
Long-range intragenomic RNA-RNA interactions are also present in other RNA viruses.In HIV-1, several such interactions, including R-GAG, LDI, U5-AUG, TAR-TAR and GAG-U3R, have been identified and characterized (Abbink and Berkhout, 2003;Andersen et al., 2004;Beerens and Kjems, 2010;Huthoff and Berkhout, 2001;Ooms et al., 2007;Paillart et al., 2002).Each of the R-GAG, LDI, U5-AUG interactions spans a few hundreds of residues and involves nucleotides in the untranslated R-U5 region and the Gag coding region.The interactions regulate various viral processes such as intermolecular dimerization of the viral genomic RNAs and packaging of the viral genomic ribonucleoproteins (gRNPs) into virions.The TAR-TAR and GAG-U3R interactions have much longer ranges, spanning thousands of residues.The interactions involve nucleotides located in the 5 0 -and 3 0ends of the genomic RNA, and mediate the circularization of the HIV-1 genome.Some of the long-range intragenomic interactions observed in HIV-1 are conserved in other retroviruses (Kalloush et al., 2016).
Functional intramolecular long-range RNA-RNA interactions are not limited to viral RNAs.A limited number of publications have shown that cellular mRNAs can also harbour intramolecular long-range interactions that assume diverse functions, including pre-mRNA splicing and mRNA translation (Braun et al., 2017;Chen and Kastan, 2010;Lovci et al., 2013;Ruiz de los Mozos et al., 2013).
Despite the importance of long-range intra-molecular RNA-RNA interactions, the identification and characterization of such interactions remain challenging.We have developed a strategy to computationally identify certain kinds of longrange intra-molecular interactions involving the loop nucleotides of a hairpin loop.In this report, we present computational and structural modelling studies on a potential longrange intra-molecular RNA-RNA interaction within the HIV-1 genomic RNA.The methods should be generally applicable to the identification of potential long-range intra-molecular RNA-RNA interactions in any viral or cellular mRNA sequence.

A computationally identified potential long-range RNA-RNA interaction within the HIV-1 genomic RNA
We had previously developed a computer program (called PKscan) for the detection of possible hairpin (H-) type pseudoknots in RNAs (Huang, Du, et al., 2013;Huang et al., 2013a;2013b;2014).An important feature of the program is that computationally there is no upper limit on the lengths of the stems and loops of the pseudoknots, as well as the RNA sequence to be analysed.With this feature, the program can easily be adapted for the identification of those kinds of potential long-range intramolecular RNA-RNA interactions that are mediated by hairpin loop nucleotides.
As shown in Figure 1A, in an H-type pseudoknot, a stretch of nucleotides outside of a hairpin forms complementary base-pairings with the loop nucleotides (Dam et al., 1992;Pleij, 1990;Pleij et al., 1985).Herein, the term pseudoknot refers to the H-type unless otherwise stated.The pseudoknot must have two stems (S1 and S2) and two loops (L1 and L2), with a third loop (L3) being optional.Most naturally occurring pseudoknots have relatively compact sizes with limited lengths of stems and loops.Long-range intramolecular interactions involving the loop nucleotides of a hairpin are equivalent (at least computationally) to H-type pseudoknots with an extraordinary long loop2 (L2).These kinds of longrange interactions can be identified by using the PKscan program with an appropriate setting for the length of loop2.
To identify potential long-range RNA-RNA interactions in the HIV-1 genomic RNA, we analysed the 4263 available HIV-1 genomic sequences using the PKscan program, with the range of length for loop2 being set to 100-8000 nt.In the present study, the arbitrary value of 100 nt is used to classify an intra-molecular RNA-RNA interaction as long-range.When loop2 has at least 100 nt, the two stretches of nucleotides that form stem2 are separated by at least 100 nt plus the number of base-pairs in stem1.
In the search, the other parameters of the pseudoknots were set as follows: S1 ¼ 6-15 bp, S2 ¼ 6-7 bp, L1 ¼ 1-2 nt and L3 ¼ 0. The range for S1 ensures a relatively strong stem of the hairpin.The ranges for the other three parameters (S2, L1 and L3) restrict the pseudoknots to a particular pseudoknot family known as CPK-1 (Common Pseudoknot Motif-1) (Figure 1B) (Du et al., 1996;Du and Hoffman, 1997;Huang et al., 2013aHuang et al., , 2014)).A CPK-1 pseudoknot is formed when a 3 0 -strand of RNA binds asymmetrically to 6 or 7 nucleotides within the loop region of a hairpin, leaving only 1-2 unpaired nucleotides at the 5 0 -end of the loop.In the tertiary structure (see Figure 1B the SRV-1 frameshift stimulating pseudoknot as an example), the two stems (stem1 and 2) of the pseudoknot stack coaxially to form a quasi-continuous helix; the 1-2 unpaired nucleotides in loop1 cross the major groove of stem2, with the base(s) being embedded inside the major groove.Compact CPK-1 pseudoknots have a widespread occurrence in natural and SELEX (Evolution of Ligands by EXponential Enrichment) RNAs (Du et al., 1996;Du and Hoffman, 1997;Huang et al., 2013aHuang et al., , 2014)).Conceivably, the conserved structural features originally observed in compact CPK-1 pseudoknots (two co-axially stacked stems and a minimal number of 1-2 nucleotides crossing the major groove of a 6-7 base-pair stem) can also mediate other types of RNA-RNA interactions, including long-distance intramolecular interactions and intermolecular interactions.We speculate that these kinds of interactions may also have a wide-spread occurrence in natural systems, as does the compact CPK-1 pseudoknots.
Computationally, these kinds of long-distance intramolecular RNA-RNA interactions are equivalent to CPK-1 pseudoknots with a long loop2 (L2).
In each of the HIV-1 genomic sequences, a certain number (from several to tens) of potential long-range pseudoknots were detected by the program.A few of the detected pseudoknots are highly conserved among the HIV-1 sequences.
One of the most conserved potential long-range pseudoknots locates in the 3 0 -end long-terminal repeat (LTR), within the Nef protein coding region (Figure 2).The predicted pseudoknot has 7 and 6 base-pairs in stem1 and stem2, respectively, a single nucleotide in loop1, and 109 nt in loop2.The nucleotides in the long loop2 may participate in the formation of other secondary structures.With the long loop2 and its possible associating structures, the long-range pseudoknot could better be described as a long-range intramolecular RNA-RNA interaction, mediated by complementary base-paring between the two strands of nucleotides that form stem2. Importantly, this computationally identified long-range intramolecular RNA-RNA interaction fits in well with the previously predicted secondary structures of an entire HIV-1 genomic RNA based on SHAPE data (Watts et al., 2009).
In the SHAPE-based secondary structure of the entire HIV-1 genomic RNA (strain NL4-3) (Pollom et al., 2013;Watts et al., 2009), nucleotides 8753-8773 form a stem-loop, and nucleotides 8867-8906 forms another stem-loop (Figure 3A).The stem of the 8753-8773 stem-loop (except the A 8760 -U 8766 basepair) forms the stem1 of the computationally identified long-range pseudoknot.The stem2 of the potential long-range pseudoknot is formed by nucleotides C 8761 -U 8766 in the 8753-8773 stem-loop and nucleotides G 8883 -G 8888 in the 8867-8906 stem-loop.Except U 8766 , all of these nucleotides are in the loop regions of the 8753-8773 or the 8867-8906 stem-loops, with the potential to involved in base-paring interaction.Therefore, the long-range RNA-RNA interaction we identified by computational methods may actually be a kissing-loop interaction, in the context of the SHAPEbased secondary structures of the entire HIV-1 genomic RNA.
After the SHAPE-based secondary structure of the entire HIV-1 genomic RNA (strain NL4-3) was reported (Watts et al., 2009) (refered to as the Watts09 model herein), a few other computational prediction of full-length HIV-1 genomic RNA secondary structures were published (Pollom et al., 2013;Skittrall et al., 2019;S€ uk€ osd et al., 2015).Pollom and co-workers revised the original SHAPE-directed secondary structural model of HIV-1NL4-3 using newly optimized parameters for calculating the pseudo-free energy term (Pollom et al., 2013;Skittrall et al., 2019;S€ uk€ osd et al., 2015)   bioinformatics approach which combined phylogenetic and SHAPE data in the prediction (refered to as the S€ uk€ osd15 model herein).The S€ uk€ osd15 model shares many secondary structures with the Watts09 and Pollom13 models.But there are also some differences.Most noticeably, the S€ uk€ osd15 model contains much less secondary structures (only about 8% of the nucleotides are involved in base-paring, comparing to �60% in the previousl models).Different interpretation of the SHAPE reactivity values and consideration of phylogenetic base-pair co-variations might be the reasons for the different prediction of the S€ uk€ osd15 model.In the S€ uk€ osd15 model, nucleotides 8753-8773 form a stem-loop that is almost identical to that in the previous models, except that A8760 does not form a base-pair with U8766 (see Figure 3A).The 8774-8906 region downstream from the 8753-8773 stem-loop is largely unstructured in the S€ uk€ osd15 model.The 8867-8906 stem-loop in the previous models is not present in the S€ uk€ osd15 model.The nucleotides G8883-G8888 are single stranded and fully flexible in the S€ uk€ osd15 model, i.e. no secondary structure would impose any constraints upon the base-paring interaction between the G8883-G8888 and C8761-U8766.While the S€ uk€ osd15 model predicted many long-range interactions, it failed to predict some of the known long-range interactions, presumably due to the highly conserved sequences (lack of co-variation data).The sequences involved in the long-range interaction described in this paper are also highly conserved in HIV-1, making it impossible to detect the long-range interaction by algorithms that are based on phylogenetic analysis.

Modelling of the long-range RNA-RNA interaction
The computationally identified long-range pseudoknot (Figure 2, left) has six base-pairs in stem2, one nucleotide in loop1, and no intervening sequence between stem1 and stem2, conforming to the CPK-1 family of pseudoknots (Du et al., 1996).Structurally, stem1 and stem2 could stack coaxially to form a quasi-continuous helix, with the base of the loop1 nucleotide in the major groove of stem2.Interestingly, formation of stem2 involves six nucleotides from the loop of the 8867-8906 stem-loop distantly located in the primary sequence.The kissing-loop nature of the potential longrange RNA-RNA interaction adds to the structural complexity of the interaction.
Within the loop region of the 8867-8906 stem-loop, residues G 8883 -G 8888 participate in the formation of stem2 of the long-range pseudoknot, leaving one nucleotide at the 5 0 -end (A 8882 ) and four nucleotides at the 3 0 -end ( 8889 ACAG) in the loop remain un-paired.These relatively short single-stranded intervening sequences may impose constrains on the kissing loop structures.
To explore whether the computationally identified longrange RNA pseudoknot with CPK-1 structural features can coexist with the SHAPE-based predicted 8867-8906 stem-loop, we carried out structural modeling studies of the RNA molecule containing the 8753-8773 stem-loop and the 8867-8906 stem-loop.
In the modeling studies, the three stems (stem1, stem2 and the stem of the 8867-8906 stem-loop, see Figure 3A) were restricted to adopt the A-form helical structure.Stem1 and stem2 were restricted to stack co-axially upon each other as in a compact CPK-1 pseudoknot.No constrain was used for the single-stranded nucleotides (A 8860 , A 8882 and A 8889 -G 8893 ).
As shown in Figures 3B and 4A and B, the kissing-loop interaction between the 8753-8773 stem-loop and the 8867-8906 stem-loop can exist in such a way that CPK-1 structural features are utilized to mediate the interaction.Despite the limited number of single-stranded loop nucleotides left unpaired upon formation of the kissing loop structure, multiple conformations with very different relative orientations of the two helical segments (the co-axially stacked S1-S2 segment of the long-range pseudoknot and the stem segment of the 8867-8906 stem-loop).In the structure shown in Figure 3B, the two segments assume an rather extended conformation.In the structure shown in Figure 4A, the two segments assume a bent conformation, lying close to each other in a roughly parallel manner.In the structure shown in Figure 4B, the two segments assume an intermediate, roughly perpendicular conformation.In all three structures, the S1, S2 and L1 (consisting of a single nucleotide A 8760 ) regions assume the characteristic CPK-1 fold, with S1 and S2 stacking co-axially and L1 crossing the major groove of a sixbasepair stem.
These results from the structural modeling studies indicate that CPK-1 structural features can be utilized to mediate the kissing loop interaction.While other base-paring schemes are also possible, the structures containing the CPK-1 fold may energetically be more favorable by maximizing the base-paring and base-stacking potentials.

Discussion
In the computational search for long-range intramolecular RNA-RNA interactions (equivalent to RNA pseudoknots with a long loop2), only the primary sequence of the HIV-1 genomic RNA is used as the input to the program.Prior knowledge about the local secondary structures of the RNA is not used in the process.Therefore, it is particularly interesting to see that the computationally identified pseudoknot shown in Figure 2 is largely compatible with a possible kissing loop interaction between the SHAPE-based 8753-8773 and 8867-8906 stem-loops (Figure 3A).This coincidence greatly enhances the credibility of the computationally identified long-range interaction and suggests that the computational method could be useful as a complement to other experiment-based methods in the studies of RNA structures.
Results from the structural modelling studies show that the kissing loop interaction not only is sterically possible, but also can be mediated by the key structural features of the CPK À 1 motif.The CPK-1 motif is popular among naturally occurring and SELEX compact RNA pseudoknots (Du et al., 1996;Du and Hoffman, 1997;Huang et al., 2013aHuang et al., , 2014)).Conceivably, the conserved structural features originally observed in compact CPK-1 pseudoknots (two co-axially stacked stems and a minimal number of 1-2 nucleotides crossing the major groove of a 6-7 base-pair stem) can also mediate other types of RNA-RNA interactions, including longdistance intramolecular interactions and intermolecular interactions.The case presented here may represent a CPK-1 motif mediated long-range intramolecular interaction, embedded in an elaborated kissing loop structure.
The possible biological function of the potential kissingloop structure is not clear at this time.The sequence involved are highly conserved among the HIV-1 strains.The structure is located in the 3 0 -end long-terminal repeat (LTR), within the Nef coding region.It is possible that the structure may present certain cis-acting signals for the regulation of mRNA stability, translation of the Nef protein, reverse transcription, or polyadenylation, etc.Further experiments are needed to investigate the existence of the structure, and its biological function.
In this study, the search for long-range pseudoknots is restricted to the CPK-1 family of pseudoknots, by setting the ranges of stem2 (S2) and loop1 (L1) to 6-7 base-pairs and 1-2 nucleotides respectively.This restriction is based on the speculation that the CPK-1 motif, as a recurring structural theme in compact pseudoknots, would also be able to mediate long-range intramolecular interactions.With the narrow ranges of S2 and L1, the computational search was also much more efficient.It should be noted, although this study  3B, the major difference is the relative orientation of the stem region and the un-paired loop nucleotides (A8882 and A8889-G8892) of the 8867-8906 stem-loop.Nucleotides are colored the same way as in Figure 3B.
only searched for long-range interactions mediated by the CPK-1 motif, the computational method should be generally applicable to search for any long-range intramolecular RNA-RNA interaction as long as the interaction involves a stretch of nucleotides in the loop region of a stem-loop.The search ranges of the stems and loops can be adjusted by the user of the program, computationally with no upper limits.Except for loop2 (L2), an upper limit of 20 would be generous enough for the other elements (S1, S2, L1 and L3).A survey of known stem-loop mediated long-range intramolecular RNA-RNA interactions would be insightful for setting proper ranges of the elements to run a general search.
Long-range intramolecular RNA-RNA interactions are known to play important biological functions for cellular and viral RNAs, but the identification of such interactions remain challenging.Computational structure prediction and subsequent experimental investigation may provide an efficient way to identify functional long-range RNA-RNA interactions.However, RNA secondary structure prediction remains a fundamental challenge, due to the lack of robustness.Prediction of long-range interactions is even more elusive.For the prediction of secondary structures within the full-length HIV-1 genomic RNA, a few studies have been published (Pollom et al., 2013;Skittrall et al., 2019;S€ uk€ osd et al., 2015;Watts et al., 2009).While the predictions share some common structures, the differences are also significant.Inclusion of phylogenetic data in the algorithm had help the prediction of some long-range interactions, but would not predict those interactions involving highly conserved sequences (S€ uk€ osd et al., 2015).The computational method presented in this paper simplifies the prediction of certain kinds of long-range intramolecular RNA-RNA interactions into the search of Htype pseudoknots with a long loop2.The method is thus highly efficient and effective.The only input required is one RNA sequence at the minimum.Of course, the availability of a set of homologous sequences is always desired.The predicted structures could be supported by either existence of base-pair covariation or sequence conservation.What should be noted is that the method only detects the potential formation of the specific types of long-range interactions.No attempt is made to find the optimal folding state of involved sequences.Therefore, interpretation of the results would be much more meaningful when RNA secondary structure predictions by minimum free energy algorithm are available, such as in the case of HIV-1 RNA.The particular case of potential long-range interaction identified by the method herein has not been reported in the previous predictions, and it is compatible with the different previously predicted secondary structure models.The stage is set for experimental verification of the predicted long-range interaction and assignment of its possible biological functions.This case exemplifies how the computational method presented herein could be used as a good compliment to existing RNA secondary structure prediction algorithms.Perhaps more significantly, the method should be generally applicable to any RNA sequence.With its robustness and easiness of use, the method would be a valuable addition to the computational tools for the studies of RNA structures.

The computational method
Previously, we had developed a program called PKscan for the identification of potential H-type pseudoknots within any given RNA sequence without length limit (Huang et al., 2013a).Briefly, a pseudoknot-forming RNA sequence must contain two pairs of complementary stretches (forming S1 and S2) separated by two or three connecting unpaired regions (L1, L2 and optionally L3) (Figure 1A).The program tests all possible combinations of stem and loop lengths within certain ranges to see whether the pseudoknot-forming criteria can be met.The ranges for the lengths of S1, S2, L1, L2 and L3 can be set by the user.We had used the program to detect potential compact pseudoknots in viral genomic mRNAs (Huang et al., 2013a(Huang et al., , 2014)).A unique feature of the PKscan program is that there is computationally no upper limit for the pseudoknot-forming elements (S1, S2, L1, L2 and L3) and the input RNA sequence.With this feature, the program can also be used to detect possible long-range intramolecular RNA-RNA interactions, which are computationally equivalent to pseudoknots with a long loop2 (L2) (Figure 1A).By setting the lower limit of L2 to an arbitrary high value, the program will be able to search for long-range (longer than the set lower limit of L2) RNA-RNA interactions.

The RNA sequences
The 4272 full-length HIV-1 genomic sequences were downloaded from the HIV databases (http://www.hiv.lanl.gov/).The sequence shown in the figures is for the HIV-1 strain NL4-3, accession number AF324493.2.

Detection of long-range pseudoknots
Using the above described computational method, we had performed a search for potential long-range intramolecular RNA-RNA interactions in the HIV-1 genomic mRNAs.Each of the 4272 different strains were analysed by PKscan, with the following settings for the ranges of the pseudoknot-forming sequence elements: S1 ¼ 6-15 bp, S2 ¼ 6-7 bp, L1 ¼ 1-2 nt, L3 ¼ 0 and L2 ¼ 100-8000 nt.With these particular ranges for S2, L1 and L3, the search is limited to a particular family of RNA pseudoknots known as CPK-1 (common pseudoknot motif-1) (Figure 1B).The range for L2 ensures that the two stretches of nucleotides forming the long-range interaction (equivalent to S2) are separated from each other in the primary sequence by at least 100 nucleotides (more exactly, 100 plus the number of basepairs in S1).For each of the HIV-1 sequences, tens of potential long-range pseudoknots were detected.One of the most conserved cases is presented in this paper (Figure 2).

Structural modelling of the kissing loop interaction
The computationally identified long-range pseudoknot (Figure 2) might actually nest inside a kissing loop structure between the 8753-8773 and 8867-8906 stem-loops of the previously reported SHAPE-based secondary structures of the entire HIV-1 genome (Figure 3A) (Watts et al., 2009).To explore whether the kissing loop structure is sterically feasible, modelling studies were performed.
The model structures were built using the same computational procedures as in RNA structure determination by NMR, except that the distance and torsion angle restraints were artificially generated.The distance restraints include: (1) inter-residue NOE-like proton-proton distance restraints for the A-form helical stem regions; (2) hydrogen bond restraints (two distance restraints per hydrogen bond) for standard Watson-Crick GC and AU base pairs, as well as G-U wobble base pair.These restraints were based on standard base pair geometry of nucleic acids; (3) across-strand phosphorusphosphorus distance restraints for the A-form helical stem regions; (4) Restraints across the S1-S2 helical junctions were treated as in a normal A-form helix.Torsion angle restraints were used to keep the sugar rings of the nucleotides in the stem and loop regions in a C3 0 -endo and C2 0 -endo conformation, respectively.Backbone torsion angles for residues in the stem regions were restricted to the A-form helical values.Backbone torsion angles for residues in the loop regions were not constrained.Structure calculation and refinement were performed using CNS 1.3 (Br€ unger et al., 1998).Two hundred random initial structures were generated and subjected to simulated annealing in torsion angle space, followed by variable target function minimization.Three structures with the lowest target function values were chosen as the representative model structures (Figures 3A and 4A and B).The structures have good geometries (for modelled structure 1, rmsd bonds ¼ 0.003119, rmsd angles ¼ 0.66495.Other structures have similar values).RNA structural analysis using the pseudotorsion angles eta and theta shows that only one to two un-paired loop nucleotides of the 8867-8906 stem-loop have eta or theta values in the outlier regions (Supplementary Figure 1).
All molecular graphics were generated using Chimera from the computer graphics laboratory of UCSF (Pettersen et al., 2004).
Figure 1.(A) A schematic representation of a H-type pseudoknot, with two stems S1 and S2, and three connecting loops L1, L2 and L3.(B) the secondary and tertiary structures for a representative pseudoknot of the CPK1 (common pseudoknot motif-1) family.Shown is the frameshift-stimulating pseudoknot at the gag-pro junction of SRV-1 (PDB code: 1E95).S1, S2, L1 and L2 of the pseudoknots are coloured differently in orange, blue, red and cyan, respectively.

Figure 2 .
Figure 2. A computationally identified long-range H-type pseudoknot in HIV-1 RNA.(Left) The pseudoknot is drawn in such a way to highlight the base-paring interaction between a stretch of nucleotides (box in blue) outside of a stem-loop and a stretch of the loop nucleotides (magenta), forming the stem S2.The loop L2 contains 109 nt.(Right) The same pseudoknot is drawn in a typical CPK1 configuration with co-axial stacking of the two stems and a single nucleotide L1 crossing the major groove of S2.The A 8760 -U 8766 basepair is disrupted.Residue numbering is based on the HIV-1 strain NL4-3 (accession number AF324493.2).

Figure 3 .
Figure 3.A potential kissing-loop structure that encompasses the computationally identified long-range pseudoknot as shown in Figure 2. (A) Secondary structure of the potential kissing-loop interaction.The base-pairing schemes of the two stem-loops were previously reported based on SHAPE data (Ref).The two stretches of nucleotides involved in the potential loop-loop kissing interaction are identical to those that form the stem S2 in Figure 2. (B) A modelled structure of the kissing-loop interaction.The structure encompasses the CPK1 pseudoknot as shown in Figure 2 right panel, which is redrawn in the boxed insert.Coloring of nucleotides is the same in the secondary and tertiary structure representations.S1 nucleotides are colored in orange, S2 nucleotides 8761-8788 in magenta and nucleotides 8883-8888 in blue, L1 nucleotide A8760 in red, other nucleotides in the 8867-8906 stem-loop that are not involved in the loop-loop kissing interaction are colored in cyan.The sequence between A8873 and G8867 (having 94 nt) is not included in the modelled structure but should be long enough to span the distance.

Figure 4 .
Figure 4. Two other modelled structures of the kissing-loop interaction.Compared to the modelled structure 1 in Figure3B, the major difference is the relative orientation of the stem region and the un-paired loop nucleotides (A8882 and A8889-G8892) of the 8867-8906 stem-loop.Nucleotides are colored the same way as in Figure3B.