Molecular simulation study of assembly of DNA-grafted nanoparticles: effect of bidispersity in DNA strand length

In this paper, we use molecular dynamics simulations to study the assembly of DNA-grafted nanoparticles to demonstrate specifically the effect of bidispersity in grafted DNA strand length on the thermodynamics and structure of nanoparticle assembly at varying number of grafted single-stranded DNA (ssDNA) strands and number of guanine/cytosine (G/C) bases per strand. At constant number of grafted ssDNA strands and G/C nucleotides per strand, as bidispersity in strand lengths increases, the number of nanoparticles that assemble as well as the number of neighbours per particle in the assembled cluster increases. When the number of G/C nucleotides per strand in short and long strands is equal, the long strands hybridise with the other long strands with higher frequency than the short strands hybridise with short/long strands. This dominance of the long strands leads to bidisperse systems having similar thermodynamics to that in corresponding systems with monodisperse long strands. Structurally, however, as a result of long–long, long–short and short–short strand hybridisation, bidispersity in DNA strand length leads to a broader inter-particle distance distribution within the assembled cluster than seen in systems with monodisperse short or monodisperse long strands. The effect of increasing the number of G/C bases per strand or increasing the number of grafted DNA strands on the thermodynamics of assembly is similar for bidisperse and monodisperse systems. The effect of increasing the number of grafted ssDNA strands on the structure of the assembled cluster is dependent on the extent of strand bidispersity because the presence of significantly shorter ssDNA strands among long ssDNA strands reduces the crowding among the strands at high grafting density. This relief in crowding leads to larger number of strands hybridised and as a result larger coordination number in the assembled cluster in systems with high bidispersity in strands than in corresponding monodisperse or low bidispersity systems.


Introduction
Functionalisation or grafting of single-stranded DNA (ssDNA) on nanoparticle surfaces has been used to direct the assembly of the nanoparticles into target nanostructures through the specific Watson -Crick base pairing between complementary ssDNAs grafted on nanoparticles. [1 -13] Single-stranded DNA consists of a series of nucleotides, in which a nucleotide consists of a sugar, a phosphate group and a base [adenine (A), guanine (G), cytosine (C) or thymine (T)]. Through Watson -Crick base pairing, in which A specifically forms two hydrogen bonds with T and G forms three hydrogen bonds with C, a sequence of nucleotides in an ssDNA hybridises specifically with a complementary sequence of nucleotides on another ssDNA to form a dsDNA helix. As the temperature is raised above the melting temperature of the dsDNA, it separates into the two-constituent ssDNA. The thermal stability of the dsDNA characterised by its melting temperature and melting curve is strongly dependent on the number, composition and sequence of nucleotides in the constituent ssDNA. Through this thermoreversible and specific nature of hybridisation between the ssDNAs that are grafted on two or more nanoparticles, one can assemble DNA-grafted nanoparticles into clusters of particles with specific structure, shape and size. [1 -13] Assembly of inorganic nanoparticles (in particular gold) into specific sizes, shapes and a structure with a set of inter-particle spacing is useful in various optics and photonic applications. [14 -16] Experimentally, the assembly of the DNA-grafted particles is directed through one of the following protocols. Strands grafted on one particle hybridise with complementary strands on another particle either (a) in a solution of binary population of particles with one set of particles grafted with ssDNA that are complementary to the ssDNA grafted on the second set of nanoparticles or (b) in a solution with a single population of particles in which the grafted ssDNA sequence is self-complementary (e.g. ACGT). Alternatively, strands on two or more particles hybridise via free linker strands which when added to the system of DNA-grafted particles induce nanoparticle assembly. [17][18][19][20][21][22][23][24][25] Synthetically, it is now possible to design DNA-grafted nanoparticles [22,[26][27][28] and colloids [20,29,30] with a specific ssDNA sequence, length and composition to tune nanoparticle/colloid assembly into target nano/microstructures. Precision in functionalising particles with a desired number of strands in specific locations leads to the ability to program into the building blocks unique assembly instructions, such as dimensionality of the assembled nanoparticle structure (e.g. 1D nanowires to 2D sheets to 3D gels or crystals). For example, Ohya et al. [31] achieved the formation of nanowires by synthesising building blocks with ssDNA grafted at diametrically opposed locations on a particle surface. Particles grafted with ssDNA at low grafting density, which have variations in strand lengths and placed in precise locations on the particle, allow one to create unique finite-sized nanoclusters, e.g. dimers and trimers, where the trimers can be linear or triangular, [32] pyramids, [33] or satellite structures in which small particles grafted with one long strand each hybridise only with a central larger particle. [34] Considering the rapid progress in the synthesis of these designed DNA-grafted particles, computational scientists have been studying DNA-grafted nanoparticles to predict their assembly in silico, and provide fundamental understanding on how various parameters -particle size, grafted ssDNA length, sequence, grafting density (number of grafted strands per unit particle surface area), G/C content or composition-impact the assembly of the DNAgrafted nanoparticles/colloids. The challenge(s) in simulating the assembly of DNA-grafted particles lies (lie) in trying to maximise the amount of detail in the model so as to mimic the right physical and chemical nature of these DNA-grafted particles, while minimising the computational intensity of simulations so as to capture experimentally relevant length and timescales. Atomistically detailed simulations of DNA-grafted particles would be too computationally intensive if one chose to simulate the assembly of multiple particles from a completely disordered (unassembled) state to an assembled state. To overcome this, the atomistic molecular simulation studies either (a) focus on systems of multiple grafted particles that begin in an assembled state so as to examine the stability of the assembled state at specific temperature and pressure, or (b) focus on smaller system sizes, such as hybridisation between the two strands grafted on two surfaces, or (c) focus on the stability of hybridised double strand tethered to a surface. [35 -37] To facilitate the studies of the assembly of multiple DNA-grafted particles from an unassembled state, without the computational costs of atomistic models, a vast number of coarse-grained models have been built. [12,13,38 -43] Simulation studies with these less detailed, yet physically relevant, coarsegrained models provide valuable insight into both the dynamics during the assembly and the structure of the assembled state at equilibrium. A review/critique of all these models is outside the scope of this paper, and not the focus of this paper, so we only briefly comment on these coarse-grained models. For example, Starr and Sciortino developed a coarse-grained model in which a DNA-grafted particle is represented as a central core with the nucleotides along the DNA strand (arms) represented with one large coarse-grained bead that hosts a smaller coarse-grained bead, embedded within the nucleotide bead, to mimic hydrogen-bonding sites on the nucleotide. [12,13] This model was adapted in our previous work in which we elucidated the effect of DNA strand design (composition, sequence and length) on the assembly of DNA-grafted particles. [44] Theodorakis et al. [40] also used the model of Starr and Sciortino, in conjunction with a parallel replica protocol, to calculate the potential of mean force between the two DNA-grafted particles with tetrahedral and octahedral grafting patterns. Recently, another coarse-grained model has been developed by Travesset and coworkers [11,38,39], similar in detail as the Starr and Sciortino model, with additional flanking beads connected to the nucleotide beads that ensure that the hydrogen bonding is directional and restricted to be between only two nucleotides (and not more). In addition, they used graphical processing unit (GPU)-based simulations [45] with these coarse-grained models to accelerate the computational speed of these simulations. Mladek et al. [42,43] used a 'core-blob' model, in which all parts of the DNA-grafted particle except for the 'sticky' ends of the DNA strands were represented by a soft core while the 'sticky' ends were represented with explicit attractive coarse-grained beads, to calculate effective interactions or potential of mean force between two such core-blob DNAgrafted particles; they showed that by using these effective pairwise interactions the three-body interactions are overestimated. All the above studies show the importance of using appropriate models that do not overly coarse grain these complex soft material systems.
These past computational and experimental studies have established the effect of grafted ssDNA length, sequence, grafting density (number of grafted strands per unit particle surface area), G/C content or composition on the thermodynamics and kinetics of colloidal and nanoparticle assembly. We review briefly the effects of some of these parameters on structure and thermodynamics. As grafted ssDNA length increases, (i) the hybridisation/melting temperature (T m ) of dsDNA increases, and as a result, the assembly/dissociation transition temperature of nanoparticles (T d ) increases. [18,25] This is because as ssDNA strand length increases, the number of base pairs in the dsDNA increases, and in turn, the enthalpic gain from the larger number of base pairs is higher, shifting the T m and T d to higher temperatures; (ii) the inter-particle spacing within the assembled structure increases. As the particle size increases at constant grafting density, the number of grafted strands increases, and as a result, the T d increases and melting transition sharpens. [5,25,46,47] As the percentage of G/C bases in the grafted strands increases, the enthalpic drive for hybridisation between the complementary strands is higher (and T d of the nanocluster assembled is higher) because of the three hydrogen bonds in a G -C pair in contrast to two hydrogen bonds in an A -T pair. [47]. In most of the above studies, the DNAgrafted nanoparticles consist of ssDNA which have homogeneity in their composition, sequence and length, and accordingly, affect the structure and thermodynamics of assembly homogeneously. As discussed in the earlier paragraphs, all grafted ssDNAs have a specific length that leads to a specific inter-particle spacing in the assembled nanoparticle cluster or have a specific G/C content that causes the particles to assemble at a specific temperature. In this paper, we study the effect of heterogeneity on the assembly of DNA-grafted particles, by introducing bidispersity in the lengths of the grafted ssDNA strands.
In this study, we use molecular dynamics simulations, with a coarse-grained model used in our previous work, that is based on Starr and Sciortino's model, [12,13] to study a system of DNA-grafted nanoparticles that assemble (without linkers) through hybridisation of complementary grafted strands in an implicit solvent. Our specific goal in this study was to understand the effect of bidispersity in strand lengths on the structure and thermodynamics of the assembly at varying number of grafts and number of G/C nucleotides at constant particle size and particle concentration. The results in this paper provide valuable understanding of the complex interplay of bidispersity in strand length, number of grafts and graft composition on the structure and thermodynamics of particle assembly, which should guide experiments aimed at tuning particle assembly through the design of grafted strands.
The content in this paper is organised as follows. In Section 2, we provide the details of our model and the simulation method as well as analysis techniques and the table of parameters. In Section 3, we present the results of the assembly of DNA-grafted nanoparticles as a function of bidispersity in ssDNA strand length. We conclude with the key observed results and some limitations of this work.

Model and simulation
We model a system of DNA-grafted nanoparticles in an implicit solvent using a coarse-grained model, as described in our recent work. [44] This coarse-grained model is capable of capturing the timescale and length scale of DNA hybridisation-driven assembly of many nanoparticles [12,13] that atomistically detailed models would not be able to capture. We direct the reader to our previous paper [44] for details of the model and its validation, and only present the essential details here for brevity. In our model, generic hard spherical nanoparticles, that can be organic or inorganic in chemistry, of diameter D are grafted with ssDNA strands at fixed locations and symmetrically on the nanoparticle surface. The strands are modelled as semiflexible chains composed of N bases number of 'monomer' beads of diameter s mon , where s mon , 1 nm, as each monomer bead represents a complete nucleotide (sugar, phosphate and A, C, G or T base). Each monomer bead contains an attractive site that mimics hydrogen bonding. This attractive hydrogen-bonding site is restricted to interact with another hydrogen-bonding site on a complementary monomer bead, thus mimicking Watson -Crick base pairing. This model of DNA-grafted nanoparticles is adapted from a recent simulation study on DNA dendrimers. [12,13] We note that although in those studies the DNA dendrimers are modelled with a tetrahedral hub to which ssDNA is bound, we modelled the nanoparticle as a hard core of a given diameter.
All non-bonded pairwise interactions (nanoparticlenanoparticle, nanoparticle -monomer, monomer-monomer, hydrogen-bonding site -monomer, hydrogen-bonding site -nanoparticle and hydrogen-bonding sitehydrogen-bonding site) were modelled using a truncated and shifted Lennard-Jones (LJ) potential: where U LJ ¼ 4 £ 1 £ ½ðs=rÞ 12 2 ðs=rÞ 6 , s is the sum of the radii of the interacting spheres, 1 is the energetic well depth, r c is the distance where the potential is truncated and r represents the centre-to-centre distance between the coarse-grained beads of interest. All pairwise interactions involving nanoparticle -nanoparticle, nanoparticlemonomer, monomer -monomer, hydrogen-bonding sitemonomer, hydrogen-bonding site -nanoparticle and noncomplementary hydrogen-bonding site -non-complementary hydrogen-bonding site are modelled as repulsive interactions with r c ¼ 2 1/6 £ s. The pairwise interactions between the complementary hydrogen-bonding sites include both the repulsive and attractive portion of the potential, with r c ¼ 2.5 £ s. The reduced value of 1 in Equation (1) is assigned as follows: 1 nanoparticle ¼ 1, 1 monomer ¼ 1, 1 hydrogen-bonding site ¼ 1 for hydrogen-bonding sites on monomer beads representing A or T nucleotides and 1 hydrogen-bonding site ¼ 1.5 for hydrogenbonding sites on monomer beads representing G or C nucleotides. All energies are represented in terms of 1 mon (LJ energy parameter for the monomer spheres), where 1 mon , 8 kT. The value of pairwise interaction between the two sites is the geometric average of the individuals forming the pair, 1 ¼ (1 1 1 2 ) 1/2 . The value of s in Equation (1) is assigned as follows: s nanoparticle ¼ 4 £ s mon and s hydrogen-bonding site ¼ 0.35 £ s mon , where s mon is the diameter of the monomer spheres and s mon , 1 nm. Bonded interactions between various beads were simulated using a finitely extensible nonlinear elastic (FENE) [48] where K ¼ (30 £ 1 FENE )/s 2 and R 0 ¼ 1.5 £ s, K is the force constant, R 0 is the maximum extension of the bond, 1 FENE is an energy parameter which is equal to 1 mon and the values of s and r depend on the type of beads involved in the bond. For a bond between a nanoparticle and the first monomer of a DNA strand, s is defined as the radius of the monomer and r is defined as the distance between the centre of the monomer and the surface of the nanoparticle. For a bond between two monomers, s is defined as the sum of the two monomer radii and r is defined as the distance between the monomer centres. Finally, a pseudo-bond between a hydrogen-bonding site and the host monomer uses s equivalent to the diameter of the hydrogen-bonding site and r is defined as the centre-to-centre distance of the hydrogen-bonding site and the host monomer. The equilibrium position of the hydrogen-bonding sites is such that the surface of the hydrogen-bonding site protrudes 0.02 £ s mon from the surface of the host monomer.
A three-body potential between bonded monomer beads along the ssDNA regulates the characteristic stiffness of the DNA strands: where K is a stiffness factor equal to 21 3-body , where 1 3-body is 10 times 1 mon , u is the angle made by the three adjacent monomer beads and u 0 is the ideal angle equal to 1808 for the preferred linear orientation of DNA. Hydrogenbonding sites are not subject to three-body interactions. A three-body interaction is also applied between the nanoparticle and the first two monomers of each chain to keep the monomer chains oriented perpendicular to the surface of the nanoparticle. Lastly, a three-body interaction is applied between the first monomer of each DNA graft, the nanoparticle, and the first monomer of every other DNA strand on the same nanoparticle and the u 0 value for this three-body interaction is set to force the DNA grafts to the desired relative positions on the surface of the nanoparticle, and allows us to ensure that the grafts are placed symmetrically on the particle surface.
The above model was incorporated into a locally authored MD code in the NVT ensemble, where the temperature is controlled via Nosé -Hoover thermostat. [49,50] We refer the reader to our previous publication [44] for details of the reduced model parameters and validation of the code.

Thermodynamics
We characterise the thermodynamics of DNA-grafted nanoparticle assembly by calculating the assembly/ dissociation transition of the nanoparticles as a function of reduced temperature. As the temperature is lowered and the strands hybridise, the grafted particles assemble into a cluster. We define a cluster as two or more particles with at least one hybridised base pair between DNA strands of those nanoparticles. Free nanoparticles are the particles for which the grafted ssDNA is not hybridised with the strands on another particle at that time step. We track the average number of free nanoparticles as a function of temperature. We define the assembly/dissociation transition temperature, T d , as the temperature at which half the total number of particles has hybridised to be part of one or more clusters.

Structure
To characterise the structural features of the particle assembly, we first calculate for each nanoparticle its coordination number, Z, defined as the average number of particle neighbours the nanoparticle has within the cluster, where two particles are considered as neighbours if they have at least one base pair hybridised between their DNA strands. We also calculate the average of the coordination number distribution to obtain kZl of the system at a specific temperature. In addition, to characterise the size of the clusters, we calculate the ensemble average number of nanoparticles in the assembled cluster, kNl, and the average radius of gyration of the assembled cluster as a function of reduced temperature. We also calculate the frequency of occurrence of various inter-particle distances N(r) versus r, where r is the inter-particle distance. To understand the types of hybridisation that leads to the above structural characteristics, we also calculate a histogram of types of strand hybridisation -hybridisation between short strands, one short strand and one long strand, and long strands (see Figure S1 in supplementary information available via the article webpage). To determine the average shape of the cluster at varying temperatures, we calculate the average relative shape anisotropy (RSA) parameter, [51] where the RSA value of 0 denotes that the particles are arranged perfectly isotropic (i.e. spherical symmetry) and an RSA value of 1 denotes that the particles are perfectly anisotropic (i.e. rod-like) in their arrangement. The details of the RSA calculation are given in our recent work. [44] Lastly, to characterise the network structure within the assembled cluster, we quantify the size and number of the smallest loops formed by linked nanoparticles within the assembled cluster. We find loops by tracking for every nanoparticle paths that start from that particle, going through its neighbours within the assembled cluster and ending in that particle. We then select from these loops the unique loops that do not contain any other loop. In the supplementary information ( Figure S7 available via the article webpage), we show two examples: In one assembled cluster five nanoparticles forming three unique loops of size three and in another assembled cluster six nanoparticles forming three unique loops with one loop of size four and two loops of size three. The error bars for the analysis presented in the figures are calculated as standard error from 10 simulation trials for each system.

Parameters
We study 10 DNA-grafted nanoparticles in simulation box size of (100 nm) 3 , corresponding to a dilute concentration of c ¼ 0.00001 particles/nm 3 . The nanoparticles are of diameter D ¼ 4 £ s mon with s mon , 1 nm, as described in the model section. We vary the number of grafts from N g ¼ 8 to 16, which on particles of D ¼ 4 nm correspond to a grafting density of s ¼ 0.16 and 0.32 chains/nm 2 , respectively. Each particle has equal number of short strands and long strands. We vary the ratio of short to long strand length, N short :N long ¼ 1:1.5, 1:2 to 1:3. All strands in a system, short and long, have the same number of G/C nucleotides, so that we can isolate the effects that come purely from the physical heterogeneity introduced by bidispersity. The short strands are composed only of G/C nucleotides (no spacers), and therefore have N short number of G/C nucleotides. The long strands are composed of both G/C nucleotides and spacer nucleotides that do not participate in hybridisation. In these long strands, the G/C nucleotides are located on the outermost portion of the strand (farthest from the particle surface), and the remaining nucleotides simply serve as spacers. We vary the number of G/C nucleotides relative to the particle diameter N short /D ¼ 1, 1.5 -2. The sequences that correspond to each bidisperse N short :N long ratio and N short /D are summarised in Table 1.

Effect of bidispersity in DNA strand length at constant number of grafts and number of G/C nucleotides
In Figure 1(a), we show the effect of increasing bidispersity in DNA strand length on the thermodynamics of cluster formation for 10 particles in a (100 nm) 3 simulation box, with particles of diameter D ¼ 4 nm each grafted with N g ¼ 8 number of strands, an equal number of short, N short , and long, N long , strands, number of G/C nucleotides relative to the particle diameter, N short /D ¼ 1, and sequences 1 -3 of Table 1. We also show the monodisperse case corresponding to N short ¼ 4 (black filled squares, solid line) and N long ¼ 12 (black filled triangles, solid line). For all systems, as temperature decreases, the number of free particles decreases and DNA-grafted nanoparticles assemble into a cluster. As bidispersity in strand length increases, the particles begin to assemble at higher temperatures; however, the cluster dissociation temperature, T d , for all systems in Figure 1 is similar, and occurs at T * , 0.095. Nanoparticle assembly is driven by the system wanting to maximise enthalpically favourable contacts of the grafted DNA strands while minimising the particle with translational and rotational entropy loss due to particles packed closely in the assembled state. As all strands have the same number of G/C nucleotides per strand in the systems, all systems should have similar enthalpic drive for hybridisation. While T d does not vary significantly with the bidispersity in these conditions, at the low temperatures T low ¼ 0.08, with increasing bidispersity, the number of free particles decreases. In contrast, in the case of monodisperse short strands, there are non-zero free nanoparticles at T low ¼ 0.08. This suggests that with increasing bidispersity the entropic losses from the assembly decrease, thus favouring the assembly of all particles.
In the bidisperse cases, when the long strands on one particle hybridise with short strands or long strands on another particle, the inter-particle distance is larger than Table 1. Sequences studied for each ratio of short to long strand length, N short :N long and G/C content of N short relative to the particle diameter, N short /D. when short strands of one particle hybridise with short strands of another particle. Larger inter-particle distances relieve particle crowding upon assembly, and are favourable entropically. In the case of monodisperse short strands, the particles have to assemble at closer distances than bidisperse strands, and because the entropic loss from tightly packed particles in the cluster is high, all the particles in the system are unable to cluster together at the lowest temperature with the number of free nanoparticles ¼ 1.8^0.2 at T low ¼ 0.08. In contrast, the presence of long strands, either in a bidisperse setting or in a monodisperse setting, reduces the entropic losses coming from tightly packed particles, thus favouring the particle assembly and resulting in zero free particles at T low ¼ 0.08. Going from monodisperse short to monodisperse long strand, and with increasing bidispersity, we observe the transition becoming slightly sharper. Thus, the long strands in the bidisperse cases dominate the thermodynamics of assembly, causing the bidisperse system to behave differently from the corresponding monodisperse short systems, but similar to the corresponding monodisperse long systems. Next, we characterise the structure of the assembled nanoclusters as a function of bidispersity in strand lengths. Figure 1(b) presents the ensemble average number of nanoparticles per cluster, kNl, at each temperature. Below T d , as bidispersity in strand lengths increases, kNl increases. With increasing bidispersity and increasing length of the long strands (N long increases) the assembly of particles is increasingly favoured, leading to larger number of particles coming together in a cluster. The 4:12 system has a similar kNl at temperatures below T d as the 12:12 system, the corresponding monodisperse long system, and has a significantly different kNl from the 4:4 system, the corresponding monodisperse short system, yet again suggesting that the long strands drive the assembly of the particles with bidisperse strands. Figure 1(c) shows the average coordination number, kZl, at each temperature. For the monodisperse short strands case, which forms relatively small clusters, the average number of neighbours per particle in the cluster is low (kZl ¼ 1.6^0.2). As bidispersity in strand lengths increases, kZl increases from 2.5^0.2 for N short :N long ¼ 4:6 to 3.0^0.2 for N short : N long ¼ 4:12, in part because kNl increases (Figure 1(b)). Interestingly, unlike prior metrics where 4:12 and 12:12 were similar, the 4:12 system has a significantly different kZl from the 4:4 and 12:12 systems suggesting that the presence of short and long strands leads to different structures than the corresponding monodisperse systems. In Figure 1(d), we observe that for increasing bidispersity in strand length the radius of gyration of the cluster increases as well. As N long increases, one can expect nanoparticles to be spaced farther apart, which in combination with increasing kNl leads to overall size of the cluster to increase. From the RSA data, we characterise the shape of the cluster. Figure S2 in supplementary information, available via the article webpage, shows how the shape and size of the cluster evolve from T high to T low . When the assembled clusters have fewer particles, the shape of the cluster is relatively anisotropic (high values of kRSAl and low value of kNl in Figure S2). When the assembled clusters have large number of nanoparticles, the clusters are relatively isotropic (low kRSAl and high kNl in Figure S2). Based on the results in Figure 1, it is not surprising that as the temperature decreases kNl increases and kRSAl decreases in Figure S2, and with increasing bidispersity in strand lengths, the transition to higher kNl and lower kRSAl shifts to higher temperatures.
Next, we present the results that show how bidispersity in DNA strand length affects the inter-particle distances in the assembled cluster (Figure 2, top panel) and the ssDNA: ssDNA hybridisation histogram between the particles  Table 1 for N short / D ¼ 1. Also shown are monodisperse cases 4:4 with sequence GGCC and 12:12 with sequence AAAAAAAAGGCC.
( Figure 2, middle panel). In Figure 2(a), at T low ¼ 0.08 for N short :N long ¼ 4:6, we observe a small peak in N(r) at r 2 D ¼ 4 nm which corresponds to a distance between the surface of particles that have short strand hybridised to another short strand and a broader peak at r 2 D ¼ 5-7 nm which corresponds to a distance between the surface of particles that have short strand hybridised to a long strand partially and long strand hybridised to another long strand. For bidispersity in strand lengths ratio N short :N long 4:8 (Figure 2(b)), there is a small peak of r 2 D ¼ 4 nm, which corresponds to a distance between the surface of particles with short -short hybridisation, a larger peak at r 2 D ¼ 7 -8 nm, which corresponds to a distance between the surface of particles with short -long hybridisation, and a large peak at r 2 D ¼ 10-11 nm, which corresponds to a distance between the surface of particles with long -long hybridisation. For bidispersity in strand lengths ratio N short :N long ¼ 4:12 (Figure 2(c)), N(r) has peaks of r 2 D ¼ 4, r 2 D ¼ 11 and r 2 D ¼ 18 nm, which correspond to short -short, short -long and longlong hybridisation, respectively.
The ssDNA:ssDNA hybridisation histogram in Figure 2 (middle panel) shows the frequency of short strand hybridising to another short strand (short -short), short strand hybridising to a long strand partially (shortlong) or long strand hybridising to another long strand (long -long). Short -long and long -long hybridisations are more frequent than short -short hybridisation in all bidisperse cases. The results shown in Figure 2 are at the lowest temperature; it is fair to expect that the strand hybridisation changes with changing temperature. When we compare the results shown in the middle row of Figure 2 with that at T * ¼ 0.09 and 0.10, we find the following. In cases with low bidispersity, at high temperatures, the longlong hybridisation dominates, and as temperature decreases, the short -short and short -long hybridisations grow. In cases with intermediate bidispersity, at high temperatures, short -long and long -long hybridisations dominate (and short-short hybridisation is zero), and as temperature decreases, the short -short hybridisation grows while still remaining lower in frequency than long -long and long -short hybridisation. At high bidispersity, the relative ratio of short -short, short -long and long -long hybridisation does not change as temperature decreases, and all occurrences grow by a small amount. This suggests that in general, in the presence of bidisperse strands, at high temperatures, the long strands are dominating the hybridisation between the particles, and as the temperature decreases, the short strands in the particles find the nearest short strands to hybridise to.
In Figure 2 (bottom panel), we compare explicitly bidisperse systems with the corresponding monodisperse  Table 1 for N short /D ¼ 1. For (d-f), x-axis s:s denotes short strand to short strand hybridisation, s:l denotes short strand to long strand hybridisation and l:l denotes a long strand to long strand hybridisation. Plots (g -i) include monodisperse N short (GGCC) and N long . N long sequence for (g) (AAGGCC), for (h) (AAAGGCC) and for (i) (AAAAAAAAGGCC). Vertical dashed lines (in each plot from left to right) indicate for each bidisperse system expected peaks that would result from short -short, short -long and long -long ssDNA:ssDNA hybridisation. The y-values of the monodisperse cases are divided by (g) 10, (h) 10 and (i) 15 units and presented in the curves to facilitate comparison to bidisperse cases in the same plot. systems at T * ¼ 0.08. In Figure 2(g), N(r) for the monodisperse short strands (4:4) case has a single peak of r 2 D , 4 nm, which corresponds to a distance between the surface of particles with short -short hybridisation, and for the monodisperse long strands (6:6) case has a single peak of r 2 D , 7 nm, which corresponds to a distance between the surface of particles with long -long hybridisation. The bidisperse case (4:6) has the additional broad peak from r 2 D ¼ 5-7 nm, which as previously discussed corresponds to short -long hybridisation. For both N short :N long ¼ 4:8 (Figure 2(h)) and 4:12 (Figure 2(i)), we observe a monodisperse short strand peak at r 2 D , 4 nm, which corresponds to a distance between the surface of particles with short -short hybridisation. The monodisperse long strands have r 2 D peaks at , 12 nm for N long ¼ 8 (Figure 2(h)) and 18 nm for N long ¼ 12 (Figure 2(i)), both of which correspond to a spacing between the surface of particles with long -long hybridisation. The bidisperse cases in Figure 2(h),(i) have an additional peak that corresponds to the short -long hybridisation (at r 2 D ¼ 8 nm in Figure 2(h) and r 2 D ¼ 12 nm in Figure 2(i)). Thus, the bidisperse systems produce a broader distribution of inter-particle distances than seen in the monodisperse ssDNA-grafted nanoparticle clusters facilitated by the three types of hybridisation between the short and long strands.
The dominance of the long strands leads to the bidisperse case and monodisperse long strands having similar thermodynamics of assembly and similar number of particles in the assembled cluster. However, the presence of the short strands amidst long strands in the bidisperse case changes the coordination number within the assembled structure and the distribution of interparticle distances within the cluster from the corresponding monodisperse long and monodisperse short cases.

Effect of varying number of G/C nucleotides and average strand length at constant bidispersity in DNA strand length and number of grafts
We next determine the effect of varying number of G/C nucleotides at constant bidispersity in DNA strand length, N short :N long ¼ 1:1.5. Corresponding figures for bidispersity in DNA strand length ratio N short :N long ¼ 1:2 and 1:3 are shown in the supplementary information ( Figures S3 -S5 available via the article webpage). We study the number of G/C nucleotides equal to 4, 6 and 8, corresponding to sequences 1, 4 and 7, respectively, in Table 1. At bidispersity ratio of 1:1.5, the increasing number of G/C nucleotides results in actual strand length ratios of 4:6, 6:9 and 8:12. As the number of G/C nucleotides increases, cluster dissociation temperature, T d , increases (Figure 3  (a)). In our previous study with monodisperse DNA strands, where we varied the G/C content at constant number of grafts, we found that T d increases as G/C content increases because of the increased enthalpic drive for the assembly. [44] In the presence of bidispersity in grafts, this trend seen in monodisperse grafts is preserved. Figure 3(b) -(d) presents kNl, kZl and radius of gyration of the cluster for increasing number of G/C nucleotides. In contrast to the systems at lower number of G/C nucleotides (4:6), we observe that in the systems with higher number of G/C nucleotides (8:12) all 10 nanoparticles assemble into a cluster by T low (Figure 3  (b)). This is due to the increased enthalpic drive arising from higher number of G/C nucleotides in 6:9 and 8:12 cases compared with that in 4:6 case. In addition, as the average strand length increases from 4:6 to 8:12, the particles do not have to pack closely upon assembly, thereby decreasing the crowding in the assembled cluster and leading to lower loss in entropy upon assembly. The differences in kNl between 4:6, 6:9 and 8:12 are larger at higher temperatures than at lower temperatures. Average coordination number or number of neighbours per particle, kZl, for bidispersity ratios 6:9 and 8:12, is , 3.5 (Figure 3 (c)), while for bidispersity ratio 4:6, which forms clusters of fewer particles, kZl ,2.5 suggesting a complex interplay of bidispersity and actual strand lengths. Also, as the number of G/C nucleotides increases, and as the average strand length increases, the radius of gyration of the assembled clusters increases (Figure 3(d)). For example, the radius of gyration of the cluster at T low ¼ 0.08 for bidisperse case of 4:6 is , 10 nm versus , 18 nm for 8:12.
The above trends at N short :N long ¼ 1:1.5 are also seen at DNA strand length ratios N short :N long ¼ 1:2 and 1:3 shown in Figure S3 of supplementary information, available via the article webpage.
When we compare the inter-particle distances of clusters formed (Figure 4(a) -(c)), as the number of G/C nucleotides increases, there is a noticeable increase in N(r) at intermediate inter-particle distances. However, the corresponding short -long hybridisation (Figure 4(d) -(f)) does not change significantly with increasing number of G/C nucleotides. In the presence of bidispersity, as the number of G/C nucleotides increases, there can be partial hybridisation between the long and short strands. This was visually confirmed in the simulation trajectories. As a result of this increasing partial short -long hybridisation, the corresponding inter-particle distances between the surfaces of particles increase without significant change in the histogram of short -long hybridisation.
3.3 Effect of number of grafts at varying number of G/C nucleotides and average strand length at constant bidispersity in DNA strand length ratio Figure 5 presents the effect of increasing G/C nucleotides at higher number of grafts, N g ¼ 16, at bidispersity ratio 1:1.5 (in contrast to N g ¼ 8 in Figures 3 and 4). Corresponding figures for bidispersity in DNA strand length ratios N short :N long ¼ 1:2 and 1:3 are shown in the supplementary information ( Figures S6, S8, S9 available via the article webpage). Again, we observe that as the number of G/C nucleotides increases, T d increases; however, the T d for each bidispersity ratio is shifted to a higher value moving from N g ¼ 8 (Figure 3(a)) to N g ¼ 16 ( Figure 5(a)). As the number of grafts increases, the total number of G/C nucleotides available for hybridising between the particles in a cluster increases, causing the cluster dissociation temperature to increase due to higher enthalpic drive. As the particle diameter is maintained constant in our study, as the number of grafts increases so does the grafting density, which is defined as the number of grafts per unit surface area of the particle. As grafting density increases, there is higher amount of crowding between the strands and the lower entropy in the unassembled state; the loss in conformational entropy of the strands going from unassembled (free particle) to assembled state is reduced, which further causes the cluster dissociation temperature to increase. These trends are similar to those seen in our previous work with monodisperse strands, where increasing the number of grafts while maintaining G/C content per strand constant leads to increasing T d.
Some trends in the structure of the clusters with increasing G/C nucleotides at bidispersity ratio 1:1.5 at N g ¼ 16 are similar to that at N g ¼ 8. For example, increasing number of G/C nucleotides leads to clusters of increasing number of particles ( Figure 5(b)) and increasing size ( Figure 5(d)) by T low . Interestingly, the number of neighbours per particle for all three systems is similar both at N g ¼ 16, as seen by kZl ,3 at T low ( Figure 5(c)), and at N g ¼ 8 with kZl ,3 at T low (Figure 3(c)), suggesting that at this bidispersity ratio of 1:1.5, the increased crowding  Table 1). For (d -f), x-axis s:s denotes short strand to short strand hybridisation, s:l denotes short strand to long strand hybridisation and l:l denotes a long strand to long strand hybridisation. among the 16 strands limits the number of strands hybridised, and the number of neighbours a particle can have in the assembled state. However, one could expect that as the bidispersity increases, the presence of significantly shorter DNA strands among long DNA strands could reduce the crowding and lead to larger number of strands hybridised, and higher number of neighbours that a particle can have (or larger coordination number) in the assembled cluster.
To understand the effect of increasing number of grafts with increasing bidispersity, while maintaining the same number of G/C nucleotides per strand, in Figure 6(a), we present the number of free nanoparticles for N g ¼ 8 (dashed lines) and N g ¼ 16 (solid lines) for two different bidispersities in DNA strand length ratios -4:6 (green) and 4:12 (blue). For both bidispersity ratios, we observe that T d for higher number of grafts is shifted to higher temperature than for the lower number of grafts. With regard to the structure of clusters at higher number of grafts, we observe that kNl (Figure 6(b)) and radius of gyration of cluster ( Figure 6(d)) change with the number of grafts in a similar fashion for both bidispersities. This is not the case for average coordination number. Interestingly, for bidisperse case for 4:12 at T low , kZl increases from 2.8^0.2 for N g ¼ 8 to 3.7^0.3 for N g ¼ 16 which is higher than the minimal increase seen for 4:6. Thus, increasing bidispersity reduces the crowding among the strands, leading to a larger increase in kZl with increasing number of grafts than that seen in low bidispersity or monodisperse systems. Figure 7 shows the inter-particle distances in the assembled clusters as well as the ssDNA:ssDNA hybridisation histograms for N g ¼ 8 and 16 for bidisperse cases 4:6 and 4:12. For both bidisperse cases, we observe a higher frequency of long -long hybridisation and negligible short -short hybridisation at higher number of grafts ( Figure 7 Table 1 for N short /D ¼ 1 and 2). systems, in the presence of longer strands, it is harder to make short -short hybridisation due to larger crowding of grafts. We see that N(r) for N short :N long ¼ 4:6 and 4:12 ( Figure 7(c),(d)) has peaks at r 2 D , 4 nm, but the absence of short -short hybridisation in the ssDNA:ssDNA hybridisation histogram of Figure 7(g),(h) indicates that particles with higher number of grafts assemble by forming partial bonds between the strands of neighbouring particles at close range. Simulation snapshots (Figure 9) confirm that higher number of graft systems forms both complete and partial hybridisation between the strands. Figure 8 shows a comparison of the structure within the assembled clusters for N g ¼ 8 and 16 by plotting the number of loops formed in the equilibrium clusters at T low for varying number of G/C bases. At lower grafting density (Figure 8(a)), monodisperse short strands primarily have loops of size three. The simulation snapshot for monodisperse short strands (Figure 9(a)) shows a hexagonal packing motif, confirming that unique loops within the assembled cluster should contain three nanoparticles. As bidispersity ratio increases, loops of size four begin to form in addition to the loops of size three. This suggests that the bidispersity in DNA strand lengths disrupts the packing of the nanoparticles. This can also be seen in the simulation snapshots in Figure 9.
As the number of G/C nucleotides increases and the average length of the DNA strands increases, we find loops of size three and four increase in frequency. When the number of G/C nucleotides increases, so does the average length of the strands, which causes percolation of the nanoparticle clusters within the simulation box (e.g. see Figure 9(i)) and results in loops of larger sizes. At the  Table 1 for N short /D ¼ 1). For (e-h), x-axis s:s denotes short strand to short strand hybridisation, s:l denotes short strand to long strand hybridisation and l:l denotes a long strand to long strand hybridisation. highest bidispersity for these larger cases (6:18 in Figure 8 (b) and 8:24 in Figure 8(c)), the system has similar number loops of size five and six as loops of size three and four, with the loops of size three being smaller in occurrence than loops of size four, in the case of 8:24. As the number of grafts increases, the number of loops of size three increases for all bidisperse cases as well as for monodisperse long strands (Figure 8(d)). As the number of G/C bases increases at higher number of grafts, the highest bidisperse cases (6:18 in Figure 8(e) and 8:24 in Figure 8(f)) form loops of size five and six in addition to the loops of size three and four.

Conclusion
We have studied using molecular dynamics simulations systems of DNA-grafted nanoparticles that assemble through hybridisation of the grafted DNA strands, and demonstrated how the bidispersity in strand lengths induces structural changes in assembled clusters as a function of number of grafts as well as G/C content. We find that the number of grafts and the number of G/C nucleotides significantly impact how the structure of the assembled nanoclusters changes with increasing bidispersity in strand lengths. The effects depend on how the DNA-grafted nanoparticles can maximise favourable enthalpic contacts (e.g. high G/C bases lead to higher enthalpic drive) while minimizsng entropic losses that arise (e.g. the nanoparticle translation and rotation restricted and strand conformations minimised) when the particles pack closely. As bidispersity in strand lengths increases, at constant number of grafts and number of G/C bases, the number of nanoparticles in the assembled cluster and the number of neighbours per particle or coordination number in the cluster increase. This is because when the total number of G/C nucleotides is constant, and thus the enthalpic drive for the assembly is constant, the presence of long strands in the bidisperse Figure 9. (Colour online) Representative snapshots of cluster formation at T low ¼ 0.08 for 10 particles, in a (100 nm) 3 box, each of diameter D ¼ 4 nm and grafted with (a-j) N g ¼ 8 or (k-t) N g ¼ 16 strands with bidispersity in DNA strand length ratio N short : N long ¼ 1:1.5 (column 2), ¼ 1:2 (column 3), or ¼ 1:3 (column 4) and number of G/Cs per strand (a-e, k -o) ¼ 4 or (f -j, p-s) ¼ 8 (sequences listed in Table 1). For each row, the monodisperse N short (column 1) and N long (column 5) are shown as well. systems helps alleviate the entropic losses seen in tightly packed particles by hybridising with the longer strands on neighbouring particles and by increasing the inter-particle spacing. As a result of the possible short -short, shortlong and long -long hybridisation between the short and long strands within a bidisperse system, there is a broad distribution of inter-particle distances within the assembled cluster in the bidisperse case, in contrast to narrow distribution (single peak in the histogram) of interparticle distances in the monodisperse cases. At low number of grafts, there are fewer short -short hybridisation and mostly short -long and long -long hybridisation, whereas at high number of grafts, there are negligible short -short hybridisation and higher frequency of longlong hybridisation versus short -long hybridisation. At constant bidispersity, the effect of increasing the number of G/C nucleotides on the structure and thermodynamics of assembly is similar to that seen in monodisperse systems. At constant bidispersity, the effect of increasing the number of grafted DNA strands on the thermodynamics of assembly is mostly similar to the effect in monodisperse cases, studied in our previous work. In contrast, structurally, we see an interesting effect of introducing bidispersity. With increasing bidispersity, there is relief in strand crowding at higher number of grafts, and as a result, there is a larger increase in coordination number within the assembled cluster with increasing number of grafted DNA strands at high bidispersity than at low bidispersity/monodisperse cases.
We note that there is one major limitation with our model; our model only mimics cases where the electrostatic interactions are completely screened and the solvent and counterions are implicit. Without explicit electrostatic interactions, it is difficult for us both to replicate experimental findings on the effect of salt concentration on the assembly and to determine the differences in the secondary structure of the DNA. Despite this limitation, our study is fundamentally useful to experimentalists because it demonstrates how the structure and thermodynamics of the nanoparticle assembly depend on a complex interplay between the bidispersity of DNA or other nucleic acid mimicking polymer strand lengths and the number of grafts and composition of the grafted strands.