PARALLEL MOLECULAR DYNAMICS SIMULATION OF DNA BASE PAIR OPENING BY SHARP BENDING
Molecular dynamics (MD) simulations were carried out to study the bending of a 20bp long B-DNA fragment with the force exerted by an external spring. The simulations run in various conditions and each simulation run for more than 70 ns and consumes a large amount of CPU time. By running parallel Gromacs with the parallelism algorithm at the instruction, data and task level, a 60-fold speedup was achieved which greatly helped to reduce the run time.
I. Introduction
Many processes inside cells and viruses require sharp bending of DNA, such as wrapping of 50 nm long DNA by 1.7 turns around histone octamers to form nucleosomes, looping of 30nm long DNA in Lac repressor stabilized DNA loops, and packaging of 6.6 m long DNA into 42 x 54 nm bacteriophage Ф29 capsid. Thus, accurate knowledge of DNA micromechanics under sharply bending conditions is critical to understand the free energy cost for DNA bending in these processes.
Under sharp bending conditions, the applicability of DNA is under debate. Recent experiments show the BDNA elastic abnormalities that deviate from the predictions by WLC model (A ≈ 50 nm): (i) DNA looping probabilities was several magnitude higher for 94 bp DNA, (ii) AFM imaging revealed higher probabilities to observe larger bending angles over short couture length, and (iii) FRET measurements of force required to bend a 25 bp is several folds lower than prediction based on Euler instability. It is suspected that DNA cannot be treated as a homogeneous elastic rod anymore, but structural molecules at such small length scale and shape bending. As a result, it is our interest to investigate the micromechanics of DNA under such conditions at atomic level. Up to date, MD simulation is the only approach to achieve this, because experiments cannot measure at such high spatial and temporal resolutions, due to the small size and transient nature of the target.
II. Simulations with parallel computing
A 20 bp DNA fragment in the middle region of 94 bp E6 DNA sequence used in Cloutier et. al. looping essay experiments was extracted:
50’GTGCGCACGAAATGCTATGC-30’
30’ CACGCGTGCTTTACGATACG– 50’
Its initial smoothly pre-bent structure was generated using X3DNA, and sharp bending conditions were implemented by connecting a zero length spring to the connected bases of the second and last second base pairs (i.e. harmonic energy constraint to terminals of DNA fragment), with spring constant k exceeds certain threshold. [Fig. 1(a)]
The MD simulations were conducted using GROMACS (version 4.5.5) package under parm99 force field with parmbsc0 corrections. The DNA was merged in rhombic dodecahedron TIP3P water box, with inscribed sphere diameter greater than the largest extension of the DNA in the simulation. The DNA negative charges were neutralized by sodium counter ions and the water was further ionized using sodium chloride to achieve 150 mM ionic strength. The system was prepared, by energy minimization using steepest descent method, then by thermolization using 200 ps velocity rescaling and 200 ps Parrinello-Rahman pressure coupling simulations to adjust its temperature and volume [Fig. 1(b)]. Finally, the prepared system was simulated using periodic boundary conditions and fast Particle-Mesh Ewald summation method, under NVT ensemble, and with constant temperature 300 K and volume of 1170 nm3.
The system shown in Fig. 1(b) consists of ~ 115, 000 atoms and requires at least 70 ns simulations for various values of spring constants k in unit of pN/nm, in order to scan the targeted conformational landscape. The DNA conformations were then collected per 1000 time steps (∆t = 2 fs) during each simulation for further analysis. This consumes lots of computational power, given the fact that a single core MD simulation on this system yields ̴0.4 ns/day using GROMACS. In order to save the overall clock time, it is necessary to utilize the parallel computing techniques in both instrument and data task level. Instrument level parallel computing assigns independent tasks across multiple nodes, and the large cluster size in HPC allows us to conduct simulations perpendicularly with each other under different k spring constraint. Data-task level parallel computing bases on the development of GROMACS, which enables a load balanced, well scalable multicore simulations. It distributes different atoms and procedures into different cores to simultaneously proceed the simulation with minimized cross communications to reduce the work load, and also optimizes the balances of usages for each core to avoid unnecessary waiting time. With the help of minimal-communication domain decomposition algorithm, full dynamic load balancing, a state-of-the-art parallel constraint solver, and efficient virtual site algorithms, the performance of GROMACS achieves a nearly linear scalability up to 64 cores (Fig. 2). As a result, we improved our productivity 50 folds to 20 ns/day by using 8 or 12 cores for each simulation and 4 to 6 simulations in parallel at the same time.
III. Results
The results show that the DNA fragment remains in intact B-form and slightly uniformly curved under k < 26 pN/nm, while generated defects usually involves dissociations of base pairs under k > 26 pN/nm. As shown in Fig. 3(a), the snapshot structure of DNA fragment at 60 ns from the simulation under spring constraint k = 28.2 pN/nm indicates the dissociated base pairs in the middle region colored by red surface. The base pairing integrity analysis, which monitors the minimum value among 2 or 3 hydrogen bond lengths for AT or GC base pairs, respectively, for each site, identifies the locations of defects. A base pairing is considered intact Watson-Crick base pairing if 0.17 < min(hi,j) < 0.23 nm, otherwise it is considered as dissociated base pairing. The above case contains dissociated 11th, 12th and 13th base pairs [Fig. 3(b)].
In order to study the effect of defects on bending, we plotted the time evolution of bending angle q10,14 in Fig. 3(c) (black), which is defined by angle between two normal directions of base pairs that enclose the defected region. It increases rapidly from 40 to 170 within 10 ns, and remains at a high level throughout the rest of the simulation. While, in comparison, the bending angles in nearby regions, q6,10 (blue) and q14, 18 (orange) more specifically, evolves from initial 40 to less than 20, simultaneously with the kink formation in q10,14, and keeps at a low level till 70ns. This indicates the local kink formation absorbs the bending constraint and releases the other region into a straighter B-form. Due to the stochastic nature of the defects excitation, multiple breaking events has been repeated and analyzed. Fig. 3 (b) shows the histogram of the breaking counts out of 12 independent events, which reveals the position preference of defects excitation: defects mostly occur at middle multi AT regions.
IV. Summary
Based on large HPC clusters in Computer Centre and development of GROMACS version 4.0, we are able to achieve a 50-fold boost in productivity using parallel computing, which enables us to tackle computational expensive MD simulations on complex system, such as DNA micromechanics under sharp bending conditions. From the results, we conclude that the DNA fragment under strong spring constraints no longer follow the WLC – homogeneous elastic rod model. Instead, it generates local defects at middle AT rich region mostly in the form of base pair dissociations. The sudden decrease of End-to-End distance (d) relaxes the rest intact regions into a straighter B-form, resulting in B-DNA separated by local defected region. The force response profiles f (d) of the B-form vs defected DNA (Data not shown), reveal a 50 pN decrease as DNA transfers from Bform to defected form in the narrow transient region, which indicates that DNA fragments with excited defects indeed become softer.