Title:

Computational methods for firstprinciples molecular dynamics with linearscaling density functional theory

Nowadays, KohmSham density functional theory (KSDFT) calculations are routinely employed in several research fields, due to the ability of KSDFT to provide great accuracy for a wide class of molecular systems and materials. Unfortunately, conventional KSDFT calculations, although very powerful, require a computational cost that goes with the cube of the system size, also known as ON³ scaling, undermining in practice the study of large systems. The advent of linearscaling, or O (N ), DFT (LSDFT) methods, relying on the locality of the electronic matter, has enabled calculations on increasingly large systems, up to tens of thousands of atoms. The central tenet of linearscaling methods is the exponential decay in real space of the singleparticle reduced density matrix. This property allows to enforce localisation constraints on the electronic structure, significantly reducing the size of the matrices, such as the Hamiltonian matrix, and increasing their sparsity. The singleparticle density matrix in the LSDFT formalism is expanded in terms of a set of atomcentred, strictly localised functions. Employing periodic boundary conditions (PBCs), the energy is minimised with respect to all the degrees of freedom in the density matrix, which allows to attain chemical accuracy using a highresolution minimal basis set. The combination of localisation constraints and sparse algebra form the substrate for O (N ) calculations. In this thesis, we used Onetep, a linearscaling DFT program, to carry out our calculations. The aim of our research is to combine molecular dynamics simulations, within the BornOppenheimer approximation (BOMD), with linearscaling DFT methods. In particular, our main goal is to advance current methodology by developing new algorithms to better exploit locality in BOMD and to reduce the computational load while maintaining DFT accuracy. Dipole moment autocorrelation functions can directly be employed to obtain the IR spectrum of a given molecular system. DFTMD simulations offer the perfect tool to generate accurate autocorrelation functions which automatically take into account the anharmonicity of the potential energy surface and temperature effects. Computational IR spectroscopy plays a pivotal role in the understanding of conformational changes of biomolecules, which tend to show several almostdegenerate conformers at room temperatures (floppy molecules). It is particularly valuable when interpreting their fingerprint in solution in combination with experimental spectra. We have implemented two algorithms for the computation of the local electronic dipole moments of molecules in solution. Both methods demand a strategy to partition the density. These methods enable the computation of IR spectra of large molecules, such as polypeptide, in explicit solvent. In the resulting IR spectrum the effect of the solvent on the target molecule is automatically captured, whereas its IR signature is removed. We expect these new functionalities to be very helpful in the understanding of how biomolecules interact with the solvent at room temperature and the effect of these interactions on conformational changes. Computationally, the most demanding step in molecular dynamics simulations is the evaluation of energies and forces. This has particular severe consequences on BOMDbased approaches. In fact, a selfconsistent field (SCF) step is required at each MD step, which in turn requires multiple energy evaluations. As a consequence, the SCF loop has a major effect on the computational load and overall walltime. MD Schemes that are capable to bypass the SCF loop altogether, e.g. CarParrinello MD or fixed charge force fields in classical MD, are inherently faster in terms of walltime per MD step, although they usually demand a much smaller timestep. Moreover, the quality of the converged density matrix is crucial for energy conservation and forces in the LSDFT BOMD approaches. In theory, the selfconsistent solution does not depend on the initial guess. In practice, the SCF optimisation is always incomplete, leading to memory effects and the breaking of timereversal symmetry, which gives rise to systematic errors in energy gradients that manifest as a drift in microcanonical energy. To ameliorate this problem, we present two integration schemes based on an extendedLagrangian (XL) approach which introduces extended or auxiliary electronic degrees of freedom to generate good quality timereversible initial guesses in the SCF loop. Both schemes are improvements over the original XL formulation, which suffered from numerical noise accumulation. The first approach, known as dissipative XL (dXL), introduces a dissipativelike term in the Verlet integration (modified Verlet) of the auxiliary degrees of freedom; the second approach, known as inertial XL (iXL), introduces a thermostat, hence requiring a velocity Verlet integrator. We have implemented both schemes in Onetep and studied their performance using liquid water as test case. In collaboration with the authors of the schemes, we have analysed the energy driftt in both classical polarisable force field MD calculations and LSDFT BOMD. We have found that both schemes are very efficient in reducing the number of SCF iterations while maintaining good energy driftss and have similar performance. We believe that our implementation and analysis will be very beneficial for future applications both with LSDFT BOMD and classical polarisable force field MD since both schemes constitute important algorithmic improvements that markedly extend the timescales accessible to classical and LSDFT MD simulations alike.
