azTotMD 2.0: Molecular dynamics with the radiative thermostat and temperature-dependent force field (CUDA version)

a

The naive implementation of molecular dynamics algorithms corresponds to NVE ensemble (Number of particles, Volume and total Energy are constants), in which a molecular system behaves quite similar to the mechanical one.In practice, simulations at constant temperature are of the greatest interest.There are some special algorithms, so called thermostats, to maintain constant temperature during simulation.Most often, for these purposes, the kinetic definition of temperature is used, which follows from the kinetic theory of gases and the ideal gas equation of state: where T is temperature of the system, k B is the Boltzmann constant, K is the kinetic energy of the system and <. . .> denotes the ensemble-averaged quantity.In addition to the ''kinetic temperature'', other definitions of temperature are also used for thermostating, although much less frequently due to their greater computational complexity.A good review of deterministic thermostats operating with different temperature definitions is given in [1].The most common deterministic thermostats are the Evans [2,3], Berendsen [4] and Nose-Hoover [5] thermostats.Evans thermostat modifies atomic velocities exponentially with the kinetic temperature constraint calculated at the beginning and at the end of a timestep.Berendsen thermostat scales atomic velocities in proportion to the difference between desired and actual temperature at each timestep.Nose-Hoover thermostat introduces an additional degree of freedom s and solves extended Lagrangian of the system.In addition to deterministic thermostats, there are stochastic ones, which use random numbers to imitate the chaotic nature of thermal motion.Among stochastic thermostats, the Andersen [6] and Langevin thermostats are widespread.In the Andersen thermostat, particles suffer from random collisions, after which their momentum is randomly selected from the Boltzmann distribution at temperature T. The Langevin thermostat uses the Langevin equation instead of Newton's equations of motion.The combination of random force and dissipative term in the Langavin equation allows to maintain kinetic temperature constant.Based on a Langevin thermostat Duffy and Rutherford [7] proposed inhomogeneous (or two-temperature) thermostat that allows the energy exchange between the atomic and electronic subsystems.Further development of this methodology led to the creation of a threetemperature thermostat to accounting of spin-lattice-electron dynamics [8].Almost all thermostats are designed as an additional stage after the motion integrator which modifies atomic velocities.For some thermostats, preliminary preparation is also required at the beginning of each timestep, a detailed description of the thermostating algorithms can be found, for example, in the DLPOLY [9] user manual.
Despite the widespread us age of these thermostats, the applicability of the kinetic molecular theory to an arbitrary aggregation state is debatable.Perhaps, by reason of this, MD often poorly reproduces experimental data on temperature dependences of some properties.In our software, we propose to use another method of stochastic thermostating in MD simulations, based on the spectrum of the black-body radiation, which unambiguously depends on temperature.In our approach the thermostating procedure is implemented as adsorption and radiation of virtual photons.It can be described as follows.At the first step, every atom adsorbs a randomly directed photon with the energy taken from the distribution for black-body radiation at the desired temperature.The atom's velocity changes in accordance with the momentum conservation principle, but the increase in the atom's kinetic energy is much smaller than the photon's energy and exactly this difference is added to a newly introduced energy term of the atom which we call ''thermal energy'', H.This energy symbolizes thermal excitation of the atom.Next, every atom radiates a photon of energy ε = γ H, where γ is an empirical constant, which is equal to 0.9.To avoid the unlimited growth of the kinetic energy of the system we need to use the following restriction on the direction of the radiated photons: where ϕ is the angle between the atom's velocity and the direction of the radiation, ε is the photon energy, m is the atom mass, c is the speed of light and v is the atom velocity before radiation.
We call the proposed thermostat as ''radiative''.
The thermostating procedure affects the kinetic energy of the system.Does the potential energy depend on temperature?The recent papers demonstrate that temperature-dependent interatomic potentials, also known as a force field, allow to describe the thermal behavior of substances correctly.For example, in paper [10] authors use the following form of the interatomic pair potential: where r is the interatomic distance, α and a i are empirical constants.This force field allows to predict thermal dependences of the density for a wide range of the organic crystals.Gong et al. [11] suggest to use the classical Lennard-Jones potential but with the temperature-dependent parameters ε and σ .In paper [12], parameters of Lennard-Jones potential depend linearly on temperature.Griffiths and Shinoda [13] use the Lennard-Jones type potentials with T-and P-dependent parameters for coarse grained MD.In the mentioned papers interatomic potentials depend explicitly on the temperature.In our opinion, inter atomic potentials should take into account the thermal excitation of atoms, which is not the same for all atoms, but is somehow distributed in the system.This distribution, in turn, depends on temperature.Thus, we suggest the following functional form of a pair potential: ) , where u ij is energy of interaction between the ith and jth atoms, C 1 , C 2 , a and b are empirical constants, r i and r j are radii of ith and jth atoms respectively, r is the interatomic distance.Here the atom radius means the Bohr radius, i.e. the distance between the atomic nucleus and the maximal electronic density.The detailed derivation of this expression will be published separately.We guess that the mentioned above thermal energy is accumulated in the form of an increase in the size of the atomic orbitals and propose the simple relation between the atomic radius and thermal energy: where r is the atomic Bohr radius, A and B are constants for a given atom type and H is the atoms thermal energy.The main benefit of the radiative thermostat is the possibility of introducing the degree thermal excitation for each atom.The thermal excitation, in turn, allows to design a temperature-dependent force field in a natural way, without the explicit use of temperature in pair potentials.The second advantage of the radiative thermostat is that it is not based on the kinetic definition of temperature derived for an ideal gas.The disadvantages of the radiative thermostat include a weak theoretical basis, further work in this area is needed.The present paper briefly describes the designed software architecture and demonstrates main results of the application of the radiative thermostat and the temperature-dependent force field.

Software description
azTotMD 2.0 is based on its previous version [14], but rewritten in SIMD parallelism with CUDA (NVidia) extension of C++ language.The program carries out all meaningful calculations on a video card (device).The process of data exchange between the device and CPU (host) is minimized, all statistics data is collected in a buffer on the device.The execution of algorithms with independent data access (the integrator of the motion, thermostats, valent bonds' and angles' processing) is distributed among the streaming multiprocessors as evenly as possible.The cell list algorithm [15] is used to speed up the calculations of the pair interactions.The cell list algorithm processing is implemented in two steps: (i) calculation of interactions between the atoms inside one cell, then (ii) between the atoms of the cells located within a cutoff radius.The program uses counting sort, which moves nearby atoms to adjacent array elements to achieve even greater performance [16].The thermostat routine is called after the second step of the Velocity Verlet integrator [17] and change the atomic velocities and thermal energies.This procedure is executed for each atom and can be performed separately.So, in parallel computing maximal possible acceleration will be achieved in the case, then number of threads is equal to the number of atoms.The program does not have graphical user interface and reads data from text files, including initial coordinates (atoms.xyz),general directives (control.txt),a forcefield (field.txt),CUDA directives (cuda.txt),list of bonds (bonds.txt,optional) and list of valent angles (angles.txt,optional).The program writes output data, including atomic coordinates, radial distribution functions, some statistics and supplement information into text files.

Software architecture
The program code consists of several modules, some of them are written in CUDA.These modules have the prefix 'cu'.The main file is named main.cu.The modules with key functionality of the software are following: • cuAngles.cu:valent angles processing; • cuBonds.cu:valent bonds processing, including bond breaking and formation; • cuElec.cu:accounting of electrostatic, including particle mesh Ewald technique; • cuInit.cu:transfer data to a video card and back; • cuMDFunc.cu:integrators of the motion and other basic functions; • cuPairs.cu:calculation of the pair interactions; • cuSort.cu:counting sort of atoms and related data; • cuStat.cu:statistics calculation and output; • cuTemp.cu:thermostats, including radiative one; • cuVdW.cu:pair potential functions, including our temperature-dependent potential; • sys_init.cpp:reading data for the simulation.

Software functionalities
The software is based on the classical molecular dynamics method with the Velocity Verlet integrator [17].The program supports the following features: • orthorhombic periodic boundary conditions; • thermostating (Nose-Hoover and radiative); • electrostatic calculations (naive, particle mesh Ewald and Fennel and Geselter potential [18]); • short-range pair potentials (including Lennard-Jones, Buckingham, Born-Mayer-Huggins and our temperaturedependent one given by Eq. ( 4)), an individual cutoff radius for each pair potential; • intramolecular potentials including valent bonds and angles; • external electric field; • a non-constant force field including electron transfer [19], dynamic formation and removing valent bonds and angles and hydrogen bonds.

Radiative thermostat
Here we are briefly describing numerical experiments with a large system of weakly interacting particles (the case study 1 in the repository).The studied system is 40'000 argon atoms in the cubic box with linear size of 1141.5 Å and interacting via pair potential from paper [20] in the Lennard-Jones form.The box size is chosen in such manner that the density is close to the density of gaseous Ar at temperature of 298 K and pressure of 1 atm.As in this case the mean distance between atoms is much higher than the potential cutoff, therefore, the most of the time particles do not interact with each other and the potential energy of the system is close to zero.The temperature of the system is controlled by the suggested radiative potential.Simulation results show that, first, the system kinetic energy as a function of simulation time reaches a plateau.This plateau value does not depend on the initial atomic velocities setting, but depends on temperature.Thus, the suggested thermostat can both accelerate and decelerate atoms depending on the target temperatures and initial velocities.Second, we establish the Maxwell-like distribution, Fig. 1, regardless of the initial distribution of atomic velocities in the system.As one can see, the narrower peaks correspond to lower temperatures.

Phase transformations
In this section we consider an effect of both the radiative thermostat and the temperature-dependent force field on an abstract system.The system consists of 4000 abstract atoms interacting via potential given by Eq. ( 4) with a cutoff distance of 6 Å.The box size is chosen in such manner that the system has enough free volume to rearrange.The radiative thermostat is applied to maintain the temperature (the case study 2 in the repository).First, initial system in the form of a cubic piece of the fcc-crystal is smoothed during the system's evolution at temperature of 200 K.This and all subsequent simulations last 200'000 steps with timestep of 1 fs.Then the system is transformed into a liquid drop by holding the temperature at 1000 K.It is of the interest that placing the system back into low temperatures (400 K and lower) gives the crystalline phases again, Fig. 2(a).It is a significant result, because a direct crystallization is a challenge for the classical molecular dynamics.Usually, researchers use some tricks to observe a crystallization like the surface potential [21], the external electric field [22] or a presence of the crystalline phase [23].One can see some defects in the system crystallized at 400 K. Radial distribution functions (Fig. 2b and c) show that the position of the first peak shifts to higher values with the growth of the temperature.This means an increase in the interatomic distances, i.e., we observe a thermal expansion.At high temperatures (1200, 1500 K) the system becomes a two-phase mixture of the gas and liquid and the density of the vapor phase grows with temperature increasing.Finally, at 3000 K we obtain a fully gaseous state.A subsequent decrease in temperature to 100 K leads to the formation of an amorphous structure, which is slowly relaxing to crystal state.However, if we put this amorphous system at a bit higher temperature, example, 200 K it crystallizes fast, 3. We exactly the same as the glass crystallization!Thus, we obtain all aggregate states of the system, including the metastable one, just by switching the temperature.Also, we observe a wide range of phenomena: the surface tension, the defect formation, the thermal expansion and the evaporation.It should be noted that these effects are obtained as a result of simultaneous use of both the radiative thermostat and the temperature-dependent pair potential.By fixing the atomic radii as A/B from formula (5), the fact is revealed that the system cannot melt even at temperature of 3000 K.

Computational efficiency and thermostats comparison
The computational efficiency of the software is tested in some ways including strong and weak scaling tests.All calculations are performed on a personal computer with NVIDIA GeForce RTX 3090 video card (clock rate of 1755 MHz).Both strong and weak scaling are performed only for the velocity integrator and radiative thermostat, the force calculations are omitted.The system of 4096 atoms is chosen for strong scaling and since there is a big difference in the execution the are obtained for two tasks of 1000 and 20000 timesteps (Table 1).The speedup is given as the ratio S(N)/S(4), where S is a speed in timesteps per second and N is the number of computing nodes.The results demonstrate good strong scaling behavior, Fig. 4(a).The weak scaling is done for 100'000 timesteps with 64 nodes per streaming multiprocessor, Table 2.The efficiency is calculated as the ratio S(N)/(N×S(1)), where S is a speed in timesteps × atoms per second and N is the number of streaming multiprocessors, Fig. 4(b).The whole software (the integrator, thermostat and force calculations) performance is tested as a function of the number of atoms in the system on all available multiprocessors with 64 nodes per the multiprocessor.Table 3 summarizes elapsed time for the simulations during 200'000 timesteps.Fig. 5(c) shows that the maximal efficiency is observed for the number of atoms close  The comparison of the effect of the radiative and Nose-Hoover thermostats on the kinetic temperature is given in Fig. 5.The results are obtained for 4096 atoms of two types with the opposite sign interacting via the Coulomb potential and short-range Buckingham or temperature-dependent (Eq.( 5)) potential and kept at 400 K.As one can see, all thermostats give oscillations around some constant temperature value.For the Nose-Hoover thermostat the amplitude of the oscillations depends on a relaxation  constant and the used force field.Sometimes, the oscillations become too high to by physical (the case of Buckingham potential and relaxation constant of 5 ps, not shown on the plot).Thus, the choice of the relaxation constant is essential for the Nose-Hoover thermostat.Note that the kinetic temperature in the case of Nose-Hoover thermostat fluctuates around the target value, while the temperature in the case of the radiative thermostat depends on a force field.Thus, in the proposed thermostating scheme, mean kinetic energy depends not only on temperature, but also on the nature of the substance.

Impact
At the moment, there is no unambiguously correct way to set the temperature in a molecular simulation, this is evidenced by the very fact of the existence of a large number of thermostats.From our point of view, temperature cannot be reduced to the average kinetic energy of molecules; or rather, temperature is the energy state of the system.Moreover, the energy state of atoms inside the system is not uniform, but is a distribution associated with the photons energy distribution in the black-body radiation.The presented software includes the thermostatic algorithm designed to test this hypothesis.In addition, we have developed and integrated in the program the temperature-dependent pair potential, which naturally follows from our ''energetic'' definition of temperature.The results mentioned in Section 3 show that the suggested approach makes it possible to obtain many temperature-dependent phenomena in a simple way.We hope that our concept is productive and allows a better understanding of the phenomenon of temperature and it will give an opportunity for more accurate prediction of temperature-dependent properties via computer simulations.
In the comparison with the previous version of azTotMD, the calculations have become parallel with usage of GPU.It allows to perform simulations of enough large systems on relatively chip personal computers in a reasonable amount of time.This parallel version has been successfully used for non-constant force field simulation of glasses [24][25][26][27][28].Although the mentioned calculations did not use the radiative thermostat, the main algorithms including the motion integrator, force calculations and etc performs well.

Conclusions
azTotMD 2.0 is a parallel molecular dynamics program which includes both conventional algorithms and the novel ''radiative'' thermostat and the physically based temperature-dependent pair potential.The combination of these novelties allows to reproduce many phenomena as the velocity's distributions, the phase transitions, the thermal expansion, the defect formation, the surface tension, the vapor saturation, the glass formation and devitrification.The thermostat algorithm has complexity of O(N).

Fig. 1 .
Fig.1.Velocity distributions for 100, 298 and 700 K in the system affected by radiative thermostat.

Fig. 2 .
Fig. 2. (a) Evolution of the system from liquid drop to crystalline phases.Radial distribution functions for the system at different temperatures: (b) a wide range and (c) the first peak.

Fig. 4 .
Fig. 4. Computational efficiency of the software: (a) the strong scaling test for the integrator and radiative thermostat; (b) the weak scaling test for the integrator and radiative thermostat; (c) the relative speedup of the whole software on the maximal available multiprocessors as a function of the system size.

Fig. 5 .
Fig. 5. Evolution of the kinetic temperature as a function of time for Nose-Hoover (NH with the specified relaxation constant in ps) or radiative thermostating and different force field (buck -Buckingham potential or tdep -temperature-dependent potential, Eq. (5)).

Table 1
Strong scaling test: Time spent on the motion integration and radiative thermostating of 4096 atoms with the different number of computing nodes.

Table 3
Total performance: Time spent on the simulation of systems of different sizes during 200'000 timesteps with 82 multiprocessors and 64 nodes per multiprocessor.