# New algorithms in Octopus for the Petaflops computing

#### PRACE

*Status: finished project*

**Contract Number:**

*2010PA0415*

**Starting date**

**Ending date**

The main objective of this project is enhance the parallel capabilities of the first-principles simulation code Octopus (http://www.tddft.org/programs/octopus) to reach a highly efficient massive parallel version to meet the petaflops (and more) challenges. We have identified some important bottlenecks that has hampered the massive performance of the code and those should be overcame during the present project with the help of PRACE. Octopus is used to study the properties of the excited states of large biological molecules and nanostructures of complex solids, using first principle simulations. For instance, Octopus is presently used to understand the mechanisms of absorption of the light and the energy transfer in photosynthetic complexes, including both electronic and ionic dynamics triggered by the absorption of light (“light-induced photophysical processes in biocomplexes”). The range of applicability of the code suite spans nano, bio and materials science with a clear multidisciplinary development contribution from physics, chemistry, biology and computer science.

The theoretical framework Octopus relies on the time-dependent density functional theory (TDDFT) formulation of quantum mechanics. The main quantities to represent are three dimensional functions: the density and the single particle orbitals. The single particle orbitals are evolved following the time dependent Kohn-Sham equations taking as initial condition in most cases the solution of the ground state density functional theory problem, also obtained by Octopus. In the code the functions are represented in a real space grid, and differential operators are approximated by high-order finite differences.

We have developed a parallel version of Octopus that, for instance, has been used to study the behavior of systems around 2,600 atoms. But, to deal with bigger systems in a bounded time we need to highly improve the current version of the parallel code.

Therefore, our objective is to exploit the capabilities of the new High Performance Computing (HPC) facilities to address, for the first time, the excited states properties of large biological molecules by first principle simulations.

### Objectives

The main problem that we had to solve is the Poisson solver. For that we propose two main alternatives; to use the libfm (FMM) library or to use the PFFT library for Poisson solvers.

* FMM shows almost perfect scaling, but it has a big prefactor such that it is only usable for very big systems using a lot of nodes. Even its very good scaling, has another drawback, that is a lower precision for the Poisson solver we are using. A correction is in development, but it was not finished before the end of the project. Once finished we expect to be good enough to use in a Time Dependent iteration.

* The accuracy of PFFT is as good as FFT (our best and reference implementation). The disadvantage of the PFFT implementation is the usage of a completely different mesh partition than the one used in Octopus. This is the reason why we have implemented a new partition inside Octopus, equivalent to this PFFT one. Nevertheless, the PFFT Poisson solver is usable from other partitions in Octopus (done with METIS or Zoltan). Regardless of the used partition, global data communication has to be done at the beginning and at the end of the PFFT Poisson solver. A solution for that is in development; to avoid this global communication. If we hide the communication time (that is possible to do with an adapted FFT solver, with lower precision), we see that the scaling is even better than with FMM and without having any prefactor.

The scientific results of our job will be seen in the near future for HPC and in afterwards in the Physics field. We will try to demonstrate the good scalability of the TDDFT algorithm using big physical systems and also a lot of nodes. From the physical point of view we are trying to simulate the different splits of spinach photosynthetic molecule, with very basic and accurate theory. Once the

scalability issues are solved long runs could be done and study how the molecule gets energy from the sun.

### Participants

Joseba Alberdi-Rodriguez

Pablo García-Risueño

Angel Rubio

Xavier Andrade

Javier Muguerza

Agustin Arruabarrena