Flash-X, a multiphysics simulation software instrument

Flash-X is a highly composable multiphysics software system that can be used to simulate physical phenomena in several scientific domains. It derives some of its solvers from FLASH, which was first released in 2000. Flash-X has a new framework that relies on abstractions and asynchronous communications for performance portability across a range of increasingly heterogeneous hardware platforms. Flash-X is meant primarily for solving Eulerian formulations of applications with compressible and/or incompressible reactive flows. It also has a built-in, versatile Lagrangian framework that can be used in many different ways, including implementing tracers, particle-in-cell simulations, and immersed boundary methods.

be used to simulate physical phenomena in several scientific domains. It derives some of its solvers from FLASH, which was first released in 2000.
Flash-X has a new framework that relies on abstractions and asynchronous communications for performance portability across a range of increasingly heterogeneous hardware platforms. Flash-X is meant primarily for solving Eulerian formulations of applications with compressible and/or incompressible reactive flows. It also has a built-in, versatile Lagrangian framework that can be used in many different ways, including implementing tracers, particle-in-cell simulations, and immersed boundary methods. flash-x-users@lists.cels.anl.gov Table 1: Code metadata. Please note that the code repository is private only because our funding agency requires us to keep a list of people who obtain the code directly from our repo. Anyone can furnish their github id and be added to the list of collaborators.

Motivation and Significance
Flash-X [1] is a new incarnation of FLASH [2,3], a multiphysics software system that has been used by multiple science communities. Flash-X is meant for use beyond existing FLASH science communities. It is designed to be easily adaptable for use by any computational scientists who rely upon differential equations as their primary mathematical model with finite-volume or finite-difference discretization. FLASH was designed only for a homogeneous, distributed-memory parallel model with bulk-synchronism, which has rendered it unsuitable for use on many newer system architectures that are heavily reliant on disparate memory spaces (e.g., accelerators). This difficulty is further exacerbated by increasing heterogeneity in hardware as well as solvers within the code. Flash-X has a fundamentally redesigned architecture that uses abstractions and asynchronous operations for performance portability across a variety of platforms, both with and without accelerators. Our design is forward-looking in that it makes minimal assumptions about which parallelization or memory models are likely to be prevalent in future platforms. The design relies upon self-describing code components of varying granularity and a toolchain that can interpret the metadata of the code components to synthesize application instances. The synthesis is done partly through assembly, partly through code translation, and partly through code generation. Some code assembly features have been imported from FLASH, but have been significantly enhanced to discretize components at a finer scope than subroutines or functions. Tools for code translation and runtime management are new and will enable orchestration of computation and data movement between distinct compute devices on a node.
In addition to the new architecture, Flash-X has newer and higher-fidelity physics solvers. Most notable among these are Spark [4] for magnetohydrodynamics, XNet [5,6] for nuclear burning, thornado [7,8] for neutrino radiation transport, and WeakLib [9,10,11] for tabulated microphysics. Additionally, Flash-X can support multiphase flow through a level-set method, which did not exist in FLASH releases [12]. Flash-X has been exercised on small clusters at Argonne National Laboratory and on leadership-class machines at Oak Ridge National Laboratory and Argonne National Laboratory. Flash-X will showcase the key performance parameters of ExaStar [13], a project under the Exascale Computing Project [14,15] (ECP), through a core-collapse supernova (CCSN) simulation on exascale machines to be deployed by the US Department of Energy. To run effectively at scale, Flash-X will rely upon the toolchain described above. Some components of the toolchain are embedded in Flash-X, while others are encapsulated into independent libraries that can be used by other codes. Note that compilation and execution of the code do not require using these external libraries; they are used only to orchestrate data movement and computation for better performance.
Along with a new architecture, Flash-X also adopts a community-based, open development model. The stewardship of the code is guided by a Council representing all the major science communities of FLASH/Flash-X. More details of our community development model are available at https://flashx.org.

Software Description
The Flash-X code is a component-based software system for simulation of multiphysics applications that can be formulated largely as a collection of partial and ordinary differential equations (PDEs and ODEs), as well as algebraic equations. The equations are discretized and solved on a domain that can have uniform resolution (UG) or adaptive mesh refinement (AMR). In Flash-X, one can select between PARAMESH [16], an octree-based library written in Fortran, or AMReX [17,18], a highly-flexible, patch-based, C++ AMR library. Both AMR frameworks can interface to math libraries such as hypre [19] and PETSc [20], making those solvers available to Flash-X. Physics units are designed to be oblivious of domain decomposition. Bulk of their code is written for block-by-block update, interspersed with invocation of fine-coarse boundary resolution related API functions of the Grid unit as needed.
Hyperbolic equations are solved using explicit methods commonly used for compressible flows with strong shocks, described in Section 2.2. For elliptic equations, one can either use an included multipole solver [21], AMReX's multigrid solver, or an interface to one of the math libraries. For parabolic equations, one must rely upon library interfaces.
The maintained code components are written in a combination of highlevel languages such as Fortran, C, and C++, with an embedded domainspecific configuration language (DSCL) that also supports Flash-X custom macros. The DSCL permits multiple alternative definitions of macros with a built-in arbitration mechanism to select the appropriate definition for an instance of code assembly. The accompanying configuration toolchain can translate and assemble different combinations of the components to configure a diverse set of applications. Flash-X has been designed from the outset to be performant with increasing heterogeneity of both the platforms and the solvers within the code.
The code uses the Message-Passing Interface (MPI) library for communication between nodes, though more than one MPI rank can also be placed on a node. HDF5 is the default mode for IO. Support for OpenMP, both for threading and for offloading to accelerators, is built into several, though not all, of the solvers.

Software Architecture
Flash-X has composable components with accompanying metadata that can express, for example, inter-component dependency and exclusivity, necessary state variables, etc. The metadata is encapsulated within the code components by accompanying config files and is parsed and interpreted by the configuration tool, Setup. Setup parses config files recursively, aggregates requisite components, and assembles a complete application. It also assembles the compilation/make system and runtime parameters for each component included in the application. The Setup tool also implements code inheritance through a combination of keywords in the config files and the Unix directory structure instead of using programming language supported inheritance mechanisms. When Setup parses the source tree, it treats each subdirectory as inheriting all of the files in its parent's directory. While source files at a given level of the directory hierarchy override files with the same name at higher levels, config files accumulate all definitions encountered. The schematic for inheritance is shown in Figure 1.  Figure 1: Schematic for the implementation of inheritance in Flash-X. Handling of inheritance and variants assumes three lists: one for files, one for macros, and one for runtime parameters. The flowchart in the right box gives details of how keys and runtime parameters are updated as the source tree is traversed.
In Flash-X parlance, the highest-level code component for a specific type of functionality is called a unit. Units can have subunits. A unit includes an API accessible to the whole code through which it interacts with other units and the driver. While each subunit can have its own sub-components with no restriction on how fine-grained they can become, the general rule of thumb is to keep them as coarse-grained as feasible for ease of maintenance. A unit can have multiple alternative implementations, one of which is required to be a null implementation. If a unit is not needed in a simulation, the null implementation is included. This feature facilitates maintaining very few implementations of the main driver while permitting many combinations of capabilities to be included in an application. Any code component can have multiple alternative implementations, though unlike the unit-level API, lower-level components do not require null implementations.
A different mechanism is used when a code component needs to become smaller than a function or a subroutine. Here, we rely on macros to implement alternative definitions of an operation, including the null case. The inheritance mechanism shown in Figure 1 arbitrates on which definition to select. The macros may also have arguments, be inlined, and be recursive. This mechanism serves two purposes. The first is for developer convenience. Certain code patterns repeat often in the code -for example, invocation of iterators, bounds for loop-nests, and bounds for arrays. We have provided macros for such repeated patterns, and developers can use these at their discretion. Macros make the code compact, reduce cut-and-paste errors, and help to clarify the control flow and semantics of the code. The second, more powerful motivation is that with alternative definitions, we can generate many variants of a code component from the same source. This functionality is particularly useful when different control flow is more suitable for different compute devices. We can keep arithmetic expressions invariant while using macros for the control flow, or vice-versa, thus not only eliminating code duplication but also keeping the maintained code more compact. The schematic for generating variants from a single source where specializations are obtained through alternative macro definitions is shown Figure in 2.

Software Functionalities
The Flash-X distribution includes solvers for compressible and incompressible fluids, several methods for handling equations of state (EOS), source terms for nuclear burning, several methods for computing effects of gravity, level-set methods for multiphase flow, and several others. The primary formulation for PDEs in Flash-X is Eulerian, although a versatile Lagrangian framework is also included that can be configured to do computations such as tracers, particles-in-cell, immersed boundaries, etc. The vast majority of applications using Flash-X include some form of hydrodynamics or magnetohydrodynamics in their configuration. However, it is possible to configure  applications that completely bypass those solvers. Magnetohydrodynamics and Hydrodynamics: a compressible magnetohydrodynamics/hydrodynamics solver with second-or third-order strong stability preserving (SSP) Runge-Kutta (RK) time integration (Spark) [4], another compressible hydrodynamics solver with a predictor-corrector formulation [22,23], and an incompressible hydrodynamics solver with fluidstructure interaction [24] are included in the distribution. All of the solvers can be used in 1-, 2-, or 3-dimensional configurations. Equations of State: the code supports several EOS versions suitable for a range of regimes in astrophysical flows. The simplest one is a perfect-gas EOS with a multispecies variant. Another implementation with two variants uses a fast Helmholtz free-energy table interpolation to handle degenerate relativistic electrons and positrons and also includes radiation pressure and ions (via the perfect gas approximation) [25]. Nuclear Burning: three nuclear reaction networks of varying numbers of species are included in the distribution. Approx-13 and approx-19 [26] are inherited from FLASH. XNet is a standalone code for evolving astrophysical nuclear burning and is generalizable to arbitrarily large networks as needed for improved physical fidelity of some applications. Gravity: the gravitational potential can be treated very simply as constant, or through a Poisson solve using a multipole or multigrid method depending upon the symmetry of the density field. Particles: this component of the code forms the basis for the Lagrangian framework [27]. Particles maintain their own spatial coordinates and are independently integrated in time. They interact with the Eulerian mesh either to obtain physical quantities needed for their advancement or to deposit quantities such as mass, charge, or energy to the mesh, depending on usage. Incompressible Fluid Dynamics: this component of the code solves incompressible Navier-Stokes equations for single and multiphase flow simulations with options for heat transfer and phase transitions [12]. The Navier-Stokes solver is implemented using a fractional-step temporal integration scheme that uses Poisson solver for pressure. Multiphase interfaces are tracked with a level-set function and use ghost-fluid methods to account for forces due to surface tension and mass transfer [28]. The effect of solid bodies on the fluid is modeled using an immersed boundary method that uses Lagrangian particles [29]. Importable Modules: Flash-X uses GitHub's submodules to import some capabilities that are independently developed and hosted in their own repositories. These include WeakLib for tabulated, nuclear EOS and neutrinomatter interaction rates, and thornado for spectral neutrino radiation transport.

Illustrative Examples
We describe two example simulations using Flash-X from two different science communities. The first is a CCSN simulation that uses compressible hydrodynamics, nuclear EOS, neutrino radiation transport, and self-gravity solvers. The second is a subcooled flow boiling simulation that uses multiphase incompressible Navier-Stokes and heat advection diffusion solver.
We perform a CCSN simulation in spherical symmetry, initiated with a low-mass pre-collapse progenitor star previously modeled throughout all stages of stellar evolution [30]. Electron-type neutrinos and anti-neutrinos are evolved using thornado's two-moment neutrino transport solver and Weak-Lib's tabulated nuclear EOS [31] and neutrino-matter interaction rates [32]. Compressible hydrodynamics are evolved with Spark, and Newtonian selfgravity is computed using the multipole Poisson solver. For a more detailed description of the physics included, see [33]. Figure 3 shows the evolution of the ratio of electrons to baryons (electron fraction) versus radius during a critical epoch in the simulation that spans the formation of the primary shock-wave during core "bounce" -the phenomena of infalling matter colliding with, and bouncing off of, the newly-formed neutron-star. Figure 4 provides details for the subcooled flow-boiling simulation which was designed to replicate experiments performed at different gravity levels by Lebon et al. [34]. These computations used the multiphase incompressible Navier-Stokes solver along with the phase transition capability, and were preformed at a resolution almost twice the previous state-of-the-art [35,28]. Liquid coolant flows over a heater surface with a mean velocity U 0 , leading to phase-change and formation of vapor bubbles. These vapor bubbles grow, merge, and finally depart the heater surface due to buoyancy which introduces turbulence in the domain. The heat transfer associated with this turbulence is an important parameter in designing cooling systems for automotive and industrial components, but is difficult to quantify through experimental observations/measurements. With Flash-X we are able to address this challenge through targeted high-fidelity simulations to quantify the contribution of turbulent heat flux.

Impact
FLASH has been an influential code for computing astrophysical flows almost since its inception. FLASH's scientific impact is clearly demonstrated by the citation history of the original paper describing the code, as shown in Figure 5. Analysis in [36,37] further quantifies the scientific significance and impact of the code on science. FLASH has not only been used extensively for science, it has also been among the pioneers in giving due importance to software quality and adopting rigorous auditing and productivity practices [38].
In recent years, FLASH's use has been diminishing in several communities because of its inability to use accelerators effectively. Flash-X is designed to fill this gap and become a reliable multiphysics simulation code for the communities that earlier relied on FLASH. At least two major communities of FLASH users, stellar astrophysics [4,33] and fluid-structure interactions [12], are already transitioning to Flash-X, with some users now exclusively using Flash-X. These use cases have also reported on performance gains with the use of GPUs. Not all of FLASH's physics capabilities are available in Flash-X yet. However, since Flash-X is open source, it is expected that interested users will assist in transitioning their capabilities of interest to the Flash-X architecture and help grow the Flash-X community. Additionally, the new tool-chain for orchestration of data and work movement is still in the early stages of being exercised. Preliminary performance studies of the runtime tool have been very encouraging [39]. It is expected that full performance gains will have been realized by the next major release.

Conclusions
Sustained funding under the ECP has permitted modernization of a highly capable community code for current and future platforms. With Flash-X, the FLASH science communities can embrace heterogeneity and use available hardware effectively. With the move to an open, community- based development model, users are assured of continuity and support for the code without depending on a single funding source. FLASH has had a long history of scientific discovery, and Flash-X aims to follow in that tradition. With more modern solvers and flexible architecture, Flash-X can continue to be a very useful resource for science domains that rely on modeling of partial differential equations.

Conflict of Interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. ecosystem, including software, applications, hardware, advanced system engineering, and early testbed platforms, in support of the nation's exascale computing imperative.