Theoretical Foundation

This section describes the mathematical and numerical foundations of FLUXOS.

Governing Equations

Shallow Water Equations (Hydrodynamics)

FLUXOS solves the 2D shallow water equations (Saint-Venant equations) in conservative form:

Mass Conservation:

\[\frac{\partial h}{\partial t} + \frac{\partial (hu)}{\partial x} + \frac{\partial (hv)}{\partial y} = S_h\]

Momentum Conservation (x-direction):

\[\frac{\partial (hu)}{\partial t} + \frac{\partial}{\partial x}\left(hu^2 + \frac{1}{2}gh^2\right) + \frac{\partial (huv)}{\partial y} = gh(S_{0x} - S_{fx})\]

Momentum Conservation (y-direction):

\[\frac{\partial (hv)}{\partial t} + \frac{\partial (huv)}{\partial x} + \frac{\partial}{\partial y}\left(hv^2 + \frac{1}{2}gh^2\right) = gh(S_{0y} - S_{fy})\]

Where:

  • \(h\) = water depth [m]

  • \(u, v\) = depth-averaged velocities in x and y directions [m/s]

  • \(g\) = gravitational acceleration [m/s²]

  • \(S_h\) = source/sink term for water [m/s]

  • \(S_{0x}, S_{0y}\) = bed slopes in x and y directions [-]

  • \(S_{fx}, S_{fy}\) = friction slopes in x and y directions [-]

Advection-Dispersion-Reaction Equation (Solute Transport)

For solute transport, the 2D ADE is solved:

\[\frac{\partial (hC)}{\partial t} + \frac{\partial (huC)}{\partial x} + \frac{\partial (hvC)}{\partial y} = \frac{\partial}{\partial x}\left(hD_x\frac{\partial C}{\partial x}\right) + \frac{\partial}{\partial y}\left(hD_y\frac{\partial C}{\partial y}\right) + S_C\]

Where:

  • \(C\) = solute concentration [kg/m³]

  • \(D_x, D_y\) = dispersion coefficients [m²/s]

  • \(S_C\) = source/sink term for solute [kg/m³/s]

Numerical Methods

Finite Volume Method

FLUXOS uses a cell-centered finite volume method. Two mesh types are supported:

Regular Cartesian Mesh:

  • Structured grid with rectangular cells of uniform size dxy

  • State variables (h, qx, qy, C) stored at cell centers

  • Fluxes computed at cell faces in x and y directions (directional sweep)

Unstructured Triangular Mesh:

  • Arbitrary triangular cells with edge-based flux computation

  • State variables stored at cell centroids

  • Fluxes computed at each edge, rotated into the edge-normal frame

  • Cell update: \(\Delta U_i = -\frac{\Delta t}{A_i} \sum_{e \in \partial \Omega_i} F_e \cdot L_e\)

Roe’s Approximate Riemann Solver

The numerical fluxes at cell interfaces are computed using Roe’s approximate Riemann solver, which provides:

  • Upwind discretization for advective terms

  • Proper treatment of shock waves and discontinuities

  • Robust handling of transcritical flows

The Roe flux is computed as:

\[F_{i+1/2} = \frac{1}{2}(F_L + F_R) - \frac{1}{2}\sum_{k=1}^{3}|\tilde{\lambda}_k|\tilde{\alpha}_k\tilde{r}_k\]

Where \(\tilde{\lambda}_k\), \(\tilde{\alpha}_k\), and \(\tilde{r}_k\) are the Roe-averaged eigenvalues, wave strengths, and eigenvectors.

Rotated Roe Solver (Triangular Mesh):

For the unstructured triangular mesh, the Roe solver operates in an edge-normal reference frame:

  1. MUSCL-reconstruct left/right states to the edge midpoint

  2. Apply hydrostatic reconstruction: \(z_{b,face} = \max(z_{b,L}, z_{b,R})\)

  3. Rotate velocities into edge-normal frame: \(q_n = q_x n_x + q_y n_y\), \(q_t = -q_x n_y + q_y n_x\)

  4. Compute HLL/Roe flux in the 1D normal frame

  5. Add turbulent stress contribution

  6. Rotate flux back to global frame: \(F_x = F_n n_x - F_t n_y\), \(F_y = F_n n_y + F_t n_x\)

  7. Multiply by edge length

MUSCL Reconstruction

Second-order spatial accuracy is achieved using MUSCL (Monotone Upstream-centered Schemes for Conservation Laws) reconstruction with slope limiters:

Regular Mesh:

  • Linear reconstruction of state variables at cell faces

  • MinMod or Van Leer slope limiters to prevent spurious oscillations

  • TVD (Total Variation Diminishing) property preservation

Triangular Mesh:

  • Least-squares gradient computation using neighbor cell values

  • Precomputed 2x2 LSQ inverse matrix per cell for efficiency

  • Barth-Jespersen slope limiter for monotonicity preservation on unstructured meshes:

\[\begin{split}\phi_i = \min_{e \in \partial \Omega_i} \begin{cases} \min\left(1, \frac{U_{\max} - U_i}{\tilde{U}_e - U_i}\right) & \text{if } \tilde{U}_e > U_i \\ \min\left(1, \frac{U_{\min} - U_i}{\tilde{U}_e - U_i}\right) & \text{if } \tilde{U}_e < U_i \\ 1 & \text{otherwise} \end{cases}\end{split}\]

where \(\tilde{U}_e\) is the reconstructed value at edge midpoint, and \(U_{\max}\), \(U_{\min}\) are the maximum and minimum values among the cell and its neighbors.

Hydrostatic Reconstruction (C-Property)

The triangular mesh solver uses hydrostatic reconstruction to preserve the well-balanced property (C-property / lake-at-rest):

\[z_{b,face} = \max(z_{b,L}, z_{b,R}), \quad h_L^* = \max(0, z_L - z_{b,face}), \quad h_R^* = \max(0, z_R - z_{b,face})\]

This ensures that a lake at rest with varying bathymetry produces zero flux, preventing spurious currents over uneven beds.

Time Integration

Adaptive Time Stepping:

The time step is computed based on the CFL (Courant-Friedrichs-Lewy) condition:

Regular mesh:

\[\Delta t = CFL \cdot \min_{i,j}\left(\frac{\Delta x}{\sqrt{gh_{i,j}} + |u_{i,j}|}, \frac{\Delta y}{\sqrt{gh_{i,j}} + |v_{i,j}|}\right)\]

Triangular mesh (inradius-based):

\[\Delta t = CFL \cdot \min_i \frac{r_i}{\sqrt{u_i^2 + v_i^2} + \sqrt{g h_i}}\]

Where \(r_i = 2 A_i / P_i\) is the inradius (inscribed circle radius) of cell \(i\), precomputed from the mesh geometry. This ensures stability on cells of varying size and shape.

Where CFL < 1.0 (typically 0.5-0.9) ensures numerical stability.

Explicit Euler Method:

Time advancement uses an explicit Euler scheme:

\[U^{n+1} = U^n + \Delta t \cdot R(U^n)\]

Where \(U\) is the vector of conserved variables and \(R\) is the residual.

Wetting and Drying

FLUXOS includes robust wetting and drying treatment:

  • Cells with depth below threshold (\(h_{dry}\)) are treated as dry

  • Momentum is set to zero in dry cells

  • Flux limiters prevent negative depths

  • Continuous tracking of wet/dry fronts

Velocity Desingularization (Triangular Mesh):

To handle the singularity at \(h \to 0\), the triangular mesh solver uses:

\[u = \frac{2 h q}{h^2 + \max(h^2, \varepsilon^2)}\]

Where \(\varepsilon = h_{dry}\). This smoothly transitions velocities to zero as depth vanishes, preventing division-by-zero and large spurious velocities.

Ritter Dry-Front Solution:

At wet-dry interfaces on the triangular mesh, the analytical Ritter dam-break solution is used to compute the flux across the edge, providing a physics-based treatment of the advancing wet front.

Friction Treatment

Bed friction is modeled using Manning’s equation:

\[S_{fx} = \frac{n^2 u \sqrt{u^2 + v^2}}{h^{4/3}}, \quad S_{fy} = \frac{n^2 v \sqrt{u^2 + v^2}}{h^{4/3}}\]

Where \(n\) is Manning’s roughness coefficient [s/m^(1/3)].

Edge-Based ADE Solver (Triangular Mesh)

The advection-dispersion equation on the triangular mesh uses an edge-based formulation:

  • Advection: Upwind scheme based on the hydrodynamic mass flux direction at each edge

  • Dispersion: Fick’s law with effective dispersion coefficient: \(F_{disp} = -D_{eff} \cdot \frac{C_R - C_L}{d_{LR}} \cdot L_e\)

  • Mass conservation: Outgoing flux limited to available mass in the donor cell

  • Bounding: Concentrations bounded by neighbor min/max to prevent spurious oscillations

Boundary Conditions

Regular Mesh:

  • Dirichlet (Fixed Value): Specified water level or discharge at domain boundaries

  • Neumann (Zero Gradient): Free outflow boundaries with zero normal gradient

  • Internal Weirs: Weir flow equations for internal hydraulic structures

Triangular Mesh:

Boundary conditions are assigned via mesh physical group tags in the Gmsh/Triangle mesh file:

  • Wall (default, tag 1): Reflective boundary - normal velocity reflected, tangential preserved

  • Outflow (tag 2): Transmissive boundary - zero-gradient (free outflow)

  • Custom boundary tags can be defined in the JSON configuration

Parallel Computing Implementation

Domain Decomposition

Regular Mesh (MPI):

  • 2D Cartesian domain decomposition using MPI_Cart_create

  • The global domain is divided into rectangular subdomains

  • Each MPI process handles one subdomain

  • Ghost cells (halo regions) used for inter-process communication

Triangular Mesh (MPI):

  • Graph-based partitioning using METIS (if available) or naive block partitioning

  • Halo cells: cells across partition-boundary edges exchanged as packed buffers

  • Per-neighbor communication via MPI_Isend/MPI_Irecv

Ghost Cell Exchange

At each time step, ghost cells are exchanged between neighboring processes:

  1. Pack boundary data into send buffers

  2. Non-blocking MPI communication (Isend/Irecv)

  3. Wait for communication completion

  4. Unpack received data into ghost regions

This ensures each process has the necessary data from neighbors for flux computation.

OpenMP Parallelization

Within each MPI process, OpenMP is used for shared-memory parallelism:

  • Loop parallelization with #pragma omp parallel for

  • Collapse(2) for efficient nested loop parallelization (regular mesh)

  • Edge-parallel and cell-parallel loops (triangular mesh)

  • Static scheduling for predictable load balancing

  • Reduction operations for global CFL computation

CUDA GPU Acceleration

Regular Mesh Kernels:

  • 2D thread grid matching the domain (1 thread per cell)

  • Separate kernels for Courant condition, hydrodynamics, and ADE

  • Block-level reduction for global minimum (CFL time step)

Triangular Mesh Kernels (7 specialized kernels):

  1. kernel_tri_wetdry - wet/dry classification (1 thread/cell)

  2. kernel_tri_gradient - least-squares gradient per cell (1 thread/cell)

  3. kernel_tri_limiter - Barth-Jespersen limiter per cell (1 thread/cell)

  4. kernel_tri_edge_flux - rotated Roe flux at each edge (1 thread/edge) - main compute kernel

  5. kernel_tri_accumulate - flux summation into cells using atomicAdd (1 thread/edge)

  6. kernel_tri_update - state update + wet/dry clamping (1 thread/cell)

  7. kernel_tri_courant - CFL reduction with block-level parallel minimum (1 thread/cell)

No race conditions: edge flux kernel writes only to its own edge, and atomicAdd ensures safe concurrent accumulation.

References

Costa, D., Shook, K., Spence, C., Elliott, J., Baulch, H., Wilson, H., & Pomeroy, J. W. (2020). Predicting variable contributing areas, hydrological connectivity, and solute transport pathways for a Canadian Prairie basin. Water Resources Research, 56, e2020WR027984. https://doi.org/10.1029/2020WR027984

Costa, D., Burlando, P., and Liong, S.-Y. (2016). Coupling spatially distributed river and groundwater transport models to investigate contaminant dynamics at river corridor scales. Environmental Modelling & Software, 86:91-110, https://doi.org/10.1016/j.envsoft.2016.09.009

Roe, P.L. (1981). Approximate Riemann solvers, parameter vectors, and difference schemes. Journal of Computational Physics, 43:357-372.

Van Leer, B. (1979). Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. Journal of Computational Physics, 32:101-136.