Building Requirements

This section describes the software dependencies and build requirements for FLUXOS.

Required Dependencies

Dependency

Minimum Version

Description

C++ Compiler

C++11

GCC 7+, Clang 8+, or Intel C++ 18+ recommended

CMake

3.10

Build system generator

Armadillo

9.9

C++ linear algebra library

OpenMP

4.5

Shared-memory parallelization (usually bundled with compiler)

Optional Dependencies

Dependency

Version

Description

MPI

3.0+

Required for distributed computing (OpenMPI, MPICH, or Intel MPI)

CUDA Toolkit

11.0+

Required for GPU acceleration (NVIDIA GPUs, Compute Capability 6.0+)

METIS

5.0+

Optional graph partitioning for triangular mesh MPI decomposition

HDF5

1.10+

Optional parallel I/O support

LAPACK/BLAS

Any

High-performance linear algebra (Armadillo backend)

Build Modes

FLUXOS supports several build configurations:

Standard Build (OpenMP only)

cmake -DMODE_release=ON ..
make

MPI+OpenMP Hybrid Build

cmake -DMODE_release=ON -DUSE_MPI=ON ..
make

CUDA GPU Build

cmake -DMODE_release=ON -DUSE_CUDA=ON ..
make

Triangular Mesh Build

cmake -DMODE_release=ON ..
make

Full-Feature Build (Triangular Mesh + GPU + MPI)

cmake -DMODE_release=ON -DUSE_CUDA=ON -DUSE_MPI=ON ..
make

Debug Build

cmake -DMODE_debug=ON ..
make

CMake Options

Option

Default

Description

MODE_release

OFF

Enable release build with optimizations (-O3)

MODE_debug

OFF

Enable debug build with symbols (-g)

USE_MPI

OFF

Enable MPI for distributed computing

USE_CUDA

OFF

Enable CUDA GPU acceleration (requires NVIDIA GPU and CUDA Toolkit 11.0+)

USE_TRIMESH

OFF

Enable unstructured triangular mesh support (Gmsh/Triangle mesh formats)

Note

USE_CUDA and USE_TRIMESH can be combined. When both are enabled, the triangular mesh solver uses GPU acceleration with 7 specialized CUDA kernels.

Compiler Optimization Flags

The release build includes aggressive optimization flags for maximum performance:

  • -O3: High-level optimization

  • -march=native: Optimize for the current CPU architecture

  • -mtune=native: Tune for the current CPU

  • -funroll-loops: Unroll loops for better performance

  • -ftree-vectorize: Enable automatic vectorization

  • -fno-math-errno: Disable errno for math functions (faster)

  • -flto: Link-time optimization

Platform-Specific Notes

Linux

Most HPC clusters run Linux. Ensure you load the appropriate modules before building:

module load gcc/11.2.0
module load cmake/3.20
module load armadillo/11.0
module load openmpi/4.1.1  # if using MPI

macOS

On macOS, install dependencies via Homebrew:

brew install cmake armadillo libomp open-mpi

For Apple Silicon (M1/M2), ensure you’re using native ARM builds of dependencies.

Windows

Windows builds are supported via:

  • Visual Studio 2019+ with C++11 support

  • MSYS2/MinGW-w64 with GCC

  • Windows Subsystem for Linux (WSL) - recommended

Running the Example

FLUXOS includes a test case in the Working_example/ directory (Rosa Creek watershed, 859x618 cells at 2m resolution). The repository’s bin/ directory is reserved for compiled binaries.

Regular Mesh:

mkdir -p Results
./build/bin/fluxos Working_example/modset.json

Triangular Mesh:

First generate the mesh and modset.json from the DEM by editing the _config dict (mesh_type = "triangular") and running the template:

cd supporting_scripts/1_Model_Config
python model_config_template.py

Then run FLUXOS with the triangular mesh config produced by the template:

cd <repo_root>
mkdir -p Results
./build/bin/fluxos Working_example/modset_trimesh.json

Visualizing Results in Google Earth

Export simulation results as KMZ files for animated visualization:

# Regular mesh results
python supporting_scripts/2_Read_Outputs/output_supporting_lib/fluxos_viewer.py \
    --results-dir Results --dem Working_example/Rosa_2m.asc --utm-zone 10

# Triangular mesh results
python supporting_scripts/2_Read_Outputs/output_supporting_lib/fluxos_viewer.py \
    --results-dir Results --dem Working_example/Rosa_2m.asc \
    --mesh-type triangular --utm-zone 10

# Open in Google Earth
open fluxos_regular.kmz    # macOS
xdg-open fluxos_regular.kmz  # Linux

Use the time slider in Google Earth to animate through simulation timesteps.

Benchmark Results

Tested on the Rosa Creek example (859x618 grid, 5h simulation, 1h output steps) on Apple M-series:

Configuration

Mesh Type

Wall Time

Output Size

OpenMP (release)

Regular (530K cells)

4.73 s

210 MB (6 x 35 MB .txt)

OpenMP (release)

Triangular (4528 cells)

0.85 s

9.2 MB (6 x 1.5 MB .vtu)

Note

CUDA acceleration requires an NVIDIA GPU (not available on macOS ARM). MPI domain decomposition is available for distributed computing on HPC clusters.

Verifying the Build

After building, verify the executable:

# Check executable exists
ls -la build/bin/fluxos

# Run with the example case
./build/bin/fluxos Working_example/modset.json

Troubleshooting

Armadillo not found:

Ensure Armadillo is installed and its include/library paths are accessible:

# Check Armadillo installation
find /usr -name "armadillo" 2>/dev/null

# Set paths if needed
cmake -DARMADILLO_INCLUDE_DIR=/path/to/include \
      -DARMADILLO_LIBRARY=/path/to/libarmadillo.so ..

OpenMP not found:

For GCC, OpenMP is usually included. For Clang on macOS:

brew install libomp
export OpenMP_ROOT=$(brew --prefix)/opt/libomp

MPI not found:

Ensure MPI is in your PATH:

which mpicc mpicxx

# If using environment modules
module load openmpi

Link-time optimization (LTO) errors:

If LTO causes issues, disable it:

cmake -DMODE_release=ON -DCMAKE_CXX_FLAGS="-O3 -march=native" ..

CUDA not found:

Ensure CUDA Toolkit is installed and nvcc is in your PATH:

# Check CUDA installation
nvcc --version
nvidia-smi

# Set CUDA path if needed
export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH

CUDA compute capability mismatch:

If you get architecture-related errors, specify your GPU’s compute capability:

cmake -DMODE_release=ON -DUSE_CUDA=ON -DCUDA_ARCH=75 ..  # For RTX 2080
cmake -DMODE_release=ON -DUSE_CUDA=ON -DCUDA_ARCH=86 ..  # For RTX 3090