GPU Acceleration in COMSOL Multiphysics®


The latest versions of COMSOL Multiphysics® introduce new capabilities for accelerating simulations using NVIDIA® graphics processing units (GPUs). These enhancements broaden the range of models that can benefit from GPU hardware; they include direct sparse solvers that apply to any single-physics or multiphysics application as well as support for time-explicit pressure acoustics simulations and deep-neural-network (DNN) surrogate model training. In version 6.4, GPU support for direct solvers is fully integrated into the standard solver framework, enabling users to take advantage of GPU acceleration for existing models without needing to make changes to the underlying physics settings.

GPU Acceleration for Direct Sparse Solvers

One of the most time-consuming stages in many finite element simulations is the repeated solution of large sparse linear systems. Such systems arise from implicit time stepping, nonlinear iterations, eigenfrequency analysis, and parameter sweeps. To address these types of studies, COMSOL Multiphysics® version 6.4 now includes the NVIDIA CUDA® direct sparse solver (cuDSS). This solver performs matrix factorizations with one or more GPUs on a single computer, taking advantage of the high memory bandwidth and massive parallelism provided by recent GPU hardware.

The COMSOL Multiphysics UI showing the Model Builder with the Direct node highlighted, the corresponding Settings window, and a wheel rim model in the Graphics window.
GPU acceleration with NVIDIA cuDSS also benefits conventional structural finite element analyses on standard workstation hardware. In this wheel rim example, the effective stress is visualized, and the GPU-based solve on an NVIDIA RTX™ 5000 Ada Generation workstation GPU achieved a 2× speedup compared with a CPU-based solve on an Intel® W5-2465X processor.

Performance improvements vary between applications, but significant reductions in wall-clock time have been observed for models with several million degrees of freedom (DOFs). For example, for a thermoviscous acoustics benchmark simulation involving a multiphysics analysis of the acoustic transmission through a perforated plate, solving on multiple NVIDIA® H100 GPUs resulted in notably shorter runtimes compared to a dual-processor CPU system. Standard structural mechanics models also show clear improvements when offloading the direct solver phase to workstation-class GPUs such as the RTX 5000 Ada.

The cuDSS implementation supports both double-precision and single-precision arithmetic. Because single precision reduces memory usage by half, it can increase performance on any card where the application is memory bound, including lower-cost GPUs. Whether a particular model is well suited to single precision depends on its numerical conditioning, which is influenced by mesh quality, material parameters, and the underlying physics. Users can test the precision modes directly within the solver settings and select the mode that provides both stable results and the desired performance.

A perforated plate model showing the acoustic particle velocity and a graph showing the computational speedup for three different model sizes.
Acoustic transfer-impedance multiphysics model of a perforated plate used in mufflers and acoustic liners, solved with cuDSS on four NVIDIA® H100 GPUs. The image shows the acoustic particle velocity. Benchmarking at four model sizes (0.9–2.4 million DOFs) shows nearly a 5× speedup over a CPU-based direct solver on a dual Intel® Xeon® Platinum 8260 system.

GPU-Accelerated Time-Explicit Pressure Acoustics

NVIDIA® GPU support is also available for time-explicit pressure acoustics simulation. When running this type of simulation, the need to solve large linear systems at each time step can be avoided by using explicit time-stepping methods that instead rely on repeated vector operations and local element updates. These operations are highly parallelizable and map efficiently onto GPU hardware.

This capability is particularly relevant for wideband acoustics simulations and large 3D domains, where fine spatial resolution leads to a large number of time steps. For example, room acoustics models, such as office spaces or concert halls, may require tens of thousands of time steps to resolve wave propagation accurately. Offloading these operations to GPUs can shorten overall simulation time substantially.

The GPU-accelerated formulation for explicit acoustics supports both single-GPU and multi-GPU systems, on a single computer as well as on cluster nodes. This makes it possible to simulate domains with hundreds of millions of DOFs. For example, in a wave-based model of a chamber music hall, a simulation involving approximately 300 million DOFs was completed in a few hours on a single data-center grade NVIDIA® H100 GPU, compared with several hours on multiple CPU nodes. Similar reductions in runtime can be observed in automotive acoustics examples and other large-scale transient analyses.

Note: The Pressure Acoustics, Time Explicit interface is supported for all license types when using a single GPU but requires a floating network license when using multiple GPUs.

Propagation of an initial pulse (centered at 500 Hz) in a model of a chamber music hall with 300 million DOFs, solved on a data-center grade NVIDIA® H100 GPU.

GPU Support for Surrogate Model Training

COMSOL Multiphysics® also provides tools for generating DNN surrogate models that approximate high-fidelity numerical simulations. Training these networks requires repeated evaluation of large datasets and many optimization cycles, which are well suited to GPU acceleration. By performing the training process on an NVIDIA® GPU, users can reduce the time required to explore network architectures or adjust hyperparameters.

Larger networks, which may be required for capturing complex multiphysics behavior or spatial model reconstruction, also benefit from the increased memory bandwidth and parallel computation capability of GPUs. GPU support for DNN training is enabled directly in the Surrogate Model interface and functions without add-on products.

The UI of an opened Thermal Microactuator Surrogate Model app, with various Input and Results sections, and a 3D plot in the Graphics window.
A simulation app of a MEMS thermal actuator powered by a DNN surrogate model enables extremely fast model evaluation of quantities such as temperature, displacement, voltage, and stress. The surrogate model was trained using GPU acceleration on a standard workstation.

Further Reading

To learn more about GPU acceleration in COMSOL Multiphysics®, see: