|
Opm::gpuistl Namespace Reference A small, fixed‑dimension MiniVector class backed by std::array that can be used in both host and CUDA device code. More...
Detailed DescriptionA small, fixed‑dimension MiniVector class backed by std::array that can be used in both host and CUDA device code. The implementation purposefully remains lightweight, containing only the utilities required for element access, iteration, and initialization. It avoids dynamic memory and leverages the compile‑time Enumeration Type Documentation◆ MatrixStorageMPScheme
Function Documentation◆ amgxSafeCall()
Safe call wrapper for AMGX functions. Checks the return code from AMGX functions and throws an AmgxError if an error occurred.
References getAmgxErrorMessage(). ◆ copyFromGPU() [1/3]
template<class T >
Copies a value from GPU-allocated memory to the host.
References copyFromGPU(). ◆ copyFromGPU() [2/3]
template<class T , class Deleter >
Copies a value from GPU-allocated memory to the host.
References copyFromGPU(). ◆ copyFromGPU() [3/3]
template<class T >
Copies a value from GPU-allocated memory to the host.
References Opm::gpuistl::detail::isGPUPointer(), and OPM_GPU_SAFE_CALL. Referenced by copyFromGPU(). ◆ copyToGPU() [1/3]
template<class T >
Copies a value from the host to GPU-allocated memory using a shared_ptr.
References copyToGPU(). ◆ copyToGPU() [2/3]
template<class T , class Deleter >
Copies a value from the host to GPU-allocated memory using a unique_ptr.
References copyToGPU(). ◆ copyToGPU() [3/3]
template<class T >
Copies a value from the host to GPU-allocated memory.
References Opm::gpuistl::detail::isGPUPointer(), and OPM_GPU_SAFE_CALL. Referenced by copyToGPU(). ◆ getAmgxErrorMessage()
Get a descriptive error message for an AMGX error code.
Referenced by amgxSafeCall(). ◆ getHypreErrorMessage()
Get a descriptive error message for a Hypre error code.
Referenced by hypreSafeCall(). ◆ hypreSafeCall()
Safe call wrapper for Hypre functions. Checks the return code from Hypre functions and throws a HypreError if an error occurred.
References getHypreErrorMessage(). ◆ make_gpu_shared_ptr() [1/2]
template<typename T >
Creates a shared pointer managing GPU-allocated memory of the specified element type. This function allocates memory on the GPU for the type
References OPM_GPU_SAFE_CALL, and OPM_GPU_WARN_IF_ERROR. ◆ make_gpu_shared_ptr() [2/2]
template<typename T >
Creates a shared pointer managing GPU-allocated memory of the specified element type. This function allocates memory on the GPU for the type
References OPM_GPU_SAFE_CALL. ◆ make_gpu_unique_ptr() [1/2]
template<typename T >
Creates a unique pointer managing GPU-allocated memory of the specified element type. This function allocates memory on the GPU for the type
References OPM_GPU_SAFE_CALL, and OPM_GPU_WARN_IF_ERROR. ◆ make_gpu_unique_ptr() [2/2]
template<typename T >
Creates a unique pointer managing GPU-allocated memory of the specified element type. This function allocates memory on the GPU for the type
References OPM_GPU_SAFE_CALL. ◆ make_view() [1/2]
template<class T >
◆ make_view() [2/2]
template<class T , class Deleter >
◆ makeGpuOwnerOverlapCopy()
template<class field_type , int block_size, class OwnerOverlapCopyCommunicationType >
◆ makeMatrixStorageMPScheme()
◆ OPM_CREATE_GPU_RESOURCE() [1/3]
Manages a CUDA event resource. This resource encapsulates a cudaEvent_t handle and provides automatic creation and destruction of the CUDA event. Use this resource to measure elapsed time or synchronize GPU executions between different streams. ◆ OPM_CREATE_GPU_RESOURCE() [2/3]
Manages a CUDA graph resource. This resource encapsulates a cudaGraph_t handle and provides automatic creation and destruction of a CUDA graph. It represents a series of operations captured for efficient replay, execution, or modification. ◆ OPM_CREATE_GPU_RESOURCE() [3/3]
Manages a CUDA stream resource. This resource encapsulates a cudaStream_t handle and provides automatic creation and destruction of the CUDA stream. Use this resource to schedule and synchronize GPU kernels or other asynchronous operations. ◆ OPM_CREATE_GPU_RESOURCE_NO_CREATE()
Manages a CUDA graph execution resource. This resource encapsulates a cudaGraphExec_t handle and provides automatic destruction of the CUDA graph execution object. It represents the compiled and optimized version of a CUDA graph ready for efficient execution. ◆ printDevice() [1/2]
Referenced by Opm::Main::initialize_(). ◆ printDevice() [2/2]
◆ setDevice() [1/2]
◆ setDevice() [2/2]
Sets the correct CUDA device in the setting of MPI.
◆ setZeroAtIndexSet()
The GpuVector class is a simple (arithmetic) vector class for the GPU.
Example usage: #include <opm/simulators/linalg/gpuistl/GpuVector.hpp>
void someFunction() {
auto someDataOnCPU = std::vector<double>({1.0, 2.0, 42.0, 59.9451743, 10.7132692});
auto dataOnGPU = GpuVector<double>(someDataOnCPU);
// Multiply by 4.0:
dataOnGPU *= 4.0;
// Get data back on CPU in another vector:
auto stdVectorOnCPU = dataOnGPU.asStdVector();
}
@tparam T the type to store. Can be either float, double or int.
/
template <typename T>
class GpuVector
{
public:
using field_type = T;
using size_type = size_t;
GpuVector(const GpuVector<T>& other);
explicit GpuVector(const std::vector<T>& data);
GpuVector& operator=(const GpuVector<T>& other);
template<int BlockDimension>
explicit GpuVector(const Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector)
: GpuVector(bvector.dim())
{
copyFromHost(bvector);
}
GpuVector& operator=(T scalar);
GpuVector() : m_dataOnDevice(nullptr), m_numberOfElements(0), m_cuBlasHandle(detail::CuBlasHandle::getInstance()) {}
explicit GpuVector(const size_t numberOfElements);
GpuVector(const T* dataOnHost, const size_t numberOfElements);
virtual ~GpuVector();
T* data();
const T* data() const;
template <int BlockDimension>
void copyFromHost(const Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector)
{
// TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
OPM_THROW(std::runtime_error,
fmt::format("Given incompatible vector size. GpuVector has size {}, \n"
"however, BlockVector has N() = {}, and dim = {}.",
m_numberOfElements,
bvector.N(),
bvector.dim()));
}
const auto dataPointer = static_cast<const T*>(&(bvector[0][0]));
copyFromHost(dataPointer, m_numberOfElements);
}
template <int BlockDimension>
void copyFromHostAsync(const Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector, cudaStream_t stream = detail::DEFAULT_STREAM)
{
// TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
OPM_THROW(std::runtime_error,
fmt::format("Given incompatible vector size. GpuVector has size {}, \n"
"however, BlockVector has N() = {}, and dim = {}.",
m_numberOfElements,
bvector.N(),
bvector.dim()));
}
const auto dataPointer = static_cast<const T*>(&(bvector[0][0]));
copyFromHostAsync(dataPointer, m_numberOfElements, stream);
}
template <int BlockDimension>
void copyToHost(Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector) const
{
// TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
OPM_THROW(std::runtime_error,
fmt::format("Given incompatible vector size. GpuVector has size {},\n however, the BlockVector "
"has has N() = {}, and dim() = {}.",
m_numberOfElements,
bvector.N(),
bvector.dim()));
}
const auto dataPointer = static_cast<T*>(&(bvector[0][0]));
copyToHost(dataPointer, m_numberOfElements);
}
template <int BlockDimension>
void copyToHostAsync(Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector, cudaStream_t stream = detail::DEFAULT_STREAM) const
{
// TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
OPM_THROW(std::runtime_error,
fmt::format("Given incompatible vector size. GpuVector has size {},\n however, the BlockVector "
"has has N() = {}, and dim() = {}.",
m_numberOfElements,
bvector.N(),
bvector.dim()));
}
const auto dataPointer = static_cast<T*>(&(bvector[0][0]));
copyToHostAsync(dataPointer, m_numberOfElements, stream);
}
void copyFromHost(const T* dataPointer, size_t numberOfElements);
void copyFromHostAsync(const T* dataPointer, size_t numberOfElements, cudaStream_t stream = detail::DEFAULT_STREAM);
void copyToHost(T* dataPointer, size_t numberOfElements) const;
void copyToHostAsync(T* dataPointer, size_t numberOfElements, cudaStream_t stream = detail::DEFAULT_STREAM) const;
void copyFromHost(const std::vector<T>& data);
void copyToHost(std::vector<T>& data) const;
void copyFromDeviceToDevice(const GpuVector<T>& other) const;
GpuVector<T>& operator*=(const T& scalar);
GpuVector<T>& axpy(T alpha, const GpuVector<T>& y);
GpuVector<T>& operator+=(const GpuVector<T>& other);
GpuVector<T>& operator-=(const GpuVector<T>& other);
T dot(const GpuVector<T>& other) const;
T two_norm() const;
T dot(const GpuVector<T>& other, const GpuVector<int>& indexSet, GpuVector<T>& buffer) const;
T two_norm(const GpuVector<int>& indexSet, GpuVector<T>& buffer) const;
T dot(const GpuVector<T>& other, const GpuVector<int>& indexSet) const;
T two_norm(const GpuVector<int>& indexSet) const;
size_type dim() const;
void resize(size_t new_size);
std::vector<T> asStdVector() const;
template <int blockSize>
Dune::BlockVector<Dune::FieldVector<T, blockSize>> asDuneBlockVector() const
{
OPM_ERROR_IF(dim() % blockSize != 0,
fmt::format("blockSize is not a multiple of dim(). Given blockSize = {}, and dim() = {}",
blockSize,
dim()));
Dune::BlockVector<Dune::FieldVector<T, blockSize>> returnValue(dim() / blockSize);
copyToHost(returnValue);
return returnValue;
}
void syncFromRecvBuf(T *deviceA, T *buffer, size_t numberOfElements, const int *indices) constexpr cudaStream_t DEFAULT_STREAM The default GPU stream (stream 0) Definition: gpu_constants.hpp:31 void prepareSendBuf(const T *deviceA, T *buffer, size_t numberOfElements, const int *indices) __host__ __device__ std::size_t to_size_t(int i) to_size_t converts a (on most relevant platforms) a 32 bit signed int to a 64 bits unsigned int Definition: safe_conversion.hpp:86 ◆ toDebugString()
References Opm::to_string(). ◆ writeMatrixMarket()
template<typename T >
Variable Documentation◆ is_gpu_type_v
template<typename T >
Helper variable template for easier usage. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||