Namespaces
namespace	detail

Classes
class	CuBlockPreconditioner
	Is an adaptation of Dune::BlockPreconditioner that works within the CuISTL framework. More...

class	CuDILU
	DILU preconditioner on the GPU. More...

class	CuJac
	Jacobi preconditioner on the GPU. More...

class	CuOwnerOverlapCopy
	CUDA compatiable variant of Dune::OwnerOverlapCopyCommunication. More...

class	CuSeqILU0
	Sequential ILU0 preconditioner on the GPU through the CuSparse library. More...

class	CuSparseMatrix
	The CuSparseMatrix class simple wrapper class for a CuSparse matrix. More...

class	GPUAwareMPISender
	Derived class of GPUSender that handles MPI made with CUDA aware MPI The copOwnerToAll function uses MPI calls refering to data that resides on the GPU in order to send it directly to other GPUs, skipping the staging step on the CPU. More...

class	GPUObliviousMPISender
	Derived class of GPUSender that handles MPI calls that should NOT use GPU direct communicatoin The implementation moves data fromthe GPU to the CPU and then sends it using regular MPI. More...

class	GPUSender
	GPUSender is a wrapper class for classes which will implement copOwnerToAll This is implemented with the intention of creating communicators with generic GPUSender To hide implementation that will either use GPU aware MPI or not. More...

class	PreconditionerAdapter
	Makes a CUDA preconditioner available to a CPU simulator. More...

class	PreconditionerConvertFieldTypeAdapter
	Converts the field type (eg. double to float) to benchmark single precision preconditioners. More...

class	PreconditionerHolder
	Common interface for adapters that hold preconditioners. More...

class	SolverAdapter
	Wraps a CUDA solver to work with CPU data. More...

Functions
void	setZeroAtIndexSet (const CuVector< int > &indexSet)
	The CuVector class is a simple (arithmetic) vector class for the GPU. More...

std::string	toDebugString ()

void	setDevice (int mpiRank, int numberOfMpiRanks)
	Sets the correct CUDA device in the setting of MPI. More...

Function Documentation

◆ setDevice()

void Opm::cuistl::setDevice	(	int	mpiRank,
		int	numberOfMpiRanks
	)

Sets the correct CUDA device in the setting of MPI.

Note: This assumes that every node has equally many GPUs, all of the same caliber; This probably needs to be called before MPI_Init if one uses GPUDirect transfers (see eg. https://devtalk.nvidia.com/default/topic/752046/teaching-and-curriculum-support/multi-gpu-system-running-mpi-cuda-/ ); If no CUDA device is present, this does nothing.

◆ setZeroAtIndexSet()

void Opm::cuistl::setZeroAtIndexSet ( const CuVector< int > & indexSet )

The CuVector class is a simple (arithmetic) vector class for the GPU.

Note: we currently only support simple raw primitives for T (double, float and int); We currently only support arithmetic operations on double and float.; this vector has no notion of block size. The user is responsible for allocating the correct number of primitives (double or floats)

Example usage:

   #include <opm/simulators/linalg/cuistl/CuVector.hpp>
  
   void someFunction() {
       auto someDataOnCPU = std::vector<double>({1.0, 2.0, 42.0, 59.9451743, 10.7132692});
  
       auto dataOnGPU = CuVector<double>(someDataOnCPU);
  
       // Multiply by 4.0:
       dataOnGPU *= 4.0;
  
       // Get data back on CPU in another vector:
       auto stdVectorOnCPU = dataOnGPU.asStdVector();
   }
  
   @tparam T the type to store. Can be either float, double or int.
  /
template <typename T>
class CuVector
{
public:
    using field_type = T;
    using size_type = size_t;
 
 
    CuVector(const CuVector<T>& other);
 
    explicit CuVector(const std::vector<T>& data);
 
    CuVector& operator=(const CuVector<T>& other);
 
    CuVector& operator=(T scalar);
 
    explicit CuVector(const size_t numberOfElements);
 
 
    CuVector(const T* dataOnHost, const size_t numberOfElements);
 
    virtual ~CuVector();
 
    T* data();
 
    const T* data() const;
 
    template <int BlockDimension>
    void copyFromHost(const Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector)
    {
        // TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
        if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
            OPM_THROW(std::runtime_error,
                      fmt::format("Given incompatible vector size. CuVector has size {}, \n"
                                  "however, BlockVector has N() = {}, and dim = {}.",
                                  m_numberOfElements,
                                  bvector.N(),
                                  bvector.dim()));
        }
        const auto dataPointer = static_cast<const T*>(&(bvector[0][0]));
        copyFromHost(dataPointer, m_numberOfElements);
    }
 
    template <int BlockDimension>
    void copyToHost(Dune::BlockVector<Dune::FieldVector<T, BlockDimension>>& bvector) const
    {
        // TODO: [perf] vector.dim() can be replaced by bvector.N() * BlockDimension
        if (detail::to_size_t(m_numberOfElements) != bvector.dim()) {
            OPM_THROW(std::runtime_error,
                      fmt::format("Given incompatible vector size. CuVector has size {},\n however, the BlockVector "
                                  "has has N() = {}, and dim() = {}.",
                                  m_numberOfElements,
                                  bvector.N(),
                                  bvector.dim()));
        }
        const auto dataPointer = static_cast<T*>(&(bvector[0][0]));
        copyToHost(dataPointer, m_numberOfElements);
    }
 
    void copyFromHost(const T* dataPointer, size_t numberOfElements);
 
    void copyToHost(T* dataPointer, size_t numberOfElements) const;
 
    void copyFromHost(const std::vector<T>& data);
 
    void copyToHost(std::vector<T>& data) const;
 
    void prepareSendBuf(CuVector<T>& buffer, const CuVector<int>& indexSet) const;
    void syncFromRecvBuf(CuVector<T>& buffer, const CuVector<int>& indexSet) const;
 
    CuVector<T>& operator*=(const T& scalar);
 
    CuVector<T>& axpy(T alpha, const CuVector<T>& y);
 
    CuVector<T>& operator+=(const CuVector<T>& other);
 
    CuVector<T>& operator-=(const CuVector<T>& other);
 
    T dot(const CuVector<T>& other) const;
 
    T two_norm() const;
 
    T dot(const CuVector<T>& other, const CuVector<int>& indexSet, CuVector<T>& buffer) const;
 
    T two_norm(const CuVector<int>& indexSet, CuVector<T>& buffer) const;
 
 
    T dot(const CuVector<T>& other, const CuVector<int>& indexSet) const;
 
    T two_norm(const CuVector<int>& indexSet) const;
 
 
    size_type dim() const;
 
 
    std::vector<T> asStdVector() const;
 
    template <int blockSize>
    Dune::BlockVector<Dune::FieldVector<T, blockSize>> asDuneBlockVector() const
    {
        OPM_ERROR_IF(dim() % blockSize != 0,
                     fmt::format("blockSize is not a multiple of dim(). Given blockSize = {}, and dim() = {}",
                                 blockSize,
                                 dim()));
 
        Dune::BlockVector<Dune::FieldVector<T, blockSize>> returnValue(dim() / blockSize);
        copyToHost(returnValue);
        return returnValue;
    }
 
 

◆ toDebugString()

std::string Opm::cuistl::toDebugString ( )

References Opm::to_string().

Namespaces

Classes

Functions

Function Documentation

◆ setDevice()

◆ setZeroAtIndexSet()

◆ toDebugString()