Array distributed across many processes supporting remote-memory-access, access to process local buffer, and some linear algebra operations. More...

#include <DistrArray.h>

Inheritance diagram for molpro::linalg::array::DistrArray:

[legend]

Detailed Description

Array distributed across many processes supporting remote-memory-access, access to process local buffer, and some linear algebra operations.

This class implements one-sided remote-memory-access (RMA) operations for getting or putting a copy of any section of the array, provides access to the array data local to the current process, and implements simple linear algebra operations. It also exposes synchronization of processes, and fencing to ensure RMA operations complete.

This class is designed for the following usage:

to do simple linear algebra on whole arrays
getting sections of the array to transform and accumulate the result into a different array
initializing the array using put operations

All RMA operations are blocking, meaning that if putting data into the array the process returns as soon as the data is in the network buffer (not necessarily in the array) and if getting data the process returns when the data is copied into the supplied buffer. Performing synchronisation ensures that all RMA operations complete.

The LocalBuffer nested class gives access to the section of distributed array that exists on the current process. It is up to specific implementation of DistributedArray whether exclusive access to the buffer is granted.

The linear algebra operations can be collective or non-collective. In the former, all processes have to participate in the function call, for example dot which requires a collective broadcast. In case of the latter, each process only needs to operate on the local section of the array and no communication is required, for example scaling array by a constant. Collective operations are naturally synchronised.

Warning: The base class does not enforce any locking or exclusive access mechanism. It is up to the specific implementation to decide whether this is necessary. The only rule is that synchronisation call must complete any outstanding RMA and linear algebra operations.

Example: blocked matrix vector multiplication

// y = A x
auto x = Array(n, comm);
auto y = Array(n, comm);
auto A = Array(n*n, comm);
// Initialze
x.allocate();
x.zero(); // non-collective operation. Each process sets local buffer to zero
y.zero(); // y uses same communicator as x, so does not need a separate synchronisation. It would for RMA.
x.sync(); // Synchronize to make sure all elements are zero
if (rank == 0) { // Simple initialization
  x.put(lo, hi, values);
  x.scatter(indices, values2);
}
initialize(A); // assume A is stored in row major format
x.sync();
// blocked matrix vector multiplication: y[i] = A[i,j] x[j]
// Let's assume there are nb blocks each of size bs to keep things simple.
// RMA operations need a buffer to copy the data into.
std::vector<double> result_block(bs);
std::vector<double> x_block(bs);
std::vector<double> a_block(bs*bs);
for (auto i = 0; i_col < nb ; ++i){
  auto i_lo = i * bs;
  auto i_hi = i_lo + bs - 1;
  std::fill_n(begin(result_block), bs, 0.);
  for (auto j = 0; j < nb ; ++j){
    auto j_lo = j * bs;
    auto j_hi = j_lo + bs - 1;
    if (NextTask()){ // task counter assigns operation to current process
      x.get(j_lo, j_hi, x_bloc.data());
      A.get((i * nb + j) * bs, (i * nb + j) * bs + bs - 1, a_bloc.data());
      // matrix vector multiplication with accumulation into result vector
      matrix_vector_multiply(a_bloc, x_bloc, result_block);
    }
  }
  y.accumulate(i_lo, i_hi, result_block.data());
}
y.sync();

Classes
class	LocalBuffer
	Provides access to the local portion of the array. More...

Public Types
using	distributed_array = void
	a compile time tag that this is a distributed array More...

using	value_type = double

using	index_type = size_t

using	SparseArray = std::map< index_type, double >

using	Distribution = util::Distribution< index_type >

Public Member Functions
virtual	~DistrArray ()=default

MPI_Comm	communicator () const
	return a copy of the communicator More...

virtual void	sync () const
	Synchronizes all process in this group and ensures any outstanding operations on the array have completed. More...

size_t	size () const
	total number of elements, same as overall dimension of array More...

bool	compatible (const DistrArray &other) const
	Checks that arrays are of the same dimensionality. More...

virtual void	zero ()
	Set all local elements to zero. More...

virtual void	error (const std::string &message) const
	stops application with an error More...

value_type	operator[] (size_t index)

Local buffer
Access the section of the array local to this process
virtual std::unique_ptr< LocalBuffer >	local_buffer ()=0
	Access the buffer local to this process. More...

virtual std::unique_ptr< const LocalBuffer >	local_buffer () const =0

virtual const Distribution &	distribution () const =0
	Access distribution of the array among processes. More...

One-sided RMA
One-sided remote-memory-access operations. They are non-collective
virtual value_type	at (index_type ind) const =0

virtual void	set (index_type ind, value_type val)=0
	Set one element to a scalar. Global operation. More...

virtual void	get (index_type lo, index_type hi, value_type *buf) const =0
	Gets buffer[lo:hi) from global array (hi is past-the-end). Blocking. More...

virtual std::vector< value_type >	get (index_type lo, index_type hi) const =0

virtual void	put (index_type lo, index_type hi, const value_type *data)=0
	array[lo:hi) = data[:] (hi is past-the-end). Blocking More...

virtual void	acc (index_type lo, index_type hi, const value_type *data)=0
	array[lo:hi) += scaling_constant * data[:] (hi is past-the-end). Blocking More...

virtual std::vector< value_type >	gather (const std::vector< index_type > &indices) const =0
	gets elements with discontinuous indices from array. Blocking More...

virtual void	scatter (const std::vector< index_type > &indices, const std::vector< value_type > &data)=0
	array[indices[i]] = data[i] Puts vals of elements with discontinuous indices of array. Blocking. More...

virtual void	scatter_acc (std::vector< index_type > &indices, const std::vector< value_type > &data)=0
	array[indices[i]] += vals[i] Accumulates vals of elements into discontinuous indices of array. Atomic, blocking, with on-sided communication More...

virtual std::vector< value_type >	vec () const =0
	Copies the whole buffer into a vector. Blocking. More...

Asynchronous linear algebra. No synchronisation on entry or exit.
virtual void	fill (value_type val)

virtual void	copy (const DistrArray &y)

virtual void	copy_patch (const DistrArray &y, index_type start, index_type end)
	Copies elements in a patch of y. If both arrays are empty than does nothing. If only one is empty, throws an error. More...

virtual void	axpy (value_type a, const DistrArray &y)
	this[:] += a * y[:]. Throws an error if any array is empty. Add a multiple of another array to this one. Blocking, collective. More...

virtual void	axpy (value_type a, const SparseArray &y)

virtual void	scal (value_type a)
	Scale by a constant. Local. More...

virtual void	add (const DistrArray &y)
	Add another array to this. Local. Throws error if any array is empty. More...

virtual void	add (value_type a)
	Add a constant. Local. More...

virtual void	sub (const DistrArray &y)
	Subtract another array from this. Local. Throws error if any array is empty. More...

virtual void	sub (value_type a)
	Subtract a constant. Local. More...

virtual void	recip ()
	Take element-wise reciprocal of this. Local. No checks are made for zero values. More...

virtual void	times (const DistrArray &y)
	this[i] *= y[i]. Throws error if any array is empty. More...

virtual void	times (const DistrArray &y, const DistrArray &z)
	this[i] = y[i]*z[i]. Throws error if any array is empty. More...

Collective linear algebra operations, synchronisation on exit
virtual value_type	dot (const DistrArray &y) const
	Scalar product of two arrays. Collective. Throws error if any array is empty. Both arrays should be part of the same processor group (same communicator). The result is broadcast to each process. More...

virtual value_type	dot (const SparseArray &y) const

void	divide (const DistrArray &y, const DistrArray &z, value_type shift=0, bool append=false, bool negative=false)
	this[i] = y[i]/(z[i]+shift). Collective. Throws error if any array is empty. More...

std::list< std::pair< index_type, value_type > >	min_n (int n) const
	returns n smallest elements in array x Collective operation, must be called by all processes in the group. More...

std::list< std::pair< index_type, value_type > >	max_n (int n) const
	returns n largest elements in array x Collective operation, must be called by all processes in the group. More...

std::list< std::pair< index_type, value_type > >	min_abs_n (int n) const
	returns n elements that are largest by absolute value in array x Collective operation, must be called by all processes in the group. More...

std::list< std::pair< index_type, value_type > >	max_abs_n (int n) const
	returns n elements that are largest by absolute value in array x Collective operation, must be called by all processes in the group. More...

std::vector< index_type >	min_loc_n (int n) const
	find the index of n smallest components in array x Collective operation, must be called by all processes in the group. More...

std::map< size_t, value_type >	select_max_dot (size_t n, const DistrArray &y) const

std::map< size_t, value_type >	select_max_dot (size_t n, const SparseArray &y) const

std::map< size_t, value_type >	select (size_t n, bool max=false, bool ignore_sign=false) const

Protected Member Functions
	DistrArray (size_t dimension, MPI_Comm commun)
	Initializes array without allocating any memory. More...

	DistrArray ()=default

virtual void	_divide (const DistrArray &y, const DistrArray &z, value_type shift, bool append, bool negative)

Protected Attributes
index_type	m_dimension = 0
	number of elements in the array More...

MPI_Comm	m_communicator

Member Typedef Documentation

◆ distributed_array

using molpro::linalg::array::DistrArray::distributed_array = void

a compile time tag that this is a distributed array

◆ Distribution

using molpro::linalg::array::DistrArray::Distribution = util::Distribution<index_type>

◆ index_type

using molpro::linalg::array::DistrArray::index_type = size_t

◆ SparseArray

using molpro::linalg::array::DistrArray::SparseArray = std::map<index_type, double>

◆ value_type

using molpro::linalg::array::DistrArray::value_type = double

Constructor & Destructor Documentation

◆ DistrArray() [1/2]

molpro::linalg::array::DistrArray::DistrArray	(	size_t	dimension,
		MPI_Comm	commun
	)

protected

Initializes array without allocating any memory.

◆ DistrArray() [2/2]

molpro::linalg::array::DistrArray::DistrArray ( )

protecteddefault

◆ ~DistrArray()

virtual molpro::linalg::array::DistrArray::~DistrArray ( )

virtualdefault

Member Function Documentation

◆ _divide()

void molpro::linalg::array::DistrArray::_divide	(	const DistrArray &	y,
		const DistrArray &	z,
		DistrArray::value_type	shift,
		bool	append,
		bool	negative
	)

protectedvirtual

◆ acc()

virtual void molpro::linalg::array::DistrArray::acc	(	index_type	lo,
		index_type	hi,
		const value_type *	data
	)

pure virtual

array[lo:hi) += scaling_constant * data[:] (hi is past-the-end). Blocking

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ add() [1/2]

void molpro::linalg::array::DistrArray::add ( const DistrArray & y )

virtual

Add another array to this. Local. Throws error if any array is empty.

◆ add() [2/2]

void molpro::linalg::array::DistrArray::add ( DistrArray::value_type a )

virtual

Add a constant. Local.

◆ at()

virtual value_type molpro::linalg::array::DistrArray::at ( index_type ind ) const

pure virtual

get element at the offset. Blocking.

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ axpy() [1/2]

void molpro::linalg::array::DistrArray::axpy	(	value_type	a,
		const DistrArray &	y
	)

virtual

this[:] += a * y[:]. Throws an error if any array is empty. Add a multiple of another array to this one. Blocking, collective.

◆ axpy() [2/2]

void molpro::linalg::array::DistrArray::axpy	(	value_type	a,
		const SparseArray &	y
	)

virtual

◆ communicator()

MPI_Comm molpro::linalg::array::DistrArray::communicator ( ) const

inline

return a copy of the communicator

◆ compatible()

bool molpro::linalg::array::DistrArray::compatible ( const DistrArray & other ) const

Checks that arrays are of the same dimensionality.

◆ copy()

void molpro::linalg::array::DistrArray::copy ( const DistrArray & y )

virtual

Copies all elements of y. If both arrays are empty than does nothing. If only one is empty, throws an error.

Reimplemented in molpro::linalg::array::DistrArrayDisk.

◆ copy_patch()

void molpro::linalg::array::DistrArray::copy_patch	(	const DistrArray &	y,
		DistrArray::index_type	start,
		DistrArray::index_type	end
	)

virtual

Copies elements in a patch of y. If both arrays are empty than does nothing. If only one is empty, throws an error.

Parameters

y	array to copy
start	index of first element to copy
end	index of last element to copy

◆ distribution()

virtual const Distribution & molpro::linalg::array::DistrArray::distribution ( ) const

pure virtual

Access distribution of the array among processes.

Implemented in molpro::linalg::array::DistrArrayDisk, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ divide()

void molpro::linalg::array::DistrArray::divide	(	const DistrArray &	y,
		const DistrArray &	z,
		value_type	shift = `0`,
		bool	append = `false`,
		bool	negative = `false`
	)

inline

this[i] = y[i]/(z[i]+shift). Collective. Throws error if any array is empty.

negative? (append? this -=... : this =-...) : (append? this +=... : this =...)

Parameters

y	array in the numerator
z	array in the denominator
shift	denominator shift
append	Whether to += or =
negative	Whether to scale right hand side by -1

◆ dot() [1/2]

DistrArray::value_type molpro::linalg::array::DistrArray::dot ( const DistrArray & y ) const

virtual

Scalar product of two arrays. Collective. Throws error if any array is empty. Both arrays should be part of the same processor group (same communicator). The result is broadcast to each process.

Reimplemented in molpro::linalg::array::DistrArrayDisk.

◆ dot() [2/2]

DistrArray::value_type molpro::linalg::array::DistrArray::dot ( const SparseArray & y ) const

virtual

Reimplemented in molpro::linalg::array::DistrArrayDisk.

◆ error()

void molpro::linalg::array::DistrArray::error ( const std::string & message ) const

virtual

stops application with an error

Reimplemented in molpro::linalg::array::DistrArrayGA, and molpro::linalg::array::DistrArrayMPI3.

◆ fill()

void molpro::linalg::array::DistrArray::fill ( DistrArray::value_type val )

virtual

Set all local elements to val.

Note: each process has its own val, there is no communication

◆ gather()

virtual std::vector< value_type > molpro::linalg::array::DistrArray::gather ( const std::vector< index_type > & indices ) const

pure virtual

gets elements with discontinuous indices from array. Blocking

Returns: res[i] = array[indices[i]]

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ get() [1/2]

virtual std::vector< value_type > molpro::linalg::array::DistrArray::get	(	index_type	lo,
		index_type	hi
	)		const

pure virtual

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ get() [2/2]

virtual void molpro::linalg::array::DistrArray::get	(	index_type	lo,
		index_type	hi,
		value_type *	buf
	)		const

pure virtual

Gets buffer[lo:hi) from global array (hi is past-the-end). Blocking.

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ local_buffer() [1/2]

virtual std::unique_ptr< const LocalBuffer > molpro::linalg::array::DistrArray::local_buffer ( ) const

pure virtual

Implemented in molpro::linalg::array::DistrArrayDisk, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ local_buffer() [2/2]

virtual std::unique_ptr< LocalBuffer > molpro::linalg::array::DistrArray::local_buffer ( )

pure virtual

Access the buffer local to this process.

Implemented in molpro::linalg::array::DistrArrayDisk, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ max_abs_n()

std::list< std::pair< DistrArray::index_type, DistrArray::value_type > > molpro::linalg::array::DistrArray::max_abs_n ( int n ) const

returns n elements that are largest by absolute value in array x Collective operation, must be called by all processes in the group.

Returns: list of index and value pairs, or empty list if array is empty.

◆ max_n()

std::list< std::pair< DistrArray::index_type, DistrArray::value_type > > molpro::linalg::array::DistrArray::max_n ( int n ) const

returns n largest elements in array x Collective operation, must be called by all processes in the group.

Returns: list of index and value pairs, or empty list if array is empty.

◆ min_abs_n()

std::list< std::pair< DistrArray::index_type, DistrArray::value_type > > molpro::linalg::array::DistrArray::min_abs_n ( int n ) const

returns n elements that are largest by absolute value in array x Collective operation, must be called by all processes in the group.

Returns: list of index and value pairs, or empty list if array is empty.

◆ min_loc_n()

std::vector< DistrArray::index_type > molpro::linalg::array::DistrArray::min_loc_n ( int n ) const

find the index of n smallest components in array x Collective operation, must be called by all processes in the group.

Returns: list of indices for smallest n values, or empty list if array is empty.

◆ min_n()

std::list< std::pair< DistrArray::index_type, DistrArray::value_type > > molpro::linalg::array::DistrArray::min_n ( int n ) const

returns n smallest elements in array x Collective operation, must be called by all processes in the group.

Returns: list of index and value pairs, or empty list if array is empty.

◆ operator[]()

value_type molpro::linalg::array::DistrArray::operator[] ( size_t index )

inline

◆ put()

virtual void molpro::linalg::array::DistrArray::put	(	index_type	lo,
		index_type	hi,
		const value_type *	data
	)

pure virtual

array[lo:hi) = data[:] (hi is past-the-end). Blocking

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ recip()

void molpro::linalg::array::DistrArray::recip ( )

virtual

Take element-wise reciprocal of this. Local. No checks are made for zero values.

◆ scal()

void molpro::linalg::array::DistrArray::scal ( DistrArray::value_type a )

virtual

Scale by a constant. Local.

◆ scatter()

virtual void molpro::linalg::array::DistrArray::scatter	(	const std::vector< index_type > &	indices,
		const std::vector< value_type > &	data
	)

pure virtual

array[indices[i]] = data[i] Puts vals of elements with discontinuous indices of array. Blocking.

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ scatter_acc()

virtual void molpro::linalg::array::DistrArray::scatter_acc	(	std::vector< index_type > &	indices,
		const std::vector< value_type > &	data
	)

pure virtual

array[indices[i]] += vals[i] Accumulates vals of elements into discontinuous indices of array. Atomic, blocking, with on-sided communication

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ select()

std::map< size_t, DistrArray::value_type > molpro::linalg::array::DistrArray::select	(	size_t	n,
		bool	max = `false`,
		bool	ignore_sign = `false`
	)		const

◆ select_max_dot() [1/2]

std::map< size_t, DistrArray::value_type > molpro::linalg::array::DistrArray::select_max_dot	(	size_t	n,
		const DistrArray &	y
	)		const

◆ select_max_dot() [2/2]

std::map< size_t, DistrArray::value_type > molpro::linalg::array::DistrArray::select_max_dot	(	size_t	n,
		const SparseArray &	y
	)		const

◆ set()

virtual void molpro::linalg::array::DistrArray::set	(	index_type	ind,
		value_type	val
	)

pure virtual

Set one element to a scalar. Global operation.

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ size()

size_t molpro::linalg::array::DistrArray::size ( ) const

inline

total number of elements, same as overall dimension of array

◆ sub() [1/2]

void molpro::linalg::array::DistrArray::sub ( const DistrArray & y )

virtual

Subtract another array from this. Local. Throws error if any array is empty.

◆ sub() [2/2]

void molpro::linalg::array::DistrArray::sub ( DistrArray::value_type a )

virtual

Subtract a constant. Local.

◆ sync()

void molpro::linalg::array::DistrArray::sync ( ) const

virtual

Synchronizes all process in this group and ensures any outstanding operations on the array have completed.

Reimplemented in molpro::linalg::array::DistrArrayGA, and molpro::linalg::array::DistrArrayMPI3.

◆ times() [1/2]

void molpro::linalg::array::DistrArray::times ( const DistrArray & y )

virtual

this[i] *= y[i]. Throws error if any array is empty.

◆ times() [2/2]

void molpro::linalg::array::DistrArray::times	(	const DistrArray &	y,
		const DistrArray &	z
	)

virtual

this[i] = y[i]*z[i]. Throws error if any array is empty.

◆ vec()

virtual std::vector< value_type > molpro::linalg::array::DistrArray::vec ( ) const

pure virtual

Copies the whole buffer into a vector. Blocking.

Note: This is only meant for debugging small arrays!

Implemented in molpro::linalg::array::DistrArrayFile, molpro::linalg::array::DistrArrayGA, molpro::linalg::array::DistrArrayMPI3, and molpro::linalg::array::DistrArraySpan.

◆ zero()

void molpro::linalg::array::DistrArray::zero ( )

virtual

Set all local elements to zero.

Member Data Documentation

◆ m_communicator

MPI_Comm molpro::linalg::array::DistrArray::m_communicator

protected

Outer communicator

◆ m_dimension

index_type molpro::linalg::array::DistrArray::m_dimension = 0

protected

number of elements in the array

Detailed Description

Classes

Public Types

Public Member Functions

Protected Member Functions

Protected Attributes

Member Typedef Documentation

◆ distributed_array

◆ Distribution

◆ index_type

◆ SparseArray

◆ value_type

Constructor & Destructor Documentation

◆ DistrArray() [1/2]

◆ DistrArray() [2/2]

◆ ~DistrArray()

Member Function Documentation

◆ _divide()

◆ acc()

◆ add() [1/2]

◆ add() [2/2]

◆ at()

◆ axpy() [1/2]

◆ axpy() [2/2]

◆ communicator()

◆ compatible()

◆ copy()

◆ copy_patch()

◆ distribution()

◆ divide()

◆ dot() [1/2]

◆ dot() [2/2]

◆ error()

◆ fill()

◆ gather()

◆ get() [1/2]

◆ get() [2/2]

◆ local_buffer() [1/2]

◆ local_buffer() [2/2]

◆ max_abs_n()

◆ max_n()

◆ min_abs_n()

◆ min_loc_n()

◆ min_n()

◆ operator[]()

◆ put()

◆ recip()

◆ scal()

◆ scatter()

◆ scatter_acc()

◆ select()

◆ select_max_dot() [1/2]

◆ select_max_dot() [2/2]

◆ set()

◆ size()

◆ sub() [1/2]

◆ sub() [2/2]

◆ sync()

◆ times() [1/2]

◆ times() [2/2]

◆ vec()

◆ zero()

Member Data Documentation

◆ m_communicator

◆ m_dimension