Hello,
welcome back. In the following I will explain how define an index set which describes an augmented partition of a matrix
among two processes.
Let’s suppose to have a global matrix \(A\) and a global vector \(x\) defined as
$$ A = \begin{pmatrix} 1 & 0 & 2 \\ 2 & 0 & 3 \\ 0 & 1 & 0 \end{pmatrix},\qquad x = \begin{pmatrix} 3 \\ 2 \\ 1 \end{pmatrix} $$
and we want to evaluate the product
$$ y = Ax = \begin{pmatrix} 5 \\ 9 \\ 2 \end{pmatrix} $$
in parallel.
In order to do a parallel product we need to partition them among 2 processes. We arbitrary choose to split the vector assigning \(x(0)\) to the process with rank 0 and \(x(1)\), \(x(2)\) to the process with rank 1.
The set of global indices is \(I = \{0, 1, 2\}\) and on each partition we have
$$ I_0 = \{0\},\qquad I_1 = \{1, 2\}. $$
The index set created from this partition will describe also the matrix. We notice that \(A(0, 2) \neq 0\) and \(A(1, 0) \neq 0\) therefore, to reduce the communication, we need to augment the partitions adding some global indices to the partitions as follows
$$ \tilde{I_0} = I_0 \cup \{2\} = \{0, 2\},\qquad \tilde{I_1} = I_1 \cup \{0\} = \{0, 1, 2\} $$
where \(\tilde{I}_p\) is the augmented set of global indices.
The indices in \(I_p\) will be tagged as owner while the ones in \(\tilde{I_p} \setminus I_p\) will be tagged as copy. The corresponding index set is
- \(\{(0, 0, o), (2, 1, c)\}\), on rank 0;
- \(\{(1, 0, o), (2, 1, o), (0, 2, c)\}\), on rank 1;
where \((i_g, i_l, \textrm{flag})\). This index set define the following map
$$ m_p(i_g) = i_l $$
where \(i_g\) is the global index while \(i_l\) is the local one.
Therefore for the vector holds
$$ x(i_g) = x_p(m_p(i_g)) $$
and the augmented vectors are
$$ x_0 = \begin{pmatrix} x(0) \\ x(2) \end{pmatrix} = \begin{pmatrix} 3 \\ 1 \end{pmatrix} $$
and
$$ x_1 = \begin{pmatrix} x(1) \\ x(2) \\ x(0) \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \\ 3 \end{pmatrix}. $$
Similarly for the matrix holds \(A(i_g, j_g) = A_p(m_p(i_g), m_p(j_g))\) therefore the augmented matrices are
$$ A_0 = \begin{pmatrix} A(0, 0) & A(0, 2) \\ A(2, 0) & A(2, 2) \end{pmatrix} = \begin{pmatrix} 1 & 2 \\ 0 & 0 \end{pmatrix} $$
and
$$ A_1 = \begin{pmatrix} A(1, 1) & A(1, 2) & A(1, 0) \\ A(2, 1) & A(2, 2) & A(2, 0) \\ A(0, 1) & A(0, 2) & A(0, 0) \end{pmatrix} = \begin{pmatrix} 0 & 3 & 2 \\ 1 & 0 & 0 \\ 0 & 2 & 1 \end{pmatrix}. $$
Therefore on each process we have defined an index set, an augmented matrix and an augmented vector. We have now everything to perform a parallel matrix-vector product.
We use as communicator the class Dune::OwnerOverlapCopyCommunication while, for the product, we use the operator Dune::OverlappingSchwarzOperator.
The result of this product is
$$ y_0 = \begin{pmatrix} y(0) \\ y(2) \end{pmatrix} = \begin{pmatrix} 5 \\ 2 \end{pmatrix} $$
and
$$ y_1 = \begin{pmatrix} y(1) \\ y(2) \\ y(0) \end{pmatrix} = \begin{pmatrix} 9 \\ 2 \\ 5 \end{pmatrix}. $$
This example can be found at the following link
The additional bool value set in the ParallelIndexSet tells the library whether this index might also be stored on another processes: if that is false then the library assumes that there are no copies anywhere when computing the RemoteIndices.
For further details see On the Generic Parallelisation of Iterative Solvers for the Finite Element Method, P. Bastian and M. Blatt.
Stay tuned!
Marco.