@@ -9,13 +9,99 @@ <h3 class="ui header">Overview</h3>
9
9
10
10
</ div >
11
11
12
+
13
+ < div class ="ui container raised segment fluid ">
14
+
15
+ < h3 class ="ui header "> 2-D Data Distribution</ h3 >
16
+
17
+ < p class ="ui ">
18
+ We consider the standard outer-product parallel matrix multiplication, as described throughout this module,
19
+ for a < i > 2-D block data distribution</ i > . More precisely, we consider the C = A×B multiplication
20
+ where all three matrices are square of dimensions N×N, and contain double precision floating point numbers.
21
+ The execution takes place on p processors, < b > where p is a perfect square and √p
22
+ divides N</ b > . Each processor thus holds a
23
+ N/√p × N/√p square block of each matrix. </ p >
24
+
25
+ < p class ="ui ">
26
+ For instance, with N=6, consider example matrix A as shown below:
27
+ </ p >
28
+
29
+ < p style ="text-align:center; " class ="ui ">
30
+ < pre >
31
+ 10 20 30 40 50 60
32
+ 11 21 31 41 51 61
33
+ 12 22 32 42 52 62
34
+ 13 23 33 43 53 63
35
+ 14 24 34 44 54 64
36
+ 15 25 35 45 55 65
37
+ </ pre >
38
+ </ p >
39
+
40
+ < p class ="ui ">
41
+ Now let's assume that p=4. The 4 processes are logically organized in a 2x2 grid (hence the name of the data
42
+ distribution) in row-major order as follows:
43
+ </ p >
44
+
45
+ < p style ="text-align:center; " class ="ui ">
46
+
47
+ < table class ="ui collapsing celled table ">
48
+ < tbody >
49
+ < tr >
50
+ < td > process #0</ td >
51
+ < td > process #1</ td >
52
+ </ tr >
53
+ < tr >
54
+ < td > process #2</ td >
55
+ < td > process #3</ td >
56
+ </ tr >
57
+ </ tbody >
58
+ </ table >
59
+
60
+ </ p >
61
+
62
+ < p class ="ui ">
63
+ In this setup, for instance, process #0 holds the following 3x3 block of matrix A:
64
+ </ p >
65
+
66
+ < p style ="text-align:center; " class ="ui ">
67
+ < pre >
68
+ 10 20 30
69
+ 11 21 31
70
+ 12 22 32
71
+ </ pre >
72
+ </ p >
73
+
74
+ < p class ="ui ">
75
+ and process #2 holds the following 3x3 block of matrix A:
76
+ </ p >
77
+
78
+ < p style ="text-align:center; " class ="ui ">
79
+ < pre >
80
+ 43 53 63
81
+ 44 54 64
82
+ 45 55 65
83
+ </ pre >
84
+ </ p >
85
+
86
+ < p > At process #2, element with value 54 has local indices (1,1), but its global indices are (5,5), assuming that
87
+ indices start at 0.
88
+ </ p >
89
+
90
+
91
+
92
+ </ div >
93
+
94
+
12
95
< div class ="ui container segment raised fluid ">
13
96
< h3 class ="ui header "> Roadmap</ h3 >
14
- < p class ="ui "> This module consists of < b > XXXX activities, each described in its own tab above, which should be done in
97
+ < p class ="ui "> This module consists of < b > 2 activities, each described in its own tab above, which should be done in
15
98
sequence:</ b >
16
99
</ p >
17
100
< ul class ="ui list ">
18
- < li class ="item "> < b > Activity #1:</ b > XXXX.
101
+ < li class ="item "> < b > Activity #1:</ b > Have each MPI process allocate and initialize its own
102
+ block of particular matrices, using the 2-D distribution scheme.
103
+ </ li >
104
+ < li class ="item "> < b > Activity #2:</ b > Implement the outer product matrix multiplication algorithm.
19
105
</ li >
20
106
</ ul >
21
107
</ div >
@@ -28,7 +114,6 @@ <h3 class="ui header">What to turn in</h3>
28
114
< div class ="item "> All source code</ div >
29
115
< div class ="item "> XML platform files (see details in the activities)</ div >
30
116
< div class ="item "> A Makefile that compiles all executables (and has a 'clean' target!)</ div >
31
- < div class ="item "> A README file with answers to the questions asked in the activities</ div >
32
117
</ div >
33
118
</ p >
34
119
</ div >
0 commit comments