Changes
Page history
Add more results
authored
Jan 14, 2021
by
Jan Eitzinger
Show whitespace changes
Inline
Side-by-side
AMD-Rome-S2-M4-C64.md
0 → 100644
View page @
cdda2186
# System
*
**Processor:**
AMD EPYC 7662 64-Core Processor
*
**Base frequency:**
2.0 GHz
*
**Number of sockets:**
2
*
**Number of memory domains per socket:**
4
*
**Number of cores per socket:**
64
*
**Number of HWThreads per core:**
2
*
**[MachineState](https://github.com/RRZE-HPC/MachineState) output:**
NA
# Tool chain
```
+----------+-------------------------------------------------------------------+
| Compiler | AMD clang |
|----------|-------------------------------------------------------------------|
| Version | AMD clang version 10.0.0 (CLANG: AOCC_2.2.0-Build#93 2020_06_25) |
+----------+-------------------------------------------------------------------+
```
Optimizing flags:
```-Ofast -fnt-store=aggressive -std=c99 -fopenmp```
# Results
All results are in
```GB/s```
.
Summary results:
```
+------------------------------------------------+
| Single core | 35.83 (Triad) |
| Memory domain | 43.15 (Sum with 16 cores) |
| Socket | 182.28 (Update with 16 cores) |
| Node | 414.18 (Update with 16 cores) |
+------------------------------------------------+
```
Results for scaling within a memory domain:
```
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 23.42 8.17 33.14 32.21 35.83 35.13 35.21 35.34
2 23.43 16.24 40.58 41.31 40.87 39.69 40.03 39.25
3 23.44 24.04 40.64 39.47 39.31 37.95 38.23 37.44
4 23.42 31.52 40.02 38.78 38.53 37.03 37.45 36.51
5 23.41 36.46 39.65 38.67 38.16 36.84 37.18 36.27
6 23.42 35.78 39.13 38.34 37.49 36.12 36.57 35.62
7 23.44 35.96 38.59 37.69 36.85 35.48 36.00 35.02
8 23.45 36.72 38.08 37.15 36.35 35.00 35.53 34.56
9 25.88 39.38 39.13 38.56 37.67 36.54 36.83 36.05
10 28.14 40.35 39.57 39.16 38.53 37.63 37.75 37.12
11 30.25 40.67 40.00 39.74 39.17 38.49 38.39 37.95
12 32.27 41.30 40.30 40.24 39.72 39.20 38.94 38.68
13 34.16 41.96 40.62 40.96 40.21 39.90 39.52 39.37
14 35.96 42.56 40.78 41.29 40.59 40.42 39.93 39.89
15 37.53 43.02 40.78 41.49 40.81 40.77 40.19 40.20
16 39.00 43.15 40.69 41.70 40.89 40.90 40.22 40.38
```
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
```
#nm 1 2 3 4 5 6 7 8
1 23.42 46.80 70.17 93.39 116.72 139.90 163.25 186.51
2 23.43 46.81 70.20 93.40 116.91 140.07 163.24 186.93
3 23.44 46.83 70.24 93.48 116.78 140.02 163.63 186.35
4 23.42 46.83 70.14 93.64 116.82 140.02 163.03 186.49
5 23.41 46.86 70.25 93.63 116.89 140.05 163.55 186.24
6 23.42 46.86 70.19 93.65 116.96 140.53 163.64 186.85
7 23.44 46.86 70.30 93.67 117.12 140.58 163.98 186.95
8 23.45 46.89 70.32 93.76 117.13 140.42 163.94 187.27
9 25.88 51.73 77.62 103.46 129.34 155.20 180.96 206.69
10 28.14 56.28 84.35 112.41 140.65 168.66 196.57 224.67
11 30.25 60.49 90.74 120.87 151.30 181.45 211.47 241.65
12 32.27 64.49 96.82 129.12 161.07 193.26 225.11 257.60
13 34.16 68.39 102.58 136.61 170.89 204.79 239.30 273.49
14 35.96 71.92 107.87 143.59 179.81 215.61 250.92 287.76
15 37.53 75.10 112.48 150.01 187.76 225.06 263.02 300.74
16 39.00 78.07 117.07 155.87 194.85 234.03 272.51 312.25
```
Sum:
```
#nm 1 2 3 4 5 6 7 8
1 8.17 16.28 24.26 32.33 40.33 48.46 56.48 64.87
2 16.24 32.45 48.33 64.36 80.45 96.45 112.24 128.16
3 24.04 48.10 72.43 96.67 119.72 144.96 167.81 192.66
4 31.52 62.56 94.35 125.68 86.72 186.48 218.07 252.19
5 36.46 73.01 109.20 146.27 182.43 218.32 254.82 292.77
6 35.78 71.55 107.27 142.58 128.45 213.63 246.21 284.28
7 35.96 71.96 107.48 143.37 179.03 212.19 248.61 287.04
8 36.72 73.41 109.78 146.40 182.22 218.01 255.47 291.35
9 39.38 78.64 117.82 156.32 195.02 234.33 273.41 311.02
10 40.35 80.60 120.55 160.54 200.38 239.11 278.60 319.35
11 40.67 81.10 121.70 162.13 201.35 241.95 279.86 322.24
12 41.30 82.52 123.36 164.17 204.71 245.30 285.33 326.99
13 41.96 83.88 125.24 166.52 208.33 249.05 289.74 330.07
14 42.56 84.96 127.11 168.45 211.13 252.30 291.75 334.65
15 43.02 85.76 127.99 170.28 212.49 253.65 294.62 336.58
16 43.15 86.15 128.82 170.71 212.87 254.60 296.24 337.88
```
Copy
```
#nm 1 2 3 4 5 6 7 8
1 33.14 66.22 100.61 132.80 165.62 198.30 232.71 263.91
2 40.58 80.93 121.07 161.23 201.26 241.39 280.37 322.12
3 40.64 81.29 121.72 162.27 202.93 243.46 283.27 323.31
4 40.02 79.96 119.80 159.40 197.78 238.99 278.07 316.26
5 39.65 79.32 118.89 158.29 198.21 238.02 278.26 317.75
6 39.13 78.24 117.07 156.25 195.10 233.99 272.62 310.97
7 38.59 77.18 115.70 153.83 192.31 229.83 268.74 305.98
8 38.08 76.07 114.03 152.01 189.34 227.66 265.25 302.02
9 39.13 78.23 117.47 156.45 195.88 235.18 274.36 312.44
10 39.57 79.13 118.75 158.06 197.35 236.92 277.64 316.05
11 40.00 79.99 119.85 159.56 199.29 238.81 279.20 317.96
12 40.30 80.56 120.77 160.86 200.73 240.13 280.81 318.94
13 40.62 81.31 122.19 163.75 205.94 247.65 289.42 331.40
14 40.78 81.48 122.10 162.53 203.86 245.19 288.91 335.03
15 40.78 81.54 122.24 162.58 203.24 244.12 286.56 331.70
16 40.69 81.32 121.87 162.54 202.72 243.18 285.64 329.20
```
Update
```
#nm 1 2 3 4 5 6 7 8
1 32.21 64.50 96.82 129.53 162.49 193.57 224.44 259.21
2 41.31 82.63 124.20 165.50 206.95 248.68 289.79 332.64
3 39.47 79.20 119.26 159.33 200.45 241.61 280.88 323.21
4 38.78 77.86 117.09 156.39 195.74 235.17 275.95 315.99
5 38.67 78.07 118.46 158.73 200.38 242.20 281.01 327.63
6 38.34 77.22 116.79 157.32 198.43 239.64 282.76 326.79
7 37.69 76.09 115.36 154.97 195.94 237.15 279.56 322.33
8 37.15 74.95 113.25 152.32 192.08 232.98 273.96 315.53
9 38.56 77.88 117.97 158.68 200.08 242.66 284.03 328.92
10 39.16 79.03 119.90 161.39 204.38 248.19 292.73 338.33
11 39.74 80.39 121.76 164.08 207.84 251.65 297.63 343.84
12 40.24 81.28 123.19 166.43 210.07 254.30 300.00 346.20
13 40.96 83.53 127.59 173.09 219.23 265.50 314.49 364.42
14 41.29 84.26 128.40 174.20 222.00 272.58 325.20 374.63
15 41.49 84.46 128.84 175.51 225.82 277.54 335.09 395.36
16 41.70 85.97 132.67 182.28 235.50 290.26 348.92 414.18
```
Triad
```
#nm 1 2 3 4 5 6 7 8
1 35.83 71.78 107.20 142.32 177.47 213.48 247.02 283.48
2 40.87 81.62 122.39 163.07 203.44 243.03 283.41 323.40
3 39.31 78.59 118.07 157.10 196.21 235.74 273.83 313.68
4 38.53 76.79 115.30 153.26 191.35 229.25 266.80 304.19
5 38.16 76.21 114.58 152.45 190.61 228.50 266.43 304.31
6 37.49 74.84 112.32 149.64 187.13 224.39 261.21 298.23
7 36.85 73.67 110.37 147.05 183.74 220.05 256.51 292.53
8 36.35 72.67 108.75 145.25 181.45 217.76 253.14 288.85
9 37.67 75.36 113.14 150.45 188.15 225.74 262.60 299.94
10 38.53 76.91 115.50 154.15 192.40 230.79 269.53 307.39
11 39.17 78.37 117.71 156.74 196.09 235.00 273.80 312.87
12 39.72 79.33 119.20 159.03 198.52 238.05 277.01 316.48
13 40.21 80.50 120.74 160.92 200.90 241.13 280.83 321.22
14 40.59 81.17 121.82 162.15 202.54 242.53 282.45 323.20
15 40.81 81.66 122.33 163.02 204.04 244.11 284.20 324.86
16 40.89 81.67 122.67 163.15 203.70 244.54 284.19 325.05
```
# Scaling
Memory bandwidth scaling within one memory domain:

The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:

Memory bandwidth scaling across memory domains for sum

Memory bandwidth scaling across memory domains for copy

Memory bandwidth scaling across memory domains for Triad
