High-Performance Linpack
HPL Benchmark on my laptop
It's the Top500 season time. I therefore tested HPL on my laptop using Intel's latest OneAPI version 2021.1.10.2261.
The laptop specifications are obtained from lscpu:
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Stepping: 9
CPU MHz: 874.469
CPU max MHz: 3800.0000
CPU min MHz: 800.0000
BogoMIPS: 5599.85
Virtualization: VT-x
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1 MiB
L3 cache: 6 MiB
I am using Linux Mint20 on Asus ROG.
Here are the benchmark results:
$ ./runme_xeon64
This is a SAMPLE run script for running a shared-memory version of
Intel(R) Distribution for LINPACK* Benchmark. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
*Other names and brands may be claimed as the property of others.
Fri 13 Nov 2020 08:29:57 IST
Sample data file lininput_xeon64.
Current date/time: Fri Nov 13 08:29:57 2020
CPU frequency: 3.391 GHz
Number of CPUs: 1
Number of cores: 4
Number of threads: 4
Parameters are set to:
Number of tests: 12
Number of equations to solve (problem size) : 1000 2000 5000 10000 15000 18000 20000 22000 25000 26000 27000 30000
Leading dimension of array : 1000 2000 5008 10000 15000 18008 20016 22008 25000 26000 27000 30000
Number of trials to run : 4 2 2 2 2 2 2 2 2 2 1 1
Data alignment value (in Kbytes) : 4 4 4 4 4 4 4 4 4 4 4 1
Maximum memory requested that can be used=7200601024, at the size=30000
=================== Timing linear equation system solver ===================
Size LDA Align. Time(s) GFlops Residual Residual(norm) Check
1000 1000 4 0.007 96.3645 1.022959e-12 3.033181e-02 pass
1000 1000 4 0.006 103.2200 1.022959e-12 3.033181e-02 pass
1000 1000 4 0.006 104.5280 1.022959e-12 3.033181e-02 pass
1000 1000 4 0.007 96.2256 1.022959e-12 3.033181e-02 pass
2000 2000 4 0.054 99.1910 5.619838e-12 4.375464e-02 pass
2000 2000 4 0.053 99.9669 5.619838e-12 4.375464e-02 pass
5000 5008 4 0.634 131.5344 2.548040e-11 3.392018e-02 pass
5000 5008 4 0.636 131.2024 2.548040e-11 3.392018e-02 pass
10000 10000 4 4.641 143.6870 1.054555e-10 3.553909e-02 pass
10000 10000 4 4.506 147.9811 1.054555e-10 3.553909e-02 pass
15000 15000 4 14.650 153.6162 2.368669e-10 3.581348e-02 pass
15000 15000 4 15.110 148.9348 2.368669e-10 3.581348e-02 pass
18000 18008 4 26.769 145.2679 3.162348e-10 3.349350e-02 pass
18000 18008 4 27.580 140.9929 3.162348e-10 3.349350e-02 pass
20000 20016 4 38.582 138.2543 3.807211e-10 3.257923e-02 pass
20000 20016 4 40.702 131.0518 3.807211e-10 3.257923e-02 pass
22000 22008 4 52.958 134.0617 4.590843e-10 3.258820e-02 pass
22000 22008 4 53.794 131.9777 4.590843e-10 3.258820e-02 pass
25000 25000 4 79.499 131.0447 5.770316e-10 3.184866e-02 pass
25000 25000 4 80.791 128.9492 5.770316e-10 3.184866e-02 pass
26000 26000 4 91.586 127.9534 6.257559e-10 3.196386e-02 pass
26000 26000 4 92.436 126.7756 6.257559e-10 3.196386e-02 pass
27000 27000 4 104.169 125.9827 5.721172e-10 2.712944e-02 pass
30000 30000 1 143.041 125.8508 7.350489e-10 2.829664e-02 pass
Performance Summary (GFlops)
Size LDA Align. Average Maximal
1000 1000 4 100.0845 104.5280
2000 2000 4 99.5789 99.9669
5000 5008 4 131.3684 131.5344
10000 10000 4 145.8340 147.9811
15000 15000 4 151.2755 153.6162
18000 18008 4 143.1304 145.2679
20000 20016 4 134.6530 138.2543
22000 22008 4 133.0197 134.0617
25000 25000 4 129.9969 131.0447
26000 26000 4 127.3645 127.9534
27000 27000 4 125.9827 125.9827
30000 30000 1 125.8508 125.8508
Residual checks PASSED
End of tests
Below are 3 screen captures showing the load on the computer during the test (these pictures were taken during a previous test): top (top), netdata (middle) and gkrellm (bottom).
Comments